Processing Control For Data Transfer Patents (Class 712/225)
  • Patent number: 8443174
    Abstract: Provided is a processor and method of performing speculative load instructions of the processor in which a load instruction is performed only in the case where the load instruction substantially accesses a memory. A load instruction for canceling operations is performed in other cases except the above case, so that problems occurring by accessing an input/output (I/O) mapped memory area and the like at the time of performing speculative load instructions can be prevented using only a software-like method, thereby improving the performance of a processor.
    Type: Grant
    Filed: August 14, 2007
    Date of Patent: May 14, 2013
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hong-Seok Kim, Hee Seok Kim, Jeongwook Kim, Suk Jin Kim
  • Publication number: 20130117547
    Abstract: An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element.
    Type: Application
    Filed: December 29, 2012
    Publication date: May 9, 2013
    Inventors: Alexander Peleg, Yaakov Yaari, Millind Mittal, Larry M. Mennemeier, Benny Eitan
  • Publication number: 20130117546
    Abstract: A Load/Store Disjoint instruction, when executed by a CPU, accesses operands from two disjoint memory locations and sets condition code indicators to indicate whether or not the two operands appeared to be accessed atomically by means of block-concurrent interlocked fetch with no intervening stores to the operands from other CPUs. In a Load Pair Disjoint form of the instruction, the accesses are loads and the disjoint data is stored in general registers.
    Type: Application
    Filed: December 26, 2012
    Publication date: May 9, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: INTERNATIONAL BUSINESS MACHINES CORPORATION
  • Publication number: 20130111192
    Abstract: A remote receives an instruction to transmit and determines whether or not to include an acknowledgement request in the instruction based on statistics regarding receipt of acknowledgements associated with previously transmitted instructions. If so, the remote control device includes the request before transmitting. The remote control may determine whether or not to include the request in a variety of different ways in a variety of different implementations. In some implementations, the remote control may classify instructions into two or more different classifications and may treat instructions of different classifications differently. In other implementations, the remote control may treat the same instruction differently depending on the number of requested acknowledgements successfully received during a time period. In various other implementations, the remote control may perform various combinations of these approaches.
    Type: Application
    Filed: October 31, 2011
    Publication date: May 2, 2013
    Applicant: EchoStar Technologies L.L.C.
    Inventor: William R. Reams
  • Patent number: 8433884
    Abstract: A multiprocessor executes a plurality of threads without decreasing execution efficiency. The multiprocessor includes a first processor allocating a different register file to each of a predetermined number of threads to be executed from among plural threads, and executing the predetermined number of threads in parallel; and a second processor performing processing according to a processing request made by the first processor. The first processor has areas allocated to the plurality of threads in one-to-one correspondence, makes the processing request to the second processor according to an instruction included in one of the predetermined number of threads, upon receiving a request for writing a value resulting from the processing from the second processor, judges whether the one thread is being executed, and when judging negatively, performs control such that the obtained value is written into one of the areas allocated to the one thread.
    Type: Grant
    Filed: June 16, 2009
    Date of Patent: April 30, 2013
    Assignee: Panasonic Corporation
    Inventor: Hiroyuki Morishita
  • Patent number: 8412917
    Abstract: Disclosed are methods and systems for dynamically determining data-transfer paths. The data-transfer paths are dynamically determined in response to an instruction that facilitates data transfer among execution lanes in an integrated-circuit processing device operable to execute operations in parallel. In addition, embodiments include an integrated-circuit processing device operable to execute operations in parallel, including the capability of providing confirmation information to potential source lanes, the confirmation information indicating whether the potential source lanes may send data to requested destination lanes during a data-transfer interval.
    Type: Grant
    Filed: September 20, 2011
    Date of Patent: April 2, 2013
    Assignee: Calos Fund Limited Liability Company
    Inventors: Brucek Khailany, William James Dally, Ujval J. Kapasi, Jim Jian Lin
  • Patent number: 8411103
    Abstract: One embodiment of the invention sets forth a CROP configured to perform both color raster operations and atomic transactions. Upon receiving an atomic transaction, the distribution unit within the CROP transmits a read request to the L2 cache for retrieving the destination operand. The distribution unit also transmits the source operands and the operation code to the latency buffer for storage until the destination operand is retrieved from the L2 cache. The processing pipeline transmits the operation code, the source and destination operands and an atomic flag to the blend unit for processing. The blend unit performs the atomic transaction on the source and destination operands based on the operation code and returns the result of the atomic transaction to the processing pipeline for storage in the internal cache. The processing pipeline writes the result of the atomic transaction to the L2 cache for storage at the memory location associated with the atomic transaction.
    Type: Grant
    Filed: September 29, 2009
    Date of Patent: April 2, 2013
    Assignee: Nvidia Corporation
    Inventors: Narayan Kulshrestha, Adam Paul Dreyer, Chad D. Walker, Rui M. Bastos
  • Publication number: 20130080747
    Abstract: The present invention relates to a processor including: an instruction cache configured to store at least some of first instructions stored in an external memory and second instructions each including a plurality of micro instructions; a micro cache configured to store third instructions corresponding to the plurality of micro instructions included in the second instructions; and a core configured to read out the first and second instructions from the instruction cache and perform calculation, in which the core performs calculation by the first instructions from the instruction cache under a normal mode, and when the process enters a micro instruction mode, the core performs calculation by the third instructions corresponding to the plurality of micro instructions provided from the micro cache.
    Type: Application
    Filed: September 10, 2012
    Publication date: March 28, 2013
    Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventor: Young-Su KWON
  • Patent number: 8401015
    Abstract: The present invention relates to a method for representing a partition of n w-bit intervals associated to d-bit data in a data communications network, said method comprising the steps of: providing in a memory (102), a datagram forwarding data structure (10) provided for indicating where to forward a datagram in said network, which data structure (10) is in the form of a tree comprising at least one leaf (11) and possibly a number of nodes (13) including partial nodes, said data structure (10) having a height (h), corresponding to a number of memory accesses required for looking up a largest stored non-negative integer smaller than or equal to a query key, step 201, reducing worst storage cost by using a technique for reduction of worst case storage cost that are selectable from: partial block tree compaction, virtual blocks, bit push pulling, block aggregation or split block trees, and variations thereof, step 202, updating the layered data structure partially including by using a technique for scheduling ma
    Type: Grant
    Filed: October 19, 2007
    Date of Patent: March 19, 2013
    Assignee: Oricane AB
    Inventor: Mikael Sundström
  • Patent number: 8402255
    Abstract: A processor that is configured to perform parallel operations in a computer system where one or more memory hazards may be present is described. An instruction fetch unit within the processor is configured to fetch instructions for detecting one or more critical memory hazards between memory addresses if memory operations are performed in parallel on multiple addresses corresponding to at least a partial vector of addresses. Note that critical memory hazards include memory hazards that lead to different results when the memory addresses are processed in parallel than when the memory addresses are processed sequentially. Furthermore, an execution unit within the processor is configured to execute the instructions for detecting the one or more critical memory hazards.
    Type: Grant
    Filed: September 1, 2011
    Date of Patent: March 19, 2013
    Assignee: Apple Inc.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Publication number: 20130067206
    Abstract: Endpoint-based parallel data processing in a parallel active messaging interface (‘PAMI’) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective operation through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.
    Type: Application
    Filed: November 9, 2012
    Publication date: March 14, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: International Business Machines Corporation
  • Publication number: 20130067205
    Abstract: An apparatus includes a processor and a memory coupled to the processor. The memory stores an instruction packet (e.g., a VLIW instruction packet) including a first predicate independent instruction and a second predicate independent instruction. Each of the predicate independent instructions has the same destination.
    Type: Application
    Filed: September 9, 2011
    Publication date: March 14, 2013
    Applicant: QUALCOMM Incorporated
    Inventors: Erich J. Plondke, Lucian Codrescu, Mao Zeng, Charles J. Tabony, Suresh K. Venkumahanti
  • Patent number: 8386751
    Abstract: One embodiment of the present includes a heterogenous, high-performance, scalable processor having at least one W-type sub-processor capable of processing W bits in parallel, W being an integer value, at least one N-type sub-processor capable of processing N bits in parallel, N being an integer value smaller than W by a factor of two. The processor further includes a shared bus coupling the at least one W-type sub-processor and at least one N-type sub-processor and memory shared coupled to the at least one W-type sub-processor and the at least one N-type sub-processor, wherein the W-type sub-processor rearranges memory to accommodate execution of applications allowing for fast operations.
    Type: Grant
    Filed: May 18, 2010
    Date of Patent: February 26, 2013
    Assignee: Icelero LLC
    Inventors: Amit Ramchandran, John Reid Hauser, Jr.
  • Publication number: 20130046954
    Abstract: Disclosed is an architecture, system and method for performing multi-thread DFA descents on a single input stream. An executer performs DFA transitions from a plurality of threads each starting at a different point in an input stream. A plurality of executers may operate in parallel to each other and a plurality of thread contexts operate concurrently within each executer to maintain the context of each thread which is state transitioning. A scheduler in each executer arbitrates instructions for the thread into an at least one pipeline where the instructions are executed. Tokens may be output from each of the plurality of executers to a token processor which sorts and filters the tokens into dispatch order.
    Type: Application
    Filed: January 18, 2012
    Publication date: February 21, 2013
    Inventors: Michael Ruehle, Umesh Ramkrishnarao Kasture, Vinay Janardan Naik, Nayan Amrutlal Suthar, Robert J. McMillen
  • Publication number: 20130036276
    Abstract: Systems and methods for providing additional instructions for supporting efficient memory corruption detection in a processor. A physical memory may be a DRAM with a spare bank of memory reserved for a hardware failover mechanism. Version numbers associated with data structures allocated in the memory may be generated so that version numbers of adjacent data structures are different. A processor determines that a fetched instruction is a memory access instruction corresponding to a first data structure within the memory. For instructions that are not a version update instruction, the processor compares the first version number and second version number stored in a location in the memory indicated by the generated address and flags an error if there is a mismatch. For version update instructions, the processor performs a memory access operation on the second version number with no comparison check.
    Type: Application
    Filed: August 2, 2011
    Publication date: February 7, 2013
    Inventors: Zoran Radovic, Darryl J. Gove, Graham Ricketson Murphy
  • Patent number: 8370609
    Abstract: This invention includes a circuit for tracking memory operations with trace-based execution. Each trace includes a sequence of operations that includes zero or more of the memory operations. The memory operations being executed form a set of active memory operations that have a predefined program order among them and corresponding ordering constraints. At least some of the active memory operations access the memory in an execution order that is different from the program order. Checkpoint entries are associated with each trace. Each entry refers to a checkpoint location. Memory operation ordering entries correspond to each one of the active memory operations. Violations of the ordering constraints result in overwriting the checkpoint locations associated with the selected trace as well as the checkpoint locations associated with traces that are younger than the selected trace.
    Type: Grant
    Filed: February 13, 2008
    Date of Patent: February 5, 2013
    Assignee: Oracle America, Inc.
    Inventors: John Gregory Favor, Paul G. Chan, Graham Ricketson Murphy, Joseph Byron Rowlands
  • Patent number: 8370844
    Abstract: Embodiments off the invention provide a mechanism for process migration on a massively parallel computer system. In particular, embodiments of the invention may be used to update process state data for a migrated compute node, such as MPI (or other communication library) state data, across a full collection of compute nodes present in a given parallel system executing a parallel task. Migrating a process form one compute node to another may be useful to address a variety of sub-optimal operating conditions. For example, one or more processes may be migrated to cure network congestion resulting from a poorly mapped task or when a compute node is predicted to experience a hardware failure.
    Type: Grant
    Filed: September 12, 2007
    Date of Patent: February 5, 2013
    Assignee: International Business Machines Corporation
    Inventors: Charles Jens Archer, David L. Darrington, Patrick Joseph McCarthy, Amanda Peters, Albert Sidelnik
  • Patent number: 8364937
    Abstract: A method includes providing a data processor having an instruction pipeline, where the instruction pipeline has a plurality of instruction pipeline stages, and where the plurality of instruction pipeline stages includes a first instruction pipeline stage and a second instruction pipeline stage. The method further includes providing a load instruction and a data-dependent instruction to the instruction pipeline. Based on an operating mode, such as ECC mode or parity mode, the data-dependent instruction may execute in either the first of the second instruction pipeline stage. Further, the execution of the data-dependent instruction may depend on whether the most recently executed instruction was misaligned.
    Type: Grant
    Filed: April 13, 2012
    Date of Patent: January 29, 2013
    Assignee: Rambus Inc.
    Inventors: William C. Moyer, Jeffrey W. Scott
  • Patent number: 8364803
    Abstract: The present invention relates to a method for routing in a data communications network, said method comprising the steps of: providing in a memory (102), a datagram forwarding data structure (10) provided for indicating where to forward a datagram in said network, which data structure (10) is in the form of a tree comprising at least one leaf (11) and possibly a number of nodes (13) including partial nodes, said data structure (10) having a height (h), corresponding to a number of memory accesses required for looking up a largest stored non-negative integer smaller than or equal to a query key, step 201, reducing worst storage cost by using a technique for reduction of worst case storage cost that are selectable from: partial block tree compaction, virtual blocks, bit push pulling, block aggregation or split block trees, and variations thereof, step 202, updating the layered data structure partially including by using a technique for scheduling maintenance work that are selectable from: vertical segmentation
    Type: Grant
    Filed: October 19, 2007
    Date of Patent: January 29, 2013
    Assignee: Oricane AB
    Inventor: Mikael Sundström
  • Publication number: 20130024673
    Abstract: A technology capable of reducing load on both system processing and filter operation and improving power consumption and performance is provided. In a digital signal processor, a program memory, a program counter, and a control logic circuit are provided, and a bit field of each instruction includes instruction stop flag information and bit field information. Also, the control logic circuit carries out the control in such a manner that the instruction whose instruction stop flag information is cleared is executed as is to proceed to the next instruction processing, execution of the instruction whose instruction stop flag information is set is stopped if an execution resumption trigger condition corresponding to the bit field information is not satisfied, and the instruction whose instruction stop flag information is set is executed if the execution resumption trigger condition corresponding to bit field information is satisfied, to proceed to the next instruction processing.
    Type: Application
    Filed: July 21, 2012
    Publication date: January 24, 2013
    Inventor: Takanaga YAMAZAKI
  • Patent number: 8352712
    Abstract: A method and processor chip design for enabling a processor core to continue sending store operations speculatively to the store queue after the core receives indication that the store queue is full. The processor core is configured with speculative store logic that enables the processor core to continue issuing store operations while the store queue full signal is asserted. A copy of the speculatively issued store operation is placed within a speculative store buffer. The core waits for a signal from the store queue indicating the store operation was accepted into the store queue. When the speculatively-issued store operation is accepted within the store queue, the copy is discarded from the buffer. However, when the store operation is rejected, the speculative store logic re-issues the store operation ahead of normal store operations.
    Type: Grant
    Filed: May 6, 2004
    Date of Patent: January 8, 2013
    Assignee: International Business Machines Corporation
    Inventors: Robert H. Bell, Jr., Thomas Michael Capasso, Guy Lynn Guthrie, Hugh Shen, Jeffrey Adam Stuecheli
  • Publication number: 20120331276
    Abstract: A method of executing an instruction set to select a set of registers, includes reading a first instruction of the instruction set; interpreting a first operand of the first instruction to represent a first register S to be selected; interpreting a second operand of the first instruction to represent a number N of registers to be selected; and selecting N consecutive registers starting at the first register S to form the set of registers.
    Type: Application
    Filed: December 20, 2011
    Publication date: December 27, 2012
    Applicant: CAMBRIDGE SILICON RADIO LIMITED
    Inventors: Peter Smith, David Richard Hargreaves
  • Publication number: 20120317402
    Abstract: A facility is provided to enable operator message commands from multiple, distinct sources to be provided to a coupling facility of a computing environment for processing. These commands are used, for instance, to perform actions on the coupling facility, and may be received from consoles coupled to the coupling facility, as well as logical partitions or other systems coupled thereto. Responsive to performing the commands, responses are returned to the initiators of the commands.
    Type: Application
    Filed: June 10, 2011
    Publication date: December 13, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: David A. Elko, Steven N. Goss, Thomas C. Shaw
  • Patent number: 8332829
    Abstract: Within a data processing system, one or more register files are assigned to respective states of a graph for each of a plurality of clock cycles. A plurality of edges are inserted to form connections between the states of the graph, with respective weights being assigned to each of the edges. A best route through the graph is then determined based, at least in part, on the weights assigned to the edges.
    Type: Grant
    Filed: August 15, 2008
    Date of Patent: December 11, 2012
    Assignee: Calos Fund Limited Liability Company
    Inventor: Peter Mattson
  • Publication number: 20120311306
    Abstract: Parallelism of processing can be improved while existing software resources are utilized substantially as they are. A data processing apparatus includes a plurality of processing units configured to process packets each including data and extended identification information added to the data, the extended identification information including identification information for identifying the data and instruction information indicating one or more processing instructions to the data, each processing unit in the plurality of processing units including: an input/output unit configured to obtain, in the packets, only a packet whose address information indicates said each processing unit in the plurality of processing units, the address information determined in accordance with the extended identification information; and an operation unit configured to execute the processing instruction in the packet obtained by the input/output unit.
    Type: Application
    Filed: June 1, 2012
    Publication date: December 6, 2012
    Applicant: MUSH-A CO., LTD.
    Inventor: Mitsuru Mushano
  • Patent number: 8327121
    Abstract: A microprocessor includes an N-way cache and a logic block that selectively enables and disables the N-way cache for at least one clock cycle if a first register load instructions and a second register load instruction, following the first register load instruction, are detected as pointing to the same index line in which the requested data is stored. The logic block further provides a disabling signal to the N-way cache for at least one clock cycle if the first and second instructions are detected as pointing to the same cache way.
    Type: Grant
    Filed: August 20, 2008
    Date of Patent: December 4, 2012
    Assignee: MIPS Technologies, Inc.
    Inventors: Ajit Karthik Mylavarapu, Sanjai Balakrishnan Athi
  • Publication number: 20120297172
    Abstract: Tools and techniques are described for multi-threaded processing for opening and saving documents. These tools may provide load processes for reading documents from storage devices, and for loading the documents into applications. These tools may spawn a load process thread for executing a given load process on a first processing unit, and an application thread may execute a given application on a second processing unit. A first pipeline may be created for executing the load process thread, with the first pipeline performing tasks associated with loading the document into the application. A second pipeline may be created for executing the application process thread, with the second pipeline performing tasks associated with operating on the documents. The tasks in the first pipeline are configured to pass tokens as input to the tasks in the second pipeline.
    Type: Application
    Filed: August 8, 2012
    Publication date: November 22, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Uladzislau Sudzilouski, Igor Zaika
  • Publication number: 20120272047
    Abstract: Method, apparatus, and program means for shuffling data. The method of one embodiment comprises receiving a first operand having a set of L data elements and a second operand having a set of L control elements. For each control element, data from a first operand data element designated by the individual control element is shuffled to an associated resultant data element position if its flush to zero field is not set and a zero is placed into the associated resultant data element position if its flush to zero field is not set.
    Type: Application
    Filed: July 2, 2012
    Publication date: October 25, 2012
    Inventors: William W. Macy, JR., Eric L. Debes, Patrice L. Roussel, Huy V. Nguyen
  • Publication number: 20120265971
    Abstract: A mapper unit of an out-of-order processor assigns a particular counter currently in a counter free pool to count a number of mappings of logical registers to a particular physical register from among multiple physical registers, responsive to an execution of an instruction by the mapper unit mapping at least one logical register to the particular physical register. The number of counters is less than the number of physical registers. The mapper unit, responsive to the counted number of mappings of logical registers to the particular physical register decremented to less than a minimum value, returns the particular counter to the counter free pool.
    Type: Application
    Filed: April 15, 2011
    Publication date: October 18, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: GREGORY W. ALEXANDER, BRIAN D. BARRICK, JOHN W. WARD, III
  • Patent number: 8289335
    Abstract: A programmable processor and method for improving the performance of processors by expanding at least two source operands, or a source and a result operand, to a width greater than the width of either the general purpose register or the data path width. The present invention provides operands which are substantially larger than the data path width of the processor by using the contents of a general purpose register to specify a memory address at which a plurality of data path widths of data can be read or written, as well as the size and shape of the operand. In addition, several instructions and apparatus for implementing these instructions are described which obtain performance advantages if the operands are not limited to the width and accessible number of general purpose registers.
    Type: Grant
    Filed: February 3, 2006
    Date of Patent: October 16, 2012
    Assignee: MicroUnity Systems Engineering, Inc.
    Inventors: Craig Hansen, John Moussouris, Alexia Massalin
  • Patent number: 8291200
    Abstract: A comparison circuit can reduce the amount of power consumed when searching a load queue or a store queue of a microprocessor. Some embodiments of the comparison circuit use a comparison unit that performs an initial comparison of addresses using a subset of the address bits. If the initial comparison results in a match, a second comparison unit can be enabled to compare another subset of the address bits.
    Type: Grant
    Filed: August 4, 2009
    Date of Patent: October 16, 2012
    Assignee: STMicroelectronics (Beijing) R&D Co., Ltd.
    Inventors: Kai-Feng Wang, Hong-Xia Sun, Yong-Qiang Wu
  • Publication number: 20120260074
    Abstract: A microprocessor having performs an architectural instruction that instructs it to perform an operation on first and second source operands to generate a result and to write the result to a destination register only if its architectural condition flags satisfy a condition specified in the architectural instruction. A hardware instruction translator translates the instruction into first and second microinstructions. To execute the first microinstruction, an execution pipeline performs the operation on the source operands to generate the result. To execute the second microinstruction, it writes the destination register with the result generated by the first microinstruction if the architectural condition flags satisfy the condition, and writes the destination register with the current value of the destination register if the architectural condition flags do not satisfy the condition.
    Type: Application
    Filed: December 21, 2011
    Publication date: October 11, 2012
    Applicant: VIA TECHNOLOGIES, INC.
    Inventors: G. Glenn Henry, Gerard M. Col, Rodney E. Hooker, Terry Parks
  • Publication number: 20120260071
    Abstract: An architectural instruction instructs a microprocessor to perform an operation on first and second source operands to generate a result and to write the result to a destination register only if architectural condition flags satisfy a condition specified in the architectural instruction. A hardware instruction translator translates the architectural instruction into first and second microinstructions. To execute the first microinstruction, an execution pipeline performs the operation on the source operands to generate the result, determines whether the architectural condition flags satisfy the condition, and updates a non-architectural indicator to indicate whether the architectural condition flags satisfy the condition.
    Type: Application
    Filed: December 21, 2011
    Publication date: October 11, 2012
    Applicant: VIA TECHNOLOGIES, INC.
    Inventors: G. Glenn Henry, Gerard M. Col, Rodney E. Hooker, Terry Parks
  • Publication number: 20120260075
    Abstract: A microprocessor includes a hardware instruction translator that translates an architectural instruction into first and second microinstructions. To execute the first microinstruction, an execution pipeline performs the shift operation on the first source operand to generate the first result and a carry flag value and updates a non-architectural carry flag with the generated carry flag value. To execute the second microinstruction, it performs the second operation on the first result and the second operand to generate the second result and new condition flag values based on the second result. If a architectural condition flags satisfy the condition, it updates the architectural carry flag with the non-architectural carry flag value and updates at least one of the other architectural condition flags with the corresponding generated new condition flag values; otherwise, it updates the architectural condition flags with the current value of the architectural condition flags.
    Type: Application
    Filed: December 21, 2011
    Publication date: October 11, 2012
    Applicant: VIA TECHNOLOGIES, INC.
    Inventors: G. Glenn Henry, Gerard M. Col, Rodney E. Hooker, Terry Parks
  • Publication number: 20120254596
    Abstract: A system and method for controlling messaging between a first processor and a second processor is disclosed. The second processor controls one or more peripheral devices on behalf of a plurality of predetermined tasks being executed by the first processor. The system includes a message control module that receives an input message intended for the second processor from the first processor and maintains a message history based on the received input message and previously received input messages. The message history indicates which peripheral devices of the system are to be on and which tasks of the plurality of tasks requested the peripheral devices to be on. The message control module is further configured to generate an output message that includes output instructions for the second processor based on the message history and an output duration based on the message history. The second processor executes the output instructions.
    Type: Application
    Filed: March 31, 2011
    Publication date: October 4, 2012
    Applicants: DENSO CORPORATION, DENSO INTERNATIONAL AMERICA, INC.
    Inventors: Wan-ping Yang, Koji Shinoda, Hiroaki Shibata
  • Patent number: 8281075
    Abstract: A technique for triggering a system bus write command with user code includes identifying a specific store-type instruction in a user instruction sequence. The specific store-type instruction is converted into a specific request-type command, which is configured to include core permission controls (that are stored in core configuration registers of a processor core by a trusted kernel) and user created data (stored in a cache memory). Slave devices are configured through register space (that is only accessible by the trusted kernel) with respective slave permission controls. The specific request-type command is then transmitted from the cache memory, via a system bus. In this case, the slave devices that receive the specific request-type command process the specific request-type command when the core permission controls are the same as the respective slave permission controls. The trusted kernel may be included in a hypervisor or an operating system.
    Type: Grant
    Filed: April 14, 2009
    Date of Patent: October 2, 2012
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana Baba Arimilli, Brian Mitchell Bass, David Wayne Cummings, Bernard Charles Drerup, Guy Lynn Guthrie, Ronald Nick Kalla, Hugh Shen, Michael Steven Siegel, William John Starke, Derek Edward Williams
  • Patent number: 8269784
    Abstract: A programmable processor and method for improving the performance of processors by expanding at least two source operands, or a source and a result operand, to a width greater than the width of either the general purpose register or the data path width. The present invention provides operands which are substantially larger than the data path width of the processor by using the contents of a general purpose register to specify a memory address at which a plurality of data path widths of data can be read or written, as well as the size and shape of the operand. In addition, several instructions and apparatus for implementing these instructions are described which obtain performance advantages if the operands are not limited to the width and accessible number of general purpose registers.
    Type: Grant
    Filed: January 19, 2012
    Date of Patent: September 18, 2012
    Assignee: MicroUnity Systems Engineering, Inc.
    Inventors: Craig Hansen, John Moussouris, Alexia Massalin
  • Patent number: 8271766
    Abstract: An information processing device including registers (105) for holding data and an operation device (102) for executing arithmetic and logic operations on input/output data held in the register. The information processing device can issue an inter-register copy instruction for instructing data held in one register to be copied to another register. The information processing device further includes a copy information holding device (113) for reserving for execution of a data copy operation by the inter-register copy instruction from a control unit (108) so as to execute the actual copy operation simultaneously with the succeeding instruction to hide the execution time of the copy operation. Thus, in the inter-register copy instruction execution phase, a reservation for a data copy operation is stored in the copy information holding device so that the execution phase is completed without performing the actual data copy operation.
    Type: Grant
    Filed: May 18, 2006
    Date of Patent: September 18, 2012
    Assignee: NEC Corporation
    Inventor: Noritaka Hoshi
  • Publication number: 20120226868
    Abstract: Devices and methods for providing deterministic execution of multithreaded applications are provided. In some embodiments, each thread is provided access to an isolated memory region, such as a private cache. In some embodiments, more than one private cache are synchronized via a modified MOESI coherence protocol. The modified coherence protocol may be configured to refrain from synchronizing the isolated memory regions until the end of an execution quantum. The execution quantum may end when all threads experience a quantum end event such as reaching a threshold instruction count, overflowing the isolated memory region, and/or attempting to access a lock released by a different thread in the same quantum.
    Type: Application
    Filed: March 1, 2012
    Publication date: September 6, 2012
    Applicant: University of Washington through its Center for Commercialization
    Inventors: Luis Henrique Ceze, Thomas Bergan, Joseph Devietti, Daniel Joseph Grossman, Jacob Eric Nelson
  • Patent number: 8255631
    Abstract: A method, processor, and data processing system for implementing a framework for priority-based scheduling and throttling of prefetching operations. A prefetch engine (PE) assigns a priority to a first prefetch stream, indicating a relative priority for scheduling prefetch operations of the first prefetch stream. The PE monitors activity within the data processing system and dynamically updates the priority of the first prefetch stream based on the activity (or lack thereof). Low priority streams may be discarded. The PE also schedules prefetching in a priority-based scheduling sequence that corresponds to the priority currently assigned to the scheduled active streams. When there are no prefetches within a prefetch queue, the PE triggers the active streams to provide prefetches for issuing. The PE determines when to throttle prefetching, based on the current usage level of resources relevant to completing the prefetch.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: August 28, 2012
    Assignee: International Business Machines Corporation
    Inventors: Lei Chen, Lixin Zhang
  • Patent number: 8255095
    Abstract: A modular avionics system includes several cabinets arranged at various locations in an aircraft and interconnected in a network. The cabinets are used for controlling or processing signals from and to sensors, actuators and other systems of the aircraft. The system includes parallel processors, for example transputers. The cabinets comprise at least two core processor modules (CPM1, CPM2) and at least two input/output modules (IOM1, IOM2). The input/output modules (IOM1, IOM2) serve as interfaces to the systems to be controlled, and serve for the control and intermediate storage of the data flowing into and out of the cabinet. Each core processor module (CPM1, CPM2) communicates independently with each IOM module and CPM module by way of links; and in each core processor a number of independent system programs works under the control of an operating system.
    Type: Grant
    Filed: November 16, 2006
    Date of Patent: August 28, 2012
    Assignee: Airbus Operations GmbH
    Inventor: Heinz Girlich
  • Patent number: 8255673
    Abstract: Apparatus for processing data is provided comprising processing circuitry and monitoring circuitry for monitoring write transactions and performing transaction authorizations of certain transactions in dependence upon associated memory addresses. The processing circuitry is configured to enable execution of a write instruction corresponding to a write transaction to be monitored to continue to completion while the monitoring circuitry is performing monitoring of the write transactions and the monitoring circuitry is arranged to cause storage of write transaction data in an intermediate storage element for those transactions for which an authorization is required. Storage of write transaction data in an intermediate storage element enables the write transaction to be reissued in dependence upon the result of the transaction authorization although the corresponding write instruction has already completed.
    Type: Grant
    Filed: April 25, 2008
    Date of Patent: August 28, 2012
    Assignee: ARM Limited
    Inventors: Daniel Kershaw, Daren Croxford
  • Patent number: 8255672
    Abstract: A processor includes: a plurality of registers; an instruction readout circuit configured to read out an instruction from a memory; an instruction generation circuit configured to generate instructions for saving data into a predetermined storage area, for the respective registers, if the instruction read out by the instruction readout circuit is an instruction causing the data stored in each of the plurality of registers to be saved; and an instruction execution circuit configured to execute the instruction read out from the memory and the instructions generated by the instruction generation circuit.
    Type: Grant
    Filed: May 28, 2008
    Date of Patent: August 28, 2012
    Assignees: Semiconductor Components Industries, LLC, Sanyo Semiconductor Co., Ltd.
    Inventors: Iwao Honda, Shinya Kishida
  • Patent number: 8254401
    Abstract: The device comprises a memory (3) for storing several user share parameters and several amounts capable of advancing. A decision means (6) allocates a chosen service slice of the resource to a user selected as possessing the least advanced amount. It subsequently advances his amount according to a chosen increment. A memory link means (5) defines user queues of “FIFO” type, such that the user having the least advanced amount in a queue appears at the head of this queue. According to the invention, the memory (3) stores a limited number of values of increments. The memory link means (5) associates one of these values of increments with each user and allocates an increment value to each queue.
    Type: Grant
    Filed: July 1, 2010
    Date of Patent: August 28, 2012
    Assignee: Streamcore
    Inventor: Rémi Despres
  • Patent number: 8250231
    Abstract: A method to reduce buffer capacity in a processor includes giving the data packets admittance to the processor through at least one interface, storing the data packets in at least one input buffer, and using a packet rate shaper outside of a processing pipeline to control flow of the data packets to the pipeline before the data packets enter the pipeline. First and second data packets are given admittance to the pipeline in dependence on cost information per packet that is dependent upon an expected time period of residence of the first data packet in the pipeline. Cost information dependent upon an expected time period of residence of the second data packet in the pipeline differs from said cost information dependent upon the expected time period of residence of the first data packet in the pipeline.
    Type: Grant
    Filed: December 20, 2005
    Date of Patent: August 21, 2012
    Assignee: Marvell International Ltd.
    Inventors: Thomas Bodén, Jakob Carlström
  • Publication number: 20120210102
    Abstract: A first hardware thread executes a software program instruction, which instructs the first hardware thread to initiate a second hardware thread. As such, the first hardware thread identifies one or more register values accessible by the first hardware thread. Next, the first hardware thread copies the identified register values to one or more registers accessible by the second hardware thread. In turn, the second hardware thread accesses the copied register values included in the accessible registers and executes software code accordingly.
    Type: Application
    Filed: April 21, 2012
    Publication date: August 16, 2012
    Applicant: International Business Machines Corporation
    Inventors: Giles Roger Frazier, Ronald P. Hall
  • Publication number: 20120204015
    Abstract: A computer-program product may have instructions that, when executed, cause a processor to perform operations including managing execution of application functions that access data in a shared buffer; determining if a first instruction that is stored at a first memory location causes, upon execution, data to be read from or written to the shared buffer; and when it is determined that the first instruction causes data to be read from or written to the shared buffer, 1) identify one or more replacement instructions to execute in place of the first instruction; 2) store the one or more replacement instructions; and 3) replace the first instruction at the first memory location with a second instruction that, when executed, causes the stored one or more replacement instructions to be executed.
    Type: Application
    Filed: April 13, 2012
    Publication date: August 9, 2012
    Applicant: APPLE INC.
    Inventors: Ronnie G. Misra, Joshua H. Shaffer
  • Publication number: 20120198214
    Abstract: One embodiment sets forth a technique for N-way memory barrier operation coalescing. When a first memory barrier is received for a first thread group execution of subsequent memory operations for the first thread group are suspended until the first memory barrier is executed. Subsequent memory barriers for different thread groups may be coalesced with the first memory barrier to produce a coalesced memory barrier that represents memory barrier operations for multiple thread groups. When the coalesced memory barrier is being processed, execution of subsequent memory operations for the different thread groups is also suspended. However, memory operations for other thread groups that are not affected by the coalesced memory barrier may be executed.
    Type: Application
    Filed: April 6, 2012
    Publication date: August 2, 2012
    Inventors: Shirish GADRE, Charles McCARVER, Anjana RAJENDRAN, Omkar PARANJAPE, Steven James HEINRICH
  • Publication number: 20120198213
    Abstract: A packet handler for a packet processing system includes a plurality of parallel action machines, each of the plurality of parallel action machines being configured to perform a respective packet processing function; and a plurality of action machine input registers, wherein each of the plurality of parallel action machines is associated with one or more of the plurality of action machine input registers, and wherein an action machine of the plurality of parallel action machines is automatically triggered to perform its respective packet processing function in the event that data sufficient to perform the actions machine's respective packet processing function is written into the action machine's one or more respective action machine input registers.
    Type: Application
    Filed: January 31, 2011
    Publication date: August 2, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Francois Abel, Jean Calvignac, Christoph Hagleitner, Fabrice Verplanken
  • Patent number: RE43825
    Abstract: A system and method forward data between processing elements. A first processing element includes an address register that stores a first memory address. A forwarding storage element is coupled to the first processing element. A second processing element, coupled to the forwarding storage element, transmits a second memory address to the forwarding storage element. The forwarding storage transmits the second memory address to the first processing element, and the first processing element compares the second memory address with the first memory address.
    Type: Grant
    Filed: November 19, 2007
    Date of Patent: November 20, 2012
    Assignee: The United States of America as Represented by the Secretary of the Navy
    Inventors: Joel Zvi Apisdorf, Sam Brandon Sandbote, Michael Daniel Poole