Simultaneous Issuance Of Multiple Instructions Patents (Class 712/215)

Information processing apparatus and method of controlling register

Patent number: 8019973

Abstract: An information processing apparatus and a method of controlling the same that employs a register window system and a Simultaneous Multithreading method for reducing circuit areas by sharing a data transfer bus between threads, said bus connecting a master register and a work register provided for each thread and for avoiding interference in instruction execution with other threads caused by a conflict between accesses to a register between threads. An information processing apparatus and a method of controlling the information processing apparatus employing a register window system for register reading, in which a master register and a work register are held for each thread and a bus for transferring data from the master to the work register is shared by threads in order to realize Simultaneous Multithreading.

Type: Grant

Filed: December 15, 2009

Date of Patent: September 13, 2011

Assignee: Fujitsu Limited

Inventors: Takashi Suzuki, Toshio Yoshida
Reducing data hazards in pipelined processors to provide high processor utilization

Patent number: 8006072

Abstract: A pipelined computer processor is presented that reduces data hazards such that high processor utilization is attained. The processor restructures a set of instructions to operate concurrently on multiple pieces of data in multiple passes. One subset of instructions operates on one piece of data while different subsets of instructions operate concurrently on different pieces of data. A validity pipeline tracks the priming and draining of the pipeline processor to ensure that only valid data is written to registers or memory. Pass-dependent addressing is provided to correctly address registers and memory for different pieces of data.

Type: Grant

Filed: May 18, 2010

Date of Patent: August 23, 2011

Assignee: Micron Technology, Inc.

Inventors: Neal Andrew Cook, Alan T. Wootton, James Peterson
Thread migration control based on prediction of migration overhead

Patent number: 8006077

Abstract: A processing system features a first processing core to operate in a first node, a second processing core to operate in a second node, and random access memory (RAM) responsive to the first and second processing cores. The processing system also features control logic to perform operations such as (a) automatically updating a resident set size (RSS) counter to correspond to the RSS for the thread on the first node in response to allocation of a page frame for a thread in the first node, and (b) using the RSS counter to predict migration overhead when determining whether the thread should be migrated from the first processing core to the second processing core. Other embodiments are described and claimed.

Type: Grant

Filed: March 29, 2007

Date of Patent: August 23, 2011

Assignee: Intel Corporation

Inventors: Tong Li, Daniel Baumberger, Scott Hahn
System and method for optimization within a group priority issue schema for a cascaded pipeline

Patent number: 7996654

Abstract: The present invention provides system and method for a group priority issue schema for a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receive an issue group of instructions, (2) determine the dependency chain depth of all the instructions in the issue group, (3) schedule the instructions in an order of the longest dependency chain depth to shortest dependency chain depth, and (4) execute the issue group of instructions in the cascaded delayed execution pipeline unit.

Type: Grant

Filed: February 19, 2008

Date of Patent: August 9, 2011

Assignee: International Business Machines Corporation

Inventor: David A. Luick
System and method for prioritizing arithmetic instructions

Patent number: 7984270

Abstract: The present invention provides a system and method for prioritizing arithmetic instructions in a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receive an issue group of instructions; (2) determine if at least one arithmetic instruction is in the issue group, if so scheduling the least one arithmetic instruction in a one of the plurality of execution pipelines based upon a first prioritization scheme; (3) determine if there is an issue conflict for one of the plurality of execution pipelines and resolving the issue conflict by scheduling the at least one arithmetic instruction in a different execution pipeline; (4) schedule execution of the issue group of instructions in the cascaded delayed execution pipeline unit.

Type: Grant

Filed: February 19, 2008

Date of Patent: July 19, 2011

Assignee: International Business Machines Corporation

Inventor: David A. Luick
Error detection device and method for error detection for a command decoder

Patent number: 7979783

Abstract: An error detection device for a command decoder is described, the command decoder reading out an associated sequence of control signal words from a command memory based on an input word, wherein the sequence of control signal words has at least one control signal word, having: a controller designed to provide the input word at a first time and the input word at a second time for reading out the command memory, wherein the second time is delayed with respect to the first time, to effect a readout of the sequence of control signal words at a first time and a readout of the sequence of control signal words at a second time; and a comparator designed to receive and compare the associated sequences of control signal words read out at the first and second times, and to output a signal indicating an error if the associated sequences of control signal words read out at the first and second times are different.

Type: Grant

Filed: February 8, 2007

Date of Patent: July 12, 2011

Assignee: Infineon Technologies AG

Inventors: Michael Goessel, Franz Klug, Steffen Marc Sonnekalb
Adaptive allocation of reservation station entries to an instruction set with variable operands in a microprocessor

Patent number: 7979677

Abstract: A method and device for adaptively allocating reservation station entries to an instruction set with variable operands in a microprocessor. The device includes logic for determining free reservation station queue positions in a reservation station. The device allocates an issue queue to an instruction and writes the instruction into the issue queue as an issue queue entry. The device reads an operand corresponding to the instruction from a general purpose register and writes the operand into a reservation station using one of the free reservations station positions as a write address. The device writes each reservation station queue position corresponding to said instruction into said issue queue entry. When the instruction is ready for issue to an execution unit, the device reads out the instruction from the issue queue entry the reservation station queue positions to the execution unit.

Type: Grant

Filed: August 3, 2007

Date of Patent: July 12, 2011

Assignee: International Business Machines Corporation

Inventor: Dung Q. Nguyen
Data Parallel Function Call for Determining if Called Routine is Data Parallel

Publication number: 20110161623

Abstract: Mechanisms for performing data parallel function calls in code during runtime are provided. These mechanisms may operate to execute, in the processor, a portion of code having a data parallel function call to a target portion of code. The mechanisms may further operate to determine, at runtime by the processor, whether the target portion of code is a data parallel portion of code or a scalar portion of code and determine whether the calling code is data parallel code or scalar code. Moreover, the mechanisms may operate to execute the target portion of code based on the determination of whether the target portion of code is a data parallel portion of code or a scalar portion of code, and the determination of whether the calling code is data parallel code or scalar code.

Type: Application

Filed: December 30, 2009

Publication date: June 30, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Alexandre E. Eichenberger, Brian K. Flachs, Charles R. Johns, Mark R. Nutter
MECHANISM FOR SELECTING INSTRUCTIONS FOR EXECUTION IN A MULTITHREADED PROCESSOR

Publication number: 20110138153

Abstract: In one embodiment, a multithreaded processor includes a plurality of buffers, each configured to store instructions corresponding to a respective thread. The multithreaded processor also includes a pick unit coupled to the plurality of buffers. The pick unit may pick from at least one of the buffers in a given cycle, a valid instruction based upon a thread selection algorithm. The pick unit may further cancel, in the given cycle, the picking of the valid instruction in response to receiving a cancel indication.

Type: Application

Filed: February 14, 2011

Publication date: June 9, 2011

Inventor: Robert T. Golla
Processor

Patent number: 7953959

Abstract: A processor includes: an instruction buffer which holds a group of instructions that can be executed in parallel; an instruction decoding unit which decodes part or all of the group of instructions; and an instruction issuance control unit which detects whether or not a factor obstructing simultaneous execution of the group of instructions exists in the group of instructions and supplies the group of instructions to the instruction decoding unit by controlling the instruction buffer so that the instructions of the group of instructions are sequentially supplied when the factor exists and all the instructions of the group of instructions are simultaneously supplied when the factor does not exist.

Type: Grant

Filed: March 9, 2006

Date of Patent: May 31, 2011

Assignee: Panasonic Corporation

Inventor: Tetsu Hosoki
Method and apparatus for separate control processing and data path processing in a dual path processor with a shared load/store unit

Patent number: 7949856

Abstract: According to embodiments of the invention, there is disclosed a computer processor architecture; and in particular a computer processor, a method of operating the same, and a computer program product that makes use of an instruction set for the computer.

Type: Grant

Filed: March 31, 2004

Date of Patent: May 24, 2011

Assignee: Icera Inc.

Inventor: Simon Knowles
Methods and apparatus for dynamic instruction controlled reconfigurable register file

Patent number: 7941648

Abstract: A scalable reconfigurable register file (SRRF) containing multiple register files, read and write multiplexer complexes, and a control unit operating in response to instructions is described. Multiple address configurations of the register files are supported by each instruction and different configurations are operable simultaneously during a single instruction execution. For example, with separate files of the size 32×32 supported configurations of 128×32 bit s, 64×64 bit s and 32×128 bit s can be in operation each cycle. Single width, double width, quad width operands are optimally supported without increasing the register file size and without increasing the number of register file read or write ports.

Type: Grant

Filed: June 3, 2008

Date of Patent: May 10, 2011

Assignee: Altera Corporation

Inventors: Gerald George Pechanek, Edward A. Wolff
Voltage droop mitigation through instruction issue throttling

Patent number: 7937563

Abstract: A system and method for providing a digital real-time voltage droop detection and subsequent voltage droop reduction. A scheduler within a reservation station may store a weight value for each instruction corresponding to node capacitance switching activity for the instruction derived from pre-silicon power modeling analysis. For instructions picked with available source data, the corresponding weight values are summed together to produce a local current consumption value and this value is summed with any existing global current consumption values from corresponding schedulers of other processor cores yielding an activity event. The activity event is stored. Hashing functions within the scheduler are used to determine both a recent and an old activity average using the calculated activity event and stored older activity events.

Type: Grant

Filed: May 27, 2008

Date of Patent: May 3, 2011

Assignee: Advanced Micro Devices, Inc.

Inventors: Samuel D. Naffziger, Michael Gerard Butler
Preparing instruction groups for a processor having multiple issue ports

Patent number: 7934203

Abstract: During program code conversion, such as in a dynamic binary translator, automatic code generation provides target code 21 executable by a target processor 13. Multiple instruction ports 610 disperse a group of instructions to functional units 620 of the processor 13. Disclosed is a mechanism of preparing an instruction group 606 using a plurality of pools 700 having a hierarchical structure 711-715. Each pool represents a different overlapping subset of the issue ports 610. Placing an instruction 600 into a particular pool 700 also reduces vacancies in any one or more subsidiary pools in the hierarchy. In a preferred embodiment, a counter value 702 is associated with each pool 700 to track vacancies. A valid instruction group 606 is formed by picking the placed instructions 600 from the pools 700. The instruction groups are generated accurately and automatically. Decoding errors and stalls are minimized or completely avoided.

Type: Grant

Filed: May 27, 2005

Date of Patent: April 26, 2011

Assignee: International Business Machines Corporation

Inventors: William O. Lovett, David Haikney, Matthew Evans
Method and apparatus for separate control processing and data path processing in a dual path processor with a shared load/store unit

Patent number: 7921277

Abstract: According to embodiments of the invention, there is disclosed a computer processor architecture; and in particular a computer processor, a method of operating the same, and a computer program product that makes use of an instruction set for the computer.

Type: Grant

Filed: March 31, 2004

Date of Patent: April 5, 2011

Assignee: Icera Inc.

Inventor: Simon Knowles
Mechanism for selecting instructions for execution in a multithreaded processor

Patent number: 7890734

Abstract: In one embodiment, a multithreaded processor includes a plurality of buffers, each configured to store instructions corresponding to a respective thread. The multithreaded processor also includes a pick unit coupled to the plurality of buffers. The pick unit may pick from at least one of the buffers in a given cycle, a valid instruction based upon a thread selection algorithm. The pick unit may further cancel, in the given cycle, the picking of the valid instruction in response to receiving a cancel indication.

Type: Grant

Filed: June 30, 2004

Date of Patent: February 15, 2011

Assignee: Open Computing Trust I & II

Inventor: Robert T. Golla
Multi-threading processors, integrated circuit devices, systems, and processes of operation and manufacture

Patent number: 7890735

Abstract: A multi-threaded microprocessor (1105) for processing instructions in threads. The microprocessor (1105) includes first and second decode pipelines (1730.0, 1730.1), first and second execute pipelines (1740, 1750), and coupling circuitry (1916) operable in a first mode to couple first and second threads from the first and second decode pipelines (1730.0, 1730.1) to the first and second execute pipelines (1740, 1750) respectively, and the coupling circuitry (1916) operable in a second mode to couple the first thread to both the first and second execute pipelines (1740, 1750). Various processes of manufacture, articles of manufacture, processes and methods of operation, circuits, devices, and systems are disclosed.

Type: Grant

Filed: August 23, 2006

Date of Patent: February 15, 2011

Assignee: Texas Instruments Incorporated

Inventor: Thang Tran
System and method for the scheduling of load instructions within a group priority issue schema for a cascaded pipeline

Patent number: 7882335

Abstract: The present invention provides system and method for a group priority issue schema for a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receive an issue group of instructions; (2) determine if at least one load instruction is in the issue group, if so scheduling the least one load instruction in a first pipeline based upon a priority list; and (3) schedule execution of the issue group of instructions in the cascaded delayed execution pipeline unit.

Type: Grant

Filed: February 19, 2008

Date of Patent: February 1, 2011

Assignee: International Business Machines Corporation

Inventor: David A Luick
System and method for prioritizing compare instructions

Patent number: 7877579

Abstract: The present invention provides a system and method for prioritizing compare instructions in a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receive an issue group of instructions; (2) determine if at least one compare instruction is in the issue group, if so scheduling the least one compare instruction in a one of the plurality of execution pipelines based upon a first prioritization scheme; (3) determine if there is an issue conflict for one of the plurality of execution pipelines and resolving the issue conflict by scheduling the at least one compare instruction in a different execution pipeline; (4) schedule execution of the issue group of instructions in the cascaded delayed execution pipeline unit.

Type: Grant

Filed: February 19, 2008

Date of Patent: January 25, 2011

Assignee: International Business Machines Corporation

Inventor: David A. Luick
High speed multi-threaded reduced instruction set computer (RISC) processor with hardware-implemented thread scheduler

Patent number: 7873817

Abstract: A reduced instruction set computer (RISC) processor includes a processing core, which is arranged to process a software thread. A hardware-implemented scheduler is arranged to receive respective contexts of a plurality of software threads, to determine a schedule for processing of the software threads by the processing core, and to serve the contexts to the processing core in accordance with the schedule.

Type: Grant

Filed: October 18, 2005

Date of Patent: January 18, 2011

Inventors: Eli Aloni, Gilad Ayalon, Oren David
System and method for prioritizing branch instructions

Patent number: 7870368

Abstract: The present invention provides a system and method for prioritizing branch instructions in a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receive an issue group of instructions; (2) determine if at least one branch instruction is in the issue group, if so scheduling the least one branch instruction in a one of the plurality of execution pipelines based upon a first prioritization scheme; (3) determine if there is an issue conflict for one of the plurality of execution pipelines and resolving the issue conflict by scheduling the at least one branch instruction in a different execution pipeline; (4) schedule execution of the issue group of instructions in the cascaded delayed execution pipeline unit.

Type: Grant

Filed: February 19, 2008

Date of Patent: January 11, 2011

Assignee: International Business Machines Corporation

Inventor: David A. Luick
System and method for prioritizing store instructions

Patent number: 7865700

Abstract: The present invention provides a system and method for prioritizing store instructions in a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receive an issue group of instructions; (2) determine if at least one store instruction is in the issue group, if so scheduling the least one store instruction in a one of the plurality of execution pipelines based upon a first prioritization scheme; (3) determine if there is an issue conflict for one of the plurality of execution pipelines and resolving the issue conflict by scheduling the at least one store instruction in a different execution pipeline; (4) schedule execution of the issue group of instructions in the cascaded delayed execution pipeline unit.

Type: Grant

Filed: February 19, 2008

Date of Patent: January 4, 2011

Assignee: International Business Machines Corporation

Inventor: David A. Luick
Processor instruction including option bits encoding which instructions of an instruction packet to execute

Patent number: 7861061

Abstract: A processor and a method for executing VLIW instructions by first fetching a VLIW instruction and then identifying from option bits encoded in a first one of the instructions within the fetched VLIW instruction packet which, if any, of the remaining instructions within the VLIW instruction are to be executed in the same execution cycle as the first instruction. Finally, executing the first instruction and any remaining instructions identified from the encoded option bits.

Type: Grant

Filed: May 23, 2003

Date of Patent: December 28, 2010

Assignee: STMicroelectronics (R&D) Ltd.

Inventor: Zahid Hussain
Register renaming of a partially updated data granule

Publication number: 20100312989

Abstract: A processor 2 supporting register renaming has a rename table 20 in which the flag register has multiple tag values associated therewith. These tag values indicate which virtual register corresponds to a destination flag register of the oldest instruction which wrote a still up-to-date value of a subset of the flags.

Type: Application

Filed: June 4, 2009

Publication date: December 9, 2010

Inventor: James Nolan Hardage
Apparatus for adjusting instruction thread priority in a multi-thread processor

Patent number: 7827388

Abstract: Each instruction thread in a SMT processor is associated with a software assigned base input processing priority. Unless some predefined event or circumstance occurs with an instruction being processed or to be processed, the base input processing priorities of the respective threads are used to determine the interleave frequency between the threads according to some instruction interleave rule. However, upon the occurrence of some predefined event or circumstance in the processor related to a particular instruction thread, the base input processing priority of one or more instruction threads is adjusted to produce one more adjusted priority values. The instruction interleave rule is then enforced according to the adjusted priority value or values together with any base input processing priority values that have not been subject to adjustment.

Type: Grant

Filed: March 7, 2008

Date of Patent: November 2, 2010

Assignee: International Business Machines Corporation

Inventors: John Wesley Ward, III, Minh Michelle Quy Pham, Ronald Nick Kalla, Balaram Sinharoy
Acting on a subject system

Patent number: 7822592

Abstract: In an active system, an actor is able to effect action in a subject system. The actor and the subject system exist in an environment which can impact the subject system. Neither the actor nor the subject system has any control over the environment. The actor includes a model and a processor. The processor is guided by the model. The processor is arranged to effect action in the subject system. The subject system is known by the model. This allows the actor to be guided in its action on the subject system by the model of the subject system. Events can occur in the subject system either through the actions of the actor, as guided by the model, or through actions of other actors, or through a change in state of the subject system itself (e.g. the progression of a chemical reaction) or its environment (e.g. the passage of time). The actor keeps the model updated with its own actions.

Type: Grant

Filed: October 17, 2005

Date of Patent: October 26, 2010

Assignee: Manthatron-IP Limited

Inventor: Peter Hawkins
System and Method for Group Formation with Multiple Taken Branches Per Group

Publication number: 20100257340

Abstract: Disclosed are a method and a system for grouping processor instructions for execution by a processor, where the group of processor instructions includes at least two branch processor instructions. In one or more embodiments, an instruction buffer can decouple an instruction fetch operation from an instruction decode operation by storing fetched processor instructions in the instruction buffer until the fetched processor instructions are ready to be decoded. Group formation can involve removing processor instructions from the instruction buffer and routing the processor instruction to latches that convey the processor instructions to decoders. Processor instructions that are removed from instruction buffer in a single clock cycle can be called a group of processor instructions. In one or more embodiments, the first instruction in the group must be the oldest instruction in the instruction buffer and instructions must be removed from the instruction buffer ordered from oldest to youngest.

Type: Application

Filed: April 3, 2009

Publication date: October 7, 2010

Applicant: INTERNATIONAL BUISNESS MACHINES CORPORATION

Inventors: Richard William Doing, Kevin Neal Magil, Balaram Sinharoy, Jeffrey R. Summers, James A. Van Norstrand, JR.
Parallel generating of bundles of data objects

Patent number: 7810084

Abstract: Computer-implemented methods, computer systems and computer program products are provided for parallel processing a plurality of data objects with a plurality of processors. As disclosed herein, the data objects to be assembled for further processing may be in bundles, the bundles obeying first predefined criteria, which is dynamically controlled by using a bundle specific master table. The methods and systems may generate pipelines of data objects by pre-selecting and grouping the data objects according to second predefined criteria by a first group of the plurality of processors, and create the bundles from each pipeline of the pre-selected data objects by a second group of the plurality of processors.

Type: Grant

Filed: June 1, 2006

Date of Patent: October 5, 2010

Assignee: SAP AG

Inventor: Karsten S. Egetoft
Trace optimization via fusing operations of a target architecture operation set

Patent number: 7797517

Abstract: Reference architecture instructions are translated into target architecture operations. Sequences of operations, in a predicted execution order in some embodiments, form traces. In some embodiments, a trace is based on a plurality of basic blocks. In some embodiments, a trace is committed or aborted as a single entity. Sequences of operations are optimized by fusing collections of operations; fused operations specify a same observable function as respective collections, but advantageously enable more efficient processing. In some embodiments, a collection comprises multiple register operations. Fusing a register operation with a branch operation in a trace forms a fused reg-op/branch operation. In some embodiments, branch instructions translate into assert operations. Fusing an assert operation with another operation forms a fused assert operation. In some embodiments, fused operations only set architectural state, such as high-order portions of registers, that is subsequently read before being written.

Type: Grant

Filed: November 17, 2006

Date of Patent: September 14, 2010

Assignee: Oracle America, Inc.

Inventor: John Gregory Favor
Processing pipeline having parallel dispatch and method thereof

Patent number: 7793080

Abstract: One or more processor cores of a multiple-core processing device each can utilize a processing pipeline having a plurality of execution units (e.g., integer execution units or floating point units) that together share a pre-execution front-end having instruction fetch, decode and dispatch resources. Further, one or more of the processor cores each can implement dispatch resources configured to dispatch multiple instructions in parallel to multiple corresponding execution units via separate dispatch buses. The dispatch resources further can opportunistically decode and dispatch instruction operations from multiple threads in parallel so as to increase the dispatch bandwidth. Moreover, some or all of the stages of the processing pipelines of one or more of the processor cores can be configured to implement independent thread selection for the corresponding stage.

Type: Grant

Filed: December 31, 2007

Date of Patent: September 7, 2010

Inventors: Gene Shen, Sean Lie
Symbolic store-load bypass

Patent number: 7779236

Abstract: The invention provides a method and system for operating a pipelined microprocessor more quickly, by detecting instructions that load from identical memory locations as were recently stored to, without having to actually compute the referenced external memory addresses. The microprocessor examines the symbolic structure of instructions as they are encountered, so as to be able to detect identical memory locations by examination of their symbolic structure. For example, in a preferred embodiment, instructions that store to and load from an identical offset from an identical register are determined to be referencing the identical memory location, without having to actually compute the complete physical target address.

Type: Grant

Filed: November 19, 1999

Date of Patent: August 17, 2010

Assignee: STMicroelectronics, Inc.

Inventor: David L. Isaman
Scalable method for producer and consumer elimination

Patent number: 7779165

Abstract: Producers and consumer processes may synchronize and transfer data using a shared data structure. After locating a potential transfer location that indicates an EMPTY status, a producer may store data to be transferred in the transfer location. A producer may use a compare-and-swap (CAS) operation to store the transfer data to the transfer location. A consumer may subsequently read the transfer data from the transfer location and store, such as by using a CAS operation, a DONE status indicator in the transfer location. The producer may notice the DONE indication and may then set the status location back to EMPTY to indicate that the location is available for future transfers, by the same or a different producer. The producer may also monitor the transfer location and time out if no consumer has picked up the transfer data.

Type: Grant

Filed: January 4, 2006

Date of Patent: August 17, 2010

Assignee: Oracle America, Inc.

Inventors: Mark S. Moir, Daniel S. Nussbaum, Ori Shalev, Nir N. Shavit
Register file

Publication number: 20100199072

Abstract: A register file comprising a plurality of register entries for storing data values for use in the execution of data processing instructions is provided, and comprises at least one write port and at least one read port, and circuitry responsive to a write request received at said at least one write port to update one of said plurality of register entries identified by an address specified by said write request with a data value specified by said write request. The register file also comprises further circuitry responsive to a received control signal to set at least a portion of a predetermined register entry to a predetermined value. In this way, certain register file updating instructions can be executed in parallel with other instructions without the need for additional full write-ports as would be required for typical dual-issue, thereby reducing area and routing complexity and cost compared with the use of an additional write-port due to the lower gate count required by the proposed further circuitry.

Type: Application

Filed: February 2, 2009

Publication date: August 5, 2010

Applicant: ARM LIMITED

Inventor: Simon John Craske
Parallel operation device allowing efficient parallel operational processing

Patent number: 7769980

Abstract: In arithmetic/logic units (ALU) provided corresponding to entries, an MIMD instruction decoder generating a group of control signals in accordance with a Multiple Instruction-Multiple Data (MIMD) instruction and an MIMD register storing data designating the MIMD instruction are provided, and an inter-ALU communication circuit is provided. The amount and direction of movement of the inter-ALU communication circuit are set by data bits stored in a movement data register. It is possible to execute data movement and arithmetic/logic operation with the amount of movement and operation instruction set individually for each ALU unit. Therefore, in a Single Instruction-Multiple Data type processing device, Multiple Instruction-Multiple Data operation can be executed at high speed in a flexible manner.

Type: Grant

Filed: August 16, 2007

Date of Patent: August 3, 2010

Assignee: Renesas Technology Corp.

Inventors: Toshinori Sueyoshi, Masahiro Iida, Mitsutaka Nakano, Fumiaki Senoue, Katsuya Mizumoto
Method for allocating registers using simulated annealing controlled instruction scheduling

Patent number: 7761691

Abstract: A method for scheduling instructions for clustered digital signal processors comprising a plurality of clusters, each cluster including at least two functional units and a first register file having a first unit, a second unit and a single set of access ports shared by the functional units comprises steps of checking whether executing one instruction needs data to be read from the first unit and the second unit of the first register file, generating a copying instruction to transfer data from the first unit to the second unit of the first register file, checking whether there is a prior operation cycle available to perform the copying instruction, scheduling the copying instruction in the prior operation cycle, and scheduling the instruction after the copying instruction.

Type: Grant

Filed: October 27, 2005

Date of Patent: July 20, 2010

Assignee: National Tsing Hua University

Inventors: Chung-Lin Tang, Yung-Chia Lin, Jenq-Kuen Lee
PROCESSOR HAVING RECONFIGURABLE ARITHMETIC ELEMENT

Publication number: 20100174884

Abstract: A processor (101) in which a plurality of arithmetic elements executing instructions are embedded includes: fixed function arithmetic elements (121 to 123) each having a circuit configuration that is not dynamically reconfigurable; a reconfigurable arithmetic element (125) having a circuit configuration that is dynamically reconfigurable; and an arithmetic operation control unit (113) which allocates instructions to the fixed function arithmetic elements (121 to 123) and the reconfigurable arithmetic element (125) and issues the allocated instructions to the respective arithmetic elements.

Type: Application

Filed: November 9, 2006

Publication date: July 8, 2010

Applicant: MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD.

Inventors: Hiroyuki Morishita, Takao Yamamoto, Masaitsu Nakajima
INFORMATION HANDLING SYSTEM WITH REAL AND VIRTUAL LOAD/STORE INSTRUCTION ISSUE QUEUE

Publication number: 20100161945

Abstract: An information handling system includes a processor that may perform issue queue virtual load/store instruction operations. The issue queue maintains load and store instructions with a real/virtual dependency flag. The issue queue provides storage resources for real and virtual load/store instructions. Real load/store instructions execute in a load store unit LSU. Virtual load/store instructions are pending execution in the LSU. The LSU may keep track of each virtual load/store instruction within the issue queue by thread, type, and pointer data. Provided that all dependencies are clear for a pending virtual load/store instruction, the LSU marks the pending virtual load/store instruction as real. The pending virtual load/store instruction may then issue to the LSU as a real load/store instruction.

Type: Application

Filed: December 22, 2008

Publication date: June 24, 2010

Applicants: International Business Machines Corporation, IBM Corporation

Inventors: William E. Burky, Kurt A. Feiste, Dung Quoc Nguyen, Balaram Sinharoy, Albert Thomas Williams
Method and apparatus for increasing load bandwidth

Patent number: 7739483

Abstract: A method and apparatus for dual-target register allocation is described, intended to enable the efficient mapping/renaming of registers associated with instructions within a pipelined microprocessor architecture.

Type: Grant

Filed: September 28, 2001

Date of Patent: June 15, 2010

Assignee: Intel Corporation

Inventors: Rajesh Patel, James Dundas, Adi Yoaz
PARALLELING PROCESSING METHOD, SYSTEM AND PROGRAM

Publication number: 20100138810

Abstract: Paralleling processing system and method. When clusters are formed based on strongly connected components, a single cluster (fat cluster) having at least a predetermined number of blocks, or an expected processing time exceeding a predetermined threshold, is formed. The fat cluster is subjected to an unrolling process to make multiple copies of the processing of the fat cluster and to assign the copies to individual processors. Processing of the fat cluster is executed by the multiple processor devices in a pipelined manner. If a fat cluster to be iteratively executed cannot be executed in the pipelined manner because a processing result of an nth iteration of the fat cluster depends on a processing result of a preceding iteration of the fat cluster an input value needed for execution of the fat cluster is generated based on a certain prediction, and the fat cluster is speculatively executed.

Type: Application

Filed: December 2, 2009

Publication date: June 3, 2010

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Hideaki Komatsu, Arquimedes Martinez Canedo, Takeo Yoshizawa
Methods and apparatus for independent processor node operations in a SIMD array processor

Patent number: 7730280

Abstract: A control processor is used for fetching and distributing single instruction multiple data (SIMD) instructions to a plurality of processing elements (PEs). One of the SIMD instructions is a thread start (Tstart) instruction, which causes the control processor to pause its instruction fetching. A local PE instruction memory (PE Imem) is associated with each PE and contains local PE instructions for execution on the local PE. Local PE Imem fetch, decode, and execute logic are associated with each PE. Instruction path selection logic in each PE is used to select between control processor distributed instructions and local PE instructions fetched from the local PE Imem. Each PE is also initialized to receive control processor distributed instructions. In addition, local hold generation logic is associated with each PE. A PE receiving a Tstart instruction causes the instruction path selection logic to switch to fetch local PE Imem instructions.

Type: Grant

Filed: April 18, 2007

Date of Patent: June 1, 2010

Assignee: Vicore Technologies, Inc.

Inventors: Gerald George Pechanek, Edwin Franklin Barry, Mihailo M. Stojancic
MULTI-CORE MICROCONTROLLER HAVING COMPARATOR FOR CHECKING PROCESSING RESULT

Publication number: 20100131741

Abstract: A microcontroller capable of improving processing performance as a whole by executing different programs by a plurality of CPUs and capable of detecting abnormality for safety-required processing by evaluating results of the same processing executed by the plurality of CPUs. A plurality of processing systems including CPUs and memories are provided, data output from the CPUs in each of the processing systems is separately compressed and stored by compressors for each of the CPUs, respectively. The compressed storage data is mutually compared by a comparator, and abnormality of processing can be detected when the comparison result indicates a mismatch. Even when the timings by which the same processing results are obtained are different when the plurality of CPUs asynchronously execute the same processing, the processing results of both of them can be easily compared with each other since compression is carried out by the compressors.

Type: Application

Filed: November 2, 2009

Publication date: May 27, 2010

Inventors: Hiromichi YAMADA, Kotaro Shimamura, Kesami Hagiwara, Yoshikazu Kiyoshige, Yuichi Ishiguro
High-performance, superscalar-based computer system with out-of-order instruction execution

Patent number: 7721070

Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.

Type: Grant

Filed: September 22, 2008

Date of Patent: May 18, 2010

Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
Mechanism for scheduling execution of threads for fair resource allocation in a multi-threaded and/or multi-core processing system

Patent number: 7707578

Abstract: A thread scheduling mechanism is provided that flexibly enforces performance isolation of multiple threads to alleviate the effect of anti-cooperative execution behavior with respect to a shared resource, for example, hoarding a cache or pipeline, using the hardware capabilities of simultaneous multi-threaded (SMT) or multi-core processors. Given a plurality of threads running on at least two processors in at least one functional processor group, the occurrence of a rescheduling condition indicating anti-cooperative execution behavior is sensed, and, if present, at least one of the threads is rescheduled such that the first and second threads no longer execute in the same functional processor group at the same time.

Type: Grant

Filed: December 16, 2004

Date of Patent: April 27, 2010

Assignee: VMware, Inc.

Inventors: John R. Zedlewski, Carl A. Waldspurger
Scheduling compatible threads in a simultaneous multi-threading processor using cycle per instruction value occurred during identified time interval

Patent number: 7698707

Abstract: Identifying compatible threads in a Simultaneous Multithreading (SMT) processor environment is provided by calculating a performance metric, such as cycles per instruction (CPI), that occurs when two threads are running on the SMT processor. The CPI that is achieved when both threads were executing on the SMT processor is determined. If the CPI that was achieved is better than the compatibility threshold, then information indicating the compatibility is recorded. When a thread is about to complete, the scheduler looks at the run queue from which the completing thread belongs to dispatch another thread. The scheduler identifies a thread that is (1) compatible with the thread that is still running on the SMT processor (i.e., the thread that is not about to complete), and (2) ready to execute. The CPI data is continually updated so that threads that are compatible with one another are continually identified.

Type: Grant

Filed: February 25, 2008

Date of Patent: April 13, 2010

Assignee: International Business Machines Corporation

Inventors: Jos Manuel Accapadi, Andrew Dunshea, Dirk Michel, Mysore Sathyanarayana Srinivas
Predicated launching of compute thread arrays

Patent number: 7697007

Abstract: A controlling process may enable or disable the launching of a predicated process that has already been queued for launching, e.g. via a pushbuffer. The controlling process generates a report so that launching of the predicated process is enabled or disabled based on the report. The predicate may be global in application to enable or disable all subsequent launch commands. Alternatively, the predicate may be specific to one or more predicated processes. In an embodiment with a central processing unit (CPU) coupled to a graphics processing unit (GPU), the CPU may generate the controlling process that enables or disables the launch of the predicated process. Alternatively or additionally, the GPU may generate the controlling process that enables or disables the launch of the predicated process.

Type: Grant

Filed: July 12, 2006

Date of Patent: April 13, 2010

Assignee: NVIDIA Corporation

Inventor: Jerome F. Duluk, Jr.
Asynchronous multiple-order issue system architecture

Patent number: 7698535

Abstract: An asynchronous circuit is described for processing units of data having a program order associated therewith. The circuit includes an N-way-issue resource comprising N parallel pipelines. Each pipeline is operable to transmit a subset of the units of data in a first-in-first-out manner. The asynchronous circuit is operable to sequentially control transmission of the units of data in the pipelines such that the program order is maintained.

Type: Grant

Filed: September 16, 2003

Date of Patent: April 13, 2010

Assignee: Fulcrum Microsystems, Inc.

Inventors: Andrew Lines, Robert Southworth, Uri Cummings
Pipeline replay support for multi-cycle operations

Patent number: 7685403

Abstract: Instructions asserted in the instruction pipeline (3) of the microprocessor are accompanied by control information, comprising a group of bits, asserted within a control information pipeline (15) of the processor. The control information pipeline is synchronized to the instruction pipeline so that the control information for an instruction progresses in synchronism with the instruction. The control information may identify, directly or indirectly, the type of operation called for by the instruction and, if the operation is to be performed in parts, indicate the part to be performed. Means are included in to the processor, such as a number of functional execution units (7), to interpret that control information and take appropriate action.

Type: Grant

Filed: June 16, 2003

Date of Patent: March 23, 2010

Inventors: Brett Coon, Godfrey D'Souza, Paul Serris
Method and apparatus for providing large register address space while maximizing cycletime performance for a multi-threaded register file set

Patent number: 7681018

Abstract: A parallel hardware-based multithreaded processor is described. The processor includes a general purpose processor that coordinates system functions and a plurality of microengines that support multiple hardware threads or contexts. The processor also includes a memory control system that has a first memory controller that sorts memory references based on whether the memory references are directed to an even bank or an odd bank of memory and a second memory controller that optimizes memory references based upon whether the memory references are read references or write references. Instructions for switching and branching based on executing contexts are also disclosed.

Type: Grant

Filed: January 12, 2001

Date of Patent: March 16, 2010

Assignee: Intel Corporation

Inventors: Gilbert Wolrich, Matthew J. Adiletta, William Wheeler
DUAL-ISSUANCE OF MICROPROCESSOR INSTRUCTIONS USING DUAL DEPENDENCY MATRICES

Publication number: 20100064121

Abstract: A dual-issue instruction is decoded to determine a plurality of LSU dependencies needed by an LSU part of the dual-issue instruction and a plurality of non-LSU dependencies needed by a non-LSU part of the dual-issue instruction. During dispatch of the dual-issue instruction by the microprocessor, the dual dependency matrices are employed as follows: a Load-Store Unit (LSU) dependency matrix is written with the plurality of LSU dependencies and a non-LSU dependency matrix is written with the plurality of non-LSU dependencies; an LSU issue valid (LSU IV) indicator is set as valid to issue; an LSU portion of the dual-issue instruction is issued once the plurality of LSU dependencies of the dual issue instruction are satisfied; a non-LSU issue valid (non-LSU IV) indicator is set as valid to issue; and a non-LSU portion of the dual-issue instruction is issued once the plurality of non-LSU dependencies of the dual issue instruction are satisfied.

Type: Application

Filed: September 11, 2008

Publication date: March 11, 2010

Applicant: International Business Machines Corporation

Inventors: Gregory W. Alexander, Brian D. Barrick, Lee E. Eisen, John W. Ward, III
Minimizing unscheduled D-cache miss pipeline stalls in a cascaded delayed execution pipeline

Patent number: 7676656

Abstract: A method and apparatus for minimizing unscheduled D-cache miss pipeline stalls is provided. In one embodiment, execution of an instruction in a processor is scheduled. The processor may have at least one cascaded delayed execution pipeline unit having two or more execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The method includes receiving an issue group of instructions, determining if a first instruction in the issue group is a load instruction, and if so, scheduling the first instruction to be executed in a pipeline in which execution is not delayed with respect to another pipeline in the cascaded delayed execution pipeline unit.

Type: Grant

Filed: July 2, 2008

Date of Patent: March 9, 2010

Assignee: International Business Machines Corporation

Inventor: David A. Luick

prev 1 2 3 4 5 6 7 8 9 … next