Scoreboarding, Reservation Station, Or Aliasing Patents (Class 712/217)
  • Patent number: 7971034
    Abstract: A method, system, and computer program product for reduced overhead address mode change management in a pipelined, recycling microprocessor are provided. The recycling microprocessor includes logic executing thereon. The microprocessor also includes an instruction fetch unit (IFU) supporting computation of address adds in selected address modes and reporting non-equal comparison of the computation to the logic. The microprocessor further includes a fixed point unit determining whether the mode has changed and reporting changes to the logic. Upon determining the comparison yields an equal result but the mode has changed, a recycle event is triggered to ensure subsequent ofetches are relaunched in the correct mode and that no execution writebacks occur from work performed in an incorrect mode.
    Type: Grant
    Filed: March 19, 2008
    Date of Patent: June 28, 2011
    Assignee: International Business Machines Corporation
    Inventors: David S. Hutton, Michael Billeci, Fadi Y. Busaba, Brian R. Prasky, John G. Rell, Jr., Chung-Lung Kevin Shum, Charles F. Webb
  • Patent number: 7958337
    Abstract: An apparatus and method for executing instructions having a program order. The apparatus comprising a temporary buffer, tag assignment logic, a plurality of functional units, a plurality of data paths, a register array, a retirement control block, and a superscalar instruction retirement unit. The temporary buffer includes a plurality of temporary buffer locations to store result data for executed instructions, wherein the temporary buffer locations are arranged in a plurality of groups of temporary buffer locations. The tag assignment logic is configured to concurrently assign a tag to each instruction in a first set of instructions, wherein the tags are assigned such that the respective tag assigned to each of the instructions in the first set of instructions identifies a different one of the temporary buffer locations in a first one of the groups of temporary buffer locations.
    Type: Grant
    Filed: February 26, 2009
    Date of Patent: June 7, 2011
    Assignee: Seiko Epson Corporation
    Inventors: Johannes Wang, Sanjiv Garg, Trevor Deosaran
  • Patent number: 7958338
    Abstract: An instruction execution control device operates a plurality of threads in a simultaneous multi-thread system. The device has architecture registers (22-0, 22-1) for each thread, and a selection circuit (32, 24) which, when an operand data required for executing a function is read from a register file (20), selects in advance a thread to be read from the register file (20). This makes it possible to select an architecture register at an early stage, and although the number of circuits in a portion for selecting the architecture registers increases, the wiring amount of the circuits can be decreased, because the architecture register of the thread to be read is selected in advance.
    Type: Grant
    Filed: December 7, 2009
    Date of Patent: June 7, 2011
    Assignee: Fujitsu Limited
    Inventors: Yasunobu Akizuki, Toshio Yoshida, Tomohiro Tanaka, Ryuji Kan
  • Patent number: 7958339
    Abstract: An instruction execution control device operates a plurality of threads in a simultaneous multi-thread system. And the instruction execution control device has a thread selection circuit (30) which detects a state where an instruction has not been completed for a predetermined period during simultaneous multi-thread operation, and controls such that all the reservation stations (5, 6 and 7) can execute only a predetermined thread. Therefore if an entry that cannot be executed from the reservation stations (5, 6 and 7) exists, execution of an entry in the thread that cannot be executed can be enabled by stopping the execution of the thread which has been executed continuously.
    Type: Grant
    Filed: December 7, 2009
    Date of Patent: June 7, 2011
    Assignee: Fujitsu Limited
    Inventors: Yasunobu Akizuki, Toshio Yoshida
  • Patent number: 7949857
    Abstract: An improved method, device and system are presented for selecting a predetermined number of unused registers in a processor. The method includes partitioning registers in a processor into subsets; searching each subset for an unused register; determining whether every subset includes an unused register; if so, selecting an unused register from each subset; if not, partitioning the registers into new subsets with each subset having a different combination of registers; searching each of the new subsets for an unused register; determining whether each of the new subsets includes an unused register; if so, selecting an unused register from each new subset; and if not, searching each register serially to find the predetermined number of unused registers.
    Type: Grant
    Filed: April 9, 2008
    Date of Patent: May 24, 2011
    Assignee: International Business Machines Corporation
    Inventor: Kurt A. Feiste
  • Publication number: 20110099355
    Abstract: A multi-threaded microprocessor (1105) for processing instructions in threads. The microprocessor (1105) includes first and second decode pipelines (1730.0, 1730.1), first and second execute pipelines (1740, 1750), and coupling circuitry (1916) operable in a first mode to couple first and second threads from the first and second decode pipelines (1730.0, 1730.1) to the first and second execute pipelines (1740, 1750) respectively, and the coupling circuitry (1916) operable in a second mode to couple the first thread to both the first and second execute pipelines (1740, 1750). Various processes of manufacture, articles of manufacture, processes and methods of operation, circuits, devices, and systems are disclosed.
    Type: Application
    Filed: January 6, 2011
    Publication date: April 28, 2011
    Applicant: TEXAS INSTRUMENTS INCORPORATED
    Inventor: Thang Tran
  • Patent number: 7934078
    Abstract: An system and method for retiring instructions in a superscalar microprocessor which executes a program comprising a set of instructions having a predetermined program order, the retirement system for simultaneously retiring groups of instructions executed in or out of order by the microprocessor. The retirement system comprises a done block for monitoring the status of the instructions to determine which instruction or group of instructions have been executed, a retirement control block for determining whether each executed instruction is retirable, a temporary buffer for storing results of instructions executed out of program order, and a register array for storing retirable-instruction results.
    Type: Grant
    Filed: September 17, 2008
    Date of Patent: April 26, 2011
    Assignee: Seiko Epson Corporation
    Inventors: Johannes Wang, Sanjiv Garg, Trevor Deosaran
  • Patent number: 7925868
    Abstract: Within a data processing system including a register renaming mechanism, register renaming for some conditional instructions which are predicted as not-executed is suppressed. The conditional instructions which are subject to such suppression of renaming may not be all conditional instructions, but may be those which are known to consume a particularly large number of physical registers if they are subject to renaming A conditional load multiple instruction in which multiple registers are loaded with new data values taken from memory in response to a single instruction is an example where the present technique may be used, particularly when one of the registers loaded is the program counter and accordingly the instruction is a conditional branch.
    Type: Grant
    Filed: January 24, 2007
    Date of Patent: April 12, 2011
    Assignee: ARM Limited
    Inventors: Norbert Bernard Eugéne Lataille, Florent Begon, Cédric Denis Robert Airaud, Mélanie Vincent
  • Patent number: 7904697
    Abstract: An apparatus and method for executing a Load Register instruction in which the source data of the Load Register instruction is retained in its original physical register while the architected target register is mapped to this same physical target register. In this state the two architected registers alias to one physical register. When the source register of the Load Address instruction is specified as the target address of a subsequent instruction, a free physical register is assigned to the Load Registers source register. And with this assignment the alias is thus broken. Similarly when the target register of the Load Address instruction is the target address of a subsequent instruction, a new physical register is assigned to the Load Registers target address. And with this assignment the alias is thus broken.
    Type: Grant
    Filed: March 7, 2008
    Date of Patent: March 8, 2011
    Assignee: International Business Machines Corporation
    Inventors: Brian David Barrick, Brian William Curran, Lee Evan Eisen
  • Publication number: 20110055524
    Abstract: A method and apparatus for providing fairness in a multi-processing element environment is herein described. Mask elements are utilized to associated portions of a reservation station with each processing element, while still allowing common access to another portion of reservation station entries. Additionally, bias logic biases selection of processing elements in a pipeline away from a processing element associated with a blocking stall to provide fair utilization of the pipeline.
    Type: Application
    Filed: November 8, 2010
    Publication date: March 3, 2011
    Inventors: Morris Marden, Matthew Merten, Alexandre Farcy, Avinash Sodani, James Hadley, Ilhyun Kim
  • Publication number: 20110010528
    Abstract: An information processing device implements a register renaming scheme for managing physical registers (e.g. hardware registers HR) coordinated with logical registers (e.g. software usable registers SUR) in conjunction with a renaming table. A first dedicated instruction is incorporated into an instruction set so that a free physical register is coordinated with a logical register designated by the first dedicated instruction. Alternatively, a second dedicated instruction is incorporated into the instruction set so that a physical register coordinated with a logical register designated by the second dedicated instruction is released to be free. In addition, the optimization is performed to change the number of software usable registers (SUR) and the number of renaming registers (RR) within the physical registers in conformity with the software executing the instruction set. Thus, it is possible to prevent the occurrence of an unwanted memory access instruction and dead time needed for releasing registers.
    Type: Application
    Filed: July 1, 2010
    Publication date: January 13, 2011
    Inventor: KENJI TAGATA
  • Patent number: 7852341
    Abstract: A method and system for patching instructions in a 3-D graphics pipeline. Specifically, in one embodiment, instructions to be executed within a scheduling process for a shader pipeline of the 3-D graphics pipeline are patchable. A scheduler includes a decode table, an expansion table, and a resource table that are each patchable. The decode table translates high level instructions to an appropriate microcode sequence. The patchable expansion table expands a high level instruction to a program of microcode if the high level instruction is complex. The resource table assigns the units for executing the microcode. Addresses within each of the tables can be patched to modify existing instructions and create new instructions. That is, contents in each address in the tables that are tagged can be replaced with a patch value of a corresponding register.
    Type: Grant
    Filed: October 5, 2004
    Date of Patent: December 14, 2010
    Assignee: Nvidia Corporation
    Inventors: Christian Rouet, Rui Bastos, Lordson Yue
  • Publication number: 20100312993
    Abstract: A register renaming table recovery method for use in a processor includes the following steps. Firstly, a flushing operation is performed on a renaming-history table according to a flushed ID. Then, a first renamed ID corresponding to a first register is acquired from an unflushed row of the renaming-history table that is immediately adjacent to the flushed ID. If the first renamed ID is occupied, a register renaming table is updated to rename the first register according to the first renamed ID. Whereas, if the first renamed ID is not occupied, the register renaming table is updated to keep the first register unrenamed.
    Type: Application
    Filed: November 17, 2009
    Publication date: December 9, 2010
    Applicant: RDC Semiconductor Co., Ltd.
    Inventors: Chien-Nan I, Chun-Wang Wei
  • Publication number: 20100306509
    Abstract: An out-of-order execution microprocessor for reducing the likelihood of having to replay a load instruction due to a store collision. The microprocessor includes a queue of entries, each entry configured to hold an instruction pointer of a load instruction and to hold information useable to identify a store instruction that caused the load instruction to be replayed on a first instance of the load instruction. A register alias table (RAT) encounters instructions in program order and generates dependencies used to determine when the instructions may execute out of program order. The RAT encounters the load instruction on a second instance, determines that the load instruction second instance instruction pointer matches the instruction pointer of an entry of the queue, and causes the load instruction on the second instance to have a dependency on the store instruction identified by the information in the matching entry.
    Type: Application
    Filed: October 23, 2009
    Publication date: December 2, 2010
    Applicant: VIA Technologies, inc.
    Inventors: Matthew Daniel Day, Rodney E. Hooker
  • Patent number: 7844800
    Abstract: A processor 2 utilising register renaming executes program instructions requiring a large number of architectural register specifiers to be renamed by dividing the renaming tasks into an initial set and a remaining set. The initial set are performed first and the results passed via a main channel 32 for further processing. The remaining set are performed in sequence with the results being passed via a background channel 34 for further processing. This technique is particularly useful for performing renaming operations for load/store multiple LDM instructions.
    Type: Grant
    Filed: August 21, 2007
    Date of Patent: November 30, 2010
    Assignee: ARM Limited
    Inventors: Melanie Emanuelle Lucie Vincent, Florent Begon, Cedric Denis Robert Airaud, Norbert Bernard Eugene Lataille
  • Patent number: 7844797
    Abstract: The present invention provides a system and method for managing load and store operations necessary for reading from and writing to memory or I/O in a superscalar RISC architecture environment. To perform this task, a load store unit is provided whose main purpose is to make load requests out of order whenever possible to get the load data back for use by an instruction execution unit as quickly as possible. A load operation can only be performed out of order if there are no address collisions and no write pendings. An address collision occurs when a read is requested at a memory location where an older instruction will be writing. Write pending refers to the case where an older instruction requests a store operation, but the store address has not yet been calculated. The data cache unit returns 8 bytes of unaligned data. The load/store unit aligns this data properly before it is returned to the instruction execution unit.
    Type: Grant
    Filed: May 6, 2009
    Date of Patent: November 30, 2010
    Assignee: Seiko Epson Corporation
    Inventors: Cheryl D. Senter, Johannes Wang
  • Patent number: 7840783
    Abstract: A system, method, and computer program product are provided for performing a register renaming operation utilizing hardware which operates in at least two modes. In operation, hardware is operated in at least two modes including a first mode for operating the hardware using a logical register of a first bit width and a second mode for operating the hardware using a logical register of a second bit width. The first bit width is twice a width of the second bit width. Additionally, a register renaming operation is performed, including renaming at least one logical register to at least one physical register of the first bit width, utilizing the hardware.
    Type: Grant
    Filed: September 10, 2007
    Date of Patent: November 23, 2010
    Assignee: Netlogic Microsystems, Inc.
    Inventors: Gaurav Singh, Srivatsan Srinivasan, Ricardo Ramirez, Wei-Hsiang Chen, Hai Ngoc Nguyen
  • Publication number: 20100268919
    Abstract: A register file, in a processor, includes a first plurality of registers of a first size, n-bits. A decoder uses a mapping that divides the register file into a second plurality M of registers having a second size. Each of the registers having the second size is assigned a different name in a continuous name space. Each register of the second size includes a plurality N of registers of the first size, n-bits. Each register in the plurality N of registers is assigned the same name as the register of the second size that includes that plurality. State information is maintained in the register file for each n-bit register. The dependence of an instruction on other instructions is detected through the continuous name space. The state information allows the processor to determine when the information in any portion, or all, of a register is valid.
    Type: Application
    Filed: April 20, 2009
    Publication date: October 21, 2010
    Inventors: Shailender Chaudhry, Marc Tremblay
  • Patent number: 7809929
    Abstract: A universal register rename mechanism for instructions with multiple targets using a common destination tag. For each instruction that updates multiple destinations, a single rename entry is allocated to handle all destinations associated with it. A rename entry now consists of a DTAG and a vector to indicate the type of destination(s) that is/are being updated by such a particular instruction. For example, a common DTAG can be assigned to a fixed point unit instruction (FXU) that updates general purpose register (GPR), fixed point exception register (XER), and condition code register (CR) destinations. During flush time, the DTAGs in the recovery link may be used to restore the information indicating that the youngest instruction updates a particular architected register. By using a single, universal rename structure for all types of destinations, a large saving in silicon and power can be realized without the need to sacrifice performance.
    Type: Grant
    Filed: April 18, 2007
    Date of Patent: October 5, 2010
    Assignee: International Business Machines Corporation
    Inventors: Hung Q. Le, Dung Q. Nguyen, Balaram Sinharoy
  • Patent number: 7809930
    Abstract: A register renaming unit has mapping control circuitry which serves to suppress unnecessary mapping operations in dependence upon a detected current state of the data processing system. One example of circumstances which can be detected from the current state and in which mapping can be suppressed and the existing mapping reused are that in respect of the existing physically mapped register there are no pending writes, no pending reads and no pending requirement for that physically mapped register to be preserved as a recovery register. Another example of a current state in which a mapping can be reused is adjacent program instructions having mutually exclusive condition codes and sharing a destination register such that only one of those adjacent instructions will ever be executed.
    Type: Grant
    Filed: January 24, 2007
    Date of Patent: October 5, 2010
    Assignee: ARM Limited
    Inventors: Frederic Claude Marie Piry, Norbert Bernard Eugene Lataille
  • Patent number: 7802079
    Abstract: A parallel data processing apparatus using a SIMD array of processing elements is disclosed. The apparatus makes use of a register in order to control issuance of instructions to the processing elements in the array.
    Type: Grant
    Filed: June 29, 2007
    Date of Patent: September 21, 2010
    Assignee: Clearspeed Technology Limited
    Inventors: Dave Stuttard, Dave Williams, Eamon O'Dea, Gordon Faulds, John Rhoades, Ken Cameron, Phil Atkin, Paul Winser, Russell David, Ray McConnell, Tim Day, Trey Greer
  • Patent number: 7793081
    Abstract: There are provided methods and computer program products for implementing instruction set architectures with non-contiguous register file specifiers. A method for processing instruction code includes processing a fixed-width instruction of a fixed-width instruction set using a non-contiguous register specifier of a non-contiguous register specification. The fixed-width instruction includes the non-contiguous register specifier.
    Type: Grant
    Filed: April 3, 2008
    Date of Patent: September 7, 2010
    Assignee: International Business Machines Corporation
    Inventors: Michael Karl Gschwind, Robert Kevin Montoye, Brett Olsson, John-David Wellman
  • Patent number: 7779236
    Abstract: The invention provides a method and system for operating a pipelined microprocessor more quickly, by detecting instructions that load from identical memory locations as were recently stored to, without having to actually compute the referenced external memory addresses. The microprocessor examines the symbolic structure of instructions as they are encountered, so as to be able to detect identical memory locations by examination of their symbolic structure. For example, in a preferred embodiment, instructions that store to and load from an identical offset from an identical register are determined to be referencing the identical memory location, without having to actually compute the complete physical target address.
    Type: Grant
    Filed: November 19, 1999
    Date of Patent: August 17, 2010
    Assignee: STMicroelectronics, Inc.
    Inventor: David L. Isaman
  • Publication number: 20100205409
    Abstract: Embodiments of a processor architecture utilizing multi-bank implementation of physical register mapping table are provided. A register renaming system to correlate architectural registers to physical registers includes a physical register mapping table and a renaming logic. The physical register mapping table has a plurality of entries each indicative of a state of a respective physical register. The mapping table has a plurality of non-overlapping sections each of which having respective entries of the mapping table. The renaming logic is coupled to search a number of the sections of the mapping table in parallel to identify entries that indicate the respective physical registers have a first state. The renaming logic selectively correlates each of a plurality of architectural registers to a respective physical register identified as being in the first state. Methods of utilizing the multi-bank implementation of physical register mapping table are also provided.
    Type: Application
    Filed: February 4, 2010
    Publication date: August 12, 2010
    Applicant: STMicroelectronics (Beijing) R&D Co. Ltd.
    Inventors: Peng Fei Zhu, Hong-Xia Sun, Yong Qiang Wu
  • Patent number: 7774583
    Abstract: A processing bypass register file system and method are disclosed. In one embodiment a processing bypass register file includes a rotating head pointer, and a plurality of write ports, storage cells and read ports. The write ports receive processing result information. The head pointer identifies which entries are written by the write ports. The plurality of cells store the processing result information. The read ports forward results to the processing data path, and to an architectural register file for retirement.
    Type: Grant
    Filed: September 29, 2006
    Date of Patent: August 10, 2010
    Inventors: Parag Gupta, Alexander Klaiber, James Van Zoeren
  • Patent number: 7774582
    Abstract: A data processing system including multiple execution pipelines each having multiple execution stages E1, E2, E3 may have instructions issued together in parallel despite a data dependency therebetween if it is detected that the result operand value for the older instruction will be generated in an execution stage prior to an execution stage which requires that result operand value as an input operand value to the younger instruction and accordingly cross-forwarding of the operand value is possible between the execution pipelines to resolve the data dependency.
    Type: Grant
    Filed: May 26, 2005
    Date of Patent: August 10, 2010
    Assignee: ARM Limited
    Inventors: David James Williamson, Glen Andrew Harris, Stephen John Hill
  • Patent number: 7769986
    Abstract: A method and apparatus for register renaming are provides in the illustrative embodiments. A mapper receives a request for a data in a logical register. The mapper searches an in-flight map table and a set of architected map tables for the data in the logical register. The mapper identifies an entry in one of the in-flight map table and an architected map table in the set of architected map tables that corresponds with the logical register in the request. The mapper returns a location of a physical register, which holds the requested data.
    Type: Grant
    Filed: May 1, 2007
    Date of Patent: August 3, 2010
    Assignee: International Business Machines Corporation
    Inventors: Christopher Michael Abernathy, William Elton Burky, Jens Leenstra, Nicolas Maeding
  • Patent number: 7765384
    Abstract: A unified register rename mechanism for targets of different instruction types is provided in a microprocessor. The universal rename mechanism renames destinations of different instruction types using a single rename structure. Thus, an instruction that is updating a floating point register (FPR) can be renamed along with an instruction that is updating a general purpose register (GPR) or vector multimedia extensions (VMX) instructions register (VR) using the same rename structure because the number of architected states for GPR is the same as the number of architected states for FPR and VR. Each destination tag (DTAG) is assigned to one destination. A floating point instruction may be assigned to a DTAG, and then a fixed point instruction may be assigned to the next DTAG and so forth. With a universal rename mechanism, significant silicon and power can be saved by having only one rename structure for all instruction types.
    Type: Grant
    Filed: April 18, 2007
    Date of Patent: July 27, 2010
    Assignee: International Business Machines Corporation
    Inventors: Hung Q. Le, Dung Q. Nguyen, Balaram Sinharoy
  • Patent number: 7761619
    Abstract: Disclosed are methods for handling RDMA connections carried over packet stream connections. In one aspect, I/O completion events are distributed among a number of processors in a multi-processor computing device, eliminating processing bottlenecks. For each processor that will accept I/O completion events, at least one completion queue is created. When an I/O completion event is received on one of the completion queues, the processor associated with that queue processes the event. In a second aspect, semantics of the interactions among a packet stream handler, an RDMA layer, and an RNIC are defined to control RDMA closures and thus to avoid implementation errors. In a third aspect, semantics are defined for transferring an existing packet stream connection into RDMA mode while avoiding possible race conditions. The resulting RNIC architecture is simpler than is traditional because the RNIC never needs to process both streaming messages and RDMA-mode traffic at the same time.
    Type: Grant
    Filed: May 13, 2005
    Date of Patent: July 20, 2010
    Assignee: Microsoft Corporation
    Inventors: Shuangtong Feng, James T. Pinkerton
  • Patent number: 7761691
    Abstract: A method for scheduling instructions for clustered digital signal processors comprising a plurality of clusters, each cluster including at least two functional units and a first register file having a first unit, a second unit and a single set of access ports shared by the functional units comprises steps of checking whether executing one instruction needs data to be read from the first unit and the second unit of the first register file, generating a copying instruction to transfer data from the first unit to the second unit of the first register file, checking whether there is a prior operation cycle available to perform the copying instruction, scheduling the copying instruction in the prior operation cycle, and scheduling the instruction after the copying instruction.
    Type: Grant
    Filed: October 27, 2005
    Date of Patent: July 20, 2010
    Assignee: National Tsing Hua University
    Inventors: Chung-Lin Tang, Yung-Chia Lin, Jenq-Kuen Lee
  • Patent number: 7747840
    Abstract: Methods for latest producer tracking in a processor. In one embodiment, the method includes the steps of (1) writing a physical register identification value in a first register rename map location specified by a first instruction, (2) writing a first in-register status value in a second register rename map location specified by the first instruction, (3) writing a producer tracking status value at a producer tracking map location specified by the physical register identification value, and (4) modifying, upon graduation of the first instruction, the first in-register status value only if the producer tracking map location stores the producer tracking status value written in step (3). Other methods are also presented.
    Type: Grant
    Filed: April 16, 2008
    Date of Patent: June 29, 2010
    Assignee: MIPS Technologies, Inc.
    Inventors: Kjeld Svendsen, Xing Yu Jiang
  • Publication number: 20100153690
    Abstract: One embodiment of the present invention provides a system that facilitates precise exception semantics. The system includes a processor that uses register rename maps to support out-of-order execution, where the register rename maps track mappings between native architectural registers and physical registers for a program executing on the processor. These register rename maps include: 1) a working rename map that maps architectural registers associated with a decoded instruction to corresponding physical registers; 2) a retire rename map that tracks and preserves a set of physical registers that are associated with retired instructions; and 3) a checkpoint rename map that stores a mapping between a set of architectural registers and a set of physical registers for a preceding checkpoint in the program. When the program signals an exception, the processor uses the checkpoint rename map to roll back program execution to the preceding checkpoint.
    Type: Application
    Filed: December 12, 2008
    Publication date: June 17, 2010
    Applicant: SUN MICROSYSTEMS, INC.
    Inventors: Christopher A. Vick, Gregory M. Wright
  • Patent number: 7730285
    Abstract: A data processing system includes a plurality of functional units that selectively execute instructions. A register file includes a plurality of registers that store data corresponding to the instructions. A reorder buffer communicates with the register file and stores the data, includes at least one bypassable buffer location, and includes at least one non-bypassable buffer location.
    Type: Grant
    Filed: August 2, 2006
    Date of Patent: June 1, 2010
    Assignee: Marvell International Ltd.
    Inventors: Hong-Yi Chen, Richard Lee, Geoffrey K. Yung, Jensen Tjeng
  • Patent number: 7730282
    Abstract: A method and system for avoiding various hazards for instructions which are propagating through a microprocessor pipeline. When a plurality of instructions exist within the pipeline which read and write the same value, a vector is established to distinguish the older from the newer instructions. Further, before instructions are dispatched for execution, pointers are generated which identify the particular instruction which had the operand or parameter value needed. Accordingly, by monitoring both the recent vector and pointers, dated dependency hazards can be avoided.
    Type: Grant
    Filed: August 11, 2004
    Date of Patent: June 1, 2010
    Assignee: International Business Machines Corporation
    Inventors: James N. Dieffenderfer, Nathan S. Nunamaker, Sanjay B. Patel
  • Patent number: 7721071
    Abstract: A processor core and a method for distributive scoreboard scheduling in an out-of-order processor pipeline are described herein. In an embodiment, control logic appends operand availability bits to each instruction. The appended operand availability bits form a distributive scoreboard for each instruction. The appended operand availability bits are propagated together with the instruction through multiple stages of the processor pipeline. An instruction dispatch buffer stores the instruction and the operand availability bits. A dispatch controller determines when an instruction is to be issued. The determination is based, at least in part, on the operand availability bits stored in the instruction dispatch buffer.
    Type: Grant
    Filed: February 28, 2006
    Date of Patent: May 18, 2010
    Assignee: MIPS Technologies, Inc.
    Inventor: Xing Yu Jiang
  • Publication number: 20100122044
    Abstract: A parallel processing technique is described for performing parallel processing operations upon N-dimensional arrays of data elements for which a corresponding N-dimensional Scoreboard of status data is held. Hazard checking for data dependencies upon data elements within the N-dimensional array of data elements is performed by looking up the corresponding status value within the Scoreboard. The status data for a given data element within the Scoreboard is located at a position which can be derived from the position of the data elements within its N-dimensional array. Thus, a two-dimensional array of video macroblocks can have a corresponding two-dimensional Scoreboard of status data indicating whether individual macroblocks have, for example, either already been deblocked or have not already been deblocked.
    Type: Application
    Filed: July 11, 2006
    Publication date: May 13, 2010
    Inventors: Simon Ford, Dominic Hugo Symes, Alastair Reid
  • Patent number: 7716455
    Abstract: A high speed processor. The processor includes terminals that each execute a subset of the instruction set. In at least one of the terminals, the instructions are executed in an order determined by data flow. Instructions are loaded into the terminal in pages. A notation is made when an operand for an instruction is generated by another instruction. When operands for an instruction are available, that instruction is a “ready” instruction. A ready instruction is selected in each cycle and executed. To allow data to be transmitted between terminals, each terminal is provided with a receive station, such that data generated in one terminal may be transmitted to another terminal for use as an operand in that terminal. In one embodiment, one terminal is an arithmetic terminal, executing arithmetic operations such as addition, multiplication and division. The processor has a second terminal, which contains functional logic to execute all other instructions in the instruction set.
    Type: Grant
    Filed: December 3, 2004
    Date of Patent: May 11, 2010
    Assignee: STMicroelectronics, Inc.
    Inventor: Stefano Cervini
  • Patent number: 7711929
    Abstract: A method of tracking instruction dependency in a processor issuing instructions speculatively includes recording in an instruction dependency array (IDA) an entry for each instruction that indicates data dependencies, if any, upon other active instructions. An output vector read out from the IDA indicates data readiness based upon which instructions have previously been selected for issue. The output vector is used to select and read out issue-ready instructions from an instruction buffer.
    Type: Grant
    Filed: August 30, 2007
    Date of Patent: May 4, 2010
    Assignee: International Business Machines Corporation
    Inventors: William E. Burky, Krishnan Kailas
  • Patent number: 7711928
    Abstract: A user is provided with means to sample memory hierarchy via software. This allows a user to enhance memory-level parallelism via software. A status of information needed for execution of a second computer program instruction is read in response to execution of a first computer program instruction. Execution continues with execution of the second computer program instruction upon the status being a first status. Alternatively, a third computer program instruction is executed upon the status being a second status different from the first status. Thus, execution of the first computer program instruction allows control of the memory hierarchy, which in turn give the user control of the memory hierarchy.
    Type: Grant
    Filed: March 16, 2005
    Date of Patent: May 4, 2010
    Assignee: Oracle America, Inc.
    Inventors: Marc Tremblay, Shailender Chaudhry, Quinn A. Jacobson
  • Patent number: 7711898
    Abstract: Embodiments of the present invention relate to a system and method for implementing functions of a register translation table of a computer processor, with reduced area requirements as compared to known arrangements. In one embodiment, an apparatus may comprise a register alias table cache to map a logical register to a physical register. The register alias table cache may have a capacity corresponding to a subset of architectural logical registers. The apparatus may further comprise store logic coupled to the cache to perform operations to save an existing content of the physical register if a cache entry corresponding to the logical register is evicted from the cache. The apparatus may also comprise load logic coupled to the cache to perform operations to load a content to the physical register and to form a new entry in the cache if a needed mapping is not present in the cache.
    Type: Grant
    Filed: December 18, 2003
    Date of Patent: May 4, 2010
    Assignee: Intel Corporation
    Inventors: Avinash Sodani, Stephan J. Jourdan, Samie B. Samaan
  • Patent number: 7707387
    Abstract: The use of a configuration-based execution model in conjunction with a content addressable memory (CAM) architecture provides a mechanism that enables performance of a number of computing concepts, including conditional execution, (e.g., If-Then statements and while loops), function calls and recursion. If-then and while loops are implemented by using a CAM feature that emits only complete operand sets from the CAM for processing; different seed operands are generated for different conditional evaluation results, and that seed operand is matched with computed data to for an if-then branch or upon exiting a while loop. As a result, downstream operators retrieve only completed operands. Function calls and recursion are handled by using a return tag as an operand along with function parameter data into the input tag space of a function. A recursive function is split into two halves, a pre-recursive half and a post-recursive half that executes after pre-recursive calls.
    Type: Grant
    Filed: June 1, 2005
    Date of Patent: April 27, 2010
    Assignee: Microsoft Corporation
    Inventor: Ray A. Bittner, Jr.
  • Publication number: 20100058035
    Abstract: A method for double-issue complex instructions receives a complex instruction comprising a first portion and a second portion. The method sets a single issue queue slot and allocates an execution unit for the complex instruction, and identifies dependencies in the first and second portions. The method sets a dependency matrix slot and a consumers table slot for the first and section portion. In the event the first portion dependencies have been satisfied, the method issues the first portion and then issues the second portion from the single issue queue slot. In the event the second portion dependencies have not been satisfied, the method cancels the second portion issue.
    Type: Application
    Filed: August 28, 2008
    Publication date: March 4, 2010
    Applicant: International Business Machines Corporation
    Inventors: Christopher M. Abernathy, Mary D. Brown, Todd A. Venton
  • Patent number: 7669038
    Abstract: A method is provided for evaluating two or more instructions in an out of order issue queue during a particular cycle of the queue, to select an instruction for issue during the next following cycle. If an instruction was previously designated to issue during the particular cycle, one or more instructions in the queue are evaluated to determine if any of them are dependent on the designated instruction. For the evaluation, each instruction placed into the queue is accompanied by corresponding logic elements that provide destination to source compares for the instruction. In an embodiment comprising a method, the oldest ready instruction in the queue during a particular cycle is identified.
    Type: Grant
    Filed: May 2, 2008
    Date of Patent: February 23, 2010
    Assignee: International Business Machines Corporation
    Inventors: William Elton Burky, Raymond Cheung Yeung
  • Patent number: 7669039
    Abstract: Intermediate results are passed between constituent instructions of an expanded instruction using register renaming resources and control logic. A first constituent instruction generates intermediate results and is assigned a PRN in a constituent instruction rename table, and writes intermediate results to the identified physical register. A second constituent instruction performs a look up in the constituent instruction rename table and reads the intermediate results from the physical register. Constituent instruction rename logic tracks the constituent instructions through the pipeline, and delete the constituent instruction rename table entry and returns the PRN to a free list when the second constituent instruction has read the intermediate results.
    Type: Grant
    Filed: January 24, 2007
    Date of Patent: February 23, 2010
    Assignee: QUALCOMM Incorporated
    Inventors: Michael Scott McIlvaine, James Norris Dieffenderfer, Nathan Samuel Nunamaker, Thomas Andrew Sartorius, Rodney Wayne Smith
  • Patent number: 7660971
    Abstract: A method for dependency tracking and flush recovery for an out-of-order processor includes recording, in a last definition (DEF) data structure, an identifier of a first instruction as the most recent instruction in an instruction sequence that defines contents of the particular logical register and recording, in a next DEF data structure, the identifier of the first instruction in association with an identifier of a previous second instruction also indicating an update to the particular logical register. In addition, a recovery array is updated to indicate which of the instructions in the instruction sequence updates each of the plurality of logical registers. In response to misspeculation during execution of the instruction sequence, the processor performs a recovery operation to place the identifier of the second instruction in the last DEF data structure by reference to the next DEF data structure and the recovery array.
    Type: Grant
    Filed: February 1, 2007
    Date of Patent: February 9, 2010
    Assignee: International Business Machines Corporation
    Inventors: Vikas Agarwal, William E. Burky, Krishnan Kailas, Balaram Sinharoy
  • Patent number: 7660970
    Abstract: Disclosed is a data processing system and method. The data processing method determines the number of static registers and the number of rotating registers for assigning a register to a variable contained in a certain program, assigns the register to the variable based on the number of the static registers and the number of the rotating registers, and compiles the program. Further, the method stores in the special register a value corresponding to the number of the rotating registers in the compiling operation, and obtains a physical address from a logical address of the register based on the value. Accordingly, the present invention provides an aspect of efficiently using register files by dynamically controlling the number of rotating registers and the number of static registers for a software pipelined loop, and has an effect capable of reducing the generations of spill/fill codes unnecessary during program execution to a minimum.
    Type: Grant
    Filed: August 21, 2006
    Date of Patent: February 9, 2010
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Suk-jin Kim, Jeong-wook Kim, Hong-seok Kim, Soo-jung Ryu
  • Patent number: 7653906
    Abstract: Activities may be delayed from being dispatched until another activity is ready to be dispatched. Dispatching more than activities increase overlapping in execution time of activities. By delaying the dispatch of the activities, power consumption and thermal dissipation on a multi-threading processor may be reduced.
    Type: Grant
    Filed: October 23, 2002
    Date of Patent: January 26, 2010
    Assignee: Intel Corporation
    Inventors: Yen-Kuang Chen, Ishmael F. Santos
  • Patent number: 7650485
    Abstract: A multithreading processor achieves a very large lookahead instruction window by allowing non-sequential fetch and processing of the dynamic instruction stream. A speculative thread is spawned at a specified point in the dynamic instruction stream and the instructions subsequent to the specified point are speculatively executed so that these instructions are fetched and issued out of sequential order. Very minimal modifications to existing processor design of a multithreading processor are required to achieve the very large lookahead instruction window. The modifications include changes to the control logic of the issue unit, only three additional bits in the register scoreboard.
    Type: Grant
    Filed: April 10, 2007
    Date of Patent: January 19, 2010
    Assignee: Sun Microsystems, Inc.
    Inventor: Yuan C. Chou
  • Publication number: 20090327662
    Abstract: A scoreboard for a video processor may keep track of only dispatched threads which have not yet completed execution. A first thread may itself snoop for execution of a second thread that must be executed before the first thread's execution. Thread execution may be freely reordered, subject only to the rule that a second thread, whose execution is dependent on execution of a first thread, can only be executed after the first thread.
    Type: Application
    Filed: June 30, 2008
    Publication date: December 31, 2009
    Inventors: Hong Jiang, James M. Holland, Prasoonkumar Surti
  • Publication number: 20090327661
    Abstract: Methods and apparatus relating to mechanisms to handle free physical register identifiers for SMT (Simultaneous Multi-Threading) out-of-order processors are described. In some embodiments, a physical register file stores both speculative data and architectural data corresponding to a plurality of registers. A free list logic may maintain free physical register identifiers corresponding to the plurality of registers. An instruction may read the architectural data from the physical register file at dispatch. Other embodiments are also described and claimed.
    Type: Application
    Filed: June 30, 2008
    Publication date: December 31, 2009
    Inventors: Zeev Sperber, David J. Sager, Fernando Latorre, Ori Lempel, Evgeni Krimer, Bishara Shomar