Scoreboarding, Reservation Station, Or Aliasing Patents (Class 712/217)
-
Patent number: 7971034Abstract: A method, system, and computer program product for reduced overhead address mode change management in a pipelined, recycling microprocessor are provided. The recycling microprocessor includes logic executing thereon. The microprocessor also includes an instruction fetch unit (IFU) supporting computation of address adds in selected address modes and reporting non-equal comparison of the computation to the logic. The microprocessor further includes a fixed point unit determining whether the mode has changed and reporting changes to the logic. Upon determining the comparison yields an equal result but the mode has changed, a recycle event is triggered to ensure subsequent ofetches are relaunched in the correct mode and that no execution writebacks occur from work performed in an incorrect mode.Type: GrantFiled: March 19, 2008Date of Patent: June 28, 2011Assignee: International Business Machines CorporationInventors: David S. Hutton, Michael Billeci, Fadi Y. Busaba, Brian R. Prasky, John G. Rell, Jr., Chung-Lung Kevin Shum, Charles F. Webb
-
Patent number: 7958337Abstract: An apparatus and method for executing instructions having a program order. The apparatus comprising a temporary buffer, tag assignment logic, a plurality of functional units, a plurality of data paths, a register array, a retirement control block, and a superscalar instruction retirement unit. The temporary buffer includes a plurality of temporary buffer locations to store result data for executed instructions, wherein the temporary buffer locations are arranged in a plurality of groups of temporary buffer locations. The tag assignment logic is configured to concurrently assign a tag to each instruction in a first set of instructions, wherein the tags are assigned such that the respective tag assigned to each of the instructions in the first set of instructions identifies a different one of the temporary buffer locations in a first one of the groups of temporary buffer locations.Type: GrantFiled: February 26, 2009Date of Patent: June 7, 2011Assignee: Seiko Epson CorporationInventors: Johannes Wang, Sanjiv Garg, Trevor Deosaran
-
Patent number: 7958338Abstract: An instruction execution control device operates a plurality of threads in a simultaneous multi-thread system. The device has architecture registers (22-0, 22-1) for each thread, and a selection circuit (32, 24) which, when an operand data required for executing a function is read from a register file (20), selects in advance a thread to be read from the register file (20). This makes it possible to select an architecture register at an early stage, and although the number of circuits in a portion for selecting the architecture registers increases, the wiring amount of the circuits can be decreased, because the architecture register of the thread to be read is selected in advance.Type: GrantFiled: December 7, 2009Date of Patent: June 7, 2011Assignee: Fujitsu LimitedInventors: Yasunobu Akizuki, Toshio Yoshida, Tomohiro Tanaka, Ryuji Kan
-
Patent number: 7958339Abstract: An instruction execution control device operates a plurality of threads in a simultaneous multi-thread system. And the instruction execution control device has a thread selection circuit (30) which detects a state where an instruction has not been completed for a predetermined period during simultaneous multi-thread operation, and controls such that all the reservation stations (5, 6 and 7) can execute only a predetermined thread. Therefore if an entry that cannot be executed from the reservation stations (5, 6 and 7) exists, execution of an entry in the thread that cannot be executed can be enabled by stopping the execution of the thread which has been executed continuously.Type: GrantFiled: December 7, 2009Date of Patent: June 7, 2011Assignee: Fujitsu LimitedInventors: Yasunobu Akizuki, Toshio Yoshida
-
Patent number: 7949857Abstract: An improved method, device and system are presented for selecting a predetermined number of unused registers in a processor. The method includes partitioning registers in a processor into subsets; searching each subset for an unused register; determining whether every subset includes an unused register; if so, selecting an unused register from each subset; if not, partitioning the registers into new subsets with each subset having a different combination of registers; searching each of the new subsets for an unused register; determining whether each of the new subsets includes an unused register; if so, selecting an unused register from each new subset; and if not, searching each register serially to find the predetermined number of unused registers.Type: GrantFiled: April 9, 2008Date of Patent: May 24, 2011Assignee: International Business Machines CorporationInventor: Kurt A. Feiste
-
Publication number: 20110099355Abstract: A multi-threaded microprocessor (1105) for processing instructions in threads. The microprocessor (1105) includes first and second decode pipelines (1730.0, 1730.1), first and second execute pipelines (1740, 1750), and coupling circuitry (1916) operable in a first mode to couple first and second threads from the first and second decode pipelines (1730.0, 1730.1) to the first and second execute pipelines (1740, 1750) respectively, and the coupling circuitry (1916) operable in a second mode to couple the first thread to both the first and second execute pipelines (1740, 1750). Various processes of manufacture, articles of manufacture, processes and methods of operation, circuits, devices, and systems are disclosed.Type: ApplicationFiled: January 6, 2011Publication date: April 28, 2011Applicant: TEXAS INSTRUMENTS INCORPORATEDInventor: Thang Tran
-
Patent number: 7934078Abstract: An system and method for retiring instructions in a superscalar microprocessor which executes a program comprising a set of instructions having a predetermined program order, the retirement system for simultaneously retiring groups of instructions executed in or out of order by the microprocessor. The retirement system comprises a done block for monitoring the status of the instructions to determine which instruction or group of instructions have been executed, a retirement control block for determining whether each executed instruction is retirable, a temporary buffer for storing results of instructions executed out of program order, and a register array for storing retirable-instruction results.Type: GrantFiled: September 17, 2008Date of Patent: April 26, 2011Assignee: Seiko Epson CorporationInventors: Johannes Wang, Sanjiv Garg, Trevor Deosaran
-
Patent number: 7925868Abstract: Within a data processing system including a register renaming mechanism, register renaming for some conditional instructions which are predicted as not-executed is suppressed. The conditional instructions which are subject to such suppression of renaming may not be all conditional instructions, but may be those which are known to consume a particularly large number of physical registers if they are subject to renaming A conditional load multiple instruction in which multiple registers are loaded with new data values taken from memory in response to a single instruction is an example where the present technique may be used, particularly when one of the registers loaded is the program counter and accordingly the instruction is a conditional branch.Type: GrantFiled: January 24, 2007Date of Patent: April 12, 2011Assignee: ARM LimitedInventors: Norbert Bernard Eugéne Lataille, Florent Begon, Cédric Denis Robert Airaud, Mélanie Vincent
-
Patent number: 7904697Abstract: An apparatus and method for executing a Load Register instruction in which the source data of the Load Register instruction is retained in its original physical register while the architected target register is mapped to this same physical target register. In this state the two architected registers alias to one physical register. When the source register of the Load Address instruction is specified as the target address of a subsequent instruction, a free physical register is assigned to the Load Registers source register. And with this assignment the alias is thus broken. Similarly when the target register of the Load Address instruction is the target address of a subsequent instruction, a new physical register is assigned to the Load Registers target address. And with this assignment the alias is thus broken.Type: GrantFiled: March 7, 2008Date of Patent: March 8, 2011Assignee: International Business Machines CorporationInventors: Brian David Barrick, Brian William Curran, Lee Evan Eisen
-
Publication number: 20110055524Abstract: A method and apparatus for providing fairness in a multi-processing element environment is herein described. Mask elements are utilized to associated portions of a reservation station with each processing element, while still allowing common access to another portion of reservation station entries. Additionally, bias logic biases selection of processing elements in a pipeline away from a processing element associated with a blocking stall to provide fair utilization of the pipeline.Type: ApplicationFiled: November 8, 2010Publication date: March 3, 2011Inventors: Morris Marden, Matthew Merten, Alexandre Farcy, Avinash Sodani, James Hadley, Ilhyun Kim
-
Publication number: 20110010528Abstract: An information processing device implements a register renaming scheme for managing physical registers (e.g. hardware registers HR) coordinated with logical registers (e.g. software usable registers SUR) in conjunction with a renaming table. A first dedicated instruction is incorporated into an instruction set so that a free physical register is coordinated with a logical register designated by the first dedicated instruction. Alternatively, a second dedicated instruction is incorporated into the instruction set so that a physical register coordinated with a logical register designated by the second dedicated instruction is released to be free. In addition, the optimization is performed to change the number of software usable registers (SUR) and the number of renaming registers (RR) within the physical registers in conformity with the software executing the instruction set. Thus, it is possible to prevent the occurrence of an unwanted memory access instruction and dead time needed for releasing registers.Type: ApplicationFiled: July 1, 2010Publication date: January 13, 2011Inventor: KENJI TAGATA
-
Patent number: 7852341Abstract: A method and system for patching instructions in a 3-D graphics pipeline. Specifically, in one embodiment, instructions to be executed within a scheduling process for a shader pipeline of the 3-D graphics pipeline are patchable. A scheduler includes a decode table, an expansion table, and a resource table that are each patchable. The decode table translates high level instructions to an appropriate microcode sequence. The patchable expansion table expands a high level instruction to a program of microcode if the high level instruction is complex. The resource table assigns the units for executing the microcode. Addresses within each of the tables can be patched to modify existing instructions and create new instructions. That is, contents in each address in the tables that are tagged can be replaced with a patch value of a corresponding register.Type: GrantFiled: October 5, 2004Date of Patent: December 14, 2010Assignee: Nvidia CorporationInventors: Christian Rouet, Rui Bastos, Lordson Yue
-
Publication number: 20100312993Abstract: A register renaming table recovery method for use in a processor includes the following steps. Firstly, a flushing operation is performed on a renaming-history table according to a flushed ID. Then, a first renamed ID corresponding to a first register is acquired from an unflushed row of the renaming-history table that is immediately adjacent to the flushed ID. If the first renamed ID is occupied, a register renaming table is updated to rename the first register according to the first renamed ID. Whereas, if the first renamed ID is not occupied, the register renaming table is updated to keep the first register unrenamed.Type: ApplicationFiled: November 17, 2009Publication date: December 9, 2010Applicant: RDC Semiconductor Co., Ltd.Inventors: Chien-Nan I, Chun-Wang Wei
-
Publication number: 20100306509Abstract: An out-of-order execution microprocessor for reducing the likelihood of having to replay a load instruction due to a store collision. The microprocessor includes a queue of entries, each entry configured to hold an instruction pointer of a load instruction and to hold information useable to identify a store instruction that caused the load instruction to be replayed on a first instance of the load instruction. A register alias table (RAT) encounters instructions in program order and generates dependencies used to determine when the instructions may execute out of program order. The RAT encounters the load instruction on a second instance, determines that the load instruction second instance instruction pointer matches the instruction pointer of an entry of the queue, and causes the load instruction on the second instance to have a dependency on the store instruction identified by the information in the matching entry.Type: ApplicationFiled: October 23, 2009Publication date: December 2, 2010Applicant: VIA Technologies, inc.Inventors: Matthew Daniel Day, Rodney E. Hooker
-
Patent number: 7844800Abstract: A processor 2 utilising register renaming executes program instructions requiring a large number of architectural register specifiers to be renamed by dividing the renaming tasks into an initial set and a remaining set. The initial set are performed first and the results passed via a main channel 32 for further processing. The remaining set are performed in sequence with the results being passed via a background channel 34 for further processing. This technique is particularly useful for performing renaming operations for load/store multiple LDM instructions.Type: GrantFiled: August 21, 2007Date of Patent: November 30, 2010Assignee: ARM LimitedInventors: Melanie Emanuelle Lucie Vincent, Florent Begon, Cedric Denis Robert Airaud, Norbert Bernard Eugene Lataille
-
Patent number: 7844797Abstract: The present invention provides a system and method for managing load and store operations necessary for reading from and writing to memory or I/O in a superscalar RISC architecture environment. To perform this task, a load store unit is provided whose main purpose is to make load requests out of order whenever possible to get the load data back for use by an instruction execution unit as quickly as possible. A load operation can only be performed out of order if there are no address collisions and no write pendings. An address collision occurs when a read is requested at a memory location where an older instruction will be writing. Write pending refers to the case where an older instruction requests a store operation, but the store address has not yet been calculated. The data cache unit returns 8 bytes of unaligned data. The load/store unit aligns this data properly before it is returned to the instruction execution unit.Type: GrantFiled: May 6, 2009Date of Patent: November 30, 2010Assignee: Seiko Epson CorporationInventors: Cheryl D. Senter, Johannes Wang
-
Patent number: 7840783Abstract: A system, method, and computer program product are provided for performing a register renaming operation utilizing hardware which operates in at least two modes. In operation, hardware is operated in at least two modes including a first mode for operating the hardware using a logical register of a first bit width and a second mode for operating the hardware using a logical register of a second bit width. The first bit width is twice a width of the second bit width. Additionally, a register renaming operation is performed, including renaming at least one logical register to at least one physical register of the first bit width, utilizing the hardware.Type: GrantFiled: September 10, 2007Date of Patent: November 23, 2010Assignee: Netlogic Microsystems, Inc.Inventors: Gaurav Singh, Srivatsan Srinivasan, Ricardo Ramirez, Wei-Hsiang Chen, Hai Ngoc Nguyen
-
Publication number: 20100268919Abstract: A register file, in a processor, includes a first plurality of registers of a first size, n-bits. A decoder uses a mapping that divides the register file into a second plurality M of registers having a second size. Each of the registers having the second size is assigned a different name in a continuous name space. Each register of the second size includes a plurality N of registers of the first size, n-bits. Each register in the plurality N of registers is assigned the same name as the register of the second size that includes that plurality. State information is maintained in the register file for each n-bit register. The dependence of an instruction on other instructions is detected through the continuous name space. The state information allows the processor to determine when the information in any portion, or all, of a register is valid.Type: ApplicationFiled: April 20, 2009Publication date: October 21, 2010Inventors: Shailender Chaudhry, Marc Tremblay
-
Patent number: 7809929Abstract: A universal register rename mechanism for instructions with multiple targets using a common destination tag. For each instruction that updates multiple destinations, a single rename entry is allocated to handle all destinations associated with it. A rename entry now consists of a DTAG and a vector to indicate the type of destination(s) that is/are being updated by such a particular instruction. For example, a common DTAG can be assigned to a fixed point unit instruction (FXU) that updates general purpose register (GPR), fixed point exception register (XER), and condition code register (CR) destinations. During flush time, the DTAGs in the recovery link may be used to restore the information indicating that the youngest instruction updates a particular architected register. By using a single, universal rename structure for all types of destinations, a large saving in silicon and power can be realized without the need to sacrifice performance.Type: GrantFiled: April 18, 2007Date of Patent: October 5, 2010Assignee: International Business Machines CorporationInventors: Hung Q. Le, Dung Q. Nguyen, Balaram Sinharoy
-
Patent number: 7809930Abstract: A register renaming unit has mapping control circuitry which serves to suppress unnecessary mapping operations in dependence upon a detected current state of the data processing system. One example of circumstances which can be detected from the current state and in which mapping can be suppressed and the existing mapping reused are that in respect of the existing physically mapped register there are no pending writes, no pending reads and no pending requirement for that physically mapped register to be preserved as a recovery register. Another example of a current state in which a mapping can be reused is adjacent program instructions having mutually exclusive condition codes and sharing a destination register such that only one of those adjacent instructions will ever be executed.Type: GrantFiled: January 24, 2007Date of Patent: October 5, 2010Assignee: ARM LimitedInventors: Frederic Claude Marie Piry, Norbert Bernard Eugene Lataille
-
Patent number: 7802079Abstract: A parallel data processing apparatus using a SIMD array of processing elements is disclosed. The apparatus makes use of a register in order to control issuance of instructions to the processing elements in the array.Type: GrantFiled: June 29, 2007Date of Patent: September 21, 2010Assignee: Clearspeed Technology LimitedInventors: Dave Stuttard, Dave Williams, Eamon O'Dea, Gordon Faulds, John Rhoades, Ken Cameron, Phil Atkin, Paul Winser, Russell David, Ray McConnell, Tim Day, Trey Greer
-
Patent number: 7793081Abstract: There are provided methods and computer program products for implementing instruction set architectures with non-contiguous register file specifiers. A method for processing instruction code includes processing a fixed-width instruction of a fixed-width instruction set using a non-contiguous register specifier of a non-contiguous register specification. The fixed-width instruction includes the non-contiguous register specifier.Type: GrantFiled: April 3, 2008Date of Patent: September 7, 2010Assignee: International Business Machines CorporationInventors: Michael Karl Gschwind, Robert Kevin Montoye, Brett Olsson, John-David Wellman
-
Patent number: 7779236Abstract: The invention provides a method and system for operating a pipelined microprocessor more quickly, by detecting instructions that load from identical memory locations as were recently stored to, without having to actually compute the referenced external memory addresses. The microprocessor examines the symbolic structure of instructions as they are encountered, so as to be able to detect identical memory locations by examination of their symbolic structure. For example, in a preferred embodiment, instructions that store to and load from an identical offset from an identical register are determined to be referencing the identical memory location, without having to actually compute the complete physical target address.Type: GrantFiled: November 19, 1999Date of Patent: August 17, 2010Assignee: STMicroelectronics, Inc.Inventor: David L. Isaman
-
Publication number: 20100205409Abstract: Embodiments of a processor architecture utilizing multi-bank implementation of physical register mapping table are provided. A register renaming system to correlate architectural registers to physical registers includes a physical register mapping table and a renaming logic. The physical register mapping table has a plurality of entries each indicative of a state of a respective physical register. The mapping table has a plurality of non-overlapping sections each of which having respective entries of the mapping table. The renaming logic is coupled to search a number of the sections of the mapping table in parallel to identify entries that indicate the respective physical registers have a first state. The renaming logic selectively correlates each of a plurality of architectural registers to a respective physical register identified as being in the first state. Methods of utilizing the multi-bank implementation of physical register mapping table are also provided.Type: ApplicationFiled: February 4, 2010Publication date: August 12, 2010Applicant: STMicroelectronics (Beijing) R&D Co. Ltd.Inventors: Peng Fei Zhu, Hong-Xia Sun, Yong Qiang Wu
-
Patent number: 7774583Abstract: A processing bypass register file system and method are disclosed. In one embodiment a processing bypass register file includes a rotating head pointer, and a plurality of write ports, storage cells and read ports. The write ports receive processing result information. The head pointer identifies which entries are written by the write ports. The plurality of cells store the processing result information. The read ports forward results to the processing data path, and to an architectural register file for retirement.Type: GrantFiled: September 29, 2006Date of Patent: August 10, 2010Inventors: Parag Gupta, Alexander Klaiber, James Van Zoeren
-
Patent number: 7774582Abstract: A data processing system including multiple execution pipelines each having multiple execution stages E1, E2, E3 may have instructions issued together in parallel despite a data dependency therebetween if it is detected that the result operand value for the older instruction will be generated in an execution stage prior to an execution stage which requires that result operand value as an input operand value to the younger instruction and accordingly cross-forwarding of the operand value is possible between the execution pipelines to resolve the data dependency.Type: GrantFiled: May 26, 2005Date of Patent: August 10, 2010Assignee: ARM LimitedInventors: David James Williamson, Glen Andrew Harris, Stephen John Hill
-
Patent number: 7769986Abstract: A method and apparatus for register renaming are provides in the illustrative embodiments. A mapper receives a request for a data in a logical register. The mapper searches an in-flight map table and a set of architected map tables for the data in the logical register. The mapper identifies an entry in one of the in-flight map table and an architected map table in the set of architected map tables that corresponds with the logical register in the request. The mapper returns a location of a physical register, which holds the requested data.Type: GrantFiled: May 1, 2007Date of Patent: August 3, 2010Assignee: International Business Machines CorporationInventors: Christopher Michael Abernathy, William Elton Burky, Jens Leenstra, Nicolas Maeding
-
Patent number: 7765384Abstract: A unified register rename mechanism for targets of different instruction types is provided in a microprocessor. The universal rename mechanism renames destinations of different instruction types using a single rename structure. Thus, an instruction that is updating a floating point register (FPR) can be renamed along with an instruction that is updating a general purpose register (GPR) or vector multimedia extensions (VMX) instructions register (VR) using the same rename structure because the number of architected states for GPR is the same as the number of architected states for FPR and VR. Each destination tag (DTAG) is assigned to one destination. A floating point instruction may be assigned to a DTAG, and then a fixed point instruction may be assigned to the next DTAG and so forth. With a universal rename mechanism, significant silicon and power can be saved by having only one rename structure for all instruction types.Type: GrantFiled: April 18, 2007Date of Patent: July 27, 2010Assignee: International Business Machines CorporationInventors: Hung Q. Le, Dung Q. Nguyen, Balaram Sinharoy
-
Patent number: 7761619Abstract: Disclosed are methods for handling RDMA connections carried over packet stream connections. In one aspect, I/O completion events are distributed among a number of processors in a multi-processor computing device, eliminating processing bottlenecks. For each processor that will accept I/O completion events, at least one completion queue is created. When an I/O completion event is received on one of the completion queues, the processor associated with that queue processes the event. In a second aspect, semantics of the interactions among a packet stream handler, an RDMA layer, and an RNIC are defined to control RDMA closures and thus to avoid implementation errors. In a third aspect, semantics are defined for transferring an existing packet stream connection into RDMA mode while avoiding possible race conditions. The resulting RNIC architecture is simpler than is traditional because the RNIC never needs to process both streaming messages and RDMA-mode traffic at the same time.Type: GrantFiled: May 13, 2005Date of Patent: July 20, 2010Assignee: Microsoft CorporationInventors: Shuangtong Feng, James T. Pinkerton
-
Patent number: 7761691Abstract: A method for scheduling instructions for clustered digital signal processors comprising a plurality of clusters, each cluster including at least two functional units and a first register file having a first unit, a second unit and a single set of access ports shared by the functional units comprises steps of checking whether executing one instruction needs data to be read from the first unit and the second unit of the first register file, generating a copying instruction to transfer data from the first unit to the second unit of the first register file, checking whether there is a prior operation cycle available to perform the copying instruction, scheduling the copying instruction in the prior operation cycle, and scheduling the instruction after the copying instruction.Type: GrantFiled: October 27, 2005Date of Patent: July 20, 2010Assignee: National Tsing Hua UniversityInventors: Chung-Lin Tang, Yung-Chia Lin, Jenq-Kuen Lee
-
Patent number: 7747840Abstract: Methods for latest producer tracking in a processor. In one embodiment, the method includes the steps of (1) writing a physical register identification value in a first register rename map location specified by a first instruction, (2) writing a first in-register status value in a second register rename map location specified by the first instruction, (3) writing a producer tracking status value at a producer tracking map location specified by the physical register identification value, and (4) modifying, upon graduation of the first instruction, the first in-register status value only if the producer tracking map location stores the producer tracking status value written in step (3). Other methods are also presented.Type: GrantFiled: April 16, 2008Date of Patent: June 29, 2010Assignee: MIPS Technologies, Inc.Inventors: Kjeld Svendsen, Xing Yu Jiang
-
Publication number: 20100153690Abstract: One embodiment of the present invention provides a system that facilitates precise exception semantics. The system includes a processor that uses register rename maps to support out-of-order execution, where the register rename maps track mappings between native architectural registers and physical registers for a program executing on the processor. These register rename maps include: 1) a working rename map that maps architectural registers associated with a decoded instruction to corresponding physical registers; 2) a retire rename map that tracks and preserves a set of physical registers that are associated with retired instructions; and 3) a checkpoint rename map that stores a mapping between a set of architectural registers and a set of physical registers for a preceding checkpoint in the program. When the program signals an exception, the processor uses the checkpoint rename map to roll back program execution to the preceding checkpoint.Type: ApplicationFiled: December 12, 2008Publication date: June 17, 2010Applicant: SUN MICROSYSTEMS, INC.Inventors: Christopher A. Vick, Gregory M. Wright
-
Patent number: 7730285Abstract: A data processing system includes a plurality of functional units that selectively execute instructions. A register file includes a plurality of registers that store data corresponding to the instructions. A reorder buffer communicates with the register file and stores the data, includes at least one bypassable buffer location, and includes at least one non-bypassable buffer location.Type: GrantFiled: August 2, 2006Date of Patent: June 1, 2010Assignee: Marvell International Ltd.Inventors: Hong-Yi Chen, Richard Lee, Geoffrey K. Yung, Jensen Tjeng
-
Patent number: 7730282Abstract: A method and system for avoiding various hazards for instructions which are propagating through a microprocessor pipeline. When a plurality of instructions exist within the pipeline which read and write the same value, a vector is established to distinguish the older from the newer instructions. Further, before instructions are dispatched for execution, pointers are generated which identify the particular instruction which had the operand or parameter value needed. Accordingly, by monitoring both the recent vector and pointers, dated dependency hazards can be avoided.Type: GrantFiled: August 11, 2004Date of Patent: June 1, 2010Assignee: International Business Machines CorporationInventors: James N. Dieffenderfer, Nathan S. Nunamaker, Sanjay B. Patel
-
Patent number: 7721071Abstract: A processor core and a method for distributive scoreboard scheduling in an out-of-order processor pipeline are described herein. In an embodiment, control logic appends operand availability bits to each instruction. The appended operand availability bits form a distributive scoreboard for each instruction. The appended operand availability bits are propagated together with the instruction through multiple stages of the processor pipeline. An instruction dispatch buffer stores the instruction and the operand availability bits. A dispatch controller determines when an instruction is to be issued. The determination is based, at least in part, on the operand availability bits stored in the instruction dispatch buffer.Type: GrantFiled: February 28, 2006Date of Patent: May 18, 2010Assignee: MIPS Technologies, Inc.Inventor: Xing Yu Jiang
-
Publication number: 20100122044Abstract: A parallel processing technique is described for performing parallel processing operations upon N-dimensional arrays of data elements for which a corresponding N-dimensional Scoreboard of status data is held. Hazard checking for data dependencies upon data elements within the N-dimensional array of data elements is performed by looking up the corresponding status value within the Scoreboard. The status data for a given data element within the Scoreboard is located at a position which can be derived from the position of the data elements within its N-dimensional array. Thus, a two-dimensional array of video macroblocks can have a corresponding two-dimensional Scoreboard of status data indicating whether individual macroblocks have, for example, either already been deblocked or have not already been deblocked.Type: ApplicationFiled: July 11, 2006Publication date: May 13, 2010Inventors: Simon Ford, Dominic Hugo Symes, Alastair Reid
-
Patent number: 7716455Abstract: A high speed processor. The processor includes terminals that each execute a subset of the instruction set. In at least one of the terminals, the instructions are executed in an order determined by data flow. Instructions are loaded into the terminal in pages. A notation is made when an operand for an instruction is generated by another instruction. When operands for an instruction are available, that instruction is a “ready” instruction. A ready instruction is selected in each cycle and executed. To allow data to be transmitted between terminals, each terminal is provided with a receive station, such that data generated in one terminal may be transmitted to another terminal for use as an operand in that terminal. In one embodiment, one terminal is an arithmetic terminal, executing arithmetic operations such as addition, multiplication and division. The processor has a second terminal, which contains functional logic to execute all other instructions in the instruction set.Type: GrantFiled: December 3, 2004Date of Patent: May 11, 2010Assignee: STMicroelectronics, Inc.Inventor: Stefano Cervini
-
Patent number: 7711929Abstract: A method of tracking instruction dependency in a processor issuing instructions speculatively includes recording in an instruction dependency array (IDA) an entry for each instruction that indicates data dependencies, if any, upon other active instructions. An output vector read out from the IDA indicates data readiness based upon which instructions have previously been selected for issue. The output vector is used to select and read out issue-ready instructions from an instruction buffer.Type: GrantFiled: August 30, 2007Date of Patent: May 4, 2010Assignee: International Business Machines CorporationInventors: William E. Burky, Krishnan Kailas
-
Patent number: 7711928Abstract: A user is provided with means to sample memory hierarchy via software. This allows a user to enhance memory-level parallelism via software. A status of information needed for execution of a second computer program instruction is read in response to execution of a first computer program instruction. Execution continues with execution of the second computer program instruction upon the status being a first status. Alternatively, a third computer program instruction is executed upon the status being a second status different from the first status. Thus, execution of the first computer program instruction allows control of the memory hierarchy, which in turn give the user control of the memory hierarchy.Type: GrantFiled: March 16, 2005Date of Patent: May 4, 2010Assignee: Oracle America, Inc.Inventors: Marc Tremblay, Shailender Chaudhry, Quinn A. Jacobson
-
Patent number: 7711898Abstract: Embodiments of the present invention relate to a system and method for implementing functions of a register translation table of a computer processor, with reduced area requirements as compared to known arrangements. In one embodiment, an apparatus may comprise a register alias table cache to map a logical register to a physical register. The register alias table cache may have a capacity corresponding to a subset of architectural logical registers. The apparatus may further comprise store logic coupled to the cache to perform operations to save an existing content of the physical register if a cache entry corresponding to the logical register is evicted from the cache. The apparatus may also comprise load logic coupled to the cache to perform operations to load a content to the physical register and to form a new entry in the cache if a needed mapping is not present in the cache.Type: GrantFiled: December 18, 2003Date of Patent: May 4, 2010Assignee: Intel CorporationInventors: Avinash Sodani, Stephan J. Jourdan, Samie B. Samaan
-
Patent number: 7707387Abstract: The use of a configuration-based execution model in conjunction with a content addressable memory (CAM) architecture provides a mechanism that enables performance of a number of computing concepts, including conditional execution, (e.g., If-Then statements and while loops), function calls and recursion. If-then and while loops are implemented by using a CAM feature that emits only complete operand sets from the CAM for processing; different seed operands are generated for different conditional evaluation results, and that seed operand is matched with computed data to for an if-then branch or upon exiting a while loop. As a result, downstream operators retrieve only completed operands. Function calls and recursion are handled by using a return tag as an operand along with function parameter data into the input tag space of a function. A recursive function is split into two halves, a pre-recursive half and a post-recursive half that executes after pre-recursive calls.Type: GrantFiled: June 1, 2005Date of Patent: April 27, 2010Assignee: Microsoft CorporationInventor: Ray A. Bittner, Jr.
-
Publication number: 20100058035Abstract: A method for double-issue complex instructions receives a complex instruction comprising a first portion and a second portion. The method sets a single issue queue slot and allocates an execution unit for the complex instruction, and identifies dependencies in the first and second portions. The method sets a dependency matrix slot and a consumers table slot for the first and section portion. In the event the first portion dependencies have been satisfied, the method issues the first portion and then issues the second portion from the single issue queue slot. In the event the second portion dependencies have not been satisfied, the method cancels the second portion issue.Type: ApplicationFiled: August 28, 2008Publication date: March 4, 2010Applicant: International Business Machines CorporationInventors: Christopher M. Abernathy, Mary D. Brown, Todd A. Venton
-
Method and apparatus for back to back issue of dependent instructions in an out of order issue queue
Patent number: 7669038Abstract: A method is provided for evaluating two or more instructions in an out of order issue queue during a particular cycle of the queue, to select an instruction for issue during the next following cycle. If an instruction was previously designated to issue during the particular cycle, one or more instructions in the queue are evaluated to determine if any of them are dependent on the designated instruction. For the evaluation, each instruction placed into the queue is accompanied by corresponding logic elements that provide destination to source compares for the instruction. In an embodiment comprising a method, the oldest ready instruction in the queue during a particular cycle is identified.Type: GrantFiled: May 2, 2008Date of Patent: February 23, 2010Assignee: International Business Machines CorporationInventors: William Elton Burky, Raymond Cheung Yeung -
Patent number: 7669039Abstract: Intermediate results are passed between constituent instructions of an expanded instruction using register renaming resources and control logic. A first constituent instruction generates intermediate results and is assigned a PRN in a constituent instruction rename table, and writes intermediate results to the identified physical register. A second constituent instruction performs a look up in the constituent instruction rename table and reads the intermediate results from the physical register. Constituent instruction rename logic tracks the constituent instructions through the pipeline, and delete the constituent instruction rename table entry and returns the PRN to a free list when the second constituent instruction has read the intermediate results.Type: GrantFiled: January 24, 2007Date of Patent: February 23, 2010Assignee: QUALCOMM IncorporatedInventors: Michael Scott McIlvaine, James Norris Dieffenderfer, Nathan Samuel Nunamaker, Thomas Andrew Sartorius, Rodney Wayne Smith
-
Patent number: 7660971Abstract: A method for dependency tracking and flush recovery for an out-of-order processor includes recording, in a last definition (DEF) data structure, an identifier of a first instruction as the most recent instruction in an instruction sequence that defines contents of the particular logical register and recording, in a next DEF data structure, the identifier of the first instruction in association with an identifier of a previous second instruction also indicating an update to the particular logical register. In addition, a recovery array is updated to indicate which of the instructions in the instruction sequence updates each of the plurality of logical registers. In response to misspeculation during execution of the instruction sequence, the processor performs a recovery operation to place the identifier of the second instruction in the last DEF data structure by reference to the next DEF data structure and the recovery array.Type: GrantFiled: February 1, 2007Date of Patent: February 9, 2010Assignee: International Business Machines CorporationInventors: Vikas Agarwal, William E. Burky, Krishnan Kailas, Balaram Sinharoy
-
Patent number: 7660970Abstract: Disclosed is a data processing system and method. The data processing method determines the number of static registers and the number of rotating registers for assigning a register to a variable contained in a certain program, assigns the register to the variable based on the number of the static registers and the number of the rotating registers, and compiles the program. Further, the method stores in the special register a value corresponding to the number of the rotating registers in the compiling operation, and obtains a physical address from a logical address of the register based on the value. Accordingly, the present invention provides an aspect of efficiently using register files by dynamically controlling the number of rotating registers and the number of static registers for a software pipelined loop, and has an effect capable of reducing the generations of spill/fill codes unnecessary during program execution to a minimum.Type: GrantFiled: August 21, 2006Date of Patent: February 9, 2010Assignee: Samsung Electronics Co., Ltd.Inventors: Suk-jin Kim, Jeong-wook Kim, Hong-seok Kim, Soo-jung Ryu
-
Patent number: 7653906Abstract: Activities may be delayed from being dispatched until another activity is ready to be dispatched. Dispatching more than activities increase overlapping in execution time of activities. By delaying the dispatch of the activities, power consumption and thermal dissipation on a multi-threading processor may be reduced.Type: GrantFiled: October 23, 2002Date of Patent: January 26, 2010Assignee: Intel CorporationInventors: Yen-Kuang Chen, Ishmael F. Santos
-
Patent number: 7650485Abstract: A multithreading processor achieves a very large lookahead instruction window by allowing non-sequential fetch and processing of the dynamic instruction stream. A speculative thread is spawned at a specified point in the dynamic instruction stream and the instructions subsequent to the specified point are speculatively executed so that these instructions are fetched and issued out of sequential order. Very minimal modifications to existing processor design of a multithreading processor are required to achieve the very large lookahead instruction window. The modifications include changes to the control logic of the issue unit, only three additional bits in the register scoreboard.Type: GrantFiled: April 10, 2007Date of Patent: January 19, 2010Assignee: Sun Microsystems, Inc.Inventor: Yuan C. Chou
-
Publication number: 20090327662Abstract: A scoreboard for a video processor may keep track of only dispatched threads which have not yet completed execution. A first thread may itself snoop for execution of a second thread that must be executed before the first thread's execution. Thread execution may be freely reordered, subject only to the rule that a second thread, whose execution is dependent on execution of a first thread, can only be executed after the first thread.Type: ApplicationFiled: June 30, 2008Publication date: December 31, 2009Inventors: Hong Jiang, James M. Holland, Prasoonkumar Surti
-
Publication number: 20090327661Abstract: Methods and apparatus relating to mechanisms to handle free physical register identifiers for SMT (Simultaneous Multi-Threading) out-of-order processors are described. In some embodiments, a physical register file stores both speculative data and architectural data corresponding to a plurality of registers. A free list logic may maintain free physical register identifiers corresponding to the plurality of registers. An instruction may read the architectural data from the physical register file at dispatch. Other embodiments are also described and claimed.Type: ApplicationFiled: June 30, 2008Publication date: December 31, 2009Inventors: Zeev Sperber, David J. Sager, Fernando Latorre, Ori Lempel, Evgeni Krimer, Bishara Shomar