Scoreboarding, Reservation Station, Or Aliasing Patents (Class 712/217)
  • Patent number: 9268569
    Abstract: A method for suppressing branch misprediction behavior is contemplated in which a conditional branch instruction that would cause the flow of control to branch around instructions in response to a determination that a predicate vector is null is predicted not taken. However, in response to detecting that the prediction is incorrect, misprediction behavior is inhibited.
    Type: Grant
    Filed: February 24, 2012
    Date of Patent: February 23, 2016
    Assignee: Apple Inc.
    Inventor: Jeffry E. Gonion
  • Patent number: 9256433
    Abstract: Systems and methods for move operation elimination with bypass Multiple Instantiation Table (MIT) logic. An example processing system may comprise a first data structure configured to store a plurality of physical register values; a second data structure configured to store a plurality of pointers, each pointer referencing an element of the first data structure; a third data structure including a plurality of move elimination sets, each move elimination set comprising a plurality of bits representing a plurality of logical registers; and a logic configured to perform a data manipulation operation by causing an element of the second data structure to reference an element of the first data structure, the logic further configured to reflect results of two or more data manipulation operations by performing a single update of the third data structure.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: February 9, 2016
    Assignee: Intel Corporation
    Inventor: Jeremy Anderson
  • Patent number: 9250900
    Abstract: Methods and systems for implementing a microprocessor with a selective register file bypass network are disclosed. Late bypasses are removed from a register file bypass network of a microprocessor design. One or more late bypasses are then added back to the register file bypass network based at least in part upon the results of analyzing a plurality of instructions that are to be processed in an instruction pipeline of the microprocessor. An electronic design for at least the register file bypass network is then generated with these one or more late bypasses that are added to the register file bypass network. Without incurring additional hardware or cost for the microprocessor design, one or more bypasses in the register file bypass network may be optionally shared among multiple free-riders, and an entire port stage may also be optionally bypassed to another port stage based upon one or more criteria.
    Type: Grant
    Filed: October 1, 2014
    Date of Patent: February 2, 2016
    Assignee: Cadence Design Systems, Inc.
    Inventors: James Sangkyu Kim, Fei Sun, Kyle Satoshi Tsukamoto
  • Patent number: 9223701
    Abstract: A data processing apparatus is provided in which a processor unit accesses data values stored in a memory and a cache stores local copies of a subset of the data values. The cache maintains a status value for each local copy stored in the cache. When the processor unit executes a load-exclusive operation, a first data value is loaded from a specified memory location and an exclusive use monitor begins monitoring the specified memory location for accesses. When the processor unit executes a store-exclusive operation, a second data value is stored to the specified memory location if the exclusive use monitor indicates that the first data value has not been modified since the load-exclusive operation was executed.
    Type: Grant
    Filed: April 12, 2013
    Date of Patent: December 29, 2015
    Assignee: ARM Limited
    Inventors: Frederic Claude Marie Piry, Philippe Jean-Pierre Raphalen, Melanie Emanuelle Lucie Teyssier, Albin Pierick Tonnerre
  • Patent number: 9201944
    Abstract: Techniques are provided for more efficiently using the bandwidth of the I/O path between a CPU and volatile memory during the performance of database operation. Relational data from a relational table is stored in volatile memory as column vectors, where each column vector contains values for a particular column of the table. A binary-comparable format may be used to represent each value within a column vector, regardless of the data type associated with the column. The column vectors may be compressed and/or encoded while in volatile memory, and decompressed/decoded on-the-fly within the CPU. Alternatively, the CPU may be designed to perform operations directly on the compressed and/or encoded column vector data. In addition, techniques are described that enable the CPU to perform vector processing operations on the column vector values.
    Type: Grant
    Filed: June 12, 2013
    Date of Patent: December 1, 2015
    Assignee: ORACLE INTERNATIONAL CORPORATION
    Inventors: Lawrence J. Ellison, Amit Ganesh, Vineet Marwah, Jesse Kamp, Anindya C. Patthak, Shasank K. Chavan, Michael J. Gleeson, Allison L. Holloway, Manosiz Bhattacharyya
  • Patent number: 9195501
    Abstract: Aspects of the disclosure are directed to a method of processing data with a graphics processing unit (GPU). According to some aspects, the method includes executing a first work item with a shader processor of the GPU, wherein the first work item includes one or more instructions for processing input data. The method also includes generating one or more values based on a result of the first work item, wherein the one or more values represent one or more characteristics of the result. The method also includes determining whether to execute a second work item based on the one or more values, wherein the second work item includes one or more instructions that are distinct from the one or more instructions of the first work item for processing the input data.
    Type: Grant
    Filed: July 12, 2011
    Date of Patent: November 24, 2015
    Assignee: QUALCOMM Incorporated
    Inventor: Jukka-Pekka Arvo
  • Patent number: 9170818
    Abstract: A data processing device maintains register map information that maps accesses to architectural registers, as identified by instructions being executed, to physical registers of the data processing device. In response to determining that an instruction, such as a speculatively-executing conditional branch, indicates a checkpoint, the data processing device stores the register map information for subsequent retrieval depending on the resolution of the instruction. In addition, in response to the checkpoint indication the data processing device generates new register map information such that accesses to the architectural registers are mapped to different physical registers. The data processing device maintains a list, referred to as a free register list, of physical registers available to be mapped to an architectural registers.
    Type: Grant
    Filed: April 26, 2011
    Date of Patent: October 27, 2015
    Assignee: Freescale Semiconductor, Inc.
    Inventor: Thang M. Tran
  • Patent number: 9164763
    Abstract: An information processing apparatus includes an instruction supplying section that supplies a plurality of instructions as a single instruction group, an executing section that repetitively executes a plurality of execution processes corresponding to the plurality of instructions in parallel, an issue timing control section that controls an issue timing of each of the instructions to the executing section so that the plurality of execution processes are executed with a timing delayed in accordance with a predetermined latency, and an operand transforming section that transforms an operand register address of each of the instructions in accordance with a predetermined increment value upon every repetition of execution in the executing section.
    Type: Grant
    Filed: August 24, 2010
    Date of Patent: October 20, 2015
    Assignee: SONY CORPORATION
    Inventors: Satoshi Takashima, Hirokazu Hanaki
  • Patent number: 9158541
    Abstract: A processor may include a physical register file and a register renamer. The register renamer may be organized into even and odd banks of entries, where each entry stores an identifier of a physical register. The register renamer may be indexed by a register number of an architected register, such that the renamer maps a particular architected register to a corresponding physical register. Individual entries of the renamer may correspond to architected register aliases of a given size. Renaming aliases that are larger than the given size may involve accessing multiple entries of the renamer, while renaming aliases that are smaller than the given size may involve accessing a single renamer entry.
    Type: Grant
    Filed: November 3, 2010
    Date of Patent: October 13, 2015
    Assignee: Apple Inc.
    Inventor: Wei-Han Lien
  • Patent number: 9069546
    Abstract: A computer system assigns a particular counter from among a plurality of counters currently in a counter free pool to count a number of mappings of logical registers from among a plurality of logical registers to a particular physical register from among a plurality of physical registers, responsive to an execution of an instruction by a mapper unit mapping at least one logical register from among the plurality of logical registers to the particular physical register, wherein the number of the plurality of counters is less than a number of the plurality of physical registers. The computer system, responsive to the counted number of mappings of logical registers to the particular physical register decremented to less than a minimum value, returns the particular counter to the counter free pool.
    Type: Grant
    Filed: April 18, 2012
    Date of Patent: June 30, 2015
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gregory W. Alexander, Brian D. Barrick, John W. Ward, III
  • Patent number: 9069563
    Abstract: A technique for reducing store-hit-loads in an out-of-order processor includes storing a store address of a store instruction associated with a store-hit-load (SHL) pipeline flush in an SHL entry. In response to detecting another SHL pipeline flush for the store address, a current count associated with the SHL entry is updated. In response to the current count associated with the SHL entry reaching a first terminal count, a dependency for the store instruction is created such that execution of a younger load instruction with a load address that overlaps the store address stalls until the store instruction executes.
    Type: Grant
    Filed: September 16, 2011
    Date of Patent: June 30, 2015
    Assignee: International Business Machines Corporation
    Inventors: Brian R. Konigsburg, David S. Levitan, Brian R. Mestan, David Mui
  • Publication number: 20150127926
    Abstract: A processor instruction scheduler comprising an optimization engine which uses an optimization model for a processor architecture with: means to generate an optimization model for the optimization engine from a design of a processor and data representing optimization goals and constraints and a code stream, wherein the processor has at least two execution pipes and at least two registers, and wherein the code stream comprises processor instructions with corresponding register selections; and reordering means to generate an optimized code stream from the code stream with the optimal solution provided by the optimization engine for the optimization model by reordering the code stream, such that optimum values for the optimization goals under the given constraints are achieved without affecting the operation results of the code stream.
    Type: Application
    Filed: January 8, 2015
    Publication date: May 7, 2015
    Inventors: Juergen KOEHL, Jens LEENSTRA, Philipp PANITZ, Hans SCHLENKER
  • Publication number: 20150121040
    Abstract: Methods, devices, and systems for accessing packed registers are presented. A state of the packed registers may be tracked and it may be determined whether the register is directly accessible based on the state. If the register is not directly accessible, an action may be performed which allows the register to be accessed directly. The action may include injecting at least one uop for reorganizing the physical storage of the register such that it is directly accessible. The action may include aligning the data with the least significant bit of a physical register or otherwise aligning the data with the datapath. The action may also include changing the state of the packed registers.
    Type: Application
    Filed: October 24, 2014
    Publication date: April 30, 2015
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Robert E. Weidner, Jay E. Fleischman, Michael C. Sedmak, Michael Estlick, Richard McGowen, II, Emil Talpes
  • Publication number: 20150121041
    Abstract: Described herein are methods and processors for flag renaming in groups to eliminate dependencies of instructions. Decoder and execution units in the processor may be configured to rename flags into groups that allow each group to be treated separately as appropriate. This flag renaming eliminates flag dependencies with respect to instructions. This allows an instruction to write exactly the flags that the instruction wants without having to create merge dependencies. Methods and processors are provided for handling immediate values embedded in instructions. A 16 bit immediate bus and a 4 bit encoding/control bus are added at the interface between decode and execution units. For an 8 or 12 bit immediate, the upper 4 bits of the immediate bus contain the encoding bits. For a 16 bit immediate, the encoding/control bus contains the encoding bits. The encoding/control bus indicates when to look at the top four bits of the immediate bus.
    Type: Application
    Filed: October 24, 2014
    Publication date: April 30, 2015
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Ashok Venkatachar, Karthik Punukollu, Srikanth Arekapudi, Samir A. Chitnis, Emil Talpes
  • Patent number: 9015450
    Abstract: Embodiments of a processor architecture efficiently implement shadow registers in hardware. A register system in a processor includes a set of physical data registers coupled to register renaming logic. The register renaming logic stores data in and retrieves data from the set of physical registers when the processor is in a first processor state. The register renaming logic identifies ones of the set of physical registers that have a first operational state as a first group of registers and identifies the remaining ones of the set of physical registers as a second group of registers in response to an indication that the processor is to enter a second processor state from the first processor state. The register renaming logic stores data in and retrieves data from the second group of registers but not the first group of registers when the processor is in the second processor state.
    Type: Grant
    Filed: January 20, 2010
    Date of Patent: April 21, 2015
    Assignee: STMicroelectronics (Beijing) R&D Co. Ltd.
    Inventors: Hong-Xia Sun, Peng Fei Zhu, Yong Qiang Wu
  • Publication number: 20150089200
    Abstract: A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag.
    Type: Application
    Filed: December 5, 2014
    Publication date: March 26, 2015
    Applicant: Intel Corporation
    Inventors: Vinodh Gopal, James D. Guilford, Gilbert M. Wolrich, Wajdi K. Feghali, Erdinc Ozturk, Martin G. Dixon, Sean Mirkes, Bret L. Toll, Maxim Loktyukhin, Mark C. Davis, Alexandre J. Farcy
  • Publication number: 20150089199
    Abstract: A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag.
    Type: Application
    Filed: December 5, 2014
    Publication date: March 26, 2015
    Applicant: INTEL CORPORATION
    Inventors: Vinodh Gopal, James D. Guilford, Gilbert M. Wolrich, Wajdi K. Feghali, Erdinc Ozturk, Martin G. Dixon, Sean Mirkes, Bret L. Toll, Maxim Loktyukhin, Mark C. Davis, Alexandre J. Farcy
  • Publication number: 20150089201
    Abstract: A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag.
    Type: Application
    Filed: December 5, 2014
    Publication date: March 26, 2015
    Applicant: INTEL CORPORATION
    Inventors: Vinodh Gopal, James D. Guilford, Gilbert M. Wolrich, Wajdi K. Feghali, Erdinc Ozturk, Martin G. Dixon, Sean Mirkes, Bret L. Toll, Maxim Loktyukhin, Mark C. Davis, Alexandre J. Farcy
  • Patent number: 8990544
    Abstract: A method and apparatus are described for using a previous column pointer to read a subset of entries of an array in a processor. The array may have a plurality of rows and columns of entries, and each entry in the subset may reside on a different row of the array. A previous column pointer may be generated for each of the rows of the array based on a plurality of bits indicating the number of valid entries in the subset to be read, the previous column pointer indicating whether each entry is in a current column or a previous column. The entries in the subset may be read and re-ordered, and invalid entries in the subset may be replaced with nulls. The valid entries and nulls may then be outputted.
    Type: Grant
    Filed: December 21, 2011
    Date of Patent: March 24, 2015
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Srikanth Arekapudi, Shloke Hajela
  • Patent number: 8972701
    Abstract: A data processing system is provided in which destination operands to be stored within architectural registers are constrained to have zero values added as prefixes in order that the architectural register value has a fixed bit width irrespective of the bit width of the destination operand being written thereto. Instead of adding these zero values everywhere in the data path, they are instead represented by zero flags in at least the physical registers utilized for register renaming operations and in the result queue prior to results being written to the architectural register file. This saves circuitry resources and reduces energy consumption.
    Type: Grant
    Filed: December 6, 2011
    Date of Patent: March 3, 2015
    Assignee: ARM Limited
    Inventors: James Nolan Hardage, Glen Andrew Harris, Mark Carpenter Glass
  • Patent number: 8972700
    Abstract: An instruction unit provides instructions for execution by a processor. A decode unit decodes instructions received from the instruction unit. Queues are coupled to receive instructions from the decode unit. Each instruction in a same queue is executed in order by a corresponding execution unit. An arbiter is coupled to each queue and to the execution unit that executes instructions of a first instruction type. The arbiter selects a next instruction of the first instruction type from a bottom entry of the queue for execution by the first execution unit.
    Type: Grant
    Filed: February 28, 2011
    Date of Patent: March 3, 2015
    Assignee: Freescale Semiconductor, Inc.
    Inventor: Thang M. Tran
  • Patent number: 8966230
    Abstract: Methods and apparatus relating to dynamic selection of execution stage are described. In some embodiments, logic may determine whether to execute an instruction at one of a plurality of stages in a processor. In some embodiments, the plurality of stages are to at least correspond to an address generation stage or an execution stage of the instruction. Other embodiments are also described and claimed.
    Type: Grant
    Filed: September 30, 2009
    Date of Patent: February 24, 2015
    Assignee: Intel Corporation
    Inventors: Deepak Limaye, Kulin N. Kothari, James D. Allen, James E. Phillips
  • Publication number: 20150019843
    Abstract: A method and apparatus for allowing an out-of-order processor to reuse an in-use physical register is disclosed herein. The method and apparatus uses identifiers, such as tokens and/or other identifiers in a rename map table (RMT) and a physical register file (PRF), to indicate whether an instruction result is allowed or disallowed to be written into a physical register.
    Type: Application
    Filed: November 27, 2013
    Publication date: January 15, 2015
    Applicant: QUALCOMM Incorporated
    Inventors: Anil KRISHNA, Sandeep S. NAVADA, Niket K. CHOUDHARY, Michael Scott MCILVAINE, Thomas Andrew SARTORIUS, Rodney Wayne SMITH, Kenneth Alan DOCKSER
  • Patent number: 8933953
    Abstract: A scoreboard for a video processor may keep track of only dispatched threads which have not yet completed execution. A first thread may itself snoop for execution of a second thread that must be executed before the first thread's execution. Thread execution may be freely reordered, subject only to the rule that a second thread, whose execution is dependent on execution of a first thread, can only be executed after the first thread.
    Type: Grant
    Filed: June 30, 2008
    Date of Patent: January 13, 2015
    Assignee: Intel Corporation
    Inventors: Hong Jiang, James M. Holland, Prasoonkumar Surti
  • Patent number: 8930679
    Abstract: An out-of-order execution microprocessor for reducing the likelihood of having to replay a load instruction due to a store collision. The microprocessor includes a queue of entries, each entry configured to hold information that identifies sources of a store instruction used to compute its store address and to hold a dependency that identifies an instruction upon which the store instruction depends for its data. A register alias table (RAT), coupled to the queue of entries, is configured to encounter instructions in program order and to generate dependencies used to determine when the instructions may execute out of program order. In response to encountering a load instruction the RAT determines whether sources of the load instruction used to compute its load address match the sources of the store instruction in an entry of the queue, and if so, causes the load instruction to share the dependency of the matching store instruction.
    Type: Grant
    Filed: October 23, 2009
    Date of Patent: January 6, 2015
    Assignee: Via Technologies, Inc.
    Inventors: Matthew Daniel Day, Rodney E. Hooker
  • Publication number: 20140380024
    Abstract: A method includes suppressing execution of at least one dependent instruction of a load instruction by a processor using stored dependency information responsive to an invalid status of the load instruction. A processor includes an execution unit to execute instructions and a scheduler. The scheduler is to select for execution in the execution unit a load instruction having at least one dependent instruction and suppress execution of the at least one dependent instruction using stored dependency information responsive to an invalid status of the load instruction.
    Type: Application
    Filed: June 25, 2013
    Publication date: December 25, 2014
    Inventors: Francesco Spadini, Michael Achenbach
  • Patent number: 8914615
    Abstract: A processor core supports execution of program instruction from both a first instruction set and a second instruction set. An architectural register file 18 containing architectural registers is shared by the two instruction sets. The two instruction sets employ logical register specifiers which for at least some values of those logical registers specifiers correspond to different architectural registers within the architectural register file 18. A first decoder 4 for the first instruction set and a second decoder 6 for the second instruction set serve to decode the logical register specifiers to a common register addressing format. This common register addressing format is used to supply register specifiers to renaming circuitry 10 for supporting register renaming in conjunction with a physical register file 16 and an architectural register file 18.
    Type: Grant
    Filed: December 2, 2011
    Date of Patent: December 16, 2014
    Assignee: ARM Limited
    Inventors: Glen Andrew Harris, James Nolan Hardage, Mark Carpenter Glass
  • Patent number: 8914617
    Abstract: Methods and apparatus relating to a hardware move elimination and/or next page prefetching are described. In some embodiments, a logic may provide hardware move eliminations based on stored data. In an embodiment, a next page prefetcher is disclosed. Other embodiments are also described and claimed.
    Type: Grant
    Filed: December 24, 2010
    Date of Patent: December 16, 2014
    Assignee: Intel Corporation
    Inventors: Shlomo Raikin, David J. Sager, Zeev Sperber, Evgeni Krimer, Ori Lempel, Stanislav Shwartsman, Adi Yoaz, Omer Golz
  • Patent number: 8914616
    Abstract: A data processing apparatus and method are provided. A processor performs data processing operations in response to data processing instructions which reference logical registers. A set of physical registers stores data values which are subjected to the data processing operations. A tag storage stores for each physical register a tag value indicative of one of the logical registers. The processor references the tag storage to perform the data processing operations. A tag value exchanger performs a tag switch exchanging two tag values in the tag storage when the processor executes a predetermined instruction which references two logical registers and for which a choice of which two physical registers are mapped to which of the two logical registers will have no effect on an outcome of the data processing operations. The tag value exchanger performs the tag switch with respect to the tag values indicative of the two logical registers.
    Type: Grant
    Filed: December 2, 2011
    Date of Patent: December 16, 2014
    Assignee: ARM Limited
    Inventor: Simon John Craske
  • Publication number: 20140344554
    Abstract: A dependency reordering method. The method includes accessing an input sequence of instructions, initializing three registers, and loading instruction numbers into a first register. The method further includes loading destination register numbers into a second register, broadcasting values from the first register to a position in a third register in accordance with a position number in the second register, overwriting positions in the third register in accordance with position numbers in the second register, and using information in the third register to populate a dependency matrix for grouping dependent instructions from the sequence of instructions.
    Type: Application
    Filed: November 22, 2011
    Publication date: November 20, 2014
    Inventor: Mohammad Abdallah
  • Publication number: 20140325188
    Abstract: A method for reducing a pipeline stall in a multi-pipelined processor includes finding a store instruction having a same target address as a load instruction and having a store value of the store instruction not yet written according to the store instruction, when the store instruction is being concurrently processed in a different pipeline than the load instruction and the store instruction occurs before the load instruction in a program order. The method also includes associating a target rename register of the load instruction as well as the load instruction with the store instruction, responsive to the finding step. The method further includes writing the store value of the store instruction to the target rename register of the load instruction and finishing the load instruction without reissuing the load instruction, responsive to writing the store value of the store instruction according to the store instruction to finish the store instruction.
    Type: Application
    Filed: April 24, 2013
    Publication date: October 30, 2014
    Applicant: International Business Machines Corporation
    Inventor: TAKESHI OGASAWARA
  • Publication number: 20140304492
    Abstract: A microprocessor implemented method for resolving dependencies for a load instruction in a load store queue (LSQ) is disclosed. The method comprises initiating a computation of a virtual address corresponding to the load instruction in a first clock cycle. It also comprises transmitting early calculated lower address bits of the virtual address to a load store queue (LSQ) in the same cycle as the initiating. Finally, it comprises performing a partial match in the LSQ responsive to and using the lower address bits to find a prior aliasing store, wherein the prior aliasing store stores to a same address as the load instruction.
    Type: Application
    Filed: May 19, 2014
    Publication date: October 9, 2014
    Applicant: Soft Machines, Inc.
    Inventors: Mohammad A. ABDALLAH, Ravishankar RAO
  • Publication number: 20140289501
    Abstract: Register renaming circuitry for a processing apparatus configured to process a stream of instructions from an instruction set specifying registers from an architectural set of registers. The apparatus including a physical set of registers configured to store data values being processed by the processing apparatus. Register renaming circuitry is configured to receive a stream of operations from an instruction decoder and to map registers that are to be written to by the stream of operations to physical registers within the physical set of registers that are currently available. The register renaming circuitry comprises register release circuitry configured to release the physical registers that have been mapped to the registers when a first set of conditions have been met, and to release the physical registers that have been mapped to the additional registers when a second set of conditions have been met.
    Type: Application
    Filed: March 20, 2013
    Publication date: September 25, 2014
    Inventors: Guillaume SCHON, Cedric Denis Robert AIRAUD, Frederic Jean Denis ARSANTO, Luca SCALABRINO
  • Publication number: 20140281415
    Abstract: Reconfiguring a register file using a rename table having a plurality of fields that indicate fracture information about a source register of an instruction for instructions which have narrow to wide dependencies.
    Type: Application
    Filed: March 15, 2013
    Publication date: September 18, 2014
    Inventors: Bradley Gene BURGESS, Ashraf AHMED, Ravi IYENGAR
  • Publication number: 20140281393
    Abstract: Out-of-order CPUs, devices and methods diminish the time penalty from stalling the pipe to rebuild a rename table, such as due to a misprediction. A microprocessor can include a pipe that has a decoder, a dispatcher, and at least one execution unit. A rename table stores rename data, and a check-point table (“CPT”) stores rename data received from the dispatcher. A Re-Order Buffer (“ROB”) stores ROB data, and has a static mapping relationship with the CPT. If the rename table is flushed, such as due to a misprediction, the rename table is rebuilt at least in part by concurrent copying of rename data stored in the CPT, in coordination with walking the ROB.
    Type: Application
    Filed: March 14, 2013
    Publication date: September 18, 2014
    Inventors: Ravi Iyengar, Prarthna Santhanakrishnan
  • Publication number: 20140281414
    Abstract: Out-of-order CPUs, devices and methods diminish the time penalty from stalling the pipe to rebuild a rename table, such as due to a misprediction. A microprocessor can include a pipe that has a decoder, a dispatcher, and at least one execution unit. A rename table stores rename data, and a check-point table (“CPT”) stores rename data received from the dispatcher. A Re-Order Buffer (“ROB”) stores ROB data, and has a dynamic mapping relationship with the CPT. If the rename table is flushed, such as due to a misprediction, the rename table is rebuilt at least in part by concurrent copying of rename data stored in the CPT, in coordination with walking the ROB.
    Type: Application
    Filed: March 14, 2013
    Publication date: September 18, 2014
    Inventors: Ravi Iyengar, Prarthna Santhanakrishnan
  • Publication number: 20140281413
    Abstract: Methods and systems that allow the processor to effectively and efficiently reduce or eliminate the latency associated with instructions that copy the value of one register to another register. A processor includes a superforwarding table, a superforwarding logic block, and a computation engine. The superforwarding table stores an entry, wherein the entry has a valid bit, a key field, and a forward field. The superforwarding logic block determines which register contains the information needed for an instruction. The computation engine executes instructions.
    Type: Application
    Filed: March 14, 2013
    Publication date: September 18, 2014
    Applicant: MIPS Technologies, Inc.
    Inventors: Qian WANG, Ranganathan Sudhakar
  • Publication number: 20140258687
    Abstract: A method and apparatus for register packing prior to register renaming in a microprocessor are provided. The method includes: receiving a plurality of micro operations (micro-ops) decoded from one or more instructions; packing a plurality of registers which are included in the micro-ops into a packed register structure including a plurality of packed registers based on a preset number of rename ports of a renamer through which the packed registers are read or written for register renaming; and sending the packed registers for register renaming.
    Type: Application
    Filed: March 8, 2013
    Publication date: September 11, 2014
    Inventors: Teik-Chung TAN, Bradley Gene BURGESS, Ravi IYENGAR
  • Publication number: 20140244978
    Abstract: The present invention provides a method and apparatus for checkpointing registers for transactional memory. Some embodiments of the apparatus include first rename logic configured to map up to a predetermined number of architectural registers to corresponding first physical registers that hold first values associated with the architectural registers. The mapping is responsive to a transaction modifying one or more of the first values associated with the architectural registers. Some embodiments of the apparatus also include microcode configured to write contents of the first physical registers to a memory in response to the transaction modifying first values associated with a number of the architectural registers that is larger than the predetermined number.
    Type: Application
    Filed: February 28, 2013
    Publication date: August 28, 2014
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventor: John M. King
  • Patent number: 8799626
    Abstract: A segmental allocation method of expanding RISC processor register includes the steps of a) setting an instruction format of the RISC processor, the destination register field being set having 6 bits to correspond to 64 registers and at least one source register field having at least 4 bits to correspond to at least 16 registers; b) providing two solutions to the problem resulting from that the instruction format in the step a) goes beyond range under some circumstances; and c) setting a register segment allocation algorithm having the steps of c1) providing and grouping a plurality of pseudo registers; c2) prioritizing the pseudo registers in each of the groups; c3) combining the groups pursuant to the priorities thereof; and c4) locating the physical register of lowest computational cost.
    Type: Grant
    Filed: September 9, 2011
    Date of Patent: August 5, 2014
    Assignee: National Chung Cheng University
    Inventors: Rong-Guey Chang, Yuan-Shin Hwang, Chia-Hsien Su
  • Publication number: 20140164742
    Abstract: An apparatus and method are provided for performing register renaming. Available register identifying circuitry is provided to identify which physical registers form a pool of physical registers available to be mapped by register renaming circuitry to an architectural register specified by an instruction to be executed. Configuration data whose value is modified during operation of the processing circuitry is stored such that, when the configuration data has a first value, the configuration data identifies at least one architectural register of the architectural register set which does not require mapping to a physical register by the register renaming circuitry. The register identifying circuitry is arranged to reference the modified data value, such that when the configuration data has the first value, the number of physical registers in the pool is increased due to the reduction in the number of architectural registers which require mapping to physical registers.
    Type: Application
    Filed: June 26, 2013
    Publication date: June 12, 2014
    Inventors: Frederic Claude Marie PIRY, Louis-Marie Vincent MOUTON, Luca SCALABRINO, Richard Roy GRISENTHWAITE, David Hennah MANSELL
  • Patent number: 8725989
    Abstract: In one embodiment, a processor can perform a function call from a main program to a function that is to operate on at least one vector-type operand, in which only scalar values are passed to the function, and input values to the function including the at least one vector-type operand are to be renamed from virtual registers identified in the function to physical registers of a vector register file, and output values from the function including the at least one vector-type operand are to be renamed from virtual registers identified in the function to physical registers of the vector register file. Other embodiments are described and claimed.
    Type: Grant
    Filed: December 9, 2010
    Date of Patent: May 13, 2014
    Assignee: Intel Corporation
    Inventor: Tomasz Madajczak
  • Publication number: 20140122837
    Abstract: Embodiments of a processor architecture utilizing multi-bank implementation of physical register mapping table are provided. A register renaming system to correlate architectural registers to physical registers includes a physical register mapping table and a renaming logic. The physical register mapping table has a plurality of entries each indicative of a state of a respective physical register. The mapping table has a plurality of non-overlapping sections each of which having respective entries of the mapping table. The renaming logic is coupled to search a number of the sections of the mapping table in parallel to identify entries that indicate the respective physical registers have a first state. The renaming logic selectively correlates each of a plurality of architectural registers to a respective physical register identified as being in the first state. Methods of utilizing the multi-bank implementation of physical register mapping table are also provided.
    Type: Application
    Filed: October 28, 2013
    Publication date: May 1, 2014
    Applicant: STMicroelectronics (Beijing) R&D Co. Ltd.
    Inventors: Peng Fei Zhu, Hong-Xia Sun, Yong Qiang Wu
  • Publication number: 20140101415
    Abstract: A pipelined computer processor is presented that reduces data hazards such that high processor utilization is attained. The processor restructures a set of instructions to operate concurrently on multiple pieces of data in multiple passes. One subset of instructions operates on one piece of data while different subsets of instructions operate concurrently on different pieces of data. A validity pipeline tracks the priming and draining of the pipeline processor to ensure that only valid data is written to registers or memory. Pass-dependent addressing is provided to correctly address registers and memory for different pieces of data.
    Type: Application
    Filed: December 10, 2013
    Publication date: April 10, 2014
    Applicant: Micron Technology, Inc.
    Inventors: Neal Andrew Crook, Alan T. Wootton, James Peterson
  • Patent number: 8683180
    Abstract: A method, processor, and computer program product employing an intermediate register mapper within a register renaming mechanism. A logical register lookup determines whether a hit to a logical register associated with the dispatched instruction has occurred. In this regard, the logical register lookup searches within at least one register mapper from a group of register mappers, including an architected register mapper, a unified main mapper, and an intermediate register mapper. A single hit to the logical register is selected among the group of register mappers. If an instruction having a mapper entry in the unified main mapper has finished but has not completed, the mapping contents of the register mapper entry in the unified main mapper are moved to the intermediate register mapper, and the unified register mapper entry is released, thus increasing a number of unified main mapper entries available for reuse.
    Type: Grant
    Filed: October 13, 2009
    Date of Patent: March 25, 2014
    Assignee: International Business Machines Corporation
    Inventors: Brian D. Barrick, Michael Billeci, Lee E. Eisen
  • Patent number: 8661228
    Abstract: A processor includes an instruction fetch unit, an issue queue coupled to the instruction fetch unit, an execution unit coupled to the issue queue, and a multi-level register file including a first level register file having lower access latency and a second level register file having higher access latency. Each of the first and second level register files includes a plurality of physical registers for holding operands that is concurrently shared by a plurality of threads. The processor further includes a mapper that, at dispatch of an instruction specifying a source logical register from the instruction fetch unit to the issue queue, initiates a swap of a first operand associated with the source logical register that is in the second level register file with a second operand held in the first level register file. The issue queue, following the swap, issues the instruction to the execution unit for execution.
    Type: Grant
    Filed: April 16, 2012
    Date of Patent: February 25, 2014
    Assignee: International Business Machines Corporation
    Inventors: Christopher M. Abernathy, Mary D. Brown, Hung Q. Le, Dung Q. Nguyen
  • Patent number: 8661230
    Abstract: A mapper unit of an out-of-order processor assigns a particular counter currently in a counter free pool to count a number of mappings of logical registers to a particular physical register from among multiple physical registers, responsive to an execution of an instruction by the mapper unit mapping at least one logical register to the particular physical register. The number of counters is less than the number of physical registers. The mapper unit, responsive to the counted number of mappings of logical registers to the particular physical register decremented to less than a minimum value, returns the particular counter to the counter free pool.
    Type: Grant
    Filed: April 15, 2011
    Date of Patent: February 25, 2014
    Assignee: International Business Machines Corporation
    Inventors: Gregory W. Alexander, Brian D. Barrick, John W. Ward, III
  • Publication number: 20140047219
    Abstract: A multi-level register hierarchy is disclosed comprising a first level pool of registers for caching registers of a second level pool of registers in a system wherein programs can dynamically release and re-enable architected registers such that released architected registers need not be maintained by the processor, the processor accessing operands from the first level pool of registers, wherein a last-use instruction is identified as having a last use of an architected register before being released, the last-use architected register being released causes the multi-level register hierarchy to discard any correspondence of an entry to said last use architected register.
    Type: Application
    Filed: October 18, 2013
    Publication date: February 13, 2014
    Applicant: International Business Machines Corporation
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Publication number: 20140047218
    Abstract: Multi-stage register renaming using dependency removal is described. In an embodiment, the registers are renamed in two stages. The first stage involves removing all the dependencies within a set of instructions which are being renamed together. The final stage then renames all registers in parallel using a renaming map. In various embodiments, the dependencies are removed in the first stage using a fixed mapping to rename destination registers in each instruction and in some embodiments the fixed mapping is based on the position of a destination register within the set of instructions. Dependent registers, which are those registers which are read in an instruction but have been written in a previous instruction in the set, are also renamed in the first stage. In addition to performing the renaming in the final stage, the renaming map is updated.
    Type: Application
    Filed: January 28, 2013
    Publication date: February 13, 2014
    Applicant: IMAGINATION TECHNOLOGIES LIMITED
    Inventor: Hugh Jackson
  • Patent number: 8631223
    Abstract: A processor includes an instruction sequencing unit, execution unit, and multi-level register file including a first level register file having a lower access latency and a second level register file having a higher access latency. Responsive to the processor processing a second instruction in a transactional code section to obtain as an execution result a second register value of the logical register, the mapper moves a first register value of the logical register to the second level register file, places the second register value in the first level register file, marks the second register value as speculative, and replaces a first mapping for the logical register with a second mapping. Responsive to unsuccessful termination of the transactional code section, the mapper designates the second register value in the first level register file as invalid so that the first register value in the second level register file becomes the working value.
    Type: Grant
    Filed: May 12, 2010
    Date of Patent: January 14, 2014
    Assignee: International Business Machines Corporation
    Inventors: Christopher M. Abernathy, Mary D. Brown, Hung Q. Le, Dung Q. Nguyen