Scoreboarding, Reservation Station, Or Aliasing Patents (Class 712/217)
-
Patent number: 9268569Abstract: A method for suppressing branch misprediction behavior is contemplated in which a conditional branch instruction that would cause the flow of control to branch around instructions in response to a determination that a predicate vector is null is predicted not taken. However, in response to detecting that the prediction is incorrect, misprediction behavior is inhibited.Type: GrantFiled: February 24, 2012Date of Patent: February 23, 2016Assignee: Apple Inc.Inventor: Jeffry E. Gonion
-
Patent number: 9256433Abstract: Systems and methods for move operation elimination with bypass Multiple Instantiation Table (MIT) logic. An example processing system may comprise a first data structure configured to store a plurality of physical register values; a second data structure configured to store a plurality of pointers, each pointer referencing an element of the first data structure; a third data structure including a plurality of move elimination sets, each move elimination set comprising a plurality of bits representing a plurality of logical registers; and a logic configured to perform a data manipulation operation by causing an element of the second data structure to reference an element of the first data structure, the logic further configured to reflect results of two or more data manipulation operations by performing a single update of the third data structure.Type: GrantFiled: March 15, 2013Date of Patent: February 9, 2016Assignee: Intel CorporationInventor: Jeremy Anderson
-
Patent number: 9250900Abstract: Methods and systems for implementing a microprocessor with a selective register file bypass network are disclosed. Late bypasses are removed from a register file bypass network of a microprocessor design. One or more late bypasses are then added back to the register file bypass network based at least in part upon the results of analyzing a plurality of instructions that are to be processed in an instruction pipeline of the microprocessor. An electronic design for at least the register file bypass network is then generated with these one or more late bypasses that are added to the register file bypass network. Without incurring additional hardware or cost for the microprocessor design, one or more bypasses in the register file bypass network may be optionally shared among multiple free-riders, and an entire port stage may also be optionally bypassed to another port stage based upon one or more criteria.Type: GrantFiled: October 1, 2014Date of Patent: February 2, 2016Assignee: Cadence Design Systems, Inc.Inventors: James Sangkyu Kim, Fei Sun, Kyle Satoshi Tsukamoto
-
Patent number: 9223701Abstract: A data processing apparatus is provided in which a processor unit accesses data values stored in a memory and a cache stores local copies of a subset of the data values. The cache maintains a status value for each local copy stored in the cache. When the processor unit executes a load-exclusive operation, a first data value is loaded from a specified memory location and an exclusive use monitor begins monitoring the specified memory location for accesses. When the processor unit executes a store-exclusive operation, a second data value is stored to the specified memory location if the exclusive use monitor indicates that the first data value has not been modified since the load-exclusive operation was executed.Type: GrantFiled: April 12, 2013Date of Patent: December 29, 2015Assignee: ARM LimitedInventors: Frederic Claude Marie Piry, Philippe Jean-Pierre Raphalen, Melanie Emanuelle Lucie Teyssier, Albin Pierick Tonnerre
-
Patent number: 9201944Abstract: Techniques are provided for more efficiently using the bandwidth of the I/O path between a CPU and volatile memory during the performance of database operation. Relational data from a relational table is stored in volatile memory as column vectors, where each column vector contains values for a particular column of the table. A binary-comparable format may be used to represent each value within a column vector, regardless of the data type associated with the column. The column vectors may be compressed and/or encoded while in volatile memory, and decompressed/decoded on-the-fly within the CPU. Alternatively, the CPU may be designed to perform operations directly on the compressed and/or encoded column vector data. In addition, techniques are described that enable the CPU to perform vector processing operations on the column vector values.Type: GrantFiled: June 12, 2013Date of Patent: December 1, 2015Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Lawrence J. Ellison, Amit Ganesh, Vineet Marwah, Jesse Kamp, Anindya C. Patthak, Shasank K. Chavan, Michael J. Gleeson, Allison L. Holloway, Manosiz Bhattacharyya
-
Patent number: 9195501Abstract: Aspects of the disclosure are directed to a method of processing data with a graphics processing unit (GPU). According to some aspects, the method includes executing a first work item with a shader processor of the GPU, wherein the first work item includes one or more instructions for processing input data. The method also includes generating one or more values based on a result of the first work item, wherein the one or more values represent one or more characteristics of the result. The method also includes determining whether to execute a second work item based on the one or more values, wherein the second work item includes one or more instructions that are distinct from the one or more instructions of the first work item for processing the input data.Type: GrantFiled: July 12, 2011Date of Patent: November 24, 2015Assignee: QUALCOMM IncorporatedInventor: Jukka-Pekka Arvo
-
Patent number: 9170818Abstract: A data processing device maintains register map information that maps accesses to architectural registers, as identified by instructions being executed, to physical registers of the data processing device. In response to determining that an instruction, such as a speculatively-executing conditional branch, indicates a checkpoint, the data processing device stores the register map information for subsequent retrieval depending on the resolution of the instruction. In addition, in response to the checkpoint indication the data processing device generates new register map information such that accesses to the architectural registers are mapped to different physical registers. The data processing device maintains a list, referred to as a free register list, of physical registers available to be mapped to an architectural registers.Type: GrantFiled: April 26, 2011Date of Patent: October 27, 2015Assignee: Freescale Semiconductor, Inc.Inventor: Thang M. Tran
-
Patent number: 9164763Abstract: An information processing apparatus includes an instruction supplying section that supplies a plurality of instructions as a single instruction group, an executing section that repetitively executes a plurality of execution processes corresponding to the plurality of instructions in parallel, an issue timing control section that controls an issue timing of each of the instructions to the executing section so that the plurality of execution processes are executed with a timing delayed in accordance with a predetermined latency, and an operand transforming section that transforms an operand register address of each of the instructions in accordance with a predetermined increment value upon every repetition of execution in the executing section.Type: GrantFiled: August 24, 2010Date of Patent: October 20, 2015Assignee: SONY CORPORATIONInventors: Satoshi Takashima, Hirokazu Hanaki
-
Patent number: 9158541Abstract: A processor may include a physical register file and a register renamer. The register renamer may be organized into even and odd banks of entries, where each entry stores an identifier of a physical register. The register renamer may be indexed by a register number of an architected register, such that the renamer maps a particular architected register to a corresponding physical register. Individual entries of the renamer may correspond to architected register aliases of a given size. Renaming aliases that are larger than the given size may involve accessing multiple entries of the renamer, while renaming aliases that are smaller than the given size may involve accessing a single renamer entry.Type: GrantFiled: November 3, 2010Date of Patent: October 13, 2015Assignee: Apple Inc.Inventor: Wei-Han Lien
-
Patent number: 9069546Abstract: A computer system assigns a particular counter from among a plurality of counters currently in a counter free pool to count a number of mappings of logical registers from among a plurality of logical registers to a particular physical register from among a plurality of physical registers, responsive to an execution of an instruction by a mapper unit mapping at least one logical register from among the plurality of logical registers to the particular physical register, wherein the number of the plurality of counters is less than a number of the plurality of physical registers. The computer system, responsive to the counted number of mappings of logical registers to the particular physical register decremented to less than a minimum value, returns the particular counter to the counter free pool.Type: GrantFiled: April 18, 2012Date of Patent: June 30, 2015Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Gregory W. Alexander, Brian D. Barrick, John W. Ward, III
-
Patent number: 9069563Abstract: A technique for reducing store-hit-loads in an out-of-order processor includes storing a store address of a store instruction associated with a store-hit-load (SHL) pipeline flush in an SHL entry. In response to detecting another SHL pipeline flush for the store address, a current count associated with the SHL entry is updated. In response to the current count associated with the SHL entry reaching a first terminal count, a dependency for the store instruction is created such that execution of a younger load instruction with a load address that overlaps the store address stalls until the store instruction executes.Type: GrantFiled: September 16, 2011Date of Patent: June 30, 2015Assignee: International Business Machines CorporationInventors: Brian R. Konigsburg, David S. Levitan, Brian R. Mestan, David Mui
-
Publication number: 20150127926Abstract: A processor instruction scheduler comprising an optimization engine which uses an optimization model for a processor architecture with: means to generate an optimization model for the optimization engine from a design of a processor and data representing optimization goals and constraints and a code stream, wherein the processor has at least two execution pipes and at least two registers, and wherein the code stream comprises processor instructions with corresponding register selections; and reordering means to generate an optimized code stream from the code stream with the optimal solution provided by the optimization engine for the optimization model by reordering the code stream, such that optimum values for the optimization goals under the given constraints are achieved without affecting the operation results of the code stream.Type: ApplicationFiled: January 8, 2015Publication date: May 7, 2015Inventors: Juergen KOEHL, Jens LEENSTRA, Philipp PANITZ, Hans SCHLENKER
-
Publication number: 20150121040Abstract: Methods, devices, and systems for accessing packed registers are presented. A state of the packed registers may be tracked and it may be determined whether the register is directly accessible based on the state. If the register is not directly accessible, an action may be performed which allows the register to be accessed directly. The action may include injecting at least one uop for reorganizing the physical storage of the register such that it is directly accessible. The action may include aligning the data with the least significant bit of a physical register or otherwise aligning the data with the datapath. The action may also include changing the state of the packed registers.Type: ApplicationFiled: October 24, 2014Publication date: April 30, 2015Applicant: Advanced Micro Devices, Inc.Inventors: Robert E. Weidner, Jay E. Fleischman, Michael C. Sedmak, Michael Estlick, Richard McGowen, II, Emil Talpes
-
Publication number: 20150121041Abstract: Described herein are methods and processors for flag renaming in groups to eliminate dependencies of instructions. Decoder and execution units in the processor may be configured to rename flags into groups that allow each group to be treated separately as appropriate. This flag renaming eliminates flag dependencies with respect to instructions. This allows an instruction to write exactly the flags that the instruction wants without having to create merge dependencies. Methods and processors are provided for handling immediate values embedded in instructions. A 16 bit immediate bus and a 4 bit encoding/control bus are added at the interface between decode and execution units. For an 8 or 12 bit immediate, the upper 4 bits of the immediate bus contain the encoding bits. For a 16 bit immediate, the encoding/control bus contains the encoding bits. The encoding/control bus indicates when to look at the top four bits of the immediate bus.Type: ApplicationFiled: October 24, 2014Publication date: April 30, 2015Applicant: ADVANCED MICRO DEVICES, INC.Inventors: Ashok Venkatachar, Karthik Punukollu, Srikanth Arekapudi, Samir A. Chitnis, Emil Talpes
-
Patent number: 9015450Abstract: Embodiments of a processor architecture efficiently implement shadow registers in hardware. A register system in a processor includes a set of physical data registers coupled to register renaming logic. The register renaming logic stores data in and retrieves data from the set of physical registers when the processor is in a first processor state. The register renaming logic identifies ones of the set of physical registers that have a first operational state as a first group of registers and identifies the remaining ones of the set of physical registers as a second group of registers in response to an indication that the processor is to enter a second processor state from the first processor state. The register renaming logic stores data in and retrieves data from the second group of registers but not the first group of registers when the processor is in the second processor state.Type: GrantFiled: January 20, 2010Date of Patent: April 21, 2015Assignee: STMicroelectronics (Beijing) R&D Co. Ltd.Inventors: Hong-Xia Sun, Peng Fei Zhu, Yong Qiang Wu
-
Publication number: 20150089200Abstract: A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag.Type: ApplicationFiled: December 5, 2014Publication date: March 26, 2015Applicant: Intel CorporationInventors: Vinodh Gopal, James D. Guilford, Gilbert M. Wolrich, Wajdi K. Feghali, Erdinc Ozturk, Martin G. Dixon, Sean Mirkes, Bret L. Toll, Maxim Loktyukhin, Mark C. Davis, Alexandre J. Farcy
-
Publication number: 20150089199Abstract: A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag.Type: ApplicationFiled: December 5, 2014Publication date: March 26, 2015Applicant: INTEL CORPORATIONInventors: Vinodh Gopal, James D. Guilford, Gilbert M. Wolrich, Wajdi K. Feghali, Erdinc Ozturk, Martin G. Dixon, Sean Mirkes, Bret L. Toll, Maxim Loktyukhin, Mark C. Davis, Alexandre J. Farcy
-
Publication number: 20150089201Abstract: A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag.Type: ApplicationFiled: December 5, 2014Publication date: March 26, 2015Applicant: INTEL CORPORATIONInventors: Vinodh Gopal, James D. Guilford, Gilbert M. Wolrich, Wajdi K. Feghali, Erdinc Ozturk, Martin G. Dixon, Sean Mirkes, Bret L. Toll, Maxim Loktyukhin, Mark C. Davis, Alexandre J. Farcy
-
Patent number: 8990544Abstract: A method and apparatus are described for using a previous column pointer to read a subset of entries of an array in a processor. The array may have a plurality of rows and columns of entries, and each entry in the subset may reside on a different row of the array. A previous column pointer may be generated for each of the rows of the array based on a plurality of bits indicating the number of valid entries in the subset to be read, the previous column pointer indicating whether each entry is in a current column or a previous column. The entries in the subset may be read and re-ordered, and invalid entries in the subset may be replaced with nulls. The valid entries and nulls may then be outputted.Type: GrantFiled: December 21, 2011Date of Patent: March 24, 2015Assignee: Advanced Micro Devices, Inc.Inventors: Srikanth Arekapudi, Shloke Hajela
-
Patent number: 8972701Abstract: A data processing system is provided in which destination operands to be stored within architectural registers are constrained to have zero values added as prefixes in order that the architectural register value has a fixed bit width irrespective of the bit width of the destination operand being written thereto. Instead of adding these zero values everywhere in the data path, they are instead represented by zero flags in at least the physical registers utilized for register renaming operations and in the result queue prior to results being written to the architectural register file. This saves circuitry resources and reduces energy consumption.Type: GrantFiled: December 6, 2011Date of Patent: March 3, 2015Assignee: ARM LimitedInventors: James Nolan Hardage, Glen Andrew Harris, Mark Carpenter Glass
-
Patent number: 8972700Abstract: An instruction unit provides instructions for execution by a processor. A decode unit decodes instructions received from the instruction unit. Queues are coupled to receive instructions from the decode unit. Each instruction in a same queue is executed in order by a corresponding execution unit. An arbiter is coupled to each queue and to the execution unit that executes instructions of a first instruction type. The arbiter selects a next instruction of the first instruction type from a bottom entry of the queue for execution by the first execution unit.Type: GrantFiled: February 28, 2011Date of Patent: March 3, 2015Assignee: Freescale Semiconductor, Inc.Inventor: Thang M. Tran
-
Patent number: 8966230Abstract: Methods and apparatus relating to dynamic selection of execution stage are described. In some embodiments, logic may determine whether to execute an instruction at one of a plurality of stages in a processor. In some embodiments, the plurality of stages are to at least correspond to an address generation stage or an execution stage of the instruction. Other embodiments are also described and claimed.Type: GrantFiled: September 30, 2009Date of Patent: February 24, 2015Assignee: Intel CorporationInventors: Deepak Limaye, Kulin N. Kothari, James D. Allen, James E. Phillips
-
Publication number: 20150019843Abstract: A method and apparatus for allowing an out-of-order processor to reuse an in-use physical register is disclosed herein. The method and apparatus uses identifiers, such as tokens and/or other identifiers in a rename map table (RMT) and a physical register file (PRF), to indicate whether an instruction result is allowed or disallowed to be written into a physical register.Type: ApplicationFiled: November 27, 2013Publication date: January 15, 2015Applicant: QUALCOMM IncorporatedInventors: Anil KRISHNA, Sandeep S. NAVADA, Niket K. CHOUDHARY, Michael Scott MCILVAINE, Thomas Andrew SARTORIUS, Rodney Wayne SMITH, Kenneth Alan DOCKSER
-
Patent number: 8933953Abstract: A scoreboard for a video processor may keep track of only dispatched threads which have not yet completed execution. A first thread may itself snoop for execution of a second thread that must be executed before the first thread's execution. Thread execution may be freely reordered, subject only to the rule that a second thread, whose execution is dependent on execution of a first thread, can only be executed after the first thread.Type: GrantFiled: June 30, 2008Date of Patent: January 13, 2015Assignee: Intel CorporationInventors: Hong Jiang, James M. Holland, Prasoonkumar Surti
-
Patent number: 8930679Abstract: An out-of-order execution microprocessor for reducing the likelihood of having to replay a load instruction due to a store collision. The microprocessor includes a queue of entries, each entry configured to hold information that identifies sources of a store instruction used to compute its store address and to hold a dependency that identifies an instruction upon which the store instruction depends for its data. A register alias table (RAT), coupled to the queue of entries, is configured to encounter instructions in program order and to generate dependencies used to determine when the instructions may execute out of program order. In response to encountering a load instruction the RAT determines whether sources of the load instruction used to compute its load address match the sources of the store instruction in an entry of the queue, and if so, causes the load instruction to share the dependency of the matching store instruction.Type: GrantFiled: October 23, 2009Date of Patent: January 6, 2015Assignee: Via Technologies, Inc.Inventors: Matthew Daniel Day, Rodney E. Hooker
-
Publication number: 20140380024Abstract: A method includes suppressing execution of at least one dependent instruction of a load instruction by a processor using stored dependency information responsive to an invalid status of the load instruction. A processor includes an execution unit to execute instructions and a scheduler. The scheduler is to select for execution in the execution unit a load instruction having at least one dependent instruction and suppress execution of the at least one dependent instruction using stored dependency information responsive to an invalid status of the load instruction.Type: ApplicationFiled: June 25, 2013Publication date: December 25, 2014Inventors: Francesco Spadini, Michael Achenbach
-
Patent number: 8914615Abstract: A processor core supports execution of program instruction from both a first instruction set and a second instruction set. An architectural register file 18 containing architectural registers is shared by the two instruction sets. The two instruction sets employ logical register specifiers which for at least some values of those logical registers specifiers correspond to different architectural registers within the architectural register file 18. A first decoder 4 for the first instruction set and a second decoder 6 for the second instruction set serve to decode the logical register specifiers to a common register addressing format. This common register addressing format is used to supply register specifiers to renaming circuitry 10 for supporting register renaming in conjunction with a physical register file 16 and an architectural register file 18.Type: GrantFiled: December 2, 2011Date of Patent: December 16, 2014Assignee: ARM LimitedInventors: Glen Andrew Harris, James Nolan Hardage, Mark Carpenter Glass
-
Patent number: 8914617Abstract: Methods and apparatus relating to a hardware move elimination and/or next page prefetching are described. In some embodiments, a logic may provide hardware move eliminations based on stored data. In an embodiment, a next page prefetcher is disclosed. Other embodiments are also described and claimed.Type: GrantFiled: December 24, 2010Date of Patent: December 16, 2014Assignee: Intel CorporationInventors: Shlomo Raikin, David J. Sager, Zeev Sperber, Evgeni Krimer, Ori Lempel, Stanislav Shwartsman, Adi Yoaz, Omer Golz
-
Patent number: 8914616Abstract: A data processing apparatus and method are provided. A processor performs data processing operations in response to data processing instructions which reference logical registers. A set of physical registers stores data values which are subjected to the data processing operations. A tag storage stores for each physical register a tag value indicative of one of the logical registers. The processor references the tag storage to perform the data processing operations. A tag value exchanger performs a tag switch exchanging two tag values in the tag storage when the processor executes a predetermined instruction which references two logical registers and for which a choice of which two physical registers are mapped to which of the two logical registers will have no effect on an outcome of the data processing operations. The tag value exchanger performs the tag switch with respect to the tag values indicative of the two logical registers.Type: GrantFiled: December 2, 2011Date of Patent: December 16, 2014Assignee: ARM LimitedInventor: Simon John Craske
-
Publication number: 20140344554Abstract: A dependency reordering method. The method includes accessing an input sequence of instructions, initializing three registers, and loading instruction numbers into a first register. The method further includes loading destination register numbers into a second register, broadcasting values from the first register to a position in a third register in accordance with a position number in the second register, overwriting positions in the third register in accordance with position numbers in the second register, and using information in the third register to populate a dependency matrix for grouping dependent instructions from the sequence of instructions.Type: ApplicationFiled: November 22, 2011Publication date: November 20, 2014Inventor: Mohammad Abdallah
-
Publication number: 20140325188Abstract: A method for reducing a pipeline stall in a multi-pipelined processor includes finding a store instruction having a same target address as a load instruction and having a store value of the store instruction not yet written according to the store instruction, when the store instruction is being concurrently processed in a different pipeline than the load instruction and the store instruction occurs before the load instruction in a program order. The method also includes associating a target rename register of the load instruction as well as the load instruction with the store instruction, responsive to the finding step. The method further includes writing the store value of the store instruction to the target rename register of the load instruction and finishing the load instruction without reissuing the load instruction, responsive to writing the store value of the store instruction according to the store instruction to finish the store instruction.Type: ApplicationFiled: April 24, 2013Publication date: October 30, 2014Applicant: International Business Machines CorporationInventor: TAKESHI OGASAWARA
-
Publication number: 20140304492Abstract: A microprocessor implemented method for resolving dependencies for a load instruction in a load store queue (LSQ) is disclosed. The method comprises initiating a computation of a virtual address corresponding to the load instruction in a first clock cycle. It also comprises transmitting early calculated lower address bits of the virtual address to a load store queue (LSQ) in the same cycle as the initiating. Finally, it comprises performing a partial match in the LSQ responsive to and using the lower address bits to find a prior aliasing store, wherein the prior aliasing store stores to a same address as the load instruction.Type: ApplicationFiled: May 19, 2014Publication date: October 9, 2014Applicant: Soft Machines, Inc.Inventors: Mohammad A. ABDALLAH, Ravishankar RAO
-
Publication number: 20140289501Abstract: Register renaming circuitry for a processing apparatus configured to process a stream of instructions from an instruction set specifying registers from an architectural set of registers. The apparatus including a physical set of registers configured to store data values being processed by the processing apparatus. Register renaming circuitry is configured to receive a stream of operations from an instruction decoder and to map registers that are to be written to by the stream of operations to physical registers within the physical set of registers that are currently available. The register renaming circuitry comprises register release circuitry configured to release the physical registers that have been mapped to the registers when a first set of conditions have been met, and to release the physical registers that have been mapped to the additional registers when a second set of conditions have been met.Type: ApplicationFiled: March 20, 2013Publication date: September 25, 2014Inventors: Guillaume SCHON, Cedric Denis Robert AIRAUD, Frederic Jean Denis ARSANTO, Luca SCALABRINO
-
Publication number: 20140281415Abstract: Reconfiguring a register file using a rename table having a plurality of fields that indicate fracture information about a source register of an instruction for instructions which have narrow to wide dependencies.Type: ApplicationFiled: March 15, 2013Publication date: September 18, 2014Inventors: Bradley Gene BURGESS, Ashraf AHMED, Ravi IYENGAR
-
Publication number: 20140281393Abstract: Out-of-order CPUs, devices and methods diminish the time penalty from stalling the pipe to rebuild a rename table, such as due to a misprediction. A microprocessor can include a pipe that has a decoder, a dispatcher, and at least one execution unit. A rename table stores rename data, and a check-point table (“CPT”) stores rename data received from the dispatcher. A Re-Order Buffer (“ROB”) stores ROB data, and has a static mapping relationship with the CPT. If the rename table is flushed, such as due to a misprediction, the rename table is rebuilt at least in part by concurrent copying of rename data stored in the CPT, in coordination with walking the ROB.Type: ApplicationFiled: March 14, 2013Publication date: September 18, 2014Inventors: Ravi Iyengar, Prarthna Santhanakrishnan
-
Publication number: 20140281414Abstract: Out-of-order CPUs, devices and methods diminish the time penalty from stalling the pipe to rebuild a rename table, such as due to a misprediction. A microprocessor can include a pipe that has a decoder, a dispatcher, and at least one execution unit. A rename table stores rename data, and a check-point table (“CPT”) stores rename data received from the dispatcher. A Re-Order Buffer (“ROB”) stores ROB data, and has a dynamic mapping relationship with the CPT. If the rename table is flushed, such as due to a misprediction, the rename table is rebuilt at least in part by concurrent copying of rename data stored in the CPT, in coordination with walking the ROB.Type: ApplicationFiled: March 14, 2013Publication date: September 18, 2014Inventors: Ravi Iyengar, Prarthna Santhanakrishnan
-
Publication number: 20140281413Abstract: Methods and systems that allow the processor to effectively and efficiently reduce or eliminate the latency associated with instructions that copy the value of one register to another register. A processor includes a superforwarding table, a superforwarding logic block, and a computation engine. The superforwarding table stores an entry, wherein the entry has a valid bit, a key field, and a forward field. The superforwarding logic block determines which register contains the information needed for an instruction. The computation engine executes instructions.Type: ApplicationFiled: March 14, 2013Publication date: September 18, 2014Applicant: MIPS Technologies, Inc.Inventors: Qian WANG, Ranganathan Sudhakar
-
Publication number: 20140258687Abstract: A method and apparatus for register packing prior to register renaming in a microprocessor are provided. The method includes: receiving a plurality of micro operations (micro-ops) decoded from one or more instructions; packing a plurality of registers which are included in the micro-ops into a packed register structure including a plurality of packed registers based on a preset number of rename ports of a renamer through which the packed registers are read or written for register renaming; and sending the packed registers for register renaming.Type: ApplicationFiled: March 8, 2013Publication date: September 11, 2014Inventors: Teik-Chung TAN, Bradley Gene BURGESS, Ravi IYENGAR
-
Publication number: 20140244978Abstract: The present invention provides a method and apparatus for checkpointing registers for transactional memory. Some embodiments of the apparatus include first rename logic configured to map up to a predetermined number of architectural registers to corresponding first physical registers that hold first values associated with the architectural registers. The mapping is responsive to a transaction modifying one or more of the first values associated with the architectural registers. Some embodiments of the apparatus also include microcode configured to write contents of the first physical registers to a memory in response to the transaction modifying first values associated with a number of the architectural registers that is larger than the predetermined number.Type: ApplicationFiled: February 28, 2013Publication date: August 28, 2014Applicant: ADVANCED MICRO DEVICES, INC.Inventor: John M. King
-
Patent number: 8799626Abstract: A segmental allocation method of expanding RISC processor register includes the steps of a) setting an instruction format of the RISC processor, the destination register field being set having 6 bits to correspond to 64 registers and at least one source register field having at least 4 bits to correspond to at least 16 registers; b) providing two solutions to the problem resulting from that the instruction format in the step a) goes beyond range under some circumstances; and c) setting a register segment allocation algorithm having the steps of c1) providing and grouping a plurality of pseudo registers; c2) prioritizing the pseudo registers in each of the groups; c3) combining the groups pursuant to the priorities thereof; and c4) locating the physical register of lowest computational cost.Type: GrantFiled: September 9, 2011Date of Patent: August 5, 2014Assignee: National Chung Cheng UniversityInventors: Rong-Guey Chang, Yuan-Shin Hwang, Chia-Hsien Su
-
Publication number: 20140164742Abstract: An apparatus and method are provided for performing register renaming. Available register identifying circuitry is provided to identify which physical registers form a pool of physical registers available to be mapped by register renaming circuitry to an architectural register specified by an instruction to be executed. Configuration data whose value is modified during operation of the processing circuitry is stored such that, when the configuration data has a first value, the configuration data identifies at least one architectural register of the architectural register set which does not require mapping to a physical register by the register renaming circuitry. The register identifying circuitry is arranged to reference the modified data value, such that when the configuration data has the first value, the number of physical registers in the pool is increased due to the reduction in the number of architectural registers which require mapping to physical registers.Type: ApplicationFiled: June 26, 2013Publication date: June 12, 2014Inventors: Frederic Claude Marie PIRY, Louis-Marie Vincent MOUTON, Luca SCALABRINO, Richard Roy GRISENTHWAITE, David Hennah MANSELL
-
Patent number: 8725989Abstract: In one embodiment, a processor can perform a function call from a main program to a function that is to operate on at least one vector-type operand, in which only scalar values are passed to the function, and input values to the function including the at least one vector-type operand are to be renamed from virtual registers identified in the function to physical registers of a vector register file, and output values from the function including the at least one vector-type operand are to be renamed from virtual registers identified in the function to physical registers of the vector register file. Other embodiments are described and claimed.Type: GrantFiled: December 9, 2010Date of Patent: May 13, 2014Assignee: Intel CorporationInventor: Tomasz Madajczak
-
Publication number: 20140122837Abstract: Embodiments of a processor architecture utilizing multi-bank implementation of physical register mapping table are provided. A register renaming system to correlate architectural registers to physical registers includes a physical register mapping table and a renaming logic. The physical register mapping table has a plurality of entries each indicative of a state of a respective physical register. The mapping table has a plurality of non-overlapping sections each of which having respective entries of the mapping table. The renaming logic is coupled to search a number of the sections of the mapping table in parallel to identify entries that indicate the respective physical registers have a first state. The renaming logic selectively correlates each of a plurality of architectural registers to a respective physical register identified as being in the first state. Methods of utilizing the multi-bank implementation of physical register mapping table are also provided.Type: ApplicationFiled: October 28, 2013Publication date: May 1, 2014Applicant: STMicroelectronics (Beijing) R&D Co. Ltd.Inventors: Peng Fei Zhu, Hong-Xia Sun, Yong Qiang Wu
-
Publication number: 20140101415Abstract: A pipelined computer processor is presented that reduces data hazards such that high processor utilization is attained. The processor restructures a set of instructions to operate concurrently on multiple pieces of data in multiple passes. One subset of instructions operates on one piece of data while different subsets of instructions operate concurrently on different pieces of data. A validity pipeline tracks the priming and draining of the pipeline processor to ensure that only valid data is written to registers or memory. Pass-dependent addressing is provided to correctly address registers and memory for different pieces of data.Type: ApplicationFiled: December 10, 2013Publication date: April 10, 2014Applicant: Micron Technology, Inc.Inventors: Neal Andrew Crook, Alan T. Wootton, James Peterson
-
Patent number: 8683180Abstract: A method, processor, and computer program product employing an intermediate register mapper within a register renaming mechanism. A logical register lookup determines whether a hit to a logical register associated with the dispatched instruction has occurred. In this regard, the logical register lookup searches within at least one register mapper from a group of register mappers, including an architected register mapper, a unified main mapper, and an intermediate register mapper. A single hit to the logical register is selected among the group of register mappers. If an instruction having a mapper entry in the unified main mapper has finished but has not completed, the mapping contents of the register mapper entry in the unified main mapper are moved to the intermediate register mapper, and the unified register mapper entry is released, thus increasing a number of unified main mapper entries available for reuse.Type: GrantFiled: October 13, 2009Date of Patent: March 25, 2014Assignee: International Business Machines CorporationInventors: Brian D. Barrick, Michael Billeci, Lee E. Eisen
-
Patent number: 8661228Abstract: A processor includes an instruction fetch unit, an issue queue coupled to the instruction fetch unit, an execution unit coupled to the issue queue, and a multi-level register file including a first level register file having lower access latency and a second level register file having higher access latency. Each of the first and second level register files includes a plurality of physical registers for holding operands that is concurrently shared by a plurality of threads. The processor further includes a mapper that, at dispatch of an instruction specifying a source logical register from the instruction fetch unit to the issue queue, initiates a swap of a first operand associated with the source logical register that is in the second level register file with a second operand held in the first level register file. The issue queue, following the swap, issues the instruction to the execution unit for execution.Type: GrantFiled: April 16, 2012Date of Patent: February 25, 2014Assignee: International Business Machines CorporationInventors: Christopher M. Abernathy, Mary D. Brown, Hung Q. Le, Dung Q. Nguyen
-
Patent number: 8661230Abstract: A mapper unit of an out-of-order processor assigns a particular counter currently in a counter free pool to count a number of mappings of logical registers to a particular physical register from among multiple physical registers, responsive to an execution of an instruction by the mapper unit mapping at least one logical register to the particular physical register. The number of counters is less than the number of physical registers. The mapper unit, responsive to the counted number of mappings of logical registers to the particular physical register decremented to less than a minimum value, returns the particular counter to the counter free pool.Type: GrantFiled: April 15, 2011Date of Patent: February 25, 2014Assignee: International Business Machines CorporationInventors: Gregory W. Alexander, Brian D. Barrick, John W. Ward, III
-
Publication number: 20140047219Abstract: A multi-level register hierarchy is disclosed comprising a first level pool of registers for caching registers of a second level pool of registers in a system wherein programs can dynamically release and re-enable architected registers such that released architected registers need not be maintained by the processor, the processor accessing operands from the first level pool of registers, wherein a last-use instruction is identified as having a last use of an architected register before being released, the last-use architected register being released causes the multi-level register hierarchy to discard any correspondence of an entry to said last use architected register.Type: ApplicationFiled: October 18, 2013Publication date: February 13, 2014Applicant: International Business Machines CorporationInventors: Michael K. Gschwind, Valentina Salapura
-
Publication number: 20140047218Abstract: Multi-stage register renaming using dependency removal is described. In an embodiment, the registers are renamed in two stages. The first stage involves removing all the dependencies within a set of instructions which are being renamed together. The final stage then renames all registers in parallel using a renaming map. In various embodiments, the dependencies are removed in the first stage using a fixed mapping to rename destination registers in each instruction and in some embodiments the fixed mapping is based on the position of a destination register within the set of instructions. Dependent registers, which are those registers which are read in an instruction but have been written in a previous instruction in the set, are also renamed in the first stage. In addition to performing the renaming in the final stage, the renaming map is updated.Type: ApplicationFiled: January 28, 2013Publication date: February 13, 2014Applicant: IMAGINATION TECHNOLOGIES LIMITEDInventor: Hugh Jackson
-
Patent number: 8631223Abstract: A processor includes an instruction sequencing unit, execution unit, and multi-level register file including a first level register file having a lower access latency and a second level register file having a higher access latency. Responsive to the processor processing a second instruction in a transactional code section to obtain as an execution result a second register value of the logical register, the mapper moves a first register value of the logical register to the second level register file, places the second register value in the first level register file, marks the second register value as speculative, and replaces a first mapping for the logical register with a second mapping. Responsive to unsuccessful termination of the transactional code section, the mapper designates the second register value in the first level register file as invalid so that the first register value in the second level register file becomes the working value.Type: GrantFiled: May 12, 2010Date of Patent: January 14, 2014Assignee: International Business Machines CorporationInventors: Christopher M. Abernathy, Mary D. Brown, Hung Q. Le, Dung Q. Nguyen