Scoreboarding, Reservation Station, Or Aliasing Patents (Class 712/217)
-
Patent number: 10169045Abstract: A method for dependency broadcasting through a source organized source view data structure is disclosed. The method comprises receiving an incoming instruction sequence using a global front end and grouping the instructions to form instruction blocks. Further, the method comprises populating the register template with block numbers corresponding to the instruction blocks, wherein the block numbers corresponding to the instruction blocks indicate interdependencies among the instruction blocks wherein an incoming instruction block writes its respective block number into fields of the register template corresponding to destination registers referred to by the incoming instruction block. The method also comprises populating a source organized source view data structure, wherein the source view data structure stores the instruction sources corresponding to the instruction blocks as read from the register template by incoming instruction blocks.Type: GrantFiled: January 17, 2017Date of Patent: January 1, 2019Assignee: Intel CorporationInventor: Mohammad Abdallah
-
Patent number: 10140128Abstract: A parallelized multiple dispatch ordered queue including an ordered queue, qualify logic, ordered select logic, and dispatch logic. The ordered queue stores candidates in order from oldest to youngest into multiple entries. The ordered queue is divided into N groups in which an i'th group includes every i'th entry of every N entries of the ordered queue, wherein i is an integer less than or equal to N. The qualify logic determines whether any candidate is ready to be dispatched. The ordered select logic respectively determines the oldest candidate in each group that is ready to be dispatched. The dispatch logic dispatches the oldest ready candidates in parallel. The shift logic shifts the stored candidates in the ordered queue to fill any vacant entries between remaining ones of the stored candidates without changing an order of the remaining ones of the stored candidates in the ordered queue. The ordered queue may have any size or depth and N is any suitable integer determining the number of candidates (e.g.Type: GrantFiled: March 10, 2015Date of Patent: November 27, 2018Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.Inventors: Qianli Di, Jianbin Wang, Weili Li, Xiaoyuan Yu, Xin Yu Gao
-
Patent number: 10133620Abstract: A processor includes physical storage locations, and a register rename unit that includes a plurality of register rename storage structures. At a given time, each of a complete group of physical storage location identifiers is to be stored in one, but only one, of the plurality of register rename storage structures, unless there is an error. Each of the complete group of physical storage location identifiers is to identify a different one of the physical storage locations. The register rename unit is to detect an error when a first value, which is to be equal to an operation on the complete group of the physical storage location identifiers with no errors, is inconsistent with a second value. The second value is to represent the operation on all physical storage location identifiers that are to be stored in the plurality of register rename storage structures at the given time.Type: GrantFiled: January 10, 2017Date of Patent: November 20, 2018Assignee: Intel CorporationInventors: Alex Gerber, Yiannakis Sazeides, Arkady Bramnik, Ron Gabor
-
Patent number: 10127098Abstract: An apparatus and method for recovering the functionality of central processing unit core are disclosed herein. The apparatus for recovering the functionality of a central processing unit (CPU) core includes a functionality recovery buffer and a functionality recovery module unit. The functionality recovery buffer temporarily stores a value, to be stored in a register storage unit, in response to a write operation. The functionality recovery module unit performs the recovery of functionality by controlling the functionality recovery buffer when receiving a signal, indicating that a failure has been detected, from the outside.Type: GrantFiled: January 27, 2016Date of Patent: November 13, 2018Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTEInventor: Young-Su Kwon
-
Patent number: 10055224Abstract: A method and apparatus for reconfiguring hardware structures to pipeline the execution of multiple special purpose hardware implemented functions, without saving intermediate results to memory, is provided. Pipelining functions in a program is typically performed by a first function saving its results (the “intermediate results”) to memory, and a second function subsequently accessing the memory to use the intermediate results as input. Saving and accessing intermediate results stored in memory incurs a heavy performance penalty, requires more power, consumes more memory bandwidth, and increases the memory footprint. Due to the ability to redirect the input and output of the hardware structures, intermediate results are passed directly from one special purpose hardware implemented function to another without storing the intermediate results in memory.Type: GrantFiled: December 10, 2015Date of Patent: August 21, 2018Assignee: Oracle International CorporationInventors: Kathirgamar Aingaran, Garret F. Swart
-
Patent number: 10019374Abstract: A technique for operating a lower level cache memory of a data processing system includes receiving an operation that is associated with a first thread. Logical partition (LPAR) information for the operation is used to limit dependencies in a dependency data structure of a store queue of the lower level cache memory that are set and to remove dependencies that are otherwise unnecessary.Type: GrantFiled: April 28, 2016Date of Patent: July 10, 2018Assignee: International Business Machines CorporationInventors: Guy L. Guthrie, Hugh Shen, Derek E. Williams
-
Patent number: 10002020Abstract: A data processing apparatus and method of data processing are provided, which relate to the operation of a processor which maintains a call stack in dependence on the data processing instructions executed. The processor is configured to operate in a transactional execution mode when the data processing instructions seek access to a stored data item which is shared with a further processor. When the processor enters its transactional execution mode it stores a copy of the current stack depth indication and thereafter, when operating in its transactional execution mode, further modifications to the call stack are compared to the copy of the stack depth indication stored. If the relative stacking position of the required modification is in a positive stack growth direction with respect to the copy stored, the modification to the call stack is labelled as non-speculative.Type: GrantFiled: June 9, 2015Date of Patent: June 19, 2018Assignee: ARM LimitedInventors: Matthew James Horsnell, Stephan Diestelhorst
-
Patent number: 9984430Abstract: A scoreboard may keep track of thread dependencies. A set of threads with a common characteristic may be grouped so that if that characteristic is changed, the group of threads can be accessed to account for that change. Examples for such a characteristic include various types of scoreboard address changes. When the characteristic is changed the group of threads are used to identify threads affected by the characteristic change.Type: GrantFiled: April 15, 2013Date of Patent: May 29, 2018Assignee: Intel CorporationInventors: Prasoonkumar Surti, Thomas A. Piazza
-
Patent number: 9977683Abstract: In one embodiment, a first thread of execution on a computing device receives a user-interface input. The first thread of execution is associated with a user interface of the computing device. The first thread of execution identifies a second thread of execution on the computing device to process the user-interface input. The second thread of execution is associated with the user interface and is de-coupled from the first thread of execution. The first thread of execution sends the user-interface input to the second thread of execution. The second thread of execution also processes the user-interface input to generate a user-interface output associated with the user-interface input.Type: GrantFiled: December 14, 2012Date of Patent: May 22, 2018Assignee: Facebook, Inc.Inventor: Robert Douglas Arnold
-
Patent number: 9971604Abstract: An approach is provided in which a mapper control unit receives dispatch information corresponding to a dispatching instruction that targets some of the register fields in a register. The mapper control unit selects, in a history buffer, an available history buffer entry that includes multiple field sets, each including an itag field. In turn, the mapper control unit modifies some of the history buffer field sets, including the itag fields, based on the existing content stored in the targeted register fields.Type: GrantFiled: February 26, 2015Date of Patent: May 15, 2018Assignee: International Business Machines CorporationInventors: Sundeep Chadha, Michael J. Genden, Dung Q. Nguyen, David R. Terry, Kenneth L. Ward
-
Patent number: 9959213Abstract: A technique for operating a lower level cache memory of a data processing system includes receiving, by a store queue controller, an operation that is associated with a first thread. The store queue controller uses level one (L1) cache memory miss information for the operation to limit dependencies in a dependency data structure of a store queue of the lower level cache memory that are set and to remove dependencies that are otherwise unnecessary.Type: GrantFiled: April 28, 2016Date of Patent: May 1, 2018Assignee: International Business Machines CorporationInventors: Guy L. Guthrie, Hugh Shen, Derek E. Williams
-
Patent number: 9959183Abstract: Data is replicated into a memory cache with non-naturally aligned data boundaries to reduce the time needed to generate test cases for testing a processor. Placing data in the non-naturally aligned data boundaries as described herein allows replicated testing of the memory cache while preserving double word and quad word boundaries in segments of the replicated test data. This allows test cases to be generated for a section of memory and then replicated throughout the memory and tested by a single test branching back and using the next strand of the replicated test data in the memory cache.Type: GrantFiled: August 23, 2016Date of Patent: May 1, 2018Assignee: International Business Machines CorporationInventors: Manoj Dusanapudi, Shakti Kapoor
-
Patent number: 9952872Abstract: An arithmetic processing device includes an instruction decode unit, an instruction execution unit and an instruction hold unit, wherein the instruction hold unit includes; a first holder including a plurality of first entries each configured to hold a decoded instruction; a second holder including a smaller number of second entries than the number of the first entries; a first selector configured to select an instruction to be registered in the second holder from instructions held in the first entries and store identification information that identifies the selected instruction into any of the second entries; and a second selector configured to sequentially select an executable instruction from instructions registered in the second holder, input the selected executable instruction to the instruction execution unit, and detect a dependency between the instruction inputted to the instruction execution unit and the instructions registered in the second holder.Type: GrantFiled: May 20, 2016Date of Patent: April 24, 2018Assignee: FUJITSU LIMITEDInventors: Sota Sakashita, Yasunobu Akizuki
-
Patent number: 9946543Abstract: A processor includes an execution pipeline configured to execute instructions for threads, wherein the architectural state of a thread includes a set of register windows for the thread. The processor also includes a physical register file (PRF) containing both speculative and architectural versions of registers for each thread. When an instruction that writes to a destination register enters a rename stage, the rename stage allocates an entry for the destination register in the PRF. When an instruction that has written to a speculative version of a destination register enters a commit stage, the commit stage converts the speculative version into an architectural version. It also deallocates an entry for a previous version of the destination register from the PRF. When a register-window-restore instruction that deallocates a register window enters the commit stage, the commit stage deallocates local and output registers for the deallocated register window from the PRF.Type: GrantFiled: March 14, 2016Date of Patent: April 17, 2018Assignee: Oracle International CorporationInventor: Yuan C. Chou
-
Patent number: 9934042Abstract: A method for dependency broadcasting through a block organized source view data structure.Type: GrantFiled: March 17, 2014Date of Patent: April 3, 2018Assignee: INTEL CORPORATIONInventor: Mohammad Abdallah
-
Patent number: 9928070Abstract: A microprocessor with a fused reservation stations (RS) structure including a primary RS, a secondary RS, and a bypass system. The primary RS has an input for receiving issued instructions, has a push output for pushing the issued instructions to the secondary RS, and has at least one bypass output for dispatching issued instructions that are ready for dispatch. The secondary RS has an input coupled to the push output of the primary RS and has at least one dispatch output. The bypass system selects between the bypass output of the primary RS and at least one dispatch output of the secondary RS for dispatching selected issued instructions. The primary and secondary RS may each be selected from different RS structure types. A unify RS provides a suitable primary RS, and the secondary RS may include multiple queues. The bypass output enables direct dispatch from the primary RS.Type: GrantFiled: October 14, 2015Date of Patent: March 27, 2018Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTDInventors: Qianli Di, Xiaoyuan Yu
-
Patent number: 9891915Abstract: A microprocessor implemented method for resolving dependencies for a load instruction in a load store queue (LSQ) is disclosed. The method comprises initiating a computation of a virtual address corresponding to the load instruction in a first clock cycle. It also comprises transmitting early calculated lower address bits of the virtual address to a load store queue (LSQ) in the same cycle as the initiating. Finally, it comprises performing a partial match in the LSQ responsive to and using the lower address bits to find a prior aliasing store, wherein the prior aliasing store stores to a same address as the load instruction.Type: GrantFiled: May 19, 2014Date of Patent: February 13, 2018Assignee: INTEL CORPORATIONInventors: Mohammad A. Abdallah, Ravishankar Rao
-
Patent number: 9880847Abstract: An apparatus for processing instructions includes a mapping unit comprising a plurality of mappers wherein each mapper of the plurality of mappers maps a logical sub-register reference to a physical sub-register reference, a decoding unit configured to receive an instruction and determine a plurality of logical sub-register references therefrom, and an execution unit. The mapping unit may be configured to distribute the plurality of logical sub-register references amongst the plurality of mappers according to at least one bit in the instruction and provide a corresponding plurality of physical sub-register references. The execution unit may be configured to execute the instruction using the plurality of physical sub-register references. Corresponding methods are also disclosed herein.Type: GrantFiled: June 26, 2015Date of Patent: January 30, 2018Assignee: International Business Machines CorporationInventors: Gregory W. Alexander, Brian D. Barrick, Lee E. Eisen, David A. Schroter
-
Patent number: 9858194Abstract: Methods and migration units for use in out-of-order processors for migrating data to register file caches associated with functional units of the processor to satisfy register read operations. The migration unit receives register read operations to be executed for a particular functional unit. The migration unit reviews entries in a register renaming table to determine if the particular functional unit has recently accessed the source register and thus is likely to comprise an entry for the source register in its register file cache. In particular, the register renaming table comprises entries for physical registers that indicate what functional units have accessed the physical register. If the particular functional unit has not accessed the particular physical register the migration unit migrates data to the register file cache associated with the particular functional unit.Type: GrantFiled: February 21, 2017Date of Patent: January 2, 2018Assignee: Imagination TechnologiesInventors: Hugh Jackson, Anand Khot
-
Patent number: 9779038Abstract: A method includes executing a first memory access operation in a memory. A progress indication, which is indicative of a progress of execution of the first memory access operation, is obtained from the memory. Based on the progress indication, a decision is made whether to suspend the execution of the first memory access operation in order to execute a second memory access operation.Type: GrantFiled: January 31, 2013Date of Patent: October 3, 2017Assignee: Apple Inc.Inventors: Yoav Kasorla, Asaf Schushan, Asaf Vega, Eyal Gurgi, Shai Ojalvo
-
Patent number: 9779792Abstract: A register file includes a substrate, a plurality of entries, and a plurality of read ports. Each entry includes a corresponding subset of a plurality of memory cells defined on the substrate. Each read port includes a plurality of access elements defined on the substrate. Each access element is associated with a particular common bit position of each of the entries. A plurality of entry access groups are disposed in adjacent columns on the substrate. Each entry access group is associated with a corresponding one of the plurality of entries and includes the access elements for all of the read ports for the corresponding entry.Type: GrantFiled: June 27, 2013Date of Patent: October 3, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Eric W. Busta, Karthik Natarajan, Brian M. Lay, Gregory A. Constant
-
Patent number: 9772854Abstract: Execution of instructions in a transactional environment is selectively controlled. A TRANSACTION BEGIN instruction initiates a transaction and includes controls that selectively indicate whether certain types of instructions are permitted to execute within the transaction. The controls include one or more of an allow access register modification control and an allow floating point operation control.Type: GrantFiled: June 15, 2012Date of Patent: September 26, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dan F. Greiner, Christian Jacobi, Robert R. Rogers, Timothy J. Slegel
-
Patent number: 9710269Abstract: Delays due to waiting for operands that will not be used by a select operand instruction, are alleviated based on an early recognition that such operand data is not required in order to complete the processing of the select operand instruction. At appropriate points prior to execution, determinations are made regarding a selection criterion or criteria specified by the select operand instruction, conditions that affect the selection criteria, and the availability of operands. A hold circuit uses the determinations to control the activation and release of a hold signal that controls processor pipeline stalls. A stall required to wait for operand data is skipped or a stall is terminated early, if the selected operand is available even though the other operand, that will not be used, is not available. A stall due to waiting for operands is maintained until the selection criteria is met and the selected operand is fetched and made available.Type: GrantFiled: January 20, 2006Date of Patent: July 18, 2017Assignee: QUALCOMM IncorporatedInventors: James Norris Dieffenderfer, Jeffrey Todd Bridges, Michael Scott McIlvaine, Thomas Andrew Sartorius
-
Patent number: 9690583Abstract: A pool of available physical registers are provided for architected registers, wherein operations are performed that activate and deactivate selected architected registers, such that the deactivated selected architected registers need not retain values, and physical registers can be deallocated to the pool, wherein deallocation of physical registers is performed after a last-use by a designated last-use instruction, wherein the last-use information is provided either by the last-use instruction or a prefix instruction, wherein reads to deallocated architecture registers return an architected default value.Type: GrantFiled: October 3, 2011Date of Patent: June 27, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Michael K. Gschwind, Valentina Salapura
-
Patent number: 9672298Abstract: Techniques for executing versioned memory access instructions. In one embodiment, a processor is configured to execute versioned store instructions of a first thread within a first mode of operation. In this embodiment, in the first mode of operation, the processor is configured to retire a versioned store instruction only after a version comparison has been performed for the versioned store instruction. In this embodiment the processor is configured to suppress retirement of instructions in the first thread that are younger than an oldest versioned store instruction until the oldest versioned store instruction has retired. In some embodiments, the processor is configured to execute versioned store instructions of a given thread within a second mode of operation, in which the processor is configured to retire outstanding versioned store instructions before a version comparison has been performed.Type: GrantFiled: May 1, 2014Date of Patent: June 6, 2017Assignee: Oracle International CorporationInventors: Zoran Radovic, Jared C. Smolens, Robert T. Golla, Paul J. Jordan, Mark A. Luttrell
-
Patent number: 9652246Abstract: In a method of executing instructions in a processing system, respective global age tags are assigned to each of the one or more instructions fetched for processing by the processing system. Each global age tag indicates an age of the corresponding instruction in the processing system. Respective physical registers in a physical register file are allocated to each destination logical register referenced by each instruction. The respective global age tags are written to the in respective physical registers allocated to the destination logical registers of the instructions. The instructions are executed by the processing system. At least some of the instructions are executed in an order different from a program order of the instructions.Type: GrantFiled: December 20, 2013Date of Patent: May 16, 2017Assignee: Marvell International Ltd.Inventors: Kit Sang Tam, Winston Lee
-
Patent number: 9612968Abstract: Methods and migration units for use in out-of-order processors for migrating data to register file caches associated with functional units of the processor to satisfy register read operations. The migration unit receives register read operations to be executed for a particular functional unit. The migration unit reviews entries in a register renaming table to determine if the particular functional unit has recently accessed the source register and thus is likely to comprise an entry for the source register in its register file cache. In particular, the register renaming table comprises entries for physical registers that indicate what functional units have accessed the physical register. If the particular functional unit has not accessed the particular physical register the migration unit migrates data to the register file cache associated with the particular functional unit.Type: GrantFiled: February 9, 2016Date of Patent: April 4, 2017Assignee: Imagination Technologies LimitedInventors: Hugh Jackson, Anand Khot
-
Patent number: 9582286Abstract: A processor includes a physical register file having physical registers and an execution unit to perform an arithmetic operation to generate a result mapped to a physical register, wherein the processor delays a write of the result to the physical register file until the result is qualified as valid. A method includes mapping the same physical register both to store load data of a load-execute operation and to subsequently store a result of an arithmetic operation of the load-execute operation, and writing the load data into the physical register. The method further includes, in a first clock cycle, executing the arithmetic operation to generate the result, and, in a second clock cycle, providing the result as a source operand for a dependent operation. The method includes, in a third clock cycle, enabling a write of the result to the physical register file responsive to the result qualifying as valid.Type: GrantFiled: November 9, 2012Date of Patent: February 28, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Ganesh Venkataramanan, Debjit Das Sarma, Betty A. McDaniel, Gregory W. Smaus, Francesco Spadini
-
Patent number: 9575754Abstract: A system and method for reducing the latency of data move operations. A register rename unit within a processor determines whether a decoded move instruction is eligible for a zero cycle move operation. If so, control logic assigns a physical register identifier associated with a source operand of the move instruction to the destination operand of the move instruction. Additionally, the register rename unit marks the given move instruction to prevent it from proceeding in the processor pipeline. Further maintenance of the particular physical register identifier may be done by the register rename unit during commit of the given move instruction.Type: GrantFiled: April 16, 2012Date of Patent: February 21, 2017Assignee: Apple Inc.Inventors: James B. Keller, John H. Mylius, Conrado Blasco-Allue, Gerard R. Williams, III, Suparn Vats
-
Patent number: 9575762Abstract: A method for populating a register view data structure by using register template snapshots. The method includes receiving an incoming instruction sequence using a global front end; grouping the instructions to form instruction blocks; using a plurality of register templates to track instruction destinations and instruction sources by populating the register template with block numbers corresponding to the instruction blocks, wherein the block numbers corresponding to the instruction blocks indicate interdependencies among the blocks of instructions; populating a register view data structure, wherein the register view data structure stores destinations corresponding to the instruction blocks as recorded by the plurality of register templates; and using the register view data structure to track a machine state in accordance with the execution of the plurality of instruction blocks.Type: GrantFiled: March 14, 2014Date of Patent: February 21, 2017Assignee: SOFT MACHINES INCInventor: Mohammad Abdallah
-
Patent number: 9569280Abstract: A storage compute device includes a data storage section that facilitates persistently storing host data as data objects. The storage compute device also includes two or more compute sections that perform computations on the data objects. A controller monitors resource collisions affecting a first of the compute sections. The controller creates a copy of at least one of the data objects to be processed in parallel at a second of the compute sections in response to the resource collisions.Type: GrantFiled: September 15, 2014Date of Patent: February 14, 2017Assignee: SEAGATE TECHNOLOGY LLCInventors: David Scott Ebsen, Ryan James Goss, Jeffrey L. Whaley, Dana Simonson
-
Patent number: 9552386Abstract: Limiting the number of concurrent requests in a database system. Arranging requests to be handled by the database system in at least one queue. Defining a maximum value (SS) of concurrent requests corresponding to the at least one queue. Monitoring at least one queue utilization parameter corresponding to the at least one queue and calculating a performance value based on the at least one queue utilization parameter. Adapting the maximum value (SS) of concurrent requests of the at least one queue dynamically based on the performance value (PF) in order to improve system performance. Limiting the number of concurrent requests of the at least one queue dynamically based on the dynamically adapted maximum value (SS).Type: GrantFiled: December 9, 2015Date of Patent: January 24, 2017Assignee: International Business Machines CorporationInventors: Pawel Gocek, Grzegorz K. Lech, Bartlomiej T. Malecki, Jan Marszalek, Joanna Wawrzyczek
-
Patent number: 9524162Abstract: A processor uses a dedicated buffer to reduce the amount of time needed to execute memory copy operations. For each load instruction associated with the memory copy operation, the processor copies the load data from memory to the dedicated buffer. For each store operation associated with the memory copy operation, the processor retrieves the store data from the dedicated buffer and transfers it to memory. The dedicated buffer is separate from a register file and caches of the processor, so that each load operation associated with a memory copy operation does not have to wait for data to be loaded from memory to the register file. Similarly, each store operation associated with a memory copy operation does not have to wait for data to be transferred from the register file to memory.Type: GrantFiled: April 25, 2012Date of Patent: December 20, 2016Assignee: Freescale Semiconductor, Inc.Inventors: Thang M. Tran, James Yang
-
Patent number: 9519944Abstract: Techniques are disclosed relating to dependency resolution among processor pipelines. In one embodiment, an apparatus includes a first special-purpose pipeline configured to execute, in parallel, a first type of graphics instruction for a group of graphics elements and a second special-purpose pipeline configured to execute, in parallel, a second type of graphics instruction for the group of graphics elements. In this embodiment, the apparatus is configured, in response to dispatch of an instruction of the second type, to mark a particular instruction of the first type with information indicative of the dispatched instruction. In this embodiment, the particular instruction and the dispatched instruction correspond to the same group of graphics elements. In this embodiment, the apparatus is configured to stall performance of the dispatched instruction until the first special-purpose pipeline has completed execution of the marked particular instruction.Type: GrantFiled: September 2, 2014Date of Patent: December 13, 2016Assignee: Apple Inc.Inventors: Benjiman L. Goodman, Robert D. Kenney, Gregory D. Roberts
-
Patent number: 9489246Abstract: A method and device for determining parallelism of tasks of a program comprises generating a task data structure to track the tasks and assigning a node of the task data structure to each executing task. Each node includes a task identification number and a wait number. The task identification number uniquely identifies the corresponding task from other currently executing tasks and the wait number corresponds to the task identification number of a node corresponding to the last descendant task of the corresponding task that was executed prior to a wait command. The parallelism of the tasks is determined by comparing the relationship between the tasks.Type: GrantFiled: September 30, 2011Date of Patent: November 8, 2016Assignee: Intel CorporationInventors: Jeffrey V. Olivier, Zhiqiang Ma, Paul M Petersen
-
Patent number: 9483267Abstract: A pool of available physical registers are provided for architected registers, wherein operations are performed that activate and deactivate selected architected registers, such that the deactivated selected architected registers need not retain values, and physical registers can be deallocated to the pool, wherein deallocation of physical registers is performed after a last-use by a designated last-use instruction, wherein the last-use information is provided either by the last-use instruction or a prefix instruction, wherein reads to deallocated architecture registers return an architected default value.Type: GrantFiled: December 23, 2013Date of Patent: November 1, 2016Assignee: International Business Machines CorporationInventors: Michael K Gschwind, Valentina Salapura
-
Patent number: 9448800Abstract: Out-of-order CPUs, devices and methods diminish the time penalty from stalling the pipe to rebuild a rename table, such as due to a misprediction. A microprocessor can include a pipe that has a decoder, a dispatcher, and at least one execution unit. A rename table stores rename data, and a check-point table (“CPT”) stores rename data received from the dispatcher. A Re-Order Buffer (“ROB”) stores ROB data, and has a static mapping relationship with the CPT. If the rename table is flushed, such as due to a misprediction, the rename table is rebuilt at least in part by concurrent copying of rename data stored in the CPT, in coordination with walking the ROB.Type: GrantFiled: March 14, 2013Date of Patent: September 20, 2016Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Ravi Iyengar, Prarthna Santhanakrishnan
-
Patent number: 9448799Abstract: Out-of-order CPUs, devices and methods diminish the time penalty from stalling the pipe to rebuild a rename table, such as due to a misprediction. A microprocessor can include a pipe that has a decoder, a dispatcher, and at least one execution unit. A rename table stores rename data, and a check-point table (“CPT”) stores rename data received from the dispatcher. A Re-Order Buffer (“ROB”) stores ROB data, and has a dynamic mapping relationship with the CPT. If the rename table is flushed, such as due to a misprediction, the rename table is rebuilt at least in part by concurrent copying of rename data stored in the CPT, in coordination with walking the ROB.Type: GrantFiled: March 14, 2013Date of Patent: September 20, 2016Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Ravi Iyengar, Prarthna Santhanakrishnan
-
Patent number: 9436472Abstract: Embodiments of a processor architecture utilizing multi-bank implementation of physical register mapping table are provided. A register renaming system to correlate architectural registers to physical registers includes a physical register mapping table and a renaming logic. The physical register mapping table has a plurality of entries each indicative of a state of a respective physical register. The mapping table has a plurality of non-overlapping sections each of which having respective entries of the mapping table. The renaming logic is coupled to search a number of the sections of the mapping table in parallel to identify entries that indicate the respective physical registers have a first state. The renaming logic selectively correlates each of a plurality of architectural registers to a respective physical register identified as being in the first state. Methods of utilizing the multi-bank implementation of physical register mapping table are also provided.Type: GrantFiled: October 28, 2013Date of Patent: September 6, 2016Assignee: France BrevetsInventors: Peng Fei Zhu, Hong-Xia Sun, Yong Qiang Wu
-
Patent number: 9418402Abstract: The present invention provides a method for providing an image having visibility which is improved by removing fog or smoke from an image of which the quality is reduced due to the fog or the smoke. To this end, the present invention provides an estimation model used for obtaining, from an input image, an image from which the fog is removed, calculates a transmission rate indicating a ratio by which the fog is included in the original image by using the estimation model, and obtains an image of which the fog is removed using the calculated transmission rate. The method proposed in the present invention does not commonly use a filter and only uses calculation of a pixel unit, so that a back lighting effect is not generated, real time processing is possible due to a small amount of calculation, and a good image is obtained even by performing a process using only a luminance image.Type: GrantFiled: November 15, 2013Date of Patent: August 16, 2016Assignee: Industry Foundation of Chonnam National UniversityInventors: Sung Hoon Hong, Jae Won Lee
-
Patent number: 9400650Abstract: A processor executes a mask update instruction to perform updates to a first mask register and a second mask register. A register file within the processor includes the first mask register and the second mask register. The processor includes execution circuitry to execute the mask update instruction. In response to the mask update instruction, the execution circuitry is to invert a given number of mask bits in the first mask register, and also to invert the given number of mask bits in the second mask register.Type: GrantFiled: September 28, 2012Date of Patent: July 26, 2016Assignee: INTEL CORPORATIONInventors: Mikhail Plotnikov, Andrey Naraikin, Christopher Hughes
-
Patent number: 9389869Abstract: A multi-threaded microprocessor for processing instructions in single threaded mode and multithreaded modes. The microprocessor includes instruction dependency scoreboards, instruction input coupling circuits for selectively feeding the first and second instruction dependency scoreboards; output coupling logic having first and second instruction issue outputs; first and second execute pipelines respectively coupled to the instruction issue outputs, the first execute pipeline for executing a first program thread and the second execute pipeline for executing a second program thread, independent of the first program thread; and a control logic circuit for causing dual issue of instructions from the first program thread, by the first dependency scoreboard, to both the first execute pipeline and said second execute pipeline.Type: GrantFiled: January 6, 2011Date of Patent: July 12, 2016Assignee: Texas Instruments IncorporatedInventor: Thang Tran
-
Patent number: 9367297Abstract: An IT system includes at least one first processing unit and one second processing unit. The first and second processing units jointly execute an application program and are each associated with an installation routine designed to control updating of a first or second program part of the application program. A first actual state is associated with the first processing unit and a second actual state is associated with the second processing unit. After system reboot, or as soon as the first and second program part have been successfully stored, or an error is detected when storing the first and/or second program part, predefined processing steps are respectively carried out in a predefined order by the first processing unit aid the second processing unit depending on the actual state of the first processing unit and the actual state of the second processing unit.Type: GrantFiled: October 15, 2012Date of Patent: June 14, 2016Assignee: Continental Automotive GmbHInventors: Bernd Meyer, Stefan Pyka, David Von Oheimb
-
Patent number: 9354879Abstract: A free list in processor includes multiple banks for indicating availability of register identifiers used for register renaming. A register rename unit receives one or more destination architectural registers to rename with physical register identifiers. Responsive to determining the multiple banks within the free list are unbalanced with available physical register identifiers, one or more returning physical register identifiers are assigned to the destination architectural registers before assigning any physical register identifiers from any bank of the multiple banks with a lowest number of available physical register identifiers. A returning physical register identifier is a physical register identifier that is available again for assignment to a destination architectural register but not yet indicated in the free list as available. Each of the banks includes a single bit width decoded vector for indicating availability of given physical register identifiers.Type: GrantFiled: July 3, 2012Date of Patent: May 31, 2016Assignee: Apple Inc.Inventors: Suparn Vats, John H. Mylius, Abhijit Radhakrishnan
-
Patent number: 9336004Abstract: The present invention provides a method and apparatus for checkpointing registers for transactional memory. Some embodiments of the apparatus include first rename logic configured to map up to a predetermined number of architectural registers to corresponding first physical registers that hold first values associated with the architectural registers. The mapping is responsive to a transaction modifying one or more of the first values associated with the architectural registers. Some embodiments of the apparatus also include microcode configured to write contents of the first physical registers to a memory in response to the transaction modifying first values associated with a number of the architectural registers that is larger than the predetermined number.Type: GrantFiled: February 28, 2013Date of Patent: May 10, 2016Assignee: Advanced Micro Devices, Inc.Inventor: John M. King
-
Patent number: 9323568Abstract: Accessing at least one memory location by one of a plurality of transactions in a multi-processor transactional execution environment is provided. Included is assigning, by a computer system, a conflict priority to a transaction; based on encountering a conflict with another process for a memory location, comparing, by the computer system, the assigned conflict priority of the transaction with another priority of the another process; and based on the conflict priority of the transaction being the higher priority continuing the transaction; and based on the conflict priority of the transaction being the lower priority, aborting the transaction.Type: GrantFiled: January 24, 2014Date of Patent: April 26, 2016Assignee: International Business Machines CorporationInventors: Fadi Y. Busaba, Michael Karl Gschwind, Eric M. Schwarz
-
Patent number: 9311084Abstract: A system and method for efficiently performing microarchitectural checkpointing. A register rename unit within a processor determines whether a physical register number qualifies to have duplicate mappings. Information for maintenance of the duplicate mappings is stored in a register duplicate array (RDA). To reduce the penalty for misspeculation or exception recovery, control logic in the processor supports multiple checkpoints. The RDA is one of multiple data structures to have checkpoint copies of state. The RDA utilizes a content addressable memory (CAM) to store physical register numbers. The duplicate counts for both the current state and the checkpoint copies for a given physical register number are updated when instructions utilizing the given physical register number are retired. To reduce on-die real estate and power consumption, a single CAM entry is stores the physical register number and the other fields are stored in separate storage elements.Type: GrantFiled: July 31, 2013Date of Patent: April 12, 2016Assignee: Apple Inc.Inventors: Shyam Sundar, Conrado Blasco-Allue
-
Patent number: 9311088Abstract: An apparatus and method are provided for performing register renaming. Available register identifying circuitry is provided to identify which physical registers form a pool of physical registers available to be mapped by register renaming circuitry to an architectural register specified by an instruction to be executed. Configuration data whose value is modified during operation of the processing circuitry is stored such that, when the configuration data has a first value, the configuration data identifies at least one architectural register of the architectural register set which does not require mapping to a physical register by the register renaming circuitry. The register identifying circuitry is arranged to reference the modified data value, such that when the configuration data has the first value, the number of physical registers in the pool is increased due to the reduction in the number of architectural registers which require mapping to physical registers.Type: GrantFiled: June 26, 2013Date of Patent: April 12, 2016Assignee: ARM LimitedInventors: Frederic Claude Marie Piry, Louis-Marie Vincent Mouton, Luca Scalabrino, Richard Roy Grisenthwaite, David Hennah Mansell
-
Patent number: 9311243Abstract: A processor with coherency-leveraged support for low latency message signaled interrupt handling includes multiple execution cores and their associated cache memories. A first cache memory associated a first of the execution cores includes a plurality of cache lines. The first cache memory has a cache controller including hardware logic, microcode, or both to identify a first cache line as an interrupt reserved cache line and map the first cache line to a selected memory address associated with an I/O device. The selected system address may be a portion of configuration data in persistent storage accessible to the processor. The controller may set a coherency state of the first cache line to shared and, in response to detecting an I/O transaction including I/O data from the I/O device and containing a reference to the selected memory address, emulate a first message signaled interrupt identifying the selected memory address.Type: GrantFiled: November 30, 2012Date of Patent: April 12, 2016Assignee: Intel CorporationInventor: Yen Hsiang Chew
-
Patent number: 9292288Abstract: Systems and methods for flag tracking in data manipulation operations involving move elimination. An example processing system comprises a first data structure including a plurality of physical register values; a second data structure including a plurality of pointers referencing elements of the first data structure; a third data structure including a plurality of move elimination sets, each move elimination set comprising two or more bits representing two or more logical data registers, the third data structure further comprising at least one bit associated with each move elimination set, the at least one bit representing one or more logical flag registers; a fourth data structure including an identifier of a data register sharing an element of the first data structure with a flag register; and a move elimination logic configured to perform a move elimination operation.Type: GrantFiled: April 11, 2013Date of Patent: March 22, 2016Assignee: Intel CorporationInventors: Vijaykumar B. Kadgi, Jeremy R. Anderson, James D. Hadley, Tong Li, Matthew C. Merten