Scoreboarding, Reservation Station, Or Aliasing Patents (Class 712/217)
  • Patent number: 11042381
    Abstract: Techniques described herein are directed to ensuring register data consistency between different instruction blocks. For example, a block-based processor renames registers during block decode, but delays the update of a logical register-to-physical register mapping utilized by other instruction blocks until it is determined that a write instruction configured to write to a logical register commits. Alternatively, the processor renames registers during block decode and updates the mapping accordingly. However, the update is negated (e.g., rolled back) if the write instruction is not executed. Still further, the processor may analyze the instructions in the block to determine instructions configured to write to a logical register but that will not execute due to a mismatched predicate. Based on the determination, the block-based processor ensures data consistency by copying data from a previously-assigned register to a newly-assigned register.
    Type: Grant
    Filed: December 8, 2018
    Date of Patent: June 22, 2021
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: David T. Harper, III, Gagan Gupta
  • Patent number: 11016773
    Abstract: Embodiments described herein provide for a computing device comprising a hardware processor including a processor trace module to generate trace data indicative of an order of instructions executed by the processor, wherein the processor trace module is configurable to selectively output a processor trace packet associated with execution of a selected non-deterministic control flow transfer instruction.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: May 25, 2021
    Assignee: INTEL CORPORATION
    Inventors: Salmin Sultana, Beeman Strong, Ravi Sahita
  • Patent number: 10992468
    Abstract: Data processing apparatuses and methods for performing an iterative determination of a key schedule are provided. A set of registers initially receives an input data item and data processing is then performed using the content of the set of registers as an input. The result of this data processing is then used to update a value stored in a predetermined register of the set of registers at each iterative round of the determination of the key schedule. Dependent on whether the data processing apparatus is in a reverse key expansion mode or a forwards key expansion mode determines which register in the set of registers is that predetermined register. Further, the set of registers is arranged to shift values contained in the set of registers in a direction which depends on whether the data processing apparatus is in a reverse key expansion mode or a forwards key expansion mode. The directions for the two modes are opposite to one another.
    Type: Grant
    Filed: March 19, 2018
    Date of Patent: April 27, 2021
    Assignee: Arm Limited
    Inventor: Yoav Asher Levy
  • Patent number: 10983799
    Abstract: Techniques are disclosed relating to selection circuitry configured to select instruction operations to issue to one or more execution circuits of a processor. In some embodiments, an apparatus includes a plurality of execution circuits configured to perform one or more instruction operations. The apparatus may further include a plurality of instruction queues configured to store information indicative of the one or more instruction operations. In some embodiments, the apparatus may include a selection circuit configured to select a first plurality of instruction operations from a first instruction queue. The selection circuit may be configured to select a first instruction operation from the first plurality of instruction operations to issue to a first execution circuits.
    Type: Grant
    Filed: December 19, 2017
    Date of Patent: April 20, 2021
    Assignee: Apple Inc.
    Inventors: Sean M. Reynolds, Gokul V. Ganesan
  • Patent number: 10983794
    Abstract: An processor to facilitate register sharing is disclosed. The processor includes a plurality of execution units (EUs), each including a General Purpose Register File (GRF) having a plurality of registers; and register sharing hardware to divide the plurality of registers into a first set of registers dedicated for execution of a first set of threads and a second set of registers shared for execution of a second set of threads.
    Type: Grant
    Filed: June 17, 2019
    Date of Patent: April 20, 2021
    Assignee: Intel Corporation
    Inventors: Guei-Yuan Lueh, Subramaniam Maiyuran, Weiyu Chen, Konrad Trifunovic, Supratim Pal, Chandra S. Gurram, Jorge E. Parra, Pratik J. Ashar, Tomasz Bujewski
  • Patent number: 10956162
    Abstract: Operand-based reach explicit dataflow processors, and related methods and computer-readable media are disclosed. The operand-based reach explicit dataflow processors support execution of a producer instruction that explicitly names a target consumer operand of a consumer instruction in a consumer operand encoding namespace of the producer instruction. The produced value from execution of the producer instruction is provided or otherwise made available as an input to the named target consumer operand of the consumer instruction as a result of processing the producer instruction. The target consumer operand is encoded in the producer instruction as an operand target distance relative to the producer instruction. Instructions in an instruction stream between the producer instruction and the targeted consumer instruction that have no operands do not consume an operand reach namespace in the producer instructions.
    Type: Grant
    Filed: June 28, 2019
    Date of Patent: March 23, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Robert Douglas Clancy, Melinda Joyce Brown, Yusuf Cagatay Tekmen, Brian Michael Stempel, Michael Scott Mcilvaine, Thomas Philip Speier, Rodney Wayne Smith, Gagan Gupta, David Tennyson Harper, III
  • Patent number: 10956157
    Abstract: A subset of a set of architectural registers in a processing system is marked (or “tainted”) to indicate that speculative use of data in the subset of the architectural registers is constrained based on a taint handling policy. One or more speculation features supported by the processing system are disabled for the instruction so that the one or more speculation features cannot be used on data in the subset. In some cases, values of bits associated with the subset of architectural registers are modified to indicate that the subset is tainted. The taint handling policy can be indicated by values stored in a policy register. Taint markings are tracked in response to values stored in the tainted architectural registers being written to a memory or read from the memory.
    Type: Grant
    Filed: March 5, 2019
    Date of Patent: March 23, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: David Kaplan, Marius Evers
  • Patent number: 10929139
    Abstract: Providing predictive instruction dispatch throttling to prevent resource overflow in out-of-order processor (OOP)-based devices is disclosed. An OOP-based device includes a system resource that may be consumed or otherwise occupied by instructions, as well as an execution pipeline comprising a decode stage and a dispatch stage. The OOP further maintains a running count and a resource usage threshold. Upon receiving an instruction block, the decode stage extracts a proxy value that indicates an approximate predicted count of instructions within the instruction block that will consume a system resource. The decode stage then increments the running count by the proxy value. The dispatch stage compares the running count to the resource usage threshold before dispatching any younger instruction blocks. If the running count exceeds the resource usage threshold, the dispatch stage blocks dispatching of younger instruction blocks until the running count no longer exceeds the resource usage threshold.
    Type: Grant
    Filed: September 27, 2018
    Date of Patent: February 23, 2021
    Assignee: Qualcomm Incorporated
    Inventors: Lisa Ru-feng Hsu, Vignyan Reddy Kothinti Naresh, Gregory Michael Wright
  • Patent number: 10922087
    Abstract: Aspects of the invention include tracking relative ages of instructions in an issue queue of an OoO processor. The tracking includes grouping entries in the issue queue into a pool of blocks, each block containing two or more entries that are configured to be allocated and deallocated as a single unit, each entry configured to store an instruction. Blocks are selected in any order from the pool of block for allocation. The selected blocks are allocated and the relative ages of the allocated blocks are tracked based at least in part on an order that the blocks are allocated. Each allocated block is configured as a first-in-first-out (FIFO) queue of entries, configured to add instructions to the block in a sequential order, and configured to remove instructions from the block in any order including a non-sequential order. The relative ages of instructions within each allocated block are tracked.
    Type: Grant
    Filed: November 30, 2017
    Date of Patent: February 16, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Mohit S. Karve, Joel A. Silberman, Balaram Sinharoy
  • Patent number: 10896041
    Abstract: Enabling early execution of move-immediate instructions having variable immediate value sizes in processor-based devices is disclosed. In one exemplary embodiment, a processor-based device provides a move-immediate logic circuit that detects a move-immediate instruction comprising an immediate value and a destination register. For frequently encountered immediate values, the move-immediate logic circuit allocates a physical register from an immediate physical register file (IPRF), and writes an IPRF tag corresponding to the allocated IPRF register into a most-recent mapping table (MRT) entry for the destination register. Subsequent move-immediate instructions embedding the same immediate value, as well as other dependent instructions, may then obtain the immediate value from the IPRF register by accessing the MRT entry.
    Type: Grant
    Filed: September 25, 2019
    Date of Patent: January 19, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shivam Priyadarshi, Arthur Perais, Vignyan Reddy Kothinti Naresh, Yusuf Cagatay Tekmen, Rami Mohammad Al Sheikh, Rodney Wayne Smith
  • Patent number: 10853077
    Abstract: The method can be performed in a processor integrated circuit having an instruction decoder and a plurality of shared resources, a resource tracker circuit having a plurality of credit units associated with corresponding ones of the shared resources in a manner to be updatable based on availability of the shared resources, and a resource matcher connected to receive a resource requirement signal from the decoder and connected to receive a resource availability signal from the resource tracker. The method can include performing a determination of whether or not the resource requirement signal matches the resource availability signal, and, upon a positive determination, dispatching corresponding instruction data, updating the status of a corresponding one or more of the credit units, and preventing the resource matcher from performing a subsequent determination for a given period of time after the positive determination.
    Type: Grant
    Filed: January 25, 2016
    Date of Patent: December 1, 2020
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Louis-Philippe Hamelin, Peter Man-Kin Sinn, Chang Lee, Paul Alepin, Guy-Armand Kamendje Ichokobou, Olivier D'Arcy, John Edward Vincent
  • Patent number: 10838729
    Abstract: A system and method for efficiently reducing the latency and power of memory access operations. A processor includes a stack pointer (SP) load-store dependence (LSD) predictor which predicts whether a memory dependence exists on a store instruction. The processor also includes a register file (RF) LSD predictor which predicts whether a memory dependence exists on a store instruction or a load instruction by a subsequent load instruction in program order. Each of the SP-LSD predictor and the RF-LSD predictor predicts and performs register renaming in a pipeline stage earlier than a renaming pipeline stage. The RF-LSD predictor also determines whether any intervening instructions between a producer memory instruction and a consumer memory instruction modify a predicted dependence.
    Type: Grant
    Filed: March 21, 2018
    Date of Patent: November 17, 2020
    Assignee: Apple Inc.
    Inventors: Muawya M. Al-Otoom, Conrado Blasco, Deepankar Duggal, Kulin N. Kothari, Richard F. Russo
  • Patent number: 10824431
    Abstract: An arithmetic circuit performs a floating-point operation. A floating-point register includes entries each allocated to an architectural register or a renaming register. An operation execution controller circuit issues a floating-point operation instruction and outputs a termination report of the floating-point operation before the floating-point operation is terminated. When exception handling is not performed at the time of instruction completion even when an exception is detected in the operation of the floating-point operation instruction, an instruction completion controller circuit outputs a release instruction that indicates a release of a renaming register when instruction execution is completed after the termination report is received.
    Type: Grant
    Filed: May 20, 2019
    Date of Patent: November 3, 2020
    Assignee: FUJITSU LIMITED
    Inventors: Yasunobu Akizuki, Atushi Fusejima, Norihito Gomyo, Ryohei Okazaki
  • Patent number: 10817302
    Abstract: Systems, apparatuses, and methods for implementing a high bandwidth, low power vector register file for use by a parallel processor are disclosed. In one embodiment, a system includes at least a parallel processing unit with a plurality of processing pipeline. The parallel processing unit includes a vector arithmetic logic unit and a high bandwidth, low power, vector register file. The vector register file includes multi-bank high density random-access memories (RAMs) to satisfy register bandwidth requirements. The parallel processing unit also includes an instruction request queue and an instruction operand buffer to provide enough local bandwidth for VALU instructions and vector I/O instructions. Also, the parallel processing unit is configured to leverage the RAM's output flops as a last level cache to reduce duplicate operand requests between multiple instructions. The parallel processing unit includes a vector destination cache to provide additional R/W bandwidth for the vector register file.
    Type: Grant
    Filed: July 7, 2017
    Date of Patent: October 27, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Jiasheng Chen, Bin He, Mark M. Leather, Michael J. Mantor, Yunxiao Zou
  • Patent number: 10782979
    Abstract: A reload multiple instruction is used to restore a set of architected registers saved by a spill multiple instruction. The reload multiple instruction is executed, and the executing includes determining the set of architected registers to be restored, which is specified by the reload multiple instruction. The set of architected registers is restored from a selected snapshot that maps architected registers to physical registers. The restoring replaces one or more physical registers currently assigned to one or more architected registers of the set of architected registers with one or more physical registers of the selected snapshot corresponding to the set of architected registers.
    Type: Grant
    Filed: April 18, 2017
    Date of Patent: September 22, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Chung-Lung K. Shum, Timothy J. Slegel
  • Patent number: 10776257
    Abstract: A method includes using a memory address map, locating a first portion of an application in a first memory and loading a second portion of the application from a second memory. The method includes executing in place from the first memory the first portion of the application, during a first period, and by completion of the loading of the second portion of the application from the second memory. The method further includes executing the second portion of the application during a second period, wherein the first period precedes the second period.
    Type: Grant
    Filed: June 15, 2018
    Date of Patent: September 15, 2020
    Assignee: Cypress Semiconductor Corporation
    Inventors: Stephan Rosner, Qamrul Hasan, Venkat Natarajan
  • Patent number: 10761856
    Abstract: Systems, methods, and computer-readable media are described for performing instruction execution using an instruction completion table (ICT) that is configured to accommodate shared ICT entries. A shared ICT entry maps to multiple instructions such as, for example, two instructions. Each shared ICT entry may be referenced by an even instruction tag (ITAG) and an odd ITAG that correspond to respective instructions that have been grouped together in the shared ICT entry. The instructions corresponding to a given shared ICT entry can be executed and finished independently of one another. A shared ICT entry is completed when each execution of each instruction corresponding to the shared ICT entry has finished and when all prior ICT entries have completed. Also described herein are system, methods, and computer-readable media for flushing instructions in shared ICT entries in response to execution of a branch instruction.
    Type: Grant
    Filed: July 19, 2018
    Date of Patent: September 1, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Kenneth L. Ward, Dung Q. Nguyen, Hung Le, Susan E. Eisen
  • Patent number: 10684861
    Abstract: The present disclosure relates to a method for instruction processing with a processor having multiple execution units. The processor includes a dependency cache containing instructions in association with respective execution unit indicators. The method includes: tracking the number of dependent instructions currently assigned to each execution unit of the processor respectively. In response to receiving an instruction of a dependency chain, the execution unit assigned to a previous instruction of the dependency chain on which depends the received instruction may be identified in the dependency cache. In case more than a predefined maximum number of dependent instructions of at least one dependency chain is currently assigned to the identified execution unit, another execution unit of the processor may be selected for scheduling the received instruction, otherwise the received instruction may be scheduled on the identified execution unit.
    Type: Grant
    Filed: September 25, 2017
    Date of Patent: June 16, 2020
    Assignee: International Business Machines Corporation
    Inventors: Peter Altevogt, Cédric Lichtenau, Thomas Pflueger
  • Patent number: 10684858
    Abstract: Disclosed embodiments relate to an indirect memory fetch (IMF) unit. In one example, an apparatus includes circuitry to fetch and decode an instruction specifying a sparse operand array including N operands, and an index array including N contiguously-addressed indices. The apparatus further includes a processing engine associated with an IMF unit to respond to the decoded instruction by initializing the IMF unit to fetch the N operands in order, probing the IMF unit to determine that a fetched operand is ready to retrieve, retrieving the fetched operand from the IMF unit, and repeating the probing and retrieving until all N operands have been retrieved. The IMF unit, independent of the processing engine, is to fetch the N contiguously-addressed indices from the index array, use the N fetched indices to calculate memory addresses for the N operands, and issue a plurality of read requests to fetch the N operands in order.
    Type: Grant
    Filed: June 1, 2018
    Date of Patent: June 16, 2020
    Assignee: Intel Corporation
    Inventors: Stijn Eyerman, Wim Heirman, Kristof Du Bois, Ibrahim Hur, Joshua B. Fryman
  • Patent number: 10664368
    Abstract: A method for modifying a configuration of a storage system. The method includes a computer processor querying a network-accessible computing system to obtain information associated with an executing application that utilizes a storage system for a process of data mirroring. The method further includes identifying a set of parameters associated with a copy program executing within a logical partition (LPAR) of the storage system based on the obtained information, where the set of parameters dictates a number of reader tasks utilized by the copy program, where the copy program is a program associated with the process for data mirroring from the network-accessible computing system to the storage system. The method further includes executing the dictated number of reader tasks for the process of mirroring data associated with the executing application, from the network-accessible computing system to the storage system.
    Type: Grant
    Filed: November 30, 2017
    Date of Patent: May 26, 2020
    Assignee: International Business Machines Corporation
    Inventors: Gregory E. McBride, Dash Miller, Miguel Perez, David C. Reed
  • Patent number: 10649781
    Abstract: The present disclosure relates to a method for instruction processing with a processor having multiple execution units. The processor includes a dependency cache containing instructions in association with respective execution unit indicators. The method includes: tracking the number of dependent instructions currently assigned to each execution unit of the processor respectively. In response to receiving an instruction of a dependency chain, the execution unit assigned to a previous instruction of the dependency chain on which depends the received instruction may be identified in the dependency cache. In case more than a predefined maximum number of dependent instructions of at least one dependency chain is currently assigned to the identified execution unit, another execution unit of the processor may be selected for scheduling the received instruction, otherwise the received instruction may be scheduled on the identified execution unit.
    Type: Grant
    Filed: December 15, 2017
    Date of Patent: May 12, 2020
    Assignee: International Business Machines Corporation
    Inventors: Peter Altevogt, Cédric Lichtenau, Thomas Pflueger
  • Patent number: 10628157
    Abstract: A processing pipeline has at least one front end stage for issuing micro-operations for execution in response to program instructions, and an execute stage for performing data processing in response to the micro-operations. At least one predicate register stores at least one predicate value. In response to a predicated vector instruction for triggering execution of two or more lanes of processing, the at least one front end stage issues at least one micro-operation to control the execute stage to mask an effect of a lane of processing indicated as disabled by a target predicate value. One of the front end stages may perform an early predicate lookup of the target predicate value to vary in dependence on the early predicate lookup, which micro-operations are issued to the execute store for a predicated vector instruction.
    Type: Grant
    Filed: April 21, 2017
    Date of Patent: April 21, 2020
    Assignee: ARM Limited
    Inventors: Alejandro Rico Carro, Lee Evan Eisen
  • Patent number: 10613868
    Abstract: Techniques disclosed herein describe a variable latency pipe for interleaving instruction tags in a processor. According to one embodiment presented herein, an instruction tag is associated with an instruction upon issue of the instruction from the issue queue. One of a plurality of positions in the latency pipe is determined. The pipe stores one or more instruction tags, each associated with a respective instruction. The pipe also stores the instruction tags in a respective position based on the latency of each respective instruction. The instruction tag is stored at the determined position in the pipe.
    Type: Grant
    Filed: June 30, 2015
    Date of Patent: April 7, 2020
    Assignee: International Business Machines Corporation
    Inventors: Salma Ayub, Josh Bowman, Sundeep Chadha, Dhivya Jeganathan, Cliff Kucharski, Dung Q. Nguyen
  • Patent number: 10599853
    Abstract: A pluggable trust architecture addresses the problem of establishing trust in hardware. The architecture has low impact on system performance and comprises a simple, user-supplied, and pluggable hardware element. The hardware element physically separates the untrusted components of a system from peripheral components that communicate with the external world. The invention only allows results of correct execution of software to be communicated externally.
    Type: Grant
    Filed: October 21, 2015
    Date of Patent: March 24, 2020
    Assignee: Princeton University
    Inventors: David I. August, Soumyadeep Ghosh, Jordan Fix
  • Patent number: 10564975
    Abstract: A global front end scheduler to schedule instruction sequences to a plurality of virtual cores implemented via a plurality of partitionable engines. The global front end scheduler includes a thread allocation array to store a set of allocation thread pointers to point to a set of buckets in a bucket buffer in which execution blocks for respective threads are placed, a bucket buffer to provide a matrix of buckets, the bucket buffer including storage for the execution blocks, and a bucket retirement array to store a set of retirement thread pointers that track a next execution block to retire for a thread.
    Type: Grant
    Filed: January 30, 2018
    Date of Patent: February 18, 2020
    Assignee: INTEL CORPORATION
    Inventor: Mohammad Abdallah
  • Patent number: 10565670
    Abstract: A processing apparatus is described. The apparatus includes a graphics processing unit (GPU), including a plurality of execution units to process graphics context data and a register file having a plurality of registers to store the graphics context data; and register renaming logic to facilitate dynamic renaming of the plurality of registers by logically partitioning the plurality of registers in the register file into a set of fixed registers and a set of shared registers.
    Type: Grant
    Filed: September 30, 2016
    Date of Patent: February 18, 2020
    Assignee: INTEL CORPORATION
    Inventors: Kaiyu Chen, Guei-Yuan Lueh, Subramaniam Maiyuran
  • Patent number: 10552164
    Abstract: Sharing snapshots between restoration and recovery. A snapshot to be used for recovery and restoration is obtained. The snapshot includes restoration state for a plurality of architected registers. The plurality of architected registers includes one or more architected registers associated with an instruction to alter an execution path and one or more architected registers associated with a save request. At least one architected register of the plurality of architected registers is restored, based on a request. The request is a recovery request to recover at least one architected register associated with the instruction to alter the execution path or a restoration request to restore at least one architected register associated with the save request.
    Type: Grant
    Filed: April 18, 2017
    Date of Patent: February 4, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura, Chung-Lung K. Shum, Timothy J. Slegel
  • Patent number: 10545763
    Abstract: Detecting data dependencies of instructions associated with threads in a simultaneous multithreading (SMT) scheme is disclosed, including: dividing a plurality of comparators of an SMT-enabled device into groups of comparators corresponding to respective ones of threads associated with the SMT-enabled device; simultaneously distributing a first set of instructions associated with a first thread of the plurality of threads to a corresponding first group of comparators from the plurality of groups of comparators and distributing a second set of instructions associated with a second thread of the plurality of threads to a corresponding second group of comparators from the plurality of groups of comparators; and simultaneously performing data dependency detection on the first set of instructions associated with the first thread using the corresponding first group of comparators and performing data dependency detection on the second set of instructions associated with the second thread using the corresponding seco
    Type: Grant
    Filed: May 6, 2015
    Date of Patent: January 28, 2020
    Assignee: Alibaba Group Holding Limited
    Inventors: Ling Ma, Sihai Yao, Lei Zhang
  • Patent number: 10545766
    Abstract: Register restoration using transactional memory register snapshots. An indication that a transaction is to be initiated is obtained. Based on obtaining the indication, a determination is made as to whether register restoration is in active use. Based on obtaining the indication and determining register restoration is in active use, register restoration is deactivated. To recover one or more architected registers of the transaction, a transactional rollback snapshot is created.
    Type: Grant
    Filed: April 18, 2017
    Date of Patent: January 28, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Patent number: 10528354
    Abstract: A processor with multiple execution units for instruction processing is provided. The processor comprises an instruction decode and issue logic and a control logic for resolving register access conflicts between subsequent instructions and a dependency cache, which comprises a receiving logic for receiving an execution unit indicator indicative of the execution unit the instruction is planned to be executed on, a storing logic responsive to the receiving logic for storing the received execution unit indicator, and a retrieving logic responsive to a request from the instruction decode and issue logic for providing the stored execution unit indicator for an instruction. The instruction decode and issue logic is adapted for requesting execution unit indicator for an instruction from the dependency cache and to assign the instruction to one respective of the execution units dependent on the execution unit indicator received from the dependency cache.
    Type: Grant
    Filed: March 10, 2016
    Date of Patent: January 7, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Peter Altevogt, Cedric Lichtenau, Thomas Pflueger
  • Patent number: 10514921
    Abstract: A method for speeding the re-use of Physical Register Names (PRNs), and hence the processor registers, in a processor. The method involves returning a PRN to a freelist for reuse when it is obsolete even when it is not complete, and blocking writes to the Processor Register File (PRF) by obsolete realms.
    Type: Grant
    Filed: September 5, 2017
    Date of Patent: December 24, 2019
    Assignee: Qualcomm Incorporated
    Inventors: Tejaswi Talluru, Rodney Smith, Yusuf Cagatay Tekmen, Kiran Seth, Daniel Higdon, Jeffery Michael Schottmiller, Andrew Irwin
  • Patent number: 10503505
    Abstract: A processor executes a mask update instruction to perform updates to a first mask register and a second mask register. A register file within the processor includes the first mask register and the second mask register. The processor includes execution circuitry to execute the mask update instruction. In response to the mask update instruction, the execution circuitry is to invert a given number of mask bits in the first mask register, and also to invert the given number of mask bits in the second mask register.
    Type: Grant
    Filed: April 2, 2018
    Date of Patent: December 10, 2019
    Assignee: Intel Corporation
    Inventors: Mikhail Plotnikov, Andrey Naraikin, Christopher J. Hughes
  • Patent number: 10503514
    Abstract: A method of managing a reduced size register view data structure in a processor, where the method includes receiving an incoming instruction sequence using a global front end, grouping instructions from the incoming instruction sequence to form instruction blocks, populating a register view data structure, wherein the register view data structure stores register information references by the instruction blocks as a set of register templates, generating a set of snapshots of the register templates to reduce a size of the register view data structure, and tracking a state of the processor to handle a branch miss-prediction using the register view data structure in accordance with execution of the instruction blocks.
    Type: Grant
    Filed: November 7, 2017
    Date of Patent: December 10, 2019
    Assignee: INTEL CORPORATION
    Inventor: Mohammad Abdallah
  • Patent number: 10488911
    Abstract: A method of allocating registers, includes for each of a plurality of live ranges of variables, calculating an energy saving value of each of the plurality of live ranges of the variables; classifying the plurality of live ranges of the variables into a plurality of queues according to the energy saving values of the plurality of live ranges of the variables; and assigning the plurality of live ranges of the variables in the plurality of queues into a plurality of registers.
    Type: Grant
    Filed: November 1, 2016
    Date of Patent: November 26, 2019
    Assignees: National Taiwan University, MEDIATEK INC.
    Inventors: Yuan-Shin Hwang, Jenq-Kuen Lee, Shao-CHung Wang, Li-Chen Kan
  • Patent number: 10417001
    Abstract: Embodiments of an invention for a physical register table for eliminating move instructions are disclosed. In one embodiment, a processor includes a physical register file, a register allocation table, and a physical register table. The register allocation table is to store mappings of logical registers to physical registers. The physical register table is to store entries including pointers to physical registers in the mappings. The number of entry locations in the physical register table is less than the number of physical registers in the physical register file.
    Type: Grant
    Filed: December 27, 2012
    Date of Patent: September 17, 2019
    Assignee: Intel Corporation
    Inventors: Jonathan D. Combs, Venkateswara R. Madduri
  • Patent number: 10394565
    Abstract: Managing an issue queue for fused instructions and paired instructions in a microprocessor including dispatching a fused instruction to a first entry in a double issue queue; dispatching two paired instructions to a second entry in the double issue queue; issuing the fused instruction during a single cycle to an execution unit in response to determining, by the issue queue logic, that the fused instruction is ready to issue; and issuing, by the issue queue logic, the first instruction of the two paired instructions to the execution unit in response to determining, by the issue queue logic, that the first instruction of the two paired instructions is ready to issue.
    Type: Grant
    Filed: October 27, 2017
    Date of Patent: August 27, 2019
    Assignee: International Business Machines Corporation
    Inventors: Michael J. Genden, Hung Q. Le, Dung Q. Nguyen, Brian W. Thompto
  • Patent number: 10387147
    Abstract: Managing an issue queue for fused instructions and paired instructions in a microprocessor including dispatching a fused instruction to a first entry in a double issue queue; dispatching two paired instructions to a second entry in the double issue queue; issuing the fused instruction during a single cycle to an execution unit in response to determining, by the issue queue logic, that the fused instruction is ready to issue; and issuing, by the issue queue logic, the first instruction of the two paired instructions to the execution unit in response to determining, by the issue queue logic, that the first instruction of the two paired instructions is ready to issue.
    Type: Grant
    Filed: August 2, 2017
    Date of Patent: August 20, 2019
    Assignee: International Business Machines Corporation
    Inventors: Michael J. Genden, Hung Q. Le, Dung Q. Nguyen, Brian W. Thompto
  • Patent number: 10374981
    Abstract: An interface circuit is disclosed for the transfer of data from a synchronous circuit, with multiple source elements, to an asynchronous circuit. Data from the synchronous circuit is received into a memory in the interface circuit. The data in the memory is then sent to the asynchronous circuit based on an instruction in a circular buffer that is part of the interface circuit. Processing elements within the interface circuit execute instructions contained within the circular buffer. The circular buffer rotates to provide new instructions to the processing elements. Flow control paces the data from the synchronous circuit to the asynchronous circuit.
    Type: Grant
    Filed: August 2, 2016
    Date of Patent: August 6, 2019
    Assignee: Wave Computing, Inc.
    Inventor: Christopher John Nicol
  • Patent number: 10365928
    Abstract: Embodiments of the invention are directed to methods for handling scratch registers in a processor. The method includes receiving a cracked instruction in an instruction dispatch unit of the processor. The method further includes decoding the cracked instruction into a group of micro-operations. Based on a determination that the instruction group uses a scratch register, determining if the scratch register is used in other groups of micro-operations. Based on a determination that the scratch register is not used in other instruction groups, allocating a physical register for use as the scratch register without creating a mapper entry for the scratch register.
    Type: Grant
    Filed: November 1, 2017
    Date of Patent: July 30, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gregory W. Alexander, David S. Hutton, Christian Jacobi, Edward T. Malley, Anthony Saporito
  • Patent number: 10353680
    Abstract: A system for an agnostic runtime architecture. The system includes a system emulation/virtualization converter, an application code converter, and a converter wherein a system emulation/virtualization converter and an application code converter implement a system emulation process, and wherein the system converter implements a system and application conversion process for executing code from a guest image, wherein the system converter or the system emulator. The system further includes a run ahead run time guest such an conversion/decoding process, and a prefetching process where guest code is pre-fetched from the target of guest branches in an instruction sequence.
    Type: Grant
    Filed: July 23, 2015
    Date of Patent: July 16, 2019
    Assignee: Intel Corporation
    Inventor: Mohammad Abdallah
  • Patent number: 10346173
    Abstract: An instruction buffer for a processor configured to execute multiple threads is disclosed. The instruction buffer is configured to receive instructions from a fetch unit and provide instructions to a selection unit. The instruction buffer includes one or more memory arrays comprising a plurality of entries configured to store instructions and/or other information (e.g., program counter addresses). One or more indicators are maintained by the processor and correspond to the plurality of threads. The one or more indicators are usable such that for instructions received by the instruction buffer, one or more of the plurality entries of a memory array can be determined as a write destination for the received instructions, and for instructions to be read from the instruction buffer (and sent to a selection unit), one or more entries can be determined as the correct source location from which to read.
    Type: Grant
    Filed: March 7, 2011
    Date of Patent: July 9, 2019
    Assignee: Oracle International Corporation
    Inventors: Jama I. Barreh, Robert T. Golla, Manish K. Shah
  • Patent number: 10346171
    Abstract: A processor of an aspect includes a plurality of physical storage locations, and a register rename unit. The register rename unit includes a first register rename storage structure that is to store a given physical storage location identifier, which is to identify a physical storage location of the plurality of physical storage locations, and that is to store a corresponding given one or more redundant bits. The register rename unit also includes a second register rename storage structure. The register rename unit also includes a first conductive path coupling the first and second register rename storage structures. The first conductive path is to convey the given one or more redundant bits end-to-end from the first register rename storage structure to the second register rename storage structure. Other processors are also disclosed, as well as methods and systems.
    Type: Grant
    Filed: January 10, 2017
    Date of Patent: July 9, 2019
    Assignee: Intel Corporation
    Inventors: Ron Gabor, Yiannakis Sazeides, Arkady Bramnik
  • Patent number: 10331583
    Abstract: A processing device for executing distributed memory operations using spatial processing units (SPU) connected by distributed channels is disclosed. A distributed channel may or may not be associated with memory operations, such as load operations or store operations. Distributed channel information is obtained for an algorithm to be executed by a group of spatially distributed processing elements. The group of spatially distributed processing elements can be connected to a shared memory controller. For each distributed channel in the distributed channel information, one or more of the group of spatially distributed processing elements may be associated with the distributed channel based on the algorithm. By associating the spatially distributed processing elements to a distributed channel, the functionality of the processing element can vary depending on the algorithm mapped onto the SPU.
    Type: Grant
    Filed: September 26, 2013
    Date of Patent: June 25, 2019
    Assignee: Intel Corporation
    Inventors: Bushra Ahsan, Michael C. Adler, Neal C. Crago, Joel S. Emer, Aamer Jaleel, Angshuman Parashar, Michael I. Pellauer
  • Patent number: 10331361
    Abstract: Techniques are disclosed relating to self-addressing memory. In one embodiment, an apparatus includes a memory and addressing circuitry coupled to or comprised in the memory. In this embodiment, the addressing circuitry is configured to receive memory access requests corresponding to a specified sequence of memory accesses. In this embodiment, the memory access requests do not include address information. In this embodiment, the addressing circuitry is further configured to assign addresses to the memory access requests for the specified sequence of memory accesses. In some embodiments, the apparatus is configured to perform the memory access requests using the assigned addresses.
    Type: Grant
    Filed: January 3, 2017
    Date of Patent: June 25, 2019
    Assignee: National Instruments Corporation
    Inventors: Tai A. Ly, Swapnil D. Mhaske, Hojin Kee, Adam T. Arnesen, David C. Uliana, Newton G. Petersen
  • Patent number: 10324777
    Abstract: An example device may include processing circuitry and a management controller. The processing circuitry may include a communications interface that includes a first register and a second register. The first register may include a freshness bit and a number of first data bits. The second register may include a number of second data bits that correspond, respectively, to the first data bits. The processing circuitry may write variously to the first data bits in response to detected events, set the freshness bit in response to the management controller reading the first data bits, and reset the freshness bit if any of the first data bits are written to. The management controller may read the first data bits, perform predetermined processing based thereon, write to the second data bits based on the predetermined processing, and request a register transfer.
    Type: Grant
    Filed: October 27, 2016
    Date of Patent: June 18, 2019
    Assignee: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP
    Inventors: Christoph L. Schmitz, Thomas Donald Rhodes, Nicholas Mark Hawkins, Binh Nguyen, Wayne Hsu
  • Patent number: 10318289
    Abstract: A compute instruction to be executed is to use a memory operand in a computation. An address associated with the memory operand is to be used to locate a portion of memory from which data is to be obtained and placed in the memory operand. A determination is made as to whether the portion of memory extends across a specified memory boundary. Based on the portion of memory extending across the specified memory boundary, the portion of memory includes a plurality of memory units and a check is made as to whether at least one specified memory unit is accessible and whether at least one specified memory unit is inaccessible.
    Type: Grant
    Filed: November 14, 2015
    Date of Patent: June 11, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Brett Olsson
  • Patent number: 10318432
    Abstract: A technique for operating a lower level cache memory of a data processing system includes receiving an operation that is associated with a first thread. Logical partition (LPAR) information for the operation is used to limit dependencies in a dependency data structure of a store queue of the lower level cache memory that are set and to remove dependencies that are otherwise unnecessary.
    Type: Grant
    Filed: July 9, 2018
    Date of Patent: June 11, 2019
    Assignee: International Business Machines Corporation
    Inventors: Guy L. Guthrie, Hugh Shen, Derek E. Williams
  • Patent number: 10303481
    Abstract: A processor with multiple execution units for instruction processing is provided. The processor comprises an instruction decode and issue logic and a control logic for resolving register access conflicts between subsequent instructions and a dependency cache, which comprises a receiving logic for receiving an execution unit indicator indicative of the execution unit the instruction is planned to be executed on, a storing logic responsive to the receiving logic for storing the received execution unit indicator, and a retrieving logic responsive to a request from the instruction decode and issue logic for providing the stored execution unit indicator for an instruction. The instruction decode and issue logic is adapted for requesting execution unit indicator for an instruction from the dependency cache and to assign the instruction to one respective of the execution units dependent on the execution unit indicator received from the dependency cache.
    Type: Grant
    Filed: December 2, 2015
    Date of Patent: May 28, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Peter Altevogt, Cedric Lichtenau, Thomas Pflueger
  • Patent number: 10296349
    Abstract: Data processing circuitry comprises allocation circuitry to allocate one or more source and destination processor registers, of a set of processor registers each defined by a respective register index, to a processor instruction for use in execution of that processor instruction and to associate, with the processor instruction, information to indicate the register index of the allocated source and destination processor registers; the avocation circuitry being selectively operable to allocate, to a processor instruction, a group of destination processor registers having a subset of their register indices in common and to associate, with the processor instruction, information to indicate the register index of one processor register of the group and identifying information to identify one or more bits of the register index which differ between the processor registers in the allocated group of processor registers.
    Type: Grant
    Filed: January 7, 2016
    Date of Patent: May 21, 2019
    Assignee: ARM Limited
    Inventors: Vladimir Vasekin, Antony John Penton, Chiloda Ashan Senarath Pathirane, Andrew James Antony Lees
  • Patent number: 10289469
    Abstract: Systems and methods for enhancing reliability are presented. In one embodiment, a system comprises a processor configured to execute program instructions and contemporaneously perform reliability enhancement operations (e.g., fault checking, error mitigation, etc.) incident to executing the program instructions. The fault checking can include: identifying functionality of a particular portion of the program instructions; speculatively executing multiple sets of operations contemporaneously; and comparing execution results from the multiple sets of operations. The multiple sets of operations are functional duplicates of the particular portion of the program instructions. If the execution results have a matching value, then the value can be made architecturally visible. If the execution results do not have a matching value, the system can be put in a safe mode. An error mitigation operation can be performed can include a corrective procedure. The corrective procedure can include rollback to a known valid state.
    Type: Grant
    Filed: October 28, 2016
    Date of Patent: May 14, 2019
    Assignee: Nvidia Corporation
    Inventors: Nick Fortino, Fred Gruner, Ben Hertzberg