Patents Examined by Michael Metzger
  • Patent number: 10108581
    Abstract: A vector reduction circuit configured to reduce an input vector of elements comprises a plurality of cells, wherein each of the plurality of cells other than a designated first cell that receives a designated first element of the input vector is configured to receive a particular element of the input vector, receive, from another of the one or more cells, a temporary reduction element, perform a reduction operation using the particular element and the temporary reduction element, and provide, as a new temporary reduction element, a result of performing the reduction operation using the particular element and the temporary reduction element. The vector reduction circuit also comprises an output circuit configured to provide, for output as a reduction of the input vector, a new temporary reduction element corresponding to a result of performing the reduction operation using a last element of the input vector.
    Type: Grant
    Filed: April 3, 2017
    Date of Patent: October 23, 2018
    Assignee: Google LLC
    Inventors: Gregory Michael Thorson, Andrew Everett Phelps, Olivier Temam
  • Patent number: 10055227
    Abstract: Systems and methods for tracking and switching between execution modes in processing systems. A processing system is configured to execute instructions in at least two instruction execution triodes including a first and second execution mode chosen from a classic/aligned mode and a compressed/unaligned mode. Target addresses of selected instructions such as calls and returns are forcibly misaligned in the compressed mode, such one or more bits, such as, the least significant bits (alignment bits) of the target address in the compressed mode are different from the corresponding alignment bits in the classic mode. When the selected instructions are encountered during execution in the first mode, a decision to switch operation to the second mode is based on analyzing the alignment bits of the target address of the selected instruction.
    Type: Grant
    Filed: October 19, 2012
    Date of Patent: August 21, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Charles Joseph Tabony, Erich James Plondke, Lucian Codrescu, Suresh K. Venkumahanti, Evandro Carlos Menezes
  • Patent number: 9983878
    Abstract: Branch prediction is provided by generating a first index from a previous instruction address and from a first branch history vector having a first length. A second index is generated from the previous instruction address and from a second branch history vector that is longer than the first vector. Using the first index, a first branch prediction is retrieved from a first branch prediction table. Using the second index, a second branch prediction is retrieved from a second branch prediction table. Based upon additional branch history data, the first branch history vector and the second branch history vector are updated. A first hash value is generated from a current instruction address and the updated first branch history vector. A second hash value is generated from the current instruction address and the updated second branch history vector. One of the branch predictions are selected based upon the hash values.
    Type: Grant
    Filed: May 15, 2014
    Date of Patent: May 29, 2018
    Assignee: International Business Machines Corporation
    Inventors: David S. Levitan, Jose E. Moreira, Mauricio J. Serrano
  • Patent number: 9928067
    Abstract: Systems and methods are provided in example embodiments for performing binary translation. A binary translation system converts, by a translator module, source instructions to target instructions. The binary translation system identifies a condition code block in the source instructions, where the condition code block includes a plurality of condition bits. In response to identifying the condition code block, the binary translation system provides an optimizer module to convert the condition code block. Then, the binary translation system performs a pre-execution on the condition code block to resolve the plurality of condition bits in the condition code block.
    Type: Grant
    Filed: September 21, 2012
    Date of Patent: March 27, 2018
    Assignee: Intel Corporation
    Inventors: Xueliang Zhong, Jianhui Li, Jian Ping Jane Chen, Gang Wang, Yi Qian, Huifeng Gu
  • Patent number: 9910672
    Abstract: A method and load and store buffer for issuing a load instruction to a data cache. The method includes determining whether there are any unresolved store instructions in the store buffer that are older than the load instruction. If there is at least one unresolved store instruction in the store buffer older than the load instruction, it is determined whether the oldest unresolved store instruction in the store buffer is within a speculation window for the load instruction. If the oldest unresolved store instruction is within the speculation window for the load instruction, the load instruction is speculatively issued to the data cache. Otherwise, the load instruction is stalled until any unresolved store instructions outside the speculation window are resolved. The speculation window is a short window that defines a number of instructions or store instructions that immediately precede the load instruction.
    Type: Grant
    Filed: June 15, 2016
    Date of Patent: March 6, 2018
    Assignee: MIPS Tech, LLC
    Inventors: Hugh Jackson, Anand Khot
  • Patent number: 9904551
    Abstract: Branch prediction is provided by generating a first index from a previous instruction address and from a first branch history vector having a first length. A second index is generated from the previous instruction address and from a second branch history vector that is longer than the first vector. Using the first index, a first branch prediction is retrieved from a first branch prediction table. Using the second index, a second branch prediction is retrieved from a second branch prediction table. Based upon additional branch history data, the first branch history vector and the second branch history vector are updated. A first hash value is generated from a current instruction address and the updated first branch history vector. A second hash value is generated from the current instruction address and the updated second branch history vector. One of the branch predictions are selected based upon the hash values.
    Type: Grant
    Filed: November 3, 2016
    Date of Patent: February 27, 2018
    Assignee: International Business Machines Corporation
    Inventors: David S. Levitan, Jose E. Moreira, Mauricio J. Serrano
  • Patent number: 9898295
    Abstract: Branch prediction is provided by generating a first index from a previous instruction address and from a first branch history vector having a first length. A second index is generated from the previous instruction address and from a second branch history vector that is longer than the first vector. Using the first index, a first branch prediction is retrieved from a first branch prediction table. Using the second index, a second branch prediction is retrieved from a second branch prediction table. Based upon additional branch history data, the first branch history vector and the second branch history vector are updated. A first hash value is generated from a current instruction address and the updated first branch history vector. A second hash value is generated from the current instruction address and the updated second branch history vector. One of the branch predictions are selected based upon the hash values.
    Type: Grant
    Filed: November 3, 2016
    Date of Patent: February 20, 2018
    Assignee: International Business Machines Corporation
    Inventors: David S. Levitan, Jose E. Moreira, Mauricio J. Serrano
  • Patent number: 9898287
    Abstract: A method, a non-transitory computer readable medium, and a processor for repacking dynamic wavefronts during program code execution on a processing unit, each dynamic wavefront including multiple threads are presented. If a branch instruction is detected, a determination is made whether all wavefronts following a same control path in the program code have reached a compaction point, which is the branch instruction. If no branch instruction is detected in executing the program code, a determination is made whether all wavefronts following the same control path have reached a reconvergence point, which is a beginning of a program code segment to be executed by both a taken branch and a not taken branch from a previous branch instruction. The dynamic wavefronts are repacked with all threads that follow the same control path, if all wavefronts following the same control path have reached the branch instruction or the reconvergence point.
    Type: Grant
    Filed: April 9, 2015
    Date of Patent: February 20, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Sooraj Puthoor, Bradford M. Beckmann, Dmitri Yudanov
  • Patent number: 9891914
    Abstract: An apparatus and method for performing an efficient scatter operation. For example, one embodiment of a processor comprises: an allocator unit to receive a scatter operation comprising a number of data elements and responsively allocate resources to execute the scatter operation; a memory execution cluster comprising at least a portion of the resources to execute the scatter operation, the resources including one or more store data buffers and one or more store address buffers; and a senior store pipeline to transfer store data elements from the store data buffers to system memory using addresses from the store address buffers prior to retirement of the scatter operation.
    Type: Grant
    Filed: April 10, 2015
    Date of Patent: February 13, 2018
    Assignee: Intel Corporation
    Inventors: Ramon Matas, Alexey P. Suprun, Roger Gramunt, Chung-Lun Chan, Rammohan Padmanabhan
  • Patent number: 9886362
    Abstract: A method for checking the integrity of a program executed by an electronic circuit and including at least one conditional jump, wherein: a first value is updated for any instruction which does not correspond to a jump instruction; a second value is updated with the first value for each conditional jump instruction; and the second value is compared with a third value, calculated according to the performed conditional jumps.
    Type: Grant
    Filed: June 15, 2012
    Date of Patent: February 6, 2018
    Assignee: PROTON WORLD INTERNATIONAL N.V.
    Inventors: Gilles Van Assche, Ronny Vankeer
  • Patent number: 9880849
    Abstract: Various aspects provide for detecting ordering violations in a memory system. A system includes a prediction component and an execution component. The prediction component predicts whether a load instruction in the system is associated with an instruction pipeline hazard. The execution component allocates the load instruction to a queue buffer in the system in response to a prediction that the load instruction is not associated with the instruction pipeline hazard.
    Type: Grant
    Filed: December 9, 2013
    Date of Patent: January 30, 2018
    Assignee: MACOM CONNECTIVITY SOLUTIONS, LLC
    Inventors: Matthew Ashcraft, Richard W. Thaik
  • Patent number: 9875105
    Abstract: Embodiments related to re-dispatching an instruction selected for re-execution from a buffer upon a microprocessor re-entering a particular execution location after runahead are provided. In one example, a microprocessor is provided. The example microprocessor includes fetch logic, one or more execution mechanisms for executing a retrieved instruction provided by the fetch logic, and scheduler logic for scheduling the retrieved instruction for execution. The example scheduler logic includes a buffer for storing the retrieved instruction and one or more additional instructions, the scheduler logic being configured, upon the microprocessor re-entering at a particular execution location after runahead, to re-dispatch, from the buffer, an instruction that has been previously dispatched to one of the execution mechanisms.
    Type: Grant
    Filed: May 3, 2012
    Date of Patent: January 23, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Guillermo J. Rozas, Paul Serris, Brad Hoyt, Sridharan Ramakrishnan, Hens Vanderschoot, Ross Segelken, Darrell Boggs, Magnus Ekman
  • Patent number: 9858082
    Abstract: Restricted instructions are prohibited from execution within a transaction. There are classes of instructions that are restricted regardless of type of transaction: constrained or nonconstrained. There are instructions only restricted in constrained transactions, and there are instructions that are selectively restricted for given transactions based on controls specified on instructions used to initiate the transactions.
    Type: Grant
    Filed: August 16, 2016
    Date of Patent: January 2, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dan F. Greiner, Christian Jacobi, Timothy J. Slegel
  • Patent number: 9851978
    Abstract: Restricted instructions are prohibited from execution within a transaction. There are classes of instructions that are restricted regardless of type of transaction: constrained or nonconstrained. There are instructions only restricted in constrained transactions, and there are instructions that are selectively restricted for given transactions based on controls specified on instructions used to initiate the transactions.
    Type: Grant
    Filed: August 16, 2016
    Date of Patent: December 26, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dan F. Greiner, Christian Jacobi, Timothy J. Slegel
  • Patent number: 9830164
    Abstract: A system and method for efficiently processing instructions in hardware parallel execution lanes within a processor. In response to a given divergent point within an identified loop, a compiler arranges instructions within the identified loop into very large instruction words (VLIW's). At least one VLIW includes instructions intermingled from different basic blocks between the given divergence point and a corresponding convergence point. The compiler generates code wherein when executed assigns at runtime instructions within a given VLIW to multiple parallel execution lanes within a target processor. The target processor includes a single instruction multiple data (SIMD) micro-architecture. The assignment for a given lane is based on branch direction found at runtime for the given lane at the given divergent point. The target processor includes a vector register for storing indications indicating which given instruction within a fetched VLIW for an associated lane to execute.
    Type: Grant
    Filed: January 29, 2013
    Date of Patent: November 28, 2017
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Reza Yazdani
  • Patent number: 9823929
    Abstract: A processor includes a queue for storing instructions processed within the context of a current value of a register field, where for some embodiments the instruction is undefined or defined, depending upon the register field at time of processing. After a write instruction (an instruction that writes to the register field) executes, the queue is searched for any entries that contain instructions that depend upon the executed write instruction. Each such entry stores the value of the register field at the time the instruction in the entry was processed. If such an entry is found in the queue and its stored value of the register field does not match the value that the write instruction wrote to the register field, then the processor flushes the pipeline and restarts at a state so as to correctly execute the instruction.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: November 21, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Daren Eugene Streett, Brian Michael Stempel, Thomas Philip Speier, Rodney Wayne Smith, Michael Scott McIlvaine, Kenneth Alan Dockser, James Norris Dieffenderfer
  • Patent number: 9817662
    Abstract: The apparatus and method for calculating and retaining a bound on error during floating point operations inserts an additional bounding field into the standard floating-point format that records the retained significant bits of the calculation with notification upon insufficient retention. The bounding field, which accounts for both rounding and cancellation errors, has two parts, the lost bits D Field and the accumulated rounding error R Field. The D Field states the number of bits in the floating point representation that are no longer meaningful. The bounds on the real value represented are determined from the truncated floating point value (first bound) and the addition of the error determined by the number of lost bits (second bound). The true, real value is absolutely contained by the first and second bounds. The allowed loss (optionally programmable) of significant bits provides a fail-safe, real-time notification of loss of significant bits.
    Type: Grant
    Filed: October 23, 2016
    Date of Patent: November 14, 2017
    Inventor: Alan A Jorgensen
  • Patent number: 9817791
    Abstract: Methods and apparatus for a low energy accelerator processor architecture with short parallel instruction word. An integrated circuit includes a system bus having a data width N, where N is a positive integer; a central processor unit coupled to the system bus and configured to execute instructions retrieved from a memory coupled to the system bus; and a low energy accelerator processor coupled to the system bus and configured to execute instruction words retrieved from a low energy accelerator code memory, the low energy accelerator processor having a plurality of execution units including a load store unit, a load coefficient unit, a multiply unit, and a butterfly/adder ALU unit, each of the execution units configured to perform operations responsive to op-codes decoded from the retrieved instruction words, wherein the width of the instruction words is equal to the data width N. Additional methods and apparatus are disclosed.
    Type: Grant
    Filed: April 4, 2015
    Date of Patent: November 14, 2017
    Assignees: TEXAS INSTRUMENTS INCORPORATED, TEXAS INSTRUMENTS DEUTSCHLAND GMBH
    Inventors: Srinivas Lingam, Seok-Jun Lee, Johann Zipperer, Manish Goel
  • Patent number: 9811338
    Abstract: In one embodiment, a processor includes an instruction decoder to receive and decode an instruction having a prefix and an opcode, an execution unit to execute the instruction based on the opcode, and flag modification override logic to prevent the execution unit from modifying a flag register of the processor based on the prefix of the instruction.
    Type: Grant
    Filed: November 4, 2011
    Date of Patent: November 7, 2017
    Assignee: Intel Corporation
    Inventors: Jonathan D. Combs, Jason W. Brandt, Robert Valentine
  • Patent number: 9804852
    Abstract: In one embodiment, a processor includes an instruction decoder to receive a first instruction having a prefix and an opcode and to generate, by an instruction decoder of the processor, a second instruction executable based on a condition determined based on the prefix, and an execution unit to conditionally execute the second instruction based on the condition determined based on the prefix.
    Type: Grant
    Filed: November 30, 2011
    Date of Patent: October 31, 2017
    Assignee: Intel Corporation
    Inventors: Jonathan D. Combs, Jason W. Brandt, Robert Valentine, Kevin B. Smith, Zia Ansari, Maxim Loktyukhin