Patents Examined by Keith E Vicary
  • Patent number: 11403110
    Abstract: A method includes receiving an execute packet that includes a first instruction and a second instruction and executing the first instruction and the second instruction using a pipeline. Executing the first and second instructions includes storing a result of the first instruction in a holding register; determining whether an event that interrupts execution of the execute packet occurs prior to completion of the executing of the second instruction; and based on the event not occurring, committing the result of the first instruction after completion of the executing of the second instruction.
    Type: Grant
    Filed: October 23, 2020
    Date of Patent: August 2, 2022
    Assignee: Texas Instruments Incorporated
    Inventors: Kai Chirca, Timothy D. Anderson, Paul Daniel Gauvreau
  • Patent number: 11403105
    Abstract: An apparatus has processing circuitry for executing instructions and fetch circuitry for fetching the instructions for execution. When a branch instruction is encountered by the fetch circuitry, it determines subsequent instructions to be fetched in dependence on an initial branch direction prediction for the branch instruction made by branch prediction circuitry. Value prediction circuitry is used to maintain a predicted result value for one or more instructions, and dispatch circuitry maintains a record of pending instructions that have been fetched by the fetch circuitry and are awaiting execution by the processing circuitry, and selects pending instructions from the record for dispatch to the processing circuitry.
    Type: Grant
    Filed: January 26, 2021
    Date of Patent: August 2, 2022
    Assignee: Arm Limited
    Inventors: Vladimir Vasekin, David Michael Bull, Frederic Claude Marie Piry, Alexei Fedorov
  • Patent number: 11392535
    Abstract: A computational array is implemented in which all operands and results are loaded or output from a single side of the array. The computational array comprises a plurality of cells arranged in n rows and m columns, each configured to produce a processed value based upon a weight value and an activation value. The cells receive weight and activation values via colinear weight and activation transmission channels that each extend across a first side edge of the computational array to provide weight values and activation values to the cells of the array. In addition, result values produced at a top cell of each of the m columns of the array are routed through the array to be output from the same first side edge of the array at a same relative timing at which the result values were produced.
    Type: Grant
    Filed: November 25, 2020
    Date of Patent: July 19, 2022
    Assignee: GROQ, INC.
    Inventors: Jonathan Alexander Ross, Tom Hawkins, Dennis Charles Abts
  • Patent number: 11392387
    Abstract: Predicting load-based control independent (CI), register data independent (DI) (CIRDI) instructions as CI memory data dependent (DD) (CIMDD) instructions for replay in speculative misprediction recovery in a processor. The processor predicts if a source of a load-based CIRDI instruction will be forwarded by a store-based instruction (i.e. “store-forwarded”). If a load-based CIRDI instruction is predicted as store-forwarded, the load-based CIRDI instruction is considered a CIMDD instruction and is replayed in misprediction recovery. If a load-based CIRDI instruction is not predicted as store-forwarded, the processor considers such load-based CIRDI instruction as a pending load-based CIRDI instruction. If this pending load-based CIRDI instruction is determined in execution to be store-forwarded, the instruction pipeline is flushed and the pending load-based CIRDI instruction is also replayed in misprediction recovery.
    Type: Grant
    Filed: November 4, 2020
    Date of Patent: July 19, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Vignyan Reddy Kothinti Naresh, Arthur Perais, Rami Mohammad Al Sheikh, Shivam Priyadarshi
  • Patent number: 11392382
    Abstract: Micro-operations (?ops) are allocated into a ?op cache by dividing, by a micro branch target buffer (?BTB), instructions into a first basic block in which the instructions are executed by a processing device and the first basic block corresponds to an edge of the instructions being executed by the processing device. The ?BTB allocates the first basic block to an inverted basic block queue (IBBQ) and the IBBQ determines that the first basic block fits into the ?op cache. The IBBQ allocates the first basic block to the ?op cache based on a number of times the edge of the instructions corresponding to the first basic block is repeatedly executed by the processing device.
    Type: Grant
    Filed: September 20, 2019
    Date of Patent: July 19, 2022
    Inventor: James David Dundas
  • Patent number: 11372646
    Abstract: A computer-implemented method includes fetching a fetch-packet containing a first hyper-block from a first address of a memory. The fetch-packet contains a bitwise distance from an entry point of the first hyper-block to a predicted exit point. The method further includes executing a first branch instruction of the first hyper-block. The first branch instruction corresponds to a first exit point. The first branch instruction includes an address corresponding to an entry point of a second hyper-block. The method also includes storing, responsive to executing the first branch instruction, a bitwise distance from the entry point of the first hyper-block to the first exit point. The method further includes moving a program counter from the first exit point of the first hyper-block to the entry point of the second hyper-block.
    Type: Grant
    Filed: November 14, 2019
    Date of Patent: June 28, 2022
    Assignee: Texas Instruments Incorporated
    Inventors: Kai Chirca, Timothy D. Anderson, David E. Smith, Jr., Paul D. Gauvreau
  • Patent number: 11360769
    Abstract: An instruction to perform scaling, converting and splitting operations is executed. The executing the instruction includes scaling an input value in one format to provide a scaled result. The scaled result is converted from the one format to provide a converted result in another format. The converted result is split into multiple parts, and one or more parts of the multiple parts are placed in a selected location.
    Type: Grant
    Filed: February 26, 2021
    Date of Patent: June 14, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric Mark Schwarz, Petra Leber, Kerstin Claudia Schelm, Silvia Melitta Mueller, Reid Copeland, Xin Guo, Cedric Lichtenau
  • Patent number: 11354128
    Abstract: In one embodiment, software executing on a data processing system that is capable of performing dynamic operational mode transitions can realize performance improvements by predicting transitions between modes and/or predicting aspects of a new operational mode. Such prediction can allow the processor to begin an early transition into the target mode. The mode transition prediction principles can be applied for various processor mode transitions including 64-bit to 32-bit mode transitions, interrupts, exceptions, traps, virtualization mode transfers, system management mode transfers, and/or secure execution mode transfers.
    Type: Grant
    Filed: March 4, 2015
    Date of Patent: June 7, 2022
    Assignee: Intel Corporation
    Inventors: Jason W. Brandt, Vedvyas Shanbhogue, Kameswar Subramaniam
  • Patent number: 11347517
    Abstract: A reduced precision based programmable and single instruction multiple data (SIMD) dataflow architecture includes reduced precision execution units with a majority of the execution units operating at reduced precision and a minority of the execution units are capable of operating at higher precision. The execution units operate in parallel within a programmable execution element to share instruction fetch, decode, and issue pipelines and operate on the same instruction in lock-step to minimize instruction-related overhead.
    Type: Grant
    Filed: June 20, 2019
    Date of Patent: May 31, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Kailash Gopalakrishnan, Sunil Shukla, Jungwook Choi, Silvia Mueller, Bruce Fleischer, Vijayalakshmi Srinivasan, Ankur Agrawal, Jinwook Oh
  • Patent number: 11341086
    Abstract: An array of processing elements are arranged in a three-dimensional array. Each of the processing elements includes or is coupled to a dedicated memory. The processing elements of the array are intercoupled to their nearest neighbor processing elements. A processing element on a first die may be intercoupled to a first processing element on a second die that is located directly above the processing element, a second processing element on a third die that is located directly below the processing element, and the four adjacent processing elements on the first die. This intercoupling allows data to flow from processing element to processing element in the three directions. These dataflows are reconfigurable so that they may be optimized for the task. The data flows of the array may be configured into one or more loops that periodically recycle data in order to accomplish different parts of a calculation.
    Type: Grant
    Filed: November 9, 2020
    Date of Patent: May 24, 2022
    Assignee: Rambus Inc.
    Inventors: Amogh Agrawal, Thomas Vogelsang, Steven C. Woo
  • Patent number: 11327757
    Abstract: In at least one embodiment, a processor includes architected and non-architected register files for buffering operands. The processor additionally includes an instruction fetch unit that fetches instructions to be executed and at least one execution unit. The at least one execution unit is configured to execute a first class of instructions that access operands in the architected register file and a second class of instructions that access operands in the non-architected register file. The processor also includes a mapper circuit that assigns physical registers to the instructions for buffering of operands. The processor additionally includes a dispatch circuit configured, based on detection of an instruction in one of the first and second classes of instructions for which correct operands do not reside in a respective one of the architected and non-architected register files, to automatically initiate transfer of operands between the architected and non-architected register files.
    Type: Grant
    Filed: December 14, 2020
    Date of Patent: May 10, 2022
    Assignee: International Business Machines Corporation
    Inventors: Steven J. Battle, Kurt A. Feiste, Susan E. Eisen, Dung Q. Nguyen, Christian Gerhard Zoellin, Kent Li, Brian W. Thompto, Dhivya Jeganathan, Kenneth L. Ward, Brian D. Barrick
  • Patent number: 11314517
    Abstract: Methods described herein relate to updating pipeline operations for data processing. The method includes receiving pipeline information for at least one of a plurality of pipelines. The pipeline information includes at least one of an input dataset, output dataset, input model, intermediate model, or output model. The method also includes determining one or more of the plurality of pipelines to update based on similarities with the pipeline information received for at least one of the plurality of pipelines. The method further includes updating the one or more of the plurality of pipelines based on the pipeline information received. Updating the pipeline includes updating at least one of the input model, intermediate model, or output model. The method still further includes storing the one or more updated pipelines.
    Type: Grant
    Filed: June 14, 2019
    Date of Patent: April 26, 2022
    Assignee: HERE GLOBAL B.V.
    Inventor: Tero Juhani Keski-Valkama
  • Patent number: 11269638
    Abstract: A streaming engine employed in a digital data processor specifies a fixed read only data stream. Once fetched data elements in the data stream are disposed in lanes in a stream head register in the fixed order. Some lanes may be invalid, for example when the number of remaining data elements are less than the number of lanes in the stream head register. The streaming engine automatically produces a valid data word stored in a stream valid register indicating lanes holding valid data. The data in the stream valid register may be automatically stored in a predicate register or otherwise made available. This data can be used to control vector SIMD operations or may be combined with other predicate register data.
    Type: Grant
    Filed: June 28, 2017
    Date of Patent: March 8, 2022
    Assignee: Texas Instruments Incorporated
    Inventors: Joseph Zbiciak, Son H. Tran
  • Patent number: 11269650
    Abstract: Techniques related to executing a plurality of instructions by a processor comprising a method for executing a plurality of instructions by a processor. The method comprises detecting a pipeline hazard based on one or more instructions provided for execution by an instruction execution pipeline, beginning execution of an instruction, of the one or more instructions on the instruction execution pipeline, stalling a portion of the instruction execution pipeline based on the detected pipeline hazard, storing a register state associated with the execution of the instruction based on the stalling, determining that the pipeline hazard has been resolved, and restoring the register state to the instruction execution pipeline based on the determination.
    Type: Grant
    Filed: November 15, 2019
    Date of Patent: March 8, 2022
    Assignee: Texas Instruments Incorporated
    Inventors: Timothy D. Anderson, Duc Bui, Joseph Zbiciak, Reid E. Tatge
  • Patent number: 11269643
    Abstract: A mechanism is described for facilitating fast data operations and for facilitating a finite state machine for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting input data to be used in computational tasks by a computation component of a processor including a graphics processor. The method may further include determining one or more frequently-used data values (FDVs) from the data, and pushing the one or more frequent data values to bypass the computational tasks.
    Type: Grant
    Filed: April 9, 2017
    Date of Patent: March 8, 2022
    Assignee: Intel Corporation
    Inventors: Liwei Ma, Nadathur Rajagopalan Satish, Jeremy Bottleson, Farshad Akhbari, Eriko Nurvitadhi, Abhishek R. Appu, Altug Koker, Kamal Sinha, Joydeep Ray, Balaji Vembu, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Patent number: 11263014
    Abstract: Data processing apparatuses, methods of data processing, and non-transitory computer-readable media on which computer-readable code is stored defining logical configurations of processing devices are disclosed. In an apparatus, fetch circuitry retrieves a sequence of instructions and execution circuitry performs data processing operations with respect to data values in a set of registers. An auxiliary execution circuitry interface and a coprocessor interface to provide a connection to a coprocessor outside the apparatus are provided.
    Type: Grant
    Filed: August 5, 2019
    Date of Patent: March 1, 2022
    Assignee: Arm Limited
    Inventors: Frederic Claude Marie Piry, Thomas Christoper Grocutt, Simon John Craske, Carlo Dario Fanara, Jean Sébastien Leroy
  • Patent number: 11256504
    Abstract: An apparatus and method for multiplying packed real and imaginary components of complex numbers are described. A processor embodiment includes: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed real and imaginary data elements; a second source register to store a second plurality of packed real and imaginary data elements; and execution circuitry to execute the decoded instruction.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: February 22, 2022
    Assignee: Intel Corporation
    Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Mark Charney, Robert Valentine, Binwei Yang
  • Patent number: 11249766
    Abstract: An instruction set architecture including instructions for a processor and instructions for a coprocessor may include synchronizing instructions that may be used to begin and end instruction sequences that include coprocessor instructions (coprocessor sequences). If a terminating synchronizing instruction is followed by an initial synchronizing instruction and the pair are detected in the coprocessor concurrently, the coprocessor may suppress execution of the pair of instructions.
    Type: Grant
    Filed: October 22, 2020
    Date of Patent: February 15, 2022
    Assignee: Apple Inc.
    Inventors: Aditya Kesiraju, Rajdeep L. Bhuyar, Ran A. Chachick, Andrew J. Beaumont-Smith
  • Patent number: 11216280
    Abstract: Exception control circuitry controls exception handling for processing circuitry. In response to an initial exception occurring when the processing circuitry is in a given exception level, the initial exception to be handled in a target exception level, the exception control circuitry stores exception control information to at least one exception control register associated with the target exception level, indicating at least one property of the initial exception or of processor state at a time the initial exception occurred. When at least one exception intercept configuration parameter stored in a configuration register indicates that exception interception is enabled, after storing the exception control information, and before the processing circuitry starts processing an exception handler for handling the initial exception in the target exception level, the exception control circuitry triggers a further exception to be handled in a predetermined exception level.
    Type: Grant
    Filed: November 26, 2019
    Date of Patent: January 4, 2022
    Assignee: Arm Limited
    Inventor: Simon John Craske
  • Patent number: 11216303
    Abstract: A method may include obtaining, for a task of a pipeline of an application: task execution metadata including a set of previous results, and a task image including executable code and an execution environment. The method may further include executing the executable code in the execution environment to generate a set of new results, and controlling execution of the pipeline using the set of new results and the set of previous results.
    Type: Grant
    Filed: July 29, 2019
    Date of Patent: January 4, 2022
    Assignee: Intuit Inc.
    Inventors: Michael Willson, Gennadiy Ziskind