Patents Examined by Michael Sun
  • Patent number: 11016763
    Abstract: Systems, apparatuses, and methods for compacting multiple groups of micro-operations into individual cache lines of a micro-operation cache are disclosed. A processor includes at least a decode unit and a micro-operation cache. When a new group of micro-operations is decoded and ready to be written to the micro-operation cache, the micro-operation cache determines which set is targeted by the new group of micro-operations. If there is a way in this set that can store the new group without evicting any existing group already stored in the way, then the new group is stored into the way with the existing group(s) of micro-operations. Metadata is then updated to indicate that the new group of micro-operations has been written to the way. Additionally, the micro-operation cache manages eviction and replacement policy at the granularity of micro-operation groups rather than at the granularity of cache lines.
    Type: Grant
    Filed: March 8, 2019
    Date of Patent: May 25, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Jagadish B. Kotra, John Kalamatianos
  • Patent number: 10990393
    Abstract: Address-based filtering for load/store speculation includes maintaining a filtering table including table entries associated with ranges of addresses; in response to receiving an ordering check triggering transaction, querying the filtering table using a target address of the ordering check triggering transaction to determine if an instruction dependent upon the ordering check triggering transaction has previously been generated a physical address; and in response to determining that the filtering table lacks an indication that the instruction dependent upon the ordering check triggering transaction has previously been generated a physical address, bypassing a lookup operation in an ordering violation memory structure to determine whether the instruction dependent upon the ordering check triggering transaction is currently in-flight.
    Type: Grant
    Filed: October 21, 2019
    Date of Patent: April 27, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: John Kalamatianos, Krishnan V. Ramani, Susumu Mashimo
  • Patent number: 10990406
    Abstract: An instruction execution device includes a processor. The processor includes an instruction translator, a reorder buffer, an architecture register, and an execution unit. The instruction translator receives a macro-instruction and translates the macro-instruction into a first micro-instruction, a second micro-instruction and a third micro-instruction. The instruction translator marks the first micro-instruction and the second micro-instruction with the same atomic operation flag. The execution unit executes the first micro-instruction to generate a first execution result and to store the first execution result in a temporary register. The execution unit executes the second micro-instruction to generate a second execution result and to store the second execution result in the architecture register. The execution unit executes the third micro-instruction to read the first execution result from the temporary register and to store the first execution result in the architecture register.
    Type: Grant
    Filed: September 26, 2019
    Date of Patent: April 27, 2021
    Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.
    Inventors: Penghao Zou, Zhi Zhang
  • Patent number: 10983794
    Abstract: An processor to facilitate register sharing is disclosed. The processor includes a plurality of execution units (EUs), each including a General Purpose Register File (GRF) having a plurality of registers; and register sharing hardware to divide the plurality of registers into a first set of registers dedicated for execution of a first set of threads and a second set of registers shared for execution of a second set of threads.
    Type: Grant
    Filed: June 17, 2019
    Date of Patent: April 20, 2021
    Assignee: Intel Corporation
    Inventors: Guei-Yuan Lueh, Subramaniam Maiyuran, Weiyu Chen, Konrad Trifunovic, Supratim Pal, Chandra S. Gurram, Jorge E. Parra, Pratik J. Ashar, Tomasz Bujewski
  • Patent number: 10983793
    Abstract: The present disclosure is directed to systems and methods of performing one or more broadcast or reduction operations using direct memory access (DMA) control circuitry. The DMA control circuitry executes a modified instruction set architecture (ISA) that facilitates the broadcast distribution of data to a plurality of destination addresses in system memory circuitry. The broadcast instruction may include broadcast of a single data value to each destination address. The broadcast instruction may include broadcast of a data array to each destination address. The DMA control circuitry may also execute a reduction instruction that facilitates the retrieval of data from a plurality of source addresses in system memory and performing one or more operations using the retrieved data. Since the DMA control circuitry, rather than the processor circuitry performs the broadcast and reduction operations, system speed and efficiency is beneficially enhanced.
    Type: Grant
    Filed: March 29, 2019
    Date of Patent: April 20, 2021
    Assignee: Intel Corporation
    Inventors: Joshua Fryman, Ankit More, Jason Howard, Robert Pawlowski, Yigit Demir, Nick Pepperling, Fabrizio Petrini, Sriram Aananthakrishnan, Shaden Smith
  • Patent number: 10977038
    Abstract: A processing apparatus supporting register renaming is provided with checkpoint circuitry to capture register mapping checkpoints indicative of speculative register mappings between logical registers and physical registers at a given point of speculative execution, and register group tracking circuitry to maintain tracking information for groups of logical registers. The tracking information for a given group indicates whether the given group is a changed group comprising at least one logical register for which a corresponding speculative register mapping has changed since a last checkpoint was captured, or an unchanged group for which none of the logical registers in that group have had their speculative register mappings changed since the last checkpoint was captured. When capturing a new register mapping checkpoint, unchanged groups of logical registers are excluded from the new register mapping checkpoint. This can save power in a register mapping checkpointing scheme.
    Type: Grant
    Filed: June 19, 2019
    Date of Patent: April 13, 2021
    Assignee: Arm Limited
    Inventor: William Elton Burky
  • Patent number: 10970070
    Abstract: An apparatus has processing circuitry to perform, in response to decoding of an iterative-operation instruction by the instruction decoder, an iterative operation comprising at least two iterations of processing where one iteration depends on an operand generated in a previous iteration. Preliminary information generating circuitry performs a preliminary portion of processing for a given iteration to generate preliminary information. Result generating circuitry performs a remaining portion of processing for the given iteration, to generate a result value using the preliminary information. Forwarding circuitry forwards the result value as an operand for a next iteration of the iterative operation, for iterations other than the final iteration. The preliminary information generating circuitry starts performing the preliminary portion for the next iteration in parallel with the result generating circuitry completing the remaining portion for the current iteration, to improve performance.
    Type: Grant
    Filed: March 29, 2019
    Date of Patent: April 6, 2021
    Assignee: Arm Limited
    Inventors: Nicholas Andrew Pfister, Srinivas Vemuri, David Raymond Lutz
  • Patent number: 10970108
    Abstract: The present invention discloses a method and an apparatus for executing a non-maskable interrupt. The method includes: obtaining a secure interrupt request in a non-secure mode, and interrupting an operation of an operating system OS, where the secure interrupt request cannot be masked; entering a secure mode by using the secure interrupt request, and saving, in the secure mode, an interrupt context of an OS status when the operation of the OS is interrupted; returning to the non-secure mode to execute user-defined processing; after the user-defined processing is completed, entering the secure mode again, and resuming the OS status in the secure mode according to the interrupt context; and returning to the non-secure mode again, and continuing to execute an operation of the OS. The method and the apparatus for executing a non-maskable interrupt in embodiments of the present invention can easily implement an NMI mechanism without depending on hardware.
    Type: Grant
    Filed: October 3, 2019
    Date of Patent: April 6, 2021
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Jun Ma, Tianhong Ding, Zhaozhe Tong
  • Patent number: 10963404
    Abstract: A DIMM is described. The DIMM includes circuitry to simultaneously transfer data of different ranks of memory chips on the DIMM over a same data bus during a same burst write sequence.
    Type: Grant
    Filed: June 25, 2018
    Date of Patent: March 30, 2021
    Assignee: Intel Corporation
    Inventors: James A. McCall, Rajat Agarwal, George Vergis, Bill Nale
  • Patent number: 10956343
    Abstract: Systems and methods are disclosed and include a processor configured to execute instructions stored in a nontransitory computer-readable medium. The instructions include generating first message authentication code (MAC) bytes based on a shared secret key. The instructions include generating first nonce bytes and an authenticated packet based on the first MAC bytes, the first nonce bytes, and a message byte. The instructions include generating a de-whitened tone byte based on the shared secret key. The instructions include generating a message packet that includes the authenticated packet and the de-whitened tone byte. Generating the message packet includes pseudo-randomly identifying a first location of the authenticated packet and inserting the de-whitened tone byte at the first location.
    Type: Grant
    Filed: October 7, 2019
    Date of Patent: March 23, 2021
    Assignees: DENSO International America, Inc., DENSO CORPORATION
    Inventors: Raymond Michael Stitt, Thomas Peterson, Karl Jager, Kyle Golsch
  • Patent number: 10956168
    Abstract: A computer data processing system includes an instruction pipeline having a front end and a back end, a decoding and dispatch unit to dispatch a current instruction; and a pipeline by-pass unit to invoke an out-of-order pipeline by-pass operation. The pipeline by-pass unit by-passes a section of the instruction pipeline such that the current instruction architecturally completes before initiating instruction execution. The computer data processing system further includes a post-completion execution unit that executes the current instruction after the current instruction architecturally completes.
    Type: Grant
    Filed: March 8, 2019
    Date of Patent: March 23, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Avery Francois, Christian Jacobi, Gregory William Alexander
  • Patent number: 10956160
    Abstract: A processor and method are described for a multi-level reservation station.
    Type: Grant
    Filed: March 27, 2019
    Date of Patent: March 23, 2021
    Assignee: Intel Corporation
    Inventors: Mark Dechene, Srikanth Srinivasan, Matthew Merten, Ammon Christiansen
  • Patent number: 10956166
    Abstract: A data processing apparatus includes obtain circuitry that obtains a stream of instructions. The stream of instructions includes a barrier creation instruction and a barrier inhibition instruction. Track circuitry orders sending each instruction in the stream of instructions to processing circuitry based on one or more dependencies. The track circuitry is responsive to the barrier creation instruction to cause the one or more dependencies to include one or more barrier dependencies in which pre-barrier instructions, occurring before the barrier creation instruction in the stream, are sent before post-barrier instructions, occurring after the barrier creation instruction in the stream, are sent. The track circuitry is also responsive to the barrier inhibition instruction to relax the barrier dependencies to permit post-inhibition instructions, occurring after the barrier inhibition instruction in the stream, to be sent before the pre-barrier instructions.
    Type: Grant
    Filed: March 8, 2019
    Date of Patent: March 23, 2021
    Assignees: Arm Limited, The Regents of The University of Michigan
    Inventors: Vaibhav Gogte, Wei Wang, Stephan Diestelhorst, Peter M Chen, Satish Narayanasamy, Thomas Friedrich Wenisch
  • Patent number: 10949210
    Abstract: A computing device, having: a processor; memory; a first cache coupled between the memory and the processor; and a second cache coupled between the memory and the processor. During speculative execution of one or more instructions, effects of the speculative execution are contained within the second cache.
    Type: Grant
    Filed: July 6, 2018
    Date of Patent: March 16, 2021
    Assignee: Micron Technology, Inc.
    Inventor: Steven Jeffrey Wallach
  • Patent number: 10949362
    Abstract: Technologies for facilitating remote memory requests in accelerator devices are disclosed. The accelerator device includes circuitry to receive, from a kernel of the present accelerator device, a request through an application programming interface exposed to a high level software language in which the kernel of the present accelerator device is implemented, to establish a logical communication path between the kernel of the present accelerator device and a target accelerator device kernel, based on one or more physical communication paths. The communication protocol supported by the accelerator device may allow kernels operating on the accelerator device to send memory requests for memory locations at remote devices, with the communication protocol performing all of the operations necessary to carry out the memory request.
    Type: Grant
    Filed: June 28, 2019
    Date of Patent: March 16, 2021
    Assignee: Intel Corporation
    Inventors: Susanne M. Balle, Evan Custodio, Paul H. Dormitzer, Narayan Ranganathan
  • Patent number: 10942738
    Abstract: The present disclosure is directed to systems and methods for performing one or more operations on a two dimensional tile register using an accelerator that includes a tiled matrix multiplication unit (TMU). The processor circuitry includes reservation station (RS) circuitry to communicatively couple the processor circuitry to the TMU. The RS circuitry coordinates the operations performed by the TMU. TMU dispatch queue (TDQ) circuitry in the TMU maintains the operations received from the RS circuitry in the order that the operations are received from the RS circuitry. Since the duration of each operation is not known prior to execution by the TMU, the RS circuitry maintains shadow dispatch queue (RS-TDQ) circuitry that mirrors the operations in the TDQ circuitry.
    Type: Grant
    Filed: March 29, 2019
    Date of Patent: March 9, 2021
    Assignee: Intel Corporation
    Inventors: Zeev Sperber, Amit Gradstein, Simon Rubanovich, Igor Yanover, Gavri Berger, Eyal Hadas, Saeed Kharouf, Ron Schneider, Sagi Meller, Jose Yallouz
  • Patent number: 10922081
    Abstract: Establishing a conditional branch frame barrier is described. A conditional branch in a function epilogue is used to provide frame-specific control. The conditional branch evaluates a return condition to determine whether to return from a callee function to a calling function, or to execute a slow path instead. The return condition is evaluated based on a thread local value. The thread local value is set such that returns to potentially unsafe frames in a call stack are prohibited. The prohibition to return to a potentially unsafe frame may be referred to as a “frame barrier.” Additionally, the thread local value may be used to establish safepointing and/or thread local handshakes, both after execution of a function body and after execution of a loop body.
    Type: Grant
    Filed: June 19, 2019
    Date of Patent: February 16, 2021
    Assignee: Oracle International Corporation
    Inventor: Erik Österlund
  • Patent number: 10922078
    Abstract: A system includes a host processor and at least one storage device coupled to the host processor. The host processor is configured to execute instructions of an instruction set, the instruction set comprising a first move instruction for moving data identified by at least one operand of the first move instruction into each of multiple distinct storage locations. The host processor, in executing the first move instruction, is configured to store the data in a first one of the storage locations identified by one or more additional operands of the first move instruction, and to store the data in a second one of the storage locations identified based at least in part on the first storage location. The instruction set in some embodiments further comprises a second move instruction for moving the data from the multiple distinct storage locations to another storage location.
    Type: Grant
    Filed: June 18, 2019
    Date of Patent: February 16, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Michael Robillard, Adrian Michaud, Dragan Savic
  • Patent number: 10915322
    Abstract: A processor predicts a number of loop iterations associated with a set of loop instructions. In response to the predicted number of loop iterations exceeding a first loop iteration threshold, the set of loop instructions are executed in a loop mode that includes placing at least one component of an instruction pipeline of the processor in a low-power mode or state and executing the set of loop instructions from a loop buffer. In response to the predicted number of loop iterations being less than or equal to a second loop iteration threshold, the set of instructions are executed in a non-loop mode that includes maintaining at least one component of the instruction pipeline in a powered up state and executing the set of loop instructions from an instruction fetch unit of the instruction pipeline.
    Type: Grant
    Filed: September 18, 2018
    Date of Patent: February 9, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Arunachalam Annamalai, Marius Evers, Aparna Thyagarajan, Anthony Jarvis
  • Patent number: 10915326
    Abstract: A cache system, having a first cache, a second cache, and a logic circuit coupled to control the first cache and the second cache according to an execution type of a processor. When an execution type of a processor is a first type indicating non-speculative execution of instructions and the first cache is configured to service commands from a command bus for accessing a memory system, the logic circuit is configured to copy a portion of content cached in the first cache to the second cache. The cache system can include a configurable data bit. The logic circuit can be coupled to control the caches according to the bit. Alternatively, the caches can include cache sets. The caches can also include registers associated with the cache sets respectively. The logic circuit can be coupled to control the cache sets according to the registers.
    Type: Grant
    Filed: July 31, 2019
    Date of Patent: February 9, 2021
    Assignee: Micron Technology, Inc.
    Inventor: Steven Jeffrey Wallach