Patents Examined by Michael Sun

Implementing a micro-operation cache with compaction

Patent number: 11016763

Abstract: Systems, apparatuses, and methods for compacting multiple groups of micro-operations into individual cache lines of a micro-operation cache are disclosed. A processor includes at least a decode unit and a micro-operation cache. When a new group of micro-operations is decoded and ready to be written to the micro-operation cache, the micro-operation cache determines which set is targeted by the new group of micro-operations. If there is a way in this set that can store the new group without evicting any existing group already stored in the way, then the new group is stored into the way with the existing group(s) of micro-operations. Metadata is then updated to indicate that the new group of micro-operations has been written to the way. Additionally, the micro-operation cache manages eviction and replacement policy at the granularity of micro-operation groups rather than at the granularity of cache lines.

Type: Grant

Filed: March 8, 2019

Date of Patent: May 25, 2021

Assignee: Advanced Micro Devices, Inc.

Inventors: Jagadish B. Kotra, John Kalamatianos
Address-based filtering for load/store speculation

Patent number: 10990393

Abstract: Address-based filtering for load/store speculation includes maintaining a filtering table including table entries associated with ranges of addresses; in response to receiving an ordering check triggering transaction, querying the filtering table using a target address of the ordering check triggering transaction to determine if an instruction dependent upon the ordering check triggering transaction has previously been generated a physical address; and in response to determining that the filtering table lacks an indication that the instruction dependent upon the ordering check triggering transaction has previously been generated a physical address, bypassing a lookup operation in an ordering violation memory structure to determine whether the instruction dependent upon the ordering check triggering transaction is currently in-flight.

Type: Grant

Filed: October 21, 2019

Date of Patent: April 27, 2021

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: John Kalamatianos, Krishnan V. Ramani, Susumu Mashimo
Instruction execution method and instruction execution device

Patent number: 10990406

Abstract: An instruction execution device includes a processor. The processor includes an instruction translator, a reorder buffer, an architecture register, and an execution unit. The instruction translator receives a macro-instruction and translates the macro-instruction into a first micro-instruction, a second micro-instruction and a third micro-instruction. The instruction translator marks the first micro-instruction and the second micro-instruction with the same atomic operation flag. The execution unit executes the first micro-instruction to generate a first execution result and to store the first execution result in a temporary register. The execution unit executes the second micro-instruction to generate a second execution result and to store the second execution result in the architecture register. The execution unit executes the third micro-instruction to read the first execution result from the temporary register and to store the first execution result in the architecture register.

Type: Grant

Filed: September 26, 2019

Date of Patent: April 27, 2021

Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.

Inventors: Penghao Zou, Zhi Zhang
Register sharing mechanism

Patent number: 10983794

Abstract: An processor to facilitate register sharing is disclosed. The processor includes a plurality of execution units (EUs), each including a General Purpose Register File (GRF) having a plurality of registers; and register sharing hardware to divide the plurality of registers into a first set of registers dedicated for execution of a first set of threads and a second set of registers shared for execution of a second set of threads.

Type: Grant

Filed: June 17, 2019

Date of Patent: April 20, 2021

Assignee: Intel Corporation

Inventors: Guei-Yuan Lueh, Subramaniam Maiyuran, Weiyu Chen, Konrad Trifunovic, Supratim Pal, Chandra S. Gurram, Jorge E. Parra, Pratik J. Ashar, Tomasz Bujewski
Array broadcast and reduction systems and methods

Patent number: 10983793

Abstract: The present disclosure is directed to systems and methods of performing one or more broadcast or reduction operations using direct memory access (DMA) control circuitry. The DMA control circuitry executes a modified instruction set architecture (ISA) that facilitates the broadcast distribution of data to a plurality of destination addresses in system memory circuitry. The broadcast instruction may include broadcast of a single data value to each destination address. The broadcast instruction may include broadcast of a data array to each destination address. The DMA control circuitry may also execute a reduction instruction that facilitates the retrieval of data from a plurality of source addresses in system memory and performing one or more operations using the retrieved data. Since the DMA control circuitry, rather than the processor circuitry performs the broadcast and reduction operations, system speed and efficiency is beneficially enhanced.

Type: Grant

Filed: March 29, 2019

Date of Patent: April 20, 2021

Assignee: Intel Corporation

Inventors: Joshua Fryman, Ankit More, Jason Howard, Robert Pawlowski, Yigit Demir, Nick Pepperling, Fabrizio Petrini, Sriram Aananthakrishnan, Shaden Smith
Checkpointing speculative register mappings

Patent number: 10977038

Abstract: A processing apparatus supporting register renaming is provided with checkpoint circuitry to capture register mapping checkpoints indicative of speculative register mappings between logical registers and physical registers at a given point of speculative execution, and register group tracking circuitry to maintain tracking information for groups of logical registers. The tracking information for a given group indicates whether the given group is a changed group comprising at least one logical register for which a corresponding speculative register mapping has changed since a last checkpoint was captured, or an unchanged group for which none of the logical registers in that group have had their speculative register mappings changed since the last checkpoint was captured. When capturing a new register mapping checkpoint, unchanged groups of logical registers are excluded from the new register mapping checkpoint. This can save power in a register mapping checkpointing scheme.

Type: Grant

Filed: June 19, 2019

Date of Patent: April 13, 2021

Assignee: Arm Limited

Inventor: William Elton Burky
Processing of iterative operation

Patent number: 10970070

Abstract: An apparatus has processing circuitry to perform, in response to decoding of an iterative-operation instruction by the instruction decoder, an iterative operation comprising at least two iterations of processing where one iteration depends on an operand generated in a previous iteration. Preliminary information generating circuitry performs a preliminary portion of processing for a given iteration to generate preliminary information. Result generating circuitry performs a remaining portion of processing for the given iteration, to generate a result value using the preliminary information. Forwarding circuitry forwards the result value as an operand for a next iteration of the iterative operation, for iterations other than the final iteration. The preliminary information generating circuitry starts performing the preliminary portion for the next iteration in parallel with the result generating circuitry completing the remaining portion for the current iteration, to improve performance.

Type: Grant

Filed: March 29, 2019

Date of Patent: April 6, 2021

Assignee: Arm Limited

Inventors: Nicholas Andrew Pfister, Srinivas Vemuri, David Raymond Lutz
Method and apparatus for executing non-maskable interrupt

Patent number: 10970108

Abstract: The present invention discloses a method and an apparatus for executing a non-maskable interrupt. The method includes: obtaining a secure interrupt request in a non-secure mode, and interrupting an operation of an operating system OS, where the secure interrupt request cannot be masked; entering a secure mode by using the secure interrupt request, and saving, in the secure mode, an interrupt context of an OS status when the operation of the OS is interrupted; returning to the non-secure mode to execute user-defined processing; after the user-defined processing is completed, entering the secure mode again, and resuming the OS status in the secure mode according to the interrupt context; and returning to the non-secure mode again, and continuing to execute an operation of the OS. The method and the apparatus for executing a non-maskable interrupt in embodiments of the present invention can easily implement an NMI mechanism without depending on hardware.

Type: Grant

Filed: October 3, 2019

Date of Patent: April 6, 2021

Assignee: Huawei Technologies Co., Ltd.

Inventors: Jun Ma, Tianhong Ding, Zhaozhe Tong
High bandwidth DIMM

Patent number: 10963404

Abstract: A DIMM is described. The DIMM includes circuitry to simultaneously transfer data of different ranks of memory chips on the DIMM over a same data bus during a same burst write sequence.

Type: Grant

Filed: June 25, 2018

Date of Patent: March 30, 2021

Assignee: Intel Corporation

Inventors: James A. McCall, Rajat Agarwal, George Vergis, Bill Nale
Mobile de-whitening

Patent number: 10956343

Abstract: Systems and methods are disclosed and include a processor configured to execute instructions stored in a nontransitory computer-readable medium. The instructions include generating first message authentication code (MAC) bytes based on a shared secret key. The instructions include generating first nonce bytes and an authenticated packet based on the first MAC bytes, the first nonce bytes, and a message byte. The instructions include generating a de-whitened tone byte based on the shared secret key. The instructions include generating a message packet that includes the authenticated packet and the de-whitened tone byte. Generating the message packet includes pseudo-randomly identifying a first location of the authenticated packet and inserting the de-whitened tone byte at the first location.

Type: Grant

Filed: October 7, 2019

Date of Patent: March 23, 2021

Assignees: DENSO International America, Inc., DENSO CORPORATION

Inventors: Raymond Michael Stitt, Thomas Peterson, Karl Jager, Kyle Golsch
Post completion execution in an out-of-order processor design

Patent number: 10956168

Abstract: A computer data processing system includes an instruction pipeline having a front end and a back end, a decoding and dispatch unit to dispatch a current instruction; and a pipeline by-pass unit to invoke an out-of-order pipeline by-pass operation. The pipeline by-pass unit by-passes a section of the instruction pipeline such that the current instruction architecturally completes before initiating instruction execution. The computer data processing system further includes a post-completion execution unit that executes the current instruction after the current instruction architecturally completes.

Type: Grant

Filed: March 8, 2019

Date of Patent: March 23, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Avery Francois, Christian Jacobi, Gregory William Alexander
Method and apparatus for a multi-level reservation station with instruction recirculation

Patent number: 10956160

Abstract: A processor and method are described for a multi-level reservation station.

Type: Grant

Filed: March 27, 2019

Date of Patent: March 23, 2021

Assignee: Intel Corporation

Inventors: Mark Dechene, Srikanth Srinivasan, Matthew Merten, Ammon Christiansen
Instruction ordering

Patent number: 10956166

Abstract: A data processing apparatus includes obtain circuitry that obtains a stream of instructions. The stream of instructions includes a barrier creation instruction and a barrier inhibition instruction. Track circuitry orders sending each instruction in the stream of instructions to processing circuitry based on one or more dependencies. The track circuitry is responsive to the barrier creation instruction to cause the one or more dependencies to include one or more barrier dependencies in which pre-barrier instructions, occurring before the barrier creation instruction in the stream, are sent before post-barrier instructions, occurring after the barrier creation instruction in the stream, are sent. The track circuitry is also responsive to the barrier inhibition instruction to relax the barrier dependencies to permit post-inhibition instructions, occurring after the barrier inhibition instruction in the stream, to be sent before the pre-barrier instructions.

Type: Grant

Filed: March 8, 2019

Date of Patent: March 23, 2021

Assignees: Arm Limited, The Regents of The University of Michigan

Inventors: Vaibhav Gogte, Wei Wang, Stephan Diestelhorst, Peter M Chen, Satish Narayanasamy, Thomas Friedrich Wenisch
Shadow cache for securing conditional speculative instruction execution

Patent number: 10949210

Abstract: A computing device, having: a processor; memory; a first cache coupled between the memory and the processor; and a second cache coupled between the memory and the processor. During speculative execution of one or more instructions, effects of the speculative execution are contained within the second cache.

Type: Grant

Filed: July 6, 2018

Date of Patent: March 16, 2021

Assignee: Micron Technology, Inc.

Inventor: Steven Jeffrey Wallach
Technologies for facilitating remote memory requests in accelerator devices

Patent number: 10949362

Abstract: Technologies for facilitating remote memory requests in accelerator devices are disclosed. The accelerator device includes circuitry to receive, from a kernel of the present accelerator device, a request through an application programming interface exposed to a high level software language in which the kernel of the present accelerator device is implemented, to establish a logical communication path between the kernel of the present accelerator device and a target accelerator device kernel, based on one or more physical communication paths. The communication protocol supported by the accelerator device may allow kernels operating on the accelerator device to send memory requests for memory locations at remote devices, with the communication protocol performing all of the operations necessary to carry out the memory request.

Type: Grant

Filed: June 28, 2019

Date of Patent: March 16, 2021

Assignee: Intel Corporation

Inventors: Susanne M. Balle, Evan Custodio, Paul H. Dormitzer, Narayan Ranganathan
Accelerator systems and methods for matrix operations

Patent number: 10942738

Abstract: The present disclosure is directed to systems and methods for performing one or more operations on a two dimensional tile register using an accelerator that includes a tiled matrix multiplication unit (TMU). The processor circuitry includes reservation station (RS) circuitry to communicatively couple the processor circuitry to the TMU. The RS circuitry coordinates the operations performed by the TMU. TMU dispatch queue (TDQ) circuitry in the TMU maintains the operations received from the RS circuitry in the order that the operations are received from the RS circuitry. Since the duration of each operation is not known prior to execution by the TMU, the RS circuitry maintains shadow dispatch queue (RS-TDQ) circuitry that mirrors the operations in the TDQ circuitry.

Type: Grant

Filed: March 29, 2019

Date of Patent: March 9, 2021

Assignee: Intel Corporation

Inventors: Zeev Sperber, Amit Gradstein, Simon Rubanovich, Igor Yanover, Gavri Berger, Eyal Hadas, Saeed Kharouf, Ron Schneider, Sagi Meller, Jose Yallouz
Conditional branch frame barrier

Patent number: 10922081

Abstract: Establishing a conditional branch frame barrier is described. A conditional branch in a function epilogue is used to provide frame-specific control. The conditional branch evaluates a return condition to determine whether to return from a callee function to a calling function, or to execute a slow path instead. The return condition is evaluated based on a thread local value. The thread local value is set such that returns to potentially unsafe frames in a call stack are prohibited. The prohibition to return to a potentially unsafe frame may be referred to as a “frame barrier.” Additionally, the thread local value may be used to establish safepointing and/or thread local handshakes, both after execution of a function body and after execution of a loop body.

Type: Grant

Filed: June 19, 2019

Date of Patent: February 16, 2021

Assignee: Oracle International Corporation

Inventor: Erik Österlund
Host processor configured with instruction set comprising resilient data move instructions

Patent number: 10922078

Abstract: A system includes a host processor and at least one storage device coupled to the host processor. The host processor is configured to execute instructions of an instruction set, the instruction set comprising a first move instruction for moving data identified by at least one operand of the first move instruction into each of multiple distinct storage locations. The host processor, in executing the first move instruction, is configured to store the data in a first one of the storage locations identified by one or more additional operands of the first move instruction, and to store the data in a second one of the storage locations identified based at least in part on the first storage location. The instruction set in some embodiments further comprises a second move instruction for moving the data from the multiple distinct storage locations to another storage location.

Type: Grant

Filed: June 18, 2019

Date of Patent: February 16, 2021

Assignee: EMC IP Holding Company LLC

Inventors: Michael Robillard, Adrian Michaud, Dragan Savic
Using loop exit prediction to accelerate or suppress loop mode of a processor

Patent number: 10915322

Abstract: A processor predicts a number of loop iterations associated with a set of loop instructions. In response to the predicted number of loop iterations exceeding a first loop iteration threshold, the set of loop instructions are executed in a loop mode that includes placing at least one component of an instruction pipeline of the processor in a low-power mode or state and executing the set of loop instructions from a loop buffer. In response to the predicted number of loop iterations being less than or equal to a second loop iteration threshold, the set of instructions are executed in a non-loop mode that includes maintaining at least one component of the instruction pipeline in a powered up state and executing the set of loop instructions from an instruction fetch unit of the instruction pipeline.

Type: Grant

Filed: September 18, 2018

Date of Patent: February 9, 2021

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Arunachalam Annamalai, Marius Evers, Aparna Thyagarajan, Anthony Jarvis
Cache systems and circuits for syncing caches or cache sets

Patent number: 10915326

Abstract: A cache system, having a first cache, a second cache, and a logic circuit coupled to control the first cache and the second cache according to an execution type of a processor. When an execution type of a processor is a first type indicating non-speculative execution of instructions and the first cache is configured to service commands from a command bus for accessing a memory system, the logic circuit is configured to copy a portion of content cached in the first cache to the second cache. The cache system can include a configurable data bit. The logic circuit can be coupled to control the caches according to the bit. Alternatively, the caches can include cache sets. The caches can also include registers associated with the cache sets respectively. The logic circuit can be coupled to control the cache sets according to the registers.

Type: Grant

Filed: July 31, 2019

Date of Patent: February 9, 2021

Assignee: Micron Technology, Inc.

Inventor: Steven Jeffrey Wallach

prev … 6 7 8 9 10 11 12 13 14 … next