Scoreboarding, Reservation Station, Or Aliasing Patents (Class 712/217)

System and method for instruction unwinding in an out-of-order processor

Patent number: 12386625

Abstract: A system and corresponding method unwind instructions in an out-of-order (OoO) processor. The system comprises a mapper. In response to a restart event causing at least one instruction to be unwound, the mapper restores a present integer mapper state and present floating-point (FP) mapper state, used for mapping instructions, to a former integer mapper state and former FP mapper state, respectively. The mapper stores integer snapshots and FP snapshots of the present integer and FP mapper state, respectively, to expedite restoration to the former integer and FP mapper state, respectively. Access to the FP snapshots is blocked, intermittently, as a function of at least one FP present indicator used by the mapper to record presence of FP registers used as destinations in the instructions. Blocking the access, intermittently, improves power efficiency of the OoO processor.

Type: Grant

Filed: January 18, 2023

Date of Patent: August 12, 2025

Assignee: Marvell Asia Pte, Ltd.

Inventor: David A. Carlson
Superscalar delay optimization with divided issue queue

Patent number: 12360772

Abstract: Superscalar Delay Optimization with Divided Issue Queue, wherein an issue queue is divided into three types, namely, ready queue, wait 1 queue, and wait 2 queue; a length of each type of queue is one-third of a length of the issue queue, so that the delay of scanning the entire issue queue from beginning to end per clock cycle is reduced to one-third.

Type: Grant

Filed: July 13, 2023

Date of Patent: July 15, 2025

Inventor: Xiuquan Xu
Approach for processing near-memory processing commands using near-memory register definition data

Patent number: 12265735

Abstract: An approach is provided for processing near-memory processing commands, e.g., PIM commands, using PIM register definition data that defines multiple combinations of source and/or destination registers to be used to process PIM commands. A particular combination of source and/or destination registers to be used to process a PIM command is specified by the PIM command or determined by a near-memory processing element processing the PIM command. According to another implementation, the PIM register definition data specifies an initial combination of source and/or destination registers and one or more update functions for each PIM command. A near-memory processing element processes a PIM command using the initial combination of source and/or destination registers and uses the one or more update functions to update the combination of source and/or destination registers to be used the next time the PIM command is processed.

Type: Grant

Filed: June 21, 2022

Date of Patent: April 1, 2025

Assignee: Advanced Micro Devices, Inc.

Inventors: Shaizeen Aga, Nuwan Jayasena
Retire queue compression

Patent number: 12204911

Abstract: Systems, apparatuses, and methods for compressing multiple instruction operations together into a single retire queue entry are disclosed. A processor includes at least a scheduler, a retire queue, one or more execution units, and control logic. When the control logic detects a given instruction operation being dispatched by the scheduler to an execution unit, the control logic determines if the given instruction operation meets one or more conditions for being compressed with one or more other instruction operations into a single retire queue entry. If the one or more conditions are met, two or more instruction operations are stored together in a single retire queue entry. By compressing multiple instruction operations together into an individual retire queue entry, the retire queue is able to be used more efficiently, and the processor can speculatively execute more instructions without the retire queue exhausting its supply of available entries.

Type: Grant

Filed: October 8, 2021

Date of Patent: January 21, 2025

Assignee: Advanced Micro Devices, Inc.

Inventors: Matthew T. Sobel, Joshua James Lindner, Neil N. Marketkar, Kai Troester, Emil Talpes, Ashok Tirupathy Venkatachar
Suppressing allocation of registers for register renaming

Patent number: 12190117

Abstract: Techniques are provided for allocating registers for a processor. The techniques include identifying a first instruction of an instruction dispatch set that meets all register allocation suppression criteria of a first set of register allocation suppression criteria, suppressing register allocation for the first instruction, identifying a second instruction of the instruction dispatch set that does not meet all register allocation suppression criteria of a second set of register allocation suppression criteria, and allocating a register for the second instruction.

Type: Grant

Filed: November 26, 2019

Date of Patent: January 7, 2025

Assignee: Advanced Micro Devices, Inc.

Inventors: Neil N. Marketkar, Arun A. Nair
Microprocessor with non-cacheable memory load prediction

Patent number: 12141580

Abstract: A processor includes an instruction issue unit that receives a first instruction, and issues the first instruction with a write time, which for a load instruction corresponds to a data cache latency time or to a non-cacheable latency time of a non-cacheable predictor. The non-cacheable predictor includes a tag array and data array with a plurality of entries to predict non-cacheable latency times of non-cacheable load instructions. The non-cacheable predictor can be implemented as a direct map, an N-way associative cache, or a fully associative cache.

Type: Grant

Filed: April 20, 2022

Date of Patent: November 12, 2024

Assignee: Simplex Micro, Inc.

Inventors: David Witt, Thang Minh Tran
Device, method and system to predict an address collision by a load and a store

Patent number: 12086591

Abstract: Techniques and mechanisms for determining a relative order in which a load instruction and a store instruction are to be executed. In an embodiment, a processor detects an address collision event wherein two instructions, corresponding to different respective instruction pointer values, target the same memory address. Based on the address collision event, the processor identifies respective instruction types of the two instructions as an aliasing instruction type pair. The processor further determines a count of decisions each to forego a reversal of an order of execution of instructions. Each decision represented in the count is based on instructions which are each of a different respective instruction type of the aliasing instruction type pair. In another embodiment, the processor determines, based on the count of decisions, whether a later load instruction is to be advanced in an order of instruction execution.

Type: Grant

Filed: March 26, 2021

Date of Patent: September 10, 2024

Assignee: Intel Corporation

Inventors: Sudhanshu Shukla, Jayesh Gaur, Stanislav Shwartsman, Pavel I. Kryukov
Memory array data structure for posit operations

Patent number: 12079589

Abstract: Systems, apparatuses, and methods related to a memory array data structure for posit operations are described. Universal number (unum) bit strings, such as posit bit string operands and posit bit strings representing results of arithmetic and/or logical operations performed using the posit bit string operands may be stored in a memory array. Circuitry deployed in a memory device may access the memory array to retrieve the unum bit string operands and/or the results of the arithmetic and/or logical operations performed using the unum bit string operands from the memory array. For instance, an arithmetic operation and/or a logical operation may be performed using a first unum bit string stored in the memory array and a second unum bit string stored in the memory array. The result of the arithmetic operation and/or the logical operation may be stored in the memory array and subsequently retrieved.

Type: Grant

Filed: July 7, 2022

Date of Patent: September 3, 2024

Assignee: Micron Technology, Inc.

Inventor: Vijay S. Ramesh
Transformations in fused multiply-add instructions

Patent number: 12061905

Abstract: Transformations in fused multiply-add instructions, including: receiving an extended fused multiply-add (FMA) instruction comprising: a first subset of bits corresponding to each operand of a fused multiply-add (FMA) operation, and a second subset of bits comprising an opcode for the extended FMA instruction, wherein the opcode identifies extended FMA instruction from an instruction set of a plurality of extended FMA instructions each having a different predefined opcode corresponding to a different combination of transformation-operand groupings; and performing, based on the opcode of the extended FMA instruction, the one or more transformations and the FMA operation.

Type: Grant

Filed: June 20, 2023

Date of Patent: August 13, 2024

Assignee: GHOST AUTONOMY INC.

Inventors: John Hayes, Volkmar Uhlig
Move elimination

Patent number: 12045620

Abstract: A data processing apparatus is provided that comprises rename circuitry for performing a register rename stage of a pipeline in respect of a stream of operations. Move elimination circuitry performs a move elimination operation on the stream of operations in which a move operation is eliminated and the register rename stage performs an adjustment of an identity of registers in the stream of operations to compensate for the move operation being eliminated and demotion circuitry reverses or inhibits the adjustment in response to one or more conditions being met.

Type: Grant

Filed: December 17, 2021

Date of Patent: July 23, 2024

Assignee: Arm Limited

Inventors: Yasuo Ishii, Muhammad Umar Farooq, William Elton Burky, Michael Brian Schinzler, Jason Lee Setter, David Gum Lim
Circuitry and method

Patent number: 11989583

Abstract: Circuitry comprises two or more clusters of execution units, each cluster comprising one or more execution units to execute processing instructions; and scheduler circuitry to maintain one or more queues of processing instructions, the scheduler circuitry comprising picker circuitry to select a queued processing instruction for issue to an execution unit of one of the clusters of execution units for execution; in which: the scheduler circuitry is configured to maintain dependency data associated with each queued processing instruction, the dependency data for a queued processing instruction indicating any source operands which are required to be available for use in execution of that queued processing instruction and to inhibit issue of that queued processing instruction until all of the required source operands for that queued processing instruction are available and is configured to be responsive to an indication to the scheduler circuitry of the availability of the given operand as a source operand for use

Type: Grant

Filed: March 31, 2021

Date of Patent: May 21, 2024

Assignee: Arm Limited

Inventors: Chris Abernathy, Eric Charles Quinnell, Abhishek Raja, Michael David Achenbach
Method and device for executing instructions to perform artificial intelligence

Patent number: 11989560

Abstract: The present disclosure provides an instruction execution method, device, and electronic equipment. In the instruction execution method described above, after obtaining an exceptional signal generated by a neural network processor during an operation, the electronic equipment determines an exception processing instruction corresponding to the exceptional signal according to the exceptional signal, then it determines a first instruction queue needed to be executed by the neural network processor, and then it generates a second instruction queue based on the exception processing instruction and the first instruction queue, and finally it controls the neural network processor to execute the second instruction queue, so that errors encountered by the neural network processor can be timely processed, thereby shortening the error processing delay and improving the data processing efficiency of the hardware system in the electronic equipment.

Type: Grant

Filed: April 7, 2021

Date of Patent: May 21, 2024

Assignee: BEIJING HORIZON ROBOTICS TECHNOLOGY RESEARCH AND DEVELOPMENT CO., LTD.

Inventors: Yitong Zhao, Xing Wei
Near-memory determination of registers

Patent number: 11966328

Abstract: A memory module includes register selection logic to select alternate local source and/or destination registers to process PIM commands. The register selection logic uses an address-based register selection approach to select an alternate local source and/or destination register based upon address data specified by a PIM command and a split address maintained by a memory module. The register selection logic may alternatively use a register data-based approach to select an alternate local source and/or destination register based upon data stored in one or more local registers. A PIM-enabled memory module configured with the register selection logic described herein is capable of selecting an alternate local source and/or destination register to process PIM commands at or near the PIM execution unit where the PIM commands are executed.

Type: Grant

Filed: December 18, 2020

Date of Patent: April 23, 2024

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Onur Kayiran, Mohamed Assem Ibrahim, Shaizeen Aga
Permutation instruction

Patent number: 11900111

Abstract: A device includes a vector register file, a memory, and a processor. The vector register file includes a plurality of vector registers. The memory is configured to store a permutation instruction. The processor is configured to access a periodicity parameter of the permutation instruction. The periodicity parameter indicates a count of a plurality of data sources that contain source data for the permutation instruction. The processor is also configured to execute the permutation instruction to, for each particular element of multiple elements of a first permutation result register of the plurality of vector registers, select a data source of the plurality of data sources based at least in part on the count of the plurality of data sources and populate the particular element based on a value in a corresponding element of the selected data source.

Type: Grant

Filed: September 24, 2021

Date of Patent: February 13, 2024

Assignee: QUALCOMM Incorporated

Inventors: Srijesh Sudarsanan, Deepak Mathew, Marc Hoffman, Gerald Sweeney, Sundar Rajan Balasubramanian, Hongfeng Dong, Yurong Sun, Seyedmehdi Sadeghzadeh
Methods and systems for inter-pipeline data hazard avoidance

Patent number: 11900122

Abstract: Methods and parallel processing units for avoiding inter-pipeline data hazards identified at compile time. For each identified inter-pipeline data hazard the primary instruction and secondary instruction(s) thereof are identified as such and are linked by a counter which is used to track that inter-pipeline data hazard. When a primary instruction is output by the instruction decoder for execution the value of the counter associated therewith is adjusted to indicate that there is hazard related to the primary instruction, and when primary instruction has been resolved by one of multiple parallel processing pipelines the value of the counter associated therewith is adjusted to indicate that the hazard related to the primary instruction has been resolved.

Type: Grant

Filed: July 10, 2023

Date of Patent: February 13, 2024

Assignee: Imagination Technologies Limited

Inventors: Luca Iuliano, Simon Nield, Yoong-Chert Foo, Ollie Mower
Arithmetic processing device and semiconductor device with improved instruction retry

Patent number: 11842192

Abstract: An arithmetic processing device, includes a memory; and a processor coupled to the memory and the processor configured to: execute arithmetic processing which executes a plurality of instructions issued out of order, execute control processing which commits the plurality of instructions for which execution has been completed in order, identify, for each instruction included in the plurality of instructions, a count value which indicates a number of cycles from when execution of the instruction has been completed, and identify, among a plurality of uncommitted instructions, the instruction with the count value which matches the number of cycles in which an error is detected in the arithmetic processing as a specific instruction to be retried.

Type: Grant

Filed: March 16, 2021

Date of Patent: December 12, 2023

Assignee: FUJITSU LIMITED

Inventors: Yuhei Takata, Shiro Kamoshida
Processor graph execution using interrupt conservation

Patent number: 11836518

Abstract: Techniques for data manipulation using processor graph execution using interrupt conservation are disclosed. Processing elements are configured to implement a data flow graph. The processing elements comprise a multilayer graph execution engine. A data engine is loaded with computational parameters for the multilayer graph execution engine. The data engine is coupled to the multilayer graph execution engine, and the computational parameters supply layer-by-layer execution data to the multilayer graph execution engine for data flow graph execution. A first command FIFO is used for loading the data engine with computational parameters, and a second command FIFO is used for loading the multilayer graph execution engine with layer definition data. An input image is provided for a first layer of the multilayer graph execution engine. The data flow graph is executed using the input image and the computational parameters.

Type: Grant

Filed: December 10, 2021

Date of Patent: December 5, 2023

Inventor: David John Simpson
Register scoreboard for a microprocessor with a time counter for statically dispatching instructions

Patent number: 11829767

Abstract: A processor includes a time counter and a register scoreboard and operates to statically dispatch instructions with preset execution times based on a write time of a register in the register scoreboard and a time count of the time counter provided to an execution pipeline.

Type: Grant

Filed: February 15, 2022

Date of Patent: November 28, 2023

Assignee: Simplex Micro, Inc.

Inventor: Thang Minh Tran
Techniques to manage execution of divergent shaders

Patent number: 11776195

Abstract: Examples are described here that can be used to enable a main routine to request subroutines or other related code to be executed with other instantiations of the same subroutine or other related code for parallel execution. A sorting unit can be used to accumulate requests to execute instantiations of the subroutine. The sorting unit can request execution of a number of multiple instantiations of the subroutine corresponding to a number of lanes in a SIMD unit. A call stack can be used to share information to be accessed by a main routine after execution of the subroutine completes.

Type: Grant

Filed: August 31, 2021

Date of Patent: October 3, 2023

Assignee: Intel Corporation

Inventors: John G. Gierach, Karthik Vaidyanathan, Thomas F. Raoux
Microprocessor that fuses load and compare instructions

Patent number: 11748104

Abstract: Technology for fusing certain load instructions and compare-immediate instructions in a computer processor having a load-store architecture with respect to transferring data between memory and registers of the computer processor. In some embodiments the load and compare-immediate instructions are consecutive. In some embodiments, the instructions are only merged if: (i) the respective RA and RT fields of the two instructions match; (ii) the immediate field of the compare-immediate instruction has a certain value, or falls within a range of certain values; and/or (iii) the instructions are received in a consecutive manner.

Type: Grant

Filed: July 29, 2020

Date of Patent: September 5, 2023

Assignee: International Business Machines Corporation

Inventors: Bryan Lloyd, David A. Hrusecky, Sundeep Chadha, Dung Q. Nguyen, Christian Gerhard Zoellin, Brian W. Thompto, Sheldon Bernard Levenstein, Phillip G. Williams
Instruction storage

Patent number: 11727530

Abstract: Techniques are disclosed relating to low-level instruction storage in a processing unit. In some embodiments, a graphics unit includes execution circuitry, decode circuitry, hazard circuitry, and caching circuitry. In some embodiments the execution circuitry is configured to execute clauses of graphics instructions. In some embodiments, the decode circuitry is configured to receive graphics instructions and a clause identifier for each received graphics instruction and to decode the received graphics instructions. In some embodiments, the caching circuitry includes a plurality of entries each configured to store a set of decoded instructions in the same clause. A given clause may be fetched and executed multiple times, e.g., for different SIMD groups, while stored in the caching circuitry.

Type: Grant

Filed: May 28, 2021

Date of Patent: August 15, 2023

Assignee: Apple Inc.

Inventors: Andrew M. Havlir, Dzung Q. Vu, Liang Kai Wang
Out-of-order block-based processors and instruction schedulers using ready state data indexed by instruction position identifiers

Patent number: 11687345

Abstract: Apparatus and methods are disclosed for implementing block-based processors including field programmable gate-array implementations. In one example of the disclosed technology, a block-based processor includes an instruction decoder configured to generate decoded ready dependencies for a transactional block of instructions, where each of the instructions is associated with a different instruction identifier encoded in the transactional block. The processor further includes an instruction scheduler configured to issue an instruction from a set of instructions of the transactional block of instructions. The instruction is issued based on determining that decoded ready state dependencies for an instruction are satisfied. The determining includes accessing storage with the decoded ready dependencies indexed with a respective instruction identifier that is encoded in the transactional block of instructions.

Type: Grant

Filed: July 29, 2016

Date of Patent: June 27, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Aaron L. Smith, Jan S. Gray
Microprocessor and method for speculatively issuing load/store instruction with non-deterministic access time using scoreboard

Patent number: 11687347

Abstract: A microprocessor and a method for issuing a load/store instruction is introduced. The microprocessor includes a decode/issue unit, a load/store queue, a scoreboard, and a load/store unit. The scoreboard includes a plurality of scoreboard entries, in which each scoreboard entry includes an unknown bit value and a count value, wherein the unknown bit value or the count value is set when instructions are issued. The decode/issue unit checks for WAR, WAW, and RAW data dependencies from the scoreboard and dispatches load/store instructions to the load/store queue with the recorded scoreboard values. The load/store queue is configured to resolve the data dependencies and dispatch the load/store instructions to the load/store unit for execution.

Type: Grant

Filed: May 25, 2021

Date of Patent: June 27, 2023

Assignee: ANDES TECHNOLOGY CORPORATION

Inventor: Thang Minh Tran
Methods and systems for utilizing a master-shadow physical register file based on verified activation

Patent number: 11599359

Abstract: A processor in a data processing system includes a master-shadow physical register file and a renaming unit. The master-shadow physical register file has a master storage coupled to shadow storage. The renaming unit is coupled to the master-shadow physical register file. Based on an occurrence of shadow transfer activation conditions verified by the renaming unit, data in the master storage is transferred from the master storage to the shadow storage for storage. Data is transferred from the shadow storage back to the master storage based on the occurrence of a shadow-to-master transfer event, which includes, for example, a flush of the master storage by the processor.

Type: Grant

Filed: May 18, 2020

Date of Patent: March 7, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Arun A. Nair, Ashok T. Venkatachar, Emil Talpes, Srikanth Arekapudi, Rajesh Kumar Arunachalam
Processor, device, and method for executing instructions

Patent number: 11593115

Abstract: The present disclosure discloses an instruction execution device, a processor including the instruction execution device, a system on chip, and a method for executing a data storage instruction in the processor. The method includes: splitting the data storage instruction into a first split instruction and a second split instruction, wherein the first split instruction is associated with an address operand of the data storage instruction, and the second split instruction is associated with a data operand of the data storage instruction; executing the first split instruction to determine a data storage address corresponding to the address operand; executing the second split instruction to acquire data content corresponding to the data operand; and storing the acquired data content to the determined data storage address in a data storage region. The present disclosure further discloses a corresponding instruction execution device, a processor including the execution device and a system on chip.

Type: Grant

Filed: March 20, 2020

Date of Patent: February 28, 2023

Assignee: Alibaba Group Holding Limited

Inventors: Yimin Lu, Xiaoyan Xiang
System and method for instruction unwinding in an out-of-order processor

Patent number: 11593116

Abstract: A system and corresponding method unwind instructions in an out-of-order (OoO) processor. The system comprises a mapper. In response to a restart event causing at least one instruction to be unwound, the mapper restores a present integer mapper state and present floating-point (FP) mapper state, used for mapping instructions, to a former integer mapper state and former FP mapper state, respectively. The mapper stores integer snapshots and FP snapshots of the present integer and FP mapper state, respectively, to expedite restoration to the former integer and FP mapper state, respectively. Access to the FP snapshots is blocked, intermittently, as a function of at least one FP present indicator used by the mapper to record presence of FP registers used as destinations in the instructions. Blocking the access, intermittently, improves power efficiency of the OoO processor.

Type: Grant

Filed: April 30, 2021

Date of Patent: February 28, 2023

Assignee: Marvell Asia Pte, Ltd.

Inventor: David A. Carlson
Evicting and restoring information using a single port of a logical register mapper and history buffer in a microprocessor comprising multiple main register file entries mapped to one accumulator register file entry

Patent number: 11561794

Abstract: A computer system, processor, programming instructions and/or method of processing data that includes a main register file having a plurality of entries for storing data; an accumulator register file having a plurality of entries for storing data wherein multiple main register file entries are mapped to one accumulator register file entry in the at least one accumulator register file; a logical register mapper to track and map logical registers to main register file entries, and a history buffer. Processing wide data width instructions includes evicting and restoring information from a single primary entry in the logical register mapper through a single read or write port in the logical register mapper without evicting or restoring the remaining other multiple main register file entries mapped in the accumulator register.

Type: Grant

Filed: May 26, 2021

Date of Patent: January 24, 2023

Assignee: International Business Machines Corporation

Inventors: Steven J. Battle, Brian W. Thompto, Dung Q. Nguyen, Cliff Kucharski, Susan E. Eisen, Salma Ayub
Data processing

Patent number: 11531547

Abstract: Data processing circuitry comprises out-of-order instruction execution circuitry; register mapping circuitry to map zero or more architectural processor registers relating to execution of that program instruction to respective ones of a set of physical processor registers; commit circuitry to commit, in a program code order, the results of executed program instructions, the commit circuitry being configured to access a data store which stores register tag data to indicate which physical registers mapped by the register mapping circuitry relate to a given program instruction; fault detection circuitry to detect a memory access fault in respect of a vector memory access operation and to generate fault indication data indicative of an element earliest in the element order for which a memory access fault was detected; a fault indication register to store the fault indication data, in which the register mapping circuitry is configured to generate a register mapping for a program instruction for any architectural p

Type: Grant

Filed: May 21, 2021

Date of Patent: December 20, 2022

Assignee: Arm Limited

Inventors: Damian Maiorano, Luca Nassi, Cédric Denis Robert Airaud, Christophe Laurent Carbonne, Jocelyn François Orion Jaubert, Pasquale Ranone
System and method for generating a binary patch file for live patching of an application

Patent number: 11507362

Abstract: A system and method for executing a method generating a binary patch file for live patching of an application is disclosed. In one exemplary aspect, the method comprises creating shared object by compiling source code patch file that contains source code of a new function corresponding to an old function, a global external symbol referenced in the source code of the new function, and at least one link to a symbol in an application binary code corresponding to the global external symbol, wherein the shared object contains binary code of the new function for replacing the old function during the live patching, and the result of a compilation of the link, generating metadata usable to facilitate the live patching, creating bindings between calculated relative addresses and the global external symbol referenced by the shared object, and creating the binary patch file by adding metadata to the shared object.

Type: Grant

Filed: October 5, 2020

Date of Patent: November 22, 2022

Assignee: Virtuozzo International GmbH

Inventors: Stanislav Kinsburskiy, Alexey Kobets, Eugene Kolomeetz
Circuitry and method for controlling a generated association of a physical register with a predicated processing operation based on predicate data state

Patent number: 11494190

Abstract: Instruction decoder circuitry decodes processing instructions each generating an output multi-bit data item in a destination architectural register by applying a processing operation to source data item(s) in respective source architectural register(s). The decoder circuitry detects whether an instruction defines a predicated merge operation that propagates a set of zero or more portions of the prevailing contents of the destination architectural register as respective portions of the output multi-bit data item. The portions are defined by predicate data. Register allocation circuitry associates physical registers with the destination architectural register and the source architectural register(s). When detector circuitry detects that an instruction defines a predicated merge operation, the register allocation circuitry associates a further physical register with that instruction to store a copy of the prevailing contents.

Type: Grant

Filed: March 31, 2021

Date of Patent: November 8, 2022

Assignee: Arm Limited

Inventors: Zachary Allen Kingsbury, Kurt Matthew Fellows, Thomas Gilles Tarridec
Booting an application from multiple memories

Patent number: 11481315

Abstract: A method includes using a memory address map, locating a first portion of an application in a first memory and loading a second portion of the application from a second memory. The method includes executing in place from the first memory the first portion of the application, during a first period, and by completion of the loading of the second portion of the application from the second memory. The method further includes executing the second portion of the application during a second period, wherein the first period precedes the second period.

Type: Grant

Filed: September 4, 2020

Date of Patent: October 25, 2022

Assignee: INFINEON TECHNOLOGIES LLC

Inventors: Stephan Rosner, Qamrul Hasan, Venkat Natarajan
Microthreading for accelerated deep learning

Patent number: 11475282

Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of compute elements and routers performs flow-based computations on wavelets of data. Some instructions are performed in iterations, such as one iteration per element of a fabric vector or FIFO. When sources for an iteration of an instruction are unavailable, and/or there is insufficient space to store results of the iteration, indicators associated with operands of the instruction are checked to determine whether other work can be performed. In some scenarios, other work cannot be performed and processing stalls. Alternatively, information about the instruction is saved, the other work is performed, and sometime after the sources become available and/or sufficient space to store the results becomes available, the iteration is performed using the saved information.

Type: Grant

Filed: April 17, 2018

Date of Patent: October 18, 2022

Assignee: Cerebras Systems Inc.

Inventors: Sean Lie, Michael Morrison, Michael Edwin James, Gary R. Lauterbach, Srikanth Arekapudi
Generating tie code fragments for binary translation

Patent number: 11455156

Abstract: Systems and methods for binary translation of executable code.

Type: Grant

Filed: May 11, 2020

Date of Patent: September 27, 2022

Assignee: Parallels International GMBH

Inventors: Alexey Koryakin, Nikolay Dobrovolskiy, Serguei M. Beloussov
Electronic device and method for data processing using virtual register mode

Patent number: 11449341

Abstract: The invention relates to an electronic device for data processing, which includes an execution unit with a temporary register, a register file, a first feedback path from the data output of the execution unit to the register file, a second feedback path from the data output of the execution unit to the temporary register, a switch configured to connect the first feedback path and/or the second feedback path, and a logic stage coupled to control the switch. The control stage is configured to control the switch to connect the second feedback path if the data output of an execution unit is used as an operand in the subsequent operation of an execution unit.

Type: Grant

Filed: September 9, 2019

Date of Patent: September 20, 2022

Assignee: Texas Instruments Incorporated

Inventors: Marko Krüger, Steven Bartling, Markus Kösler
Speculative execution of correlated memory access instruction methods, apparatuses and systems

Patent number: 11429391

Abstract: A processor core, a processor, an apparatus, and an instruction processing method are disclosed. The processor core includes: an instruction fetch unit, where the instruction fetch unit includes a speculative execution predictor and the speculative execution predictor compares a program counter of a memory access instruction with a table entry stored in the speculative execution predictor and marks the memory access instruction; a scheduler unit adapted to adjust a send order of marked memory access instructions and send the marked memory access instructions according to the send order; an execution unit adapted to execute the memory access instructions according to the send order. In the instruction fetch unit, a memory access instruction is marked according to a speculative execution prediction result. In the scheduler unit, a send order of memory access instructions is determined according to the marked memory access instruction and the memory access instructions are sent.

Type: Grant

Filed: August 27, 2020

Date of Patent: August 30, 2022

Assignee: Alibaba Group Holding LImited

Inventors: Dongqi Liu, Chang Liu, Yimin Lu, Tao Jiang, Chaojun Zhao
Program instruction fusion

Patent number: 11416252

Abstract: A data processing system includes an instruction pipeline containing instruction queue circuitry, fusion circuitry and decoder circuitry. The fusion circuitry serves to identify fusible groups of program instructions within a Y-wide window of program instructions and supply a stream of program instructions including such replacement fused program instructions to a X-wide decoder circuitry which decodes X program instructions in parallel using parallel decoders.

Type: Grant

Filed: December 27, 2017

Date of Patent: August 16, 2022

Assignee: Arm Limited

Inventors: Vladimir Vasekin, Chiloda Ashan Senarath Pathirane, Jungsoo Kim, Alexei Fedorov
Zero cycle load bypass in a decode group

Patent number: 11416254

Abstract: Systems, apparatuses, and methods for implementing zero cycle load bypass operations are described. A system includes a processor with at least a decode unit, control logic, mapper, and free list. When a load operation is detected, the control logic determines if the load operation qualifies to be converted to a zero cycle load bypass operation. Conditions for qualifying include the load operation being in the same decode group as an older store operation to the same address. Qualifying load operations are converted to zero cycle load bypass operations. A lookup of the free list is prevented for a zero cycle load bypass operation and a destination operand of the load is renamed with a same physical register identifier used for a source operand of the store. Also, the data of the store is bypassed to the load.

Type: Grant

Filed: December 5, 2019

Date of Patent: August 16, 2022

Assignee: Apple Inc.

Inventors: Deepankar Duggal, Kulin N. Kothari, Conrado Blasco, Muawya M. Al-Otoom
Protection against timing-based security attacks by randomly adjusting reorder buffer capacity

Patent number: 11403107

Abstract: Methods, systems, and apparatuses related to re-order buffers and for protection from timing-based security attacks are described. A processor may have functional units configured to execute instructions out of order, a re-order buffer configured to buffer the execution results of instructions for output in order, and a controller configured to randomize data timing in the re-order buffer. For example, the controller can make random adjustments to the capacity of the re-order buffer in buffering and/or sorting execution results and thus randomize data timing in the re-order buffer.

Type: Grant

Filed: December 5, 2018

Date of Patent: August 2, 2022

Assignee: Micron Technology, Inc.

Inventor: Steven Jeffrey Wallach
Microprocessor with multi-step ahead branch predictor and having a fetch-target queue between the branch predictor and instruction cache

Patent number: 11403103

Abstract: A microprocessor is shown, in which a branch predictor and an instruction cache are decoupled by a fetch-target queue (FTQ). The branch predictor performs branch prediction for N instruction addresses in parallel in the same cycle, wherein N is an integer greater than 1. In the current cycle, the branch predictor finishes branch prediction for N instruction addresses in parallel and, among the N instruction addresses with finished branch prediction, those that are not bypassed and do not overlap previously-predicted instruction addresses are pushed into the fetch-target queue, to be read out later as an instruction-fetching address for the instruction cache. The previously-predicted instruction addresses are pushed into the fetch-target queue in a previous cycle.

Type: Grant

Filed: October 13, 2020

Date of Patent: August 2, 2022

Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.

Inventors: Fangong Gong, Mengchen Yang
Method and apparatus for instruction prefetching with alternating buffers and sequential instruction address matching

Patent number: 11327762

Abstract: An instruction prefetching method, a device and a medium are provided. The method includes the following: instructions in a target buffer are precompiled before a processor core fetches a required instruction from the target buffer corresponding to the processor core; if it is determined that a jump instruction exists in the target buffer and a jump target instruction corresponding to the jump instruction is not cached in the target buffer according to a precompiled result, the jump target instruction is prefetched from an icache into a candidate buffer corresponding to the processor core to wait for the processor core to fetch the jump target instruction from the candidate buffer; the target buffer and the candidate buffer are alternately reused during instruction prefetching.

Type: Grant

Filed: September 29, 2020

Date of Patent: May 10, 2022

Assignee: KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED

Inventors: Chao Tang, Xueliang Du, Yingnan Xu
Opportunistic consumer instruction steering based on producer instruction value prediction in a multi-cluster processor

Patent number: 11327763

Abstract: Opportunistic consumer instruction steering based on producer instruction value prediction in a multi-cluster processor is disclosed. A processor provides producer instructions and consumer instructions to a steering circuit that steers the program instructions to clusters of instruction execution circuits. An input value provided to a consumer instruction may be a produced value of a producer instruction, creating a dependency. The steering circuit steers a producer instruction to a first cluster and, in response to receiving the consumer instruction and the predicted value of the producer instruction, provides the predicted value to at least a second cluster and steers the consumer instruction to the second cluster for execution with the predicted value as the input value.

Type: Grant

Filed: June 11, 2020

Date of Patent: May 10, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Arthur Perais, Shivam Priyadarshi, Yusuf Cagatay Tekmen, Rami Mohammad Al Sheikh, Vignyan Reddy Kothinti Naresh
Compiler assisted register file write reduction

Patent number: 11321799

Abstract: Examples described herein relate to a software and hardware optimization that manages scenarios where a write operation to a register is less than an entirety of the register. A compiler detects instructions that make partial writes to the same register, groups such instructions, and provides hints to hardware of the partial write. The execution unit combines the output data for grouped instructions and updates the destination register as single write instead of multiple separate partial writes.

Type: Grant

Filed: December 24, 2019

Date of Patent: May 3, 2022

Assignee: Intel Corporation

Inventors: Chandra S. Gurram, Gang Y. Chen, Subramaniam Maiyuran, Supratim Pal, Ashutosh Garg, Jorge E. Parra, Darin M. Starkey, Guei-Yuan Lueh, Wei-Yu Chen
Duplicate detection for register renaming

Patent number: 11294683

Abstract: Systems and methods are disclosed for duplicate detection for register renaming. For example, a method includes checking a map table for duplicates of a first physical register, wherein the map table stores entries that each map an architectural register of an instruction set architecture to a physical register of a microarchitecture and a duplicate is two or more architectural registers that are mapped to a same physical register; and, responsive to a duplicate of the first physical register in the map table, preventing the first physical register from being added to a free list upon retirement of an instruction that renames an architectural register that was previously mapped to the first physical register to a different physical register, wherein the free list stores entries that indicate which physical registers are available for renaming.

Type: Grant

Filed: April 17, 2020

Date of Patent: April 5, 2022

Assignee: SiFive, Inc.

Inventor: Joshua Smith
Extending fused multiply-add instructions

Patent number: 11269631

Abstract: Extending fused multiply-add instructions, the method comprising: receiving an extended fused multiply-add (FMA) instruction indicating one or more operands of a fused multiply-add (FMA) operation and one or more transformations to be applied to the one or more operands; and performing, based on the extended FMA instruction, the one or more transformations and the FMA operation.

Type: Grant

Filed: July 29, 2020

Date of Patent: March 8, 2022

Assignee: GHOST LOCOMOTION INC.

Inventors: John Hayes, Volkmar Uhlig
Processor with variable pre-fetch threshold

Patent number: 11231933

Abstract: A method and apparatus for controlling pre-fetching in a processor. A processor includes an execution pipeline and an instruction pre-fetch unit. The execution pipeline is configured to execute instructions. The instruction pre-fetch unit is coupled to the execution pipeline. The instruction pre-fetch unit includes instruction storage to store pre-fetched instructions, and pre-fetch control logic. The pre-fetch control logic is configured to fetch instructions from memory and store the fetched instructions in the instruction storage. The pre-fetch control logic is also configured to provide instructions stored in the instruction storage to the execution pipeline for execution. The pre-fetch control logic is further configured set a maximum number of instruction words to be pre-fetched for execution subsequent to execution of an instruction currently being executed in the execution pipeline.

Type: Grant

Filed: April 9, 2020

Date of Patent: January 25, 2022

Assignee: Texas Instruments Incorporated

Inventors: Christian Wiencke, Johann Zipperer
Arithmetic processing device and control method implemented by arithmetic processing device

Patent number: 11210101

Abstract: An arithmetic processing device includes: a decoding circuit configured to decode a command; a command execution circuit configured to execute the command decoded by the decoding circuit; a register circuit configured to include a plurality of registers for holding data used by the command execution circuit; an identification information holding circuit configured to store identification information for identifying a register for writing a specific value when the command is a register writing command; a setting circuit configured to hold the specific value; and an operation control circuit configured to execute inhibiting processing when the command is a register reading command, the inhibiting processing including inhibiting an access of the register by the register reading command and selecting the specific value held in the setting circuit.

Type: Grant

Filed: August 22, 2019

Date of Patent: December 28, 2021

Assignee: FUJITSU LIMITED

Inventors: Ryohei Okazaki, Sota Sakashita, Atushi Fusejima
Memory fragments for supporting code block execution by using virtual cores instantiated by partitionable engines

Patent number: 11204769

Abstract: A global front end scheduler to schedule instruction sequences to a plurality of virtual cores implemented via a plurality of partitionable engines. The global front end scheduler includes a thread allocation array to store a set of allocation thread pointers to point to a set of buckets in a bucket buffer in which execution blocks for respective threads are placed, a bucket buffer to provide a matrix of buckets, the bucket buffer including storage for the execution blocks, and a bucket retirement array to store a set of retirement thread pointers that track a next execution block to retire for a thread.

Type: Grant

Filed: January 2, 2020

Date of Patent: December 21, 2021

Assignee: Intel Corporation

Inventor: Mohammad Abdallah
Processing unit and operating method therefor

Patent number: 11163562

Abstract: A processing unit, in particular, a microcontroller for a control unit, which includes at least one processing core, one primary memory device, and at least one main connection unit for connecting the at least one processing core to the primary memory device, the processing unit including at least two functional units, the processing unit including at least one functional unit designed as a data flow control unit, which is designed to receive input data, to evaluate the input data, and to generate output data as a function of the evaluation.

Type: Grant

Filed: October 9, 2018

Date of Patent: November 2, 2021

Assignee: Robert Bosch GmbH

Inventor: Nico Bannow
Vector compare and store instruction that stores index values to memory

Patent number: 11163564

Abstract: The present disclosure is directed to methods to generate a packed result array using parallel vector processing, of an input array and a comparison operation. In one aspect, an additive scan operation can be used to generate memory offsets for each successful comparison operation of the input array and to generate a count of the number of data elements satisfying the comparison operation. In another aspect, the input array can be segmented to allow more efficient processing using the vector registers. In another aspect, a vector processing system is disclosed that is operable to receive a data array, a comparison operation, and threshold criteria, and output a packed array, at a specified memory address, comprising of the data elements satisfying the comparison operation.

Type: Grant

Filed: October 8, 2018

Date of Patent: November 2, 2021

Assignees: VeriSilicon Microelectronics (Shanghai) Co., Ltd., VeriSilicon Holdings Co., Ltd.

Inventors: Charles H. Stewart, Charles R. Bezet
Store hit multiple load side register for preventing a subsequent store memory violation

Patent number: 11144321

Abstract: Examples of techniques for store hit multiple load side register for operand store compare are described herein. An aspect includes, based on detecting a store hit multiple load condition in the processor, updating a register of the processor to hold information corresponding to a first store instruction that triggered the detected store hit multiple load condition. Another aspect includes, based on issuing a second store instruction in the processor, determining whether the second store instruction corresponds to the information in the register. Another aspect includes, based on determining that the second store instruction corresponds to the information in the register, tagging the second store instruction with an operand store compare mark.

Type: Grant

Filed: February 20, 2019

Date of Patent: October 12, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Yair Fried, Jonathan Hsieh, Eyal Naor, James Bonanno, Gregory William Alexander

1 2 3 4 5 … next