Scoreboarding, Reservation Station, Or Aliasing Patents (Class 712/217)

Near-memory determination of registers

Patent number: 11966328

Abstract: A memory module includes register selection logic to select alternate local source and/or destination registers to process PIM commands. The register selection logic uses an address-based register selection approach to select an alternate local source and/or destination register based upon address data specified by a PIM command and a split address maintained by a memory module. The register selection logic may alternatively use a register data-based approach to select an alternate local source and/or destination register based upon data stored in one or more local registers. A PIM-enabled memory module configured with the register selection logic described herein is capable of selecting an alternate local source and/or destination register to process PIM commands at or near the PIM execution unit where the PIM commands are executed.

Type: Grant

Filed: December 18, 2020

Date of Patent: April 23, 2024

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Onur Kayiran, Mohamed Assem Ibrahim, Shaizeen Aga
Methods and systems for inter-pipeline data hazard avoidance

Patent number: 11900122

Abstract: Methods and parallel processing units for avoiding inter-pipeline data hazards identified at compile time. For each identified inter-pipeline data hazard the primary instruction and secondary instruction(s) thereof are identified as such and are linked by a counter which is used to track that inter-pipeline data hazard. When a primary instruction is output by the instruction decoder for execution the value of the counter associated therewith is adjusted to indicate that there is hazard related to the primary instruction, and when primary instruction has been resolved by one of multiple parallel processing pipelines the value of the counter associated therewith is adjusted to indicate that the hazard related to the primary instruction has been resolved.

Type: Grant

Filed: July 10, 2023

Date of Patent: February 13, 2024

Assignee: Imagination Technologies Limited

Inventors: Luca Iuliano, Simon Nield, Yoong-Chert Foo, Ollie Mower
Permutation instruction

Patent number: 11900111

Abstract: A device includes a vector register file, a memory, and a processor. The vector register file includes a plurality of vector registers. The memory is configured to store a permutation instruction. The processor is configured to access a periodicity parameter of the permutation instruction. The periodicity parameter indicates a count of a plurality of data sources that contain source data for the permutation instruction. The processor is also configured to execute the permutation instruction to, for each particular element of multiple elements of a first permutation result register of the plurality of vector registers, select a data source of the plurality of data sources based at least in part on the count of the plurality of data sources and populate the particular element based on a value in a corresponding element of the selected data source.

Type: Grant

Filed: September 24, 2021

Date of Patent: February 13, 2024

Assignee: QUALCOMM Incorporated

Inventors: Srijesh Sudarsanan, Deepak Mathew, Marc Hoffman, Gerald Sweeney, Sundar Rajan Balasubramanian, Hongfeng Dong, Yurong Sun, Seyedmehdi Sadeghzadeh
Arithmetic processing device and semiconductor device with improved instruction retry

Patent number: 11842192

Abstract: An arithmetic processing device, includes a memory; and a processor coupled to the memory and the processor configured to: execute arithmetic processing which executes a plurality of instructions issued out of order, execute control processing which commits the plurality of instructions for which execution has been completed in order, identify, for each instruction included in the plurality of instructions, a count value which indicates a number of cycles from when execution of the instruction has been completed, and identify, among a plurality of uncommitted instructions, the instruction with the count value which matches the number of cycles in which an error is detected in the arithmetic processing as a specific instruction to be retried.

Type: Grant

Filed: March 16, 2021

Date of Patent: December 12, 2023

Assignee: FUJITSU LIMITED

Inventors: Yuhei Takata, Shiro Kamoshida
Processor graph execution using interrupt conservation

Patent number: 11836518

Abstract: Techniques for data manipulation using processor graph execution using interrupt conservation are disclosed. Processing elements are configured to implement a data flow graph. The processing elements comprise a multilayer graph execution engine. A data engine is loaded with computational parameters for the multilayer graph execution engine. The data engine is coupled to the multilayer graph execution engine, and the computational parameters supply layer-by-layer execution data to the multilayer graph execution engine for data flow graph execution. A first command FIFO is used for loading the data engine with computational parameters, and a second command FIFO is used for loading the multilayer graph execution engine with layer definition data. An input image is provided for a first layer of the multilayer graph execution engine. The data flow graph is executed using the input image and the computational parameters.

Type: Grant

Filed: December 10, 2021

Date of Patent: December 5, 2023

Inventor: David John Simpson
Register scoreboard for a microprocessor with a time counter for statically dispatching instructions

Patent number: 11829767

Abstract: A processor includes a time counter and a register scoreboard and operates to statically dispatch instructions with preset execution times based on a write time of a register in the register scoreboard and a time count of the time counter provided to an execution pipeline.

Type: Grant

Filed: February 15, 2022

Date of Patent: November 28, 2023

Assignee: Simplex Micro, Inc.

Inventor: Thang Minh Tran
Techniques to manage execution of divergent shaders

Patent number: 11776195

Abstract: Examples are described here that can be used to enable a main routine to request subroutines or other related code to be executed with other instantiations of the same subroutine or other related code for parallel execution. A sorting unit can be used to accumulate requests to execute instantiations of the subroutine. The sorting unit can request execution of a number of multiple instantiations of the subroutine corresponding to a number of lanes in a SIMD unit. A call stack can be used to share information to be accessed by a main routine after execution of the subroutine completes.

Type: Grant

Filed: August 31, 2021

Date of Patent: October 3, 2023

Assignee: Intel Corporation

Inventors: John G. Gierach, Karthik Vaidyanathan, Thomas F. Raoux
Microprocessor that fuses load and compare instructions

Patent number: 11748104

Abstract: Technology for fusing certain load instructions and compare-immediate instructions in a computer processor having a load-store architecture with respect to transferring data between memory and registers of the computer processor. In some embodiments the load and compare-immediate instructions are consecutive. In some embodiments, the instructions are only merged if: (i) the respective RA and RT fields of the two instructions match; (ii) the immediate field of the compare-immediate instruction has a certain value, or falls within a range of certain values; and/or (iii) the instructions are received in a consecutive manner.

Type: Grant

Filed: July 29, 2020

Date of Patent: September 5, 2023

Assignee: International Business Machines Corporation

Inventors: Bryan Lloyd, David A. Hrusecky, Sundeep Chadha, Dung Q. Nguyen, Christian Gerhard Zoellin, Brian W. Thompto, Sheldon Bernard Levenstein, Phillip G. Williams
Instruction storage

Patent number: 11727530

Abstract: Techniques are disclosed relating to low-level instruction storage in a processing unit. In some embodiments, a graphics unit includes execution circuitry, decode circuitry, hazard circuitry, and caching circuitry. In some embodiments the execution circuitry is configured to execute clauses of graphics instructions. In some embodiments, the decode circuitry is configured to receive graphics instructions and a clause identifier for each received graphics instruction and to decode the received graphics instructions. In some embodiments, the caching circuitry includes a plurality of entries each configured to store a set of decoded instructions in the same clause. A given clause may be fetched and executed multiple times, e.g., for different SIMD groups, while stored in the caching circuitry.

Type: Grant

Filed: May 28, 2021

Date of Patent: August 15, 2023

Assignee: Apple Inc.

Inventors: Andrew M. Havlir, Dzung Q. Vu, Liang Kai Wang
Out-of-order block-based processors and instruction schedulers using ready state data indexed by instruction position identifiers

Patent number: 11687345

Abstract: Apparatus and methods are disclosed for implementing block-based processors including field programmable gate-array implementations. In one example of the disclosed technology, a block-based processor includes an instruction decoder configured to generate decoded ready dependencies for a transactional block of instructions, where each of the instructions is associated with a different instruction identifier encoded in the transactional block. The processor further includes an instruction scheduler configured to issue an instruction from a set of instructions of the transactional block of instructions. The instruction is issued based on determining that decoded ready state dependencies for an instruction are satisfied. The determining includes accessing storage with the decoded ready dependencies indexed with a respective instruction identifier that is encoded in the transactional block of instructions.

Type: Grant

Filed: July 29, 2016

Date of Patent: June 27, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Aaron L. Smith, Jan S. Gray
Microprocessor and method for speculatively issuing load/store instruction with non-deterministic access time using scoreboard

Patent number: 11687347

Abstract: A microprocessor and a method for issuing a load/store instruction is introduced. The microprocessor includes a decode/issue unit, a load/store queue, a scoreboard, and a load/store unit. The scoreboard includes a plurality of scoreboard entries, in which each scoreboard entry includes an unknown bit value and a count value, wherein the unknown bit value or the count value is set when instructions are issued. The decode/issue unit checks for WAR, WAW, and RAW data dependencies from the scoreboard and dispatches load/store instructions to the load/store queue with the recorded scoreboard values. The load/store queue is configured to resolve the data dependencies and dispatch the load/store instructions to the load/store unit for execution.

Type: Grant

Filed: May 25, 2021

Date of Patent: June 27, 2023

Assignee: ANDES TECHNOLOGY CORPORATION

Inventor: Thang Minh Tran
Methods and systems for utilizing a master-shadow physical register file based on verified activation

Patent number: 11599359

Abstract: A processor in a data processing system includes a master-shadow physical register file and a renaming unit. The master-shadow physical register file has a master storage coupled to shadow storage. The renaming unit is coupled to the master-shadow physical register file. Based on an occurrence of shadow transfer activation conditions verified by the renaming unit, data in the master storage is transferred from the master storage to the shadow storage for storage. Data is transferred from the shadow storage back to the master storage based on the occurrence of a shadow-to-master transfer event, which includes, for example, a flush of the master storage by the processor.

Type: Grant

Filed: May 18, 2020

Date of Patent: March 7, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Arun A. Nair, Ashok T. Venkatachar, Emil Talpes, Srikanth Arekapudi, Rajesh Kumar Arunachalam
Processor, device, and method for executing instructions

Patent number: 11593115

Abstract: The present disclosure discloses an instruction execution device, a processor including the instruction execution device, a system on chip, and a method for executing a data storage instruction in the processor. The method includes: splitting the data storage instruction into a first split instruction and a second split instruction, wherein the first split instruction is associated with an address operand of the data storage instruction, and the second split instruction is associated with a data operand of the data storage instruction; executing the first split instruction to determine a data storage address corresponding to the address operand; executing the second split instruction to acquire data content corresponding to the data operand; and storing the acquired data content to the determined data storage address in a data storage region. The present disclosure further discloses a corresponding instruction execution device, a processor including the execution device and a system on chip.

Type: Grant

Filed: March 20, 2020

Date of Patent: February 28, 2023

Assignee: Alibaba Group Holding Limited

Inventors: Yimin Lu, Xiaoyan Xiang
System and method for instruction unwinding in an out-of-order processor

Patent number: 11593116

Abstract: A system and corresponding method unwind instructions in an out-of-order (OoO) processor. The system comprises a mapper. In response to a restart event causing at least one instruction to be unwound, the mapper restores a present integer mapper state and present floating-point (FP) mapper state, used for mapping instructions, to a former integer mapper state and former FP mapper state, respectively. The mapper stores integer snapshots and FP snapshots of the present integer and FP mapper state, respectively, to expedite restoration to the former integer and FP mapper state, respectively. Access to the FP snapshots is blocked, intermittently, as a function of at least one FP present indicator used by the mapper to record presence of FP registers used as destinations in the instructions. Blocking the access, intermittently, improves power efficiency of the OoO processor.

Type: Grant

Filed: April 30, 2021

Date of Patent: February 28, 2023

Assignee: Marvell Asia Pte, Ltd.

Inventor: David A. Carlson
Evicting and restoring information using a single port of a logical register mapper and history buffer in a microprocessor comprising multiple main register file entries mapped to one accumulator register file entry

Patent number: 11561794

Abstract: A computer system, processor, programming instructions and/or method of processing data that includes a main register file having a plurality of entries for storing data; an accumulator register file having a plurality of entries for storing data wherein multiple main register file entries are mapped to one accumulator register file entry in the at least one accumulator register file; a logical register mapper to track and map logical registers to main register file entries, and a history buffer. Processing wide data width instructions includes evicting and restoring information from a single primary entry in the logical register mapper through a single read or write port in the logical register mapper without evicting or restoring the remaining other multiple main register file entries mapped in the accumulator register.

Type: Grant

Filed: May 26, 2021

Date of Patent: January 24, 2023

Assignee: International Business Machines Corporation

Inventors: Steven J. Battle, Brian W. Thompto, Dung Q. Nguyen, Cliff Kucharski, Susan E. Eisen, Salma Ayub
Data processing

Patent number: 11531547

Abstract: Data processing circuitry comprises out-of-order instruction execution circuitry; register mapping circuitry to map zero or more architectural processor registers relating to execution of that program instruction to respective ones of a set of physical processor registers; commit circuitry to commit, in a program code order, the results of executed program instructions, the commit circuitry being configured to access a data store which stores register tag data to indicate which physical registers mapped by the register mapping circuitry relate to a given program instruction; fault detection circuitry to detect a memory access fault in respect of a vector memory access operation and to generate fault indication data indicative of an element earliest in the element order for which a memory access fault was detected; a fault indication register to store the fault indication data, in which the register mapping circuitry is configured to generate a register mapping for a program instruction for any architectural p

Type: Grant

Filed: May 21, 2021

Date of Patent: December 20, 2022

Assignee: Arm Limited

Inventors: Damian Maiorano, Luca Nassi, Cédric Denis Robert Airaud, Christophe Laurent Carbonne, Jocelyn François Orion Jaubert, Pasquale Ranone
System and method for generating a binary patch file for live patching of an application

Patent number: 11507362

Abstract: A system and method for executing a method generating a binary patch file for live patching of an application is disclosed. In one exemplary aspect, the method comprises creating shared object by compiling source code patch file that contains source code of a new function corresponding to an old function, a global external symbol referenced in the source code of the new function, and at least one link to a symbol in an application binary code corresponding to the global external symbol, wherein the shared object contains binary code of the new function for replacing the old function during the live patching, and the result of a compilation of the link, generating metadata usable to facilitate the live patching, creating bindings between calculated relative addresses and the global external symbol referenced by the shared object, and creating the binary patch file by adding metadata to the shared object.

Type: Grant

Filed: October 5, 2020

Date of Patent: November 22, 2022

Assignee: Virtuozzo International GmbH

Inventors: Stanislav Kinsburskiy, Alexey Kobets, Eugene Kolomeetz
Circuitry and method for controlling a generated association of a physical register with a predicated processing operation based on predicate data state

Patent number: 11494190

Abstract: Instruction decoder circuitry decodes processing instructions each generating an output multi-bit data item in a destination architectural register by applying a processing operation to source data item(s) in respective source architectural register(s). The decoder circuitry detects whether an instruction defines a predicated merge operation that propagates a set of zero or more portions of the prevailing contents of the destination architectural register as respective portions of the output multi-bit data item. The portions are defined by predicate data. Register allocation circuitry associates physical registers with the destination architectural register and the source architectural register(s). When detector circuitry detects that an instruction defines a predicated merge operation, the register allocation circuitry associates a further physical register with that instruction to store a copy of the prevailing contents.

Type: Grant

Filed: March 31, 2021

Date of Patent: November 8, 2022

Assignee: Arm Limited

Inventors: Zachary Allen Kingsbury, Kurt Matthew Fellows, Thomas Gilles Tarridec
Booting an application from multiple memories

Patent number: 11481315

Abstract: A method includes using a memory address map, locating a first portion of an application in a first memory and loading a second portion of the application from a second memory. The method includes executing in place from the first memory the first portion of the application, during a first period, and by completion of the loading of the second portion of the application from the second memory. The method further includes executing the second portion of the application during a second period, wherein the first period precedes the second period.

Type: Grant

Filed: September 4, 2020

Date of Patent: October 25, 2022

Assignee: INFINEON TECHNOLOGIES LLC

Inventors: Stephan Rosner, Qamrul Hasan, Venkat Natarajan
Microthreading for accelerated deep learning

Patent number: 11475282

Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of compute elements and routers performs flow-based computations on wavelets of data. Some instructions are performed in iterations, such as one iteration per element of a fabric vector or FIFO. When sources for an iteration of an instruction are unavailable, and/or there is insufficient space to store results of the iteration, indicators associated with operands of the instruction are checked to determine whether other work can be performed. In some scenarios, other work cannot be performed and processing stalls. Alternatively, information about the instruction is saved, the other work is performed, and sometime after the sources become available and/or sufficient space to store the results becomes available, the iteration is performed using the saved information.

Type: Grant

Filed: April 17, 2018

Date of Patent: October 18, 2022

Assignee: Cerebras Systems Inc.

Inventors: Sean Lie, Michael Morrison, Michael Edwin James, Gary R. Lauterbach, Srikanth Arekapudi
Generating tie code fragments for binary translation

Patent number: 11455156

Abstract: Systems and methods for binary translation of executable code.

Type: Grant

Filed: May 11, 2020

Date of Patent: September 27, 2022

Assignee: Parallels International GMBH

Inventors: Alexey Koryakin, Nikolay Dobrovolskiy, Serguei M. Beloussov
Electronic device and method for data processing using virtual register mode

Patent number: 11449341

Abstract: The invention relates to an electronic device for data processing, which includes an execution unit with a temporary register, a register file, a first feedback path from the data output of the execution unit to the register file, a second feedback path from the data output of the execution unit to the temporary register, a switch configured to connect the first feedback path and/or the second feedback path, and a logic stage coupled to control the switch. The control stage is configured to control the switch to connect the second feedback path if the data output of an execution unit is used as an operand in the subsequent operation of an execution unit.

Type: Grant

Filed: September 9, 2019

Date of Patent: September 20, 2022

Assignee: Texas Instruments Incorporated

Inventors: Marko Krüger, Steven Bartling, Markus Kösler
Speculative execution of correlated memory access instruction methods, apparatuses and systems

Patent number: 11429391

Abstract: A processor core, a processor, an apparatus, and an instruction processing method are disclosed. The processor core includes: an instruction fetch unit, where the instruction fetch unit includes a speculative execution predictor and the speculative execution predictor compares a program counter of a memory access instruction with a table entry stored in the speculative execution predictor and marks the memory access instruction; a scheduler unit adapted to adjust a send order of marked memory access instructions and send the marked memory access instructions according to the send order; an execution unit adapted to execute the memory access instructions according to the send order. In the instruction fetch unit, a memory access instruction is marked according to a speculative execution prediction result. In the scheduler unit, a send order of memory access instructions is determined according to the marked memory access instruction and the memory access instructions are sent.

Type: Grant

Filed: August 27, 2020

Date of Patent: August 30, 2022

Assignee: Alibaba Group Holding LImited

Inventors: Dongqi Liu, Chang Liu, Yimin Lu, Tao Jiang, Chaojun Zhao
Program instruction fusion

Patent number: 11416252

Abstract: A data processing system includes an instruction pipeline containing instruction queue circuitry, fusion circuitry and decoder circuitry. The fusion circuitry serves to identify fusible groups of program instructions within a Y-wide window of program instructions and supply a stream of program instructions including such replacement fused program instructions to a X-wide decoder circuitry which decodes X program instructions in parallel using parallel decoders.

Type: Grant

Filed: December 27, 2017

Date of Patent: August 16, 2022

Assignee: Arm Limited

Inventors: Vladimir Vasekin, Chiloda Ashan Senarath Pathirane, Jungsoo Kim, Alexei Fedorov
Zero cycle load bypass in a decode group

Patent number: 11416254

Abstract: Systems, apparatuses, and methods for implementing zero cycle load bypass operations are described. A system includes a processor with at least a decode unit, control logic, mapper, and free list. When a load operation is detected, the control logic determines if the load operation qualifies to be converted to a zero cycle load bypass operation. Conditions for qualifying include the load operation being in the same decode group as an older store operation to the same address. Qualifying load operations are converted to zero cycle load bypass operations. A lookup of the free list is prevented for a zero cycle load bypass operation and a destination operand of the load is renamed with a same physical register identifier used for a source operand of the store. Also, the data of the store is bypassed to the load.

Type: Grant

Filed: December 5, 2019

Date of Patent: August 16, 2022

Assignee: Apple Inc.

Inventors: Deepankar Duggal, Kulin N. Kothari, Conrado Blasco, Muawya M. Al-Otoom
Microprocessor with multi-step ahead branch predictor and having a fetch-target queue between the branch predictor and instruction cache

Patent number: 11403103

Abstract: A microprocessor is shown, in which a branch predictor and an instruction cache are decoupled by a fetch-target queue (FTQ). The branch predictor performs branch prediction for N instruction addresses in parallel in the same cycle, wherein N is an integer greater than 1. In the current cycle, the branch predictor finishes branch prediction for N instruction addresses in parallel and, among the N instruction addresses with finished branch prediction, those that are not bypassed and do not overlap previously-predicted instruction addresses are pushed into the fetch-target queue, to be read out later as an instruction-fetching address for the instruction cache. The previously-predicted instruction addresses are pushed into the fetch-target queue in a previous cycle.

Type: Grant

Filed: October 13, 2020

Date of Patent: August 2, 2022

Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.

Inventors: Fangong Gong, Mengchen Yang
Protection against timing-based security attacks by randomly adjusting reorder buffer capacity

Patent number: 11403107

Abstract: Methods, systems, and apparatuses related to re-order buffers and for protection from timing-based security attacks are described. A processor may have functional units configured to execute instructions out of order, a re-order buffer configured to buffer the execution results of instructions for output in order, and a controller configured to randomize data timing in the re-order buffer. For example, the controller can make random adjustments to the capacity of the re-order buffer in buffering and/or sorting execution results and thus randomize data timing in the re-order buffer.

Type: Grant

Filed: December 5, 2018

Date of Patent: August 2, 2022

Assignee: Micron Technology, Inc.

Inventor: Steven Jeffrey Wallach
Opportunistic consumer instruction steering based on producer instruction value prediction in a multi-cluster processor

Patent number: 11327763

Abstract: Opportunistic consumer instruction steering based on producer instruction value prediction in a multi-cluster processor is disclosed. A processor provides producer instructions and consumer instructions to a steering circuit that steers the program instructions to clusters of instruction execution circuits. An input value provided to a consumer instruction may be a produced value of a producer instruction, creating a dependency. The steering circuit steers a producer instruction to a first cluster and, in response to receiving the consumer instruction and the predicted value of the producer instruction, provides the predicted value to at least a second cluster and steers the consumer instruction to the second cluster for execution with the predicted value as the input value.

Type: Grant

Filed: June 11, 2020

Date of Patent: May 10, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Arthur Perais, Shivam Priyadarshi, Yusuf Cagatay Tekmen, Rami Mohammad Al Sheikh, Vignyan Reddy Kothinti Naresh
Method and apparatus for instruction prefetching with alternating buffers and sequential instruction address matching

Patent number: 11327762

Abstract: An instruction prefetching method, a device and a medium are provided. The method includes the following: instructions in a target buffer are precompiled before a processor core fetches a required instruction from the target buffer corresponding to the processor core; if it is determined that a jump instruction exists in the target buffer and a jump target instruction corresponding to the jump instruction is not cached in the target buffer according to a precompiled result, the jump target instruction is prefetched from an icache into a candidate buffer corresponding to the processor core to wait for the processor core to fetch the jump target instruction from the candidate buffer; the target buffer and the candidate buffer are alternately reused during instruction prefetching.

Type: Grant

Filed: September 29, 2020

Date of Patent: May 10, 2022

Assignee: KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED

Inventors: Chao Tang, Xueliang Du, Yingnan Xu
Compiler assisted register file write reduction

Patent number: 11321799

Abstract: Examples described herein relate to a software and hardware optimization that manages scenarios where a write operation to a register is less than an entirety of the register. A compiler detects instructions that make partial writes to the same register, groups such instructions, and provides hints to hardware of the partial write. The execution unit combines the output data for grouped instructions and updates the destination register as single write instead of multiple separate partial writes.

Type: Grant

Filed: December 24, 2019

Date of Patent: May 3, 2022

Assignee: Intel Corporation

Inventors: Chandra S. Gurram, Gang Y. Chen, Subramaniam Maiyuran, Supratim Pal, Ashutosh Garg, Jorge E. Parra, Darin M. Starkey, Guei-Yuan Lueh, Wei-Yu Chen
Duplicate detection for register renaming

Patent number: 11294683

Abstract: Systems and methods are disclosed for duplicate detection for register renaming. For example, a method includes checking a map table for duplicates of a first physical register, wherein the map table stores entries that each map an architectural register of an instruction set architecture to a physical register of a microarchitecture and a duplicate is two or more architectural registers that are mapped to a same physical register; and, responsive to a duplicate of the first physical register in the map table, preventing the first physical register from being added to a free list upon retirement of an instruction that renames an architectural register that was previously mapped to the first physical register to a different physical register, wherein the free list stores entries that indicate which physical registers are available for renaming.

Type: Grant

Filed: April 17, 2020

Date of Patent: April 5, 2022

Assignee: SiFive, Inc.

Inventor: Joshua Smith
Extending fused multiply-add instructions

Patent number: 11269631

Abstract: Extending fused multiply-add instructions, the method comprising: receiving an extended fused multiply-add (FMA) instruction indicating one or more operands of a fused multiply-add (FMA) operation and one or more transformations to be applied to the one or more operands; and performing, based on the extended FMA instruction, the one or more transformations and the FMA operation.

Type: Grant

Filed: July 29, 2020

Date of Patent: March 8, 2022

Assignee: GHOST LOCOMOTION INC.

Inventors: John Hayes, Volkmar Uhlig
Processor with variable pre-fetch threshold

Patent number: 11231933

Abstract: A method and apparatus for controlling pre-fetching in a processor. A processor includes an execution pipeline and an instruction pre-fetch unit. The execution pipeline is configured to execute instructions. The instruction pre-fetch unit is coupled to the execution pipeline. The instruction pre-fetch unit includes instruction storage to store pre-fetched instructions, and pre-fetch control logic. The pre-fetch control logic is configured to fetch instructions from memory and store the fetched instructions in the instruction storage. The pre-fetch control logic is also configured to provide instructions stored in the instruction storage to the execution pipeline for execution. The pre-fetch control logic is further configured set a maximum number of instruction words to be pre-fetched for execution subsequent to execution of an instruction currently being executed in the execution pipeline.

Type: Grant

Filed: April 9, 2020

Date of Patent: January 25, 2022

Assignee: Texas Instruments Incorporated

Inventors: Christian Wiencke, Johann Zipperer
Arithmetic processing device and control method implemented by arithmetic processing device

Patent number: 11210101

Abstract: An arithmetic processing device includes: a decoding circuit configured to decode a command; a command execution circuit configured to execute the command decoded by the decoding circuit; a register circuit configured to include a plurality of registers for holding data used by the command execution circuit; an identification information holding circuit configured to store identification information for identifying a register for writing a specific value when the command is a register writing command; a setting circuit configured to hold the specific value; and an operation control circuit configured to execute inhibiting processing when the command is a register reading command, the inhibiting processing including inhibiting an access of the register by the register reading command and selecting the specific value held in the setting circuit.

Type: Grant

Filed: August 22, 2019

Date of Patent: December 28, 2021

Assignee: FUJITSU LIMITED

Inventors: Ryohei Okazaki, Sota Sakashita, Atushi Fusejima
Memory fragments for supporting code block execution by using virtual cores instantiated by partitionable engines

Patent number: 11204769

Abstract: A global front end scheduler to schedule instruction sequences to a plurality of virtual cores implemented via a plurality of partitionable engines. The global front end scheduler includes a thread allocation array to store a set of allocation thread pointers to point to a set of buckets in a bucket buffer in which execution blocks for respective threads are placed, a bucket buffer to provide a matrix of buckets, the bucket buffer including storage for the execution blocks, and a bucket retirement array to store a set of retirement thread pointers that track a next execution block to retire for a thread.

Type: Grant

Filed: January 2, 2020

Date of Patent: December 21, 2021

Assignee: Intel Corporation

Inventor: Mohammad Abdallah
Processing unit and operating method therefor

Patent number: 11163562

Abstract: A processing unit, in particular, a microcontroller for a control unit, which includes at least one processing core, one primary memory device, and at least one main connection unit for connecting the at least one processing core to the primary memory device, the processing unit including at least two functional units, the processing unit including at least one functional unit designed as a data flow control unit, which is designed to receive input data, to evaluate the input data, and to generate output data as a function of the evaluation.

Type: Grant

Filed: October 9, 2018

Date of Patent: November 2, 2021

Assignee: Robert Bosch GmbH

Inventor: Nico Bannow
Vector compare and store instruction that stores index values to memory

Patent number: 11163564

Abstract: The present disclosure is directed to methods to generate a packed result array using parallel vector processing, of an input array and a comparison operation. In one aspect, an additive scan operation can be used to generate memory offsets for each successful comparison operation of the input array and to generate a count of the number of data elements satisfying the comparison operation. In another aspect, the input array can be segmented to allow more efficient processing using the vector registers. In another aspect, a vector processing system is disclosed that is operable to receive a data array, a comparison operation, and threshold criteria, and output a packed array, at a specified memory address, comprising of the data elements satisfying the comparison operation.

Type: Grant

Filed: October 8, 2018

Date of Patent: November 2, 2021

Assignees: VeriSilicon Microelectronics (Shanghai) Co., Ltd., VeriSilicon Holdings Co., Ltd.

Inventors: Charles H. Stewart, Charles R. Bezet
Store hit multiple load side register for preventing a subsequent store memory violation

Patent number: 11144321

Abstract: Examples of techniques for store hit multiple load side register for operand store compare are described herein. An aspect includes, based on detecting a store hit multiple load condition in the processor, updating a register of the processor to hold information corresponding to a first store instruction that triggered the detected store hit multiple load condition. Another aspect includes, based on issuing a second store instruction in the processor, determining whether the second store instruction corresponds to the information in the register. Another aspect includes, based on determining that the second store instruction corresponds to the information in the register, tagging the second store instruction with an operand store compare mark.

Type: Grant

Filed: February 20, 2019

Date of Patent: October 12, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Yair Fried, Jonathan Hsieh, Eyal Naor, James Bonanno, Gregory William Alexander
Cache control circuitry and methods

Patent number: 11132202

Abstract: An apparatus comprises execution circuitry to perform operations on source data values and to generate result data values; issue circuitry comprising one or more issue queues identifying pending operations awaiting performance by the execution circuitry, and selection circuitry to select pending operations to issue to the execution circuitry; data value cache storage comprising first and second cache regions; and cache control circuitry to control the storing to the first cache region of result data values generated by the execution circuitry and the eviction of stored result data values from the first cache region in response to newly generated result data values being stored in the first cache region; the cache control circuitry being configured to store to the second cache region result data values required as source data values for one or more oldest pending operations identified by the one or more issue queues and to inhibit eviction of a given result data value stored in the second cache region until in

Type: Grant

Filed: September 24, 2019

Date of Patent: September 28, 2021

Assignee: Arm Limited

Inventors: Luca Nassi, Rémi Marius Teyssier, Cédric Denis Robert Airaud, Albin Pierrick Tonnerre, Francois Donati, Christophe Carbonne, Damian Maiorano
Performing flush recovery using parallel walks of sliced reorder buffers (SROBs)

Patent number: 11113068

Abstract: Performing flush recovery using parallel walks of sliced reorder buffers (SROBs) is disclosed herein. In one exemplary embodiment, a register mapping circuit provides a rename mapping table (RMT) comprising RMT entries representing logical register number (LRN) to physical register number (PRN) mappings. The register mapping circuit also provides an SROB comprising multiple SROB slices that each corresponds to a respective LRN. Each SROB slice tracks uncommitted instructions that write to the LRN corresponding to that SROB slice, and maintains those instructions in program order with respect to each other. Upon detecting an uncommitted instruction writing to an LRN, the register mapping circuit allocates an SROB slice entry in the SROB slice corresponding to the LRN. When an pipeline flush from a target instruction occurs, the register mapping circuit restores RMT entries of the RMT to their prior mapping states based on parallel walks of the SROB slices of the SROB.

Type: Grant

Filed: August 6, 2020

Date of Patent: September 7, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Yusuf Cagatay Tekmen, Rodney Wayne Smith, Kiran Ravi Seth, Shivam Priyadarshi
Circuitry and methods

Patent number: 11086626

Abstract: Circuitry comprises decode circuitry to decode program instructions including producer instructions and consumer instructions, a consumer instruction requiring, as an input operand, a result generated by execution of a producer instruction; and execution circuitry to execute the program instructions; in which: the decode circuitry is configured to control operation of the execution circuitry in response to hint data associated with a given producer instruction and indicating, for the given producer instruction, a number of consumer instructions which require, as an input operand, a result generated by the given producer instruction.

Type: Grant

Filed: October 24, 2019

Date of Patent: August 10, 2021

Assignee: Arm Limited

Inventors: Roko Grubisic, Giacomo Gabrielli, Matthew James Horsnell, Syed Ali Mustafa Zaidi
Finish exception handling of an instruction completion table

Patent number: 11086630

Abstract: A computer system includes a dispatch stage configured to dispatch a plurality of instructions in a program order, and an issue stage configured to issue at least one instruction among the plurality of instructions. The computer system further includes an execution stage configured to execute the at least one instruction to generate a finish report and to determine the at least one instruction is one of an exception-free instruction or an exception instruction. In response to determining the exception-free instruction, a first finish report associated with the exception-free instruction is output to a completion stage. In response to determining the exception instruction, a second finish report associated with the exception instruction is output to an exception unit so as to halt output of the second finish report to the completion stage.

Type: Grant

Filed: February 27, 2020

Date of Patent: August 10, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Kenneth L. Ward, Susan Eisen, Christopher M. Mueller, Glenn O. Kincaid, Dhivya Jeganathan
Addition instructions with independent carry chains

Patent number: 11080045

Abstract: A number of addition instructions are provided that have no data dependency between each other. A first addition instruction stores its carry output in a first flag of a flags register without modifying a second flag in the flags register. A second addition instruction stores its carry output in the second flag of the flags register without modifying the first flag in the flags register.

Type: Grant

Filed: December 22, 2011

Date of Patent: August 3, 2021

Assignee: Intel Corporation

Inventors: Vinodh Gopal, James D. Guilford, Gilbert M. Wolrich, Wajdi K. Feghali, Erdinc Ozturk, Martin G. Dixon, Sean P. Mirkes, Matthew C. Merten, Tong Li, Bret T. Toll, I
Register renaming-based techniques for block-based processors

Patent number: 11042381

Abstract: Techniques described herein are directed to ensuring register data consistency between different instruction blocks. For example, a block-based processor renames registers during block decode, but delays the update of a logical register-to-physical register mapping utilized by other instruction blocks until it is determined that a write instruction configured to write to a logical register commits. Alternatively, the processor renames registers during block decode and updates the mapping accordingly. However, the update is negated (e.g., rolled back) if the write instruction is not executed. Still further, the processor may analyze the instructions in the block to determine instructions configured to write to a logical register but that will not execute due to a mismatched predicate. Based on the determination, the block-based processor ensures data consistency by copying data from a previously-assigned register to a newly-assigned register.

Type: Grant

Filed: December 8, 2018

Date of Patent: June 22, 2021

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: David T. Harper, III, Gagan Gupta
Processor trace extensions to facilitate real-time security monitoring

Patent number: 11016773

Abstract: Embodiments described herein provide for a computing device comprising a hardware processor including a processor trace module to generate trace data indicative of an order of instructions executed by the processor, wherein the processor trace module is configurable to selectively output a processor trace packet associated with execution of a selected non-deterministic control flow transfer instruction.

Type: Grant

Filed: September 27, 2019

Date of Patent: May 25, 2021

Assignee: INTEL CORPORATION

Inventors: Salmin Sultana, Beeman Strong, Ravi Sahita
Key schedule determination

Patent number: 10992468

Abstract: Data processing apparatuses and methods for performing an iterative determination of a key schedule are provided. A set of registers initially receives an input data item and data processing is then performed using the content of the set of registers as an input. The result of this data processing is then used to update a value stored in a predetermined register of the set of registers at each iterative round of the determination of the key schedule. Dependent on whether the data processing apparatus is in a reverse key expansion mode or a forwards key expansion mode determines which register in the set of registers is that predetermined register. Further, the set of registers is arranged to shift values contained in the set of registers in a direction which depends on whether the data processing apparatus is in a reverse key expansion mode or a forwards key expansion mode. The directions for the two modes are opposite to one another.

Type: Grant

Filed: March 19, 2018

Date of Patent: April 27, 2021

Assignee: Arm Limited

Inventor: Yoav Asher Levy
Selection of instructions to issue in a processor

Patent number: 10983799

Abstract: Techniques are disclosed relating to selection circuitry configured to select instruction operations to issue to one or more execution circuits of a processor. In some embodiments, an apparatus includes a plurality of execution circuits configured to perform one or more instruction operations. The apparatus may further include a plurality of instruction queues configured to store information indicative of the one or more instruction operations. In some embodiments, the apparatus may include a selection circuit configured to select a first plurality of instruction operations from a first instruction queue. The selection circuit may be configured to select a first instruction operation from the first plurality of instruction operations to issue to a first execution circuits.

Type: Grant

Filed: December 19, 2017

Date of Patent: April 20, 2021

Assignee: Apple Inc.

Inventors: Sean M. Reynolds, Gokul V. Ganesan
Register sharing mechanism

Patent number: 10983794

Abstract: An processor to facilitate register sharing is disclosed. The processor includes a plurality of execution units (EUs), each including a General Purpose Register File (GRF) having a plurality of registers; and register sharing hardware to divide the plurality of registers into a first set of registers dedicated for execution of a first set of threads and a second set of registers shared for execution of a second set of threads.

Type: Grant

Filed: June 17, 2019

Date of Patent: April 20, 2021

Assignee: Intel Corporation

Inventors: Guei-Yuan Lueh, Subramaniam Maiyuran, Weiyu Chen, Konrad Trifunovic, Supratim Pal, Chandra S. Gurram, Jorge E. Parra, Pratik J. Ashar, Tomasz Bujewski
Taint protection during speculative execution

Patent number: 10956157

Abstract: A subset of a set of architectural registers in a processing system is marked (or “tainted”) to indicate that speculative use of data in the subset of the architectural registers is constrained based on a taint handling policy. One or more speculation features supported by the processing system are disabled for the instruction so that the one or more speculation features cannot be used on data in the subset. In some cases, values of bits associated with the subset of architectural registers are modified to indicate that the subset is tainted. The taint handling policy can be indicated by values stored in a policy register. Taint markings are tracked in response to values stored in the tainted architectural registers being written to a memory or read from the memory.

Type: Grant

Filed: March 5, 2019

Date of Patent: March 23, 2021

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: David Kaplan, Marius Evers
Operand-based reach explicit dataflow processors, and related methods and computer-readable media

Patent number: 10956162

Abstract: Operand-based reach explicit dataflow processors, and related methods and computer-readable media are disclosed. The operand-based reach explicit dataflow processors support execution of a producer instruction that explicitly names a target consumer operand of a consumer instruction in a consumer operand encoding namespace of the producer instruction. The produced value from execution of the producer instruction is provided or otherwise made available as an input to the named target consumer operand of the consumer instruction as a result of processing the producer instruction. The target consumer operand is encoded in the producer instruction as an operand target distance relative to the producer instruction. Instructions in an instruction stream between the producer instruction and the targeted consumer instruction that have no operands do not consume an operand reach namespace in the producer instructions.

Type: Grant

Filed: June 28, 2019

Date of Patent: March 23, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Robert Douglas Clancy, Melinda Joyce Brown, Yusuf Cagatay Tekmen, Brian Michael Stempel, Michael Scott Mcilvaine, Thomas Philip Speier, Rodney Wayne Smith, Gagan Gupta, David Tennyson Harper, III

1 2 3 4 5 … next