Patents Examined by Keith E Vicary

Data value prediction and pre-alignment based on prefetched predicted memory access address

Patent number: 12293189

Abstract: An apparatus with prefetching capabilities is provided in order to produce predictions of a memory address to be accessed by a load instruction in the future. An additional special cache is provided where pre-aligned data can be stored based on that prediction. When that load instruction is eventually received, the prediction can be confirmed and the pre-aligned data returned and loaded into a register file. In accordance with these techniques, the load instruction does not need to access the memory system nor perform alignment of the data before loading it into the register file. Hence the load instruction is performed faster than when loading data via a memory access. Further precautionary functionalities are also provided to manage the pre-aligned data to avoid the possibility of data corruption after a substantive change occurs to the state of memory.

Type: Grant

Filed: May 4, 2023

Date of Patent: May 6, 2025

Assignee: Arm Limited

Inventors: Kim Richard Schuttenberg, Richard F Bryant
Programmable control of micro-operations cache resources of a processor

Patent number: 12282777

Abstract: Various example embodiments of a processor are presented. Various example embodiments of a processor may be configured to support split programmability of resources of a processor frontend of the processor. Various example embodiments of a processor are configured to support split programmability of resources of a processor frontend of the processor in a manner enabling assignment of split programmable resources of the frontend of the processor to control blocks of a program being executed by the processor. Various example embodiments of a processor are configured to support split programmability of micro-operations (UOPs) cache (UC) resources of the frontend of the processor (which may then be referred to as a split programmable (SP) UC (SP-UC), where it may be referred to as “split” since there are multiple UCs and may be referred to as “programmable” since selection of the active UC from the set of multiple UCs is controllable by the program executed by the processor).

Type: Grant

Filed: February 13, 2019

Date of Patent: April 22, 2025

Assignee: NOKIA TECHNOLOGIES OY

Inventor: Pranjal Kumar Dutta
Masked-vector-comparison instruction

Patent number: 12277420

Abstract: A masked-vector-comparison instruction specifies a source vector operand comprising a plurality of source data elements, a mask value, and a comparison target operand. In response to the masked-vector-comparison instruction, an instruction decoder 10 controls processing circuitry 16 to: for each active source data element of the source vector operand, determine whether the active source data element satisfies a comparison condition, based on a masked comparison between one or more compared bits of the active source data element and one or more compared bits of the comparison target operand, the mask value specifying a pattern of compared bits and non-compared bits within the comparison target operand and the active source data element; and generate a result value indicative of which of the source data elements of the source vector operand, if any, is an active source data element satisfying the comparison condition. This instruction is useful for variable length decoding operations.

Type: Grant

Filed: August 17, 2021

Date of Patent: April 15, 2025

Assignee: Arm Limited

Inventors: Jacob Eapen, Matthias Lothar Boettcher, Balaji Venu, François Christopher Jacques Botman
Compressing instructions for machine-learning accelerators

Patent number: 12242854

Abstract: In one embodiment, a method for accessing an instruction that is to be executed by a control agent within a computing system may include decompressing the instruction by replacing each of one or more zero-symbol run-length fields in the instruction with as many continuous zero symbols as its corresponding value and removing one or more non-zero-symbol run-length fields from the instruction. The method may also include determining that the instruction is spatial-delta-encoded based on a compression data header associated with the instruction, performing spatial-delta decoding on the instruction in response to the determination by orderly determining a spatial-delta-decoded value of each bit in the instruction, and causing the instruction to be sent to the control agent.

Type: Grant

Filed: February 21, 2023

Date of Patent: March 4, 2025

Assignee: Meta Platforms, Inc.

Inventors: Kyong Ho Lee, Miguel Angel Guerrero, Varun Agarwal
Data processing apparatus with selectively delayed transmission of operands

Patent number: 12236241

Abstract: A data processing apparatus comprises operand routing circuitry configured to prepare operands for processing, and a plurality of processing elements. Each processing element comprises receiving circuitry, processing circuitry, and transmitting circuitry. A group of coupled processing elements comprises a first processing element configured to receive operands from the operand routing circuitry and one or more further processing elements for which the receiving circuitry is coupled to the transmitting circuitry of another processing element in the group. The apparatus also comprises timing circuitry, configured to selectively delay transmission of operands within the group of coupled processing elements to cause operations performed by the group of coupled processing elements to be staggered.

Type: Grant

Filed: February 24, 2023

Date of Patent: February 25, 2025

Assignee: Arm Limited

Inventors: Xiaoyang Shen, Zichao Xie, Cédric Denis Robert Airaud, Grégorie Martin
Apparatuses and methods for speculative execution side channel mitigation

Patent number: 12236243

Abstract: Methods and apparatuses relating to mitigations for speculative execution side channels are described. Speculative execution hardware and environments that utilize the mitigations are also described. For example, three indirect branch control mechanisms and their associated hardware are discussed herein: (i) indirect branch restricted speculation (IBRS) to restrict speculation of indirect branches, (ii) single thread indirect branch predictors (STIBP) to prevent indirect branch predictions from being controlled by a sibling thread, and (iii) indirect branch predictor barrier (IBPB) to prevent indirect branch predictions after the barrier from being controlled by software executed before the barrier.

Type: Grant

Filed: April 24, 2023

Date of Patent: February 25, 2025

Assignee: Intel Corporation

Inventors: Jason W. Brandt, Deepak K. Gupta, Rodrigo Branco, Joseph Nuzman, Robert S. Chappell, Sergiu Ghetie, Wojciech Powiertowski, Jared W. Stark, IV, Ariel Sabba, Scott J. Cape, Hisham Shafi, Lihu Rappoport, Yair Berger, Scott P. Bobholz, Gilad Holzstein, Sagar V. Dalvi, Yogesh Bijlani
CPUs with capture queues to save and restore intermediate results and out-of-order results

Patent number: 12223327

Abstract: Techniques related to executing a plurality of instructions by a processor comprising a method for executing a plurality of instructions by a processor. The method comprises detecting a pipeline hazard based on one or more instructions provided for execution by an instruction execution pipeline, beginning execution of an instruction, of the one or more instructions on the instruction execution pipeline, stalling a portion of the instruction execution pipeline based on the detected pipeline hazard, storing a register state associated with the execution of the instruction based on the stalling, determining that the pipeline hazard has been resolved, and restoring the register state to the instruction execution pipeline based on the determination.

Type: Grant

Filed: October 16, 2023

Date of Patent: February 11, 2025

Assignee: Texas Instruments Incorporated

Inventors: Timothy D. Anderson, Duc Bui, Joseph Zbiciak, Reid E. Tatge
System having a hybrid threading processor, a hybrid threading fabric having configurable computing elements, and a hybrid interconnection network

Patent number: 12204363

Abstract: Representative apparatus, method, and system embodiments are disclosed for configurable computing. In a representative embodiment, a system includes an interconnection network, a processor, a host interface, and a configurable circuit cluster. The configurable circuit cluster may include a plurality of configurable circuits arranged in an array; an asynchronous packet network and a synchronous network coupled to each configurable circuit of the array; and a memory interface circuit and a dispatch interface circuit coupled to the asynchronous packet network and to the interconnection network. Each configurable circuit includes instruction or configuration memories for selection of a current data path configuration, a master synchronous network input, and a data path configuration for a next configurable circuit.

Type: Grant

Filed: January 15, 2024

Date of Patent: January 21, 2025

Assignee: Micron Technology, Inc.

Inventor: Tony M. Brewer
Exit history based branch prediction

Patent number: 12197917

Abstract: A computer-implemented method includes fetching a fetch-packet containing a first hyper-block from a first address of a memory. The fetch-packet contains a bitwise distance from an entry point of the first hyper-block to a predicted exit point. A first branch instruction of the first hyper-block is executed that corresponds to a first exit point. The first branch instruction includes an address corresponding to an entry point of a second hyper-block. Responsive to executing the first branch instruction, a bitwise distance from the entry point of the first hyper-block to the first exit point is stored. A program counter is moved from the first exit point of the first hyper-block to the entry point of the second hyper-block.

Type: Grant

Filed: June 27, 2022

Date of Patent: January 14, 2025

Assignee: Texas Instruments Incorporated

Inventors: Kai Chirca, Timothy D. Anderson, David E. Smith, Jr., Paul D. Gauvreau
Parallel instruction demarcator

Patent number: 12197915

Abstract: Parallel instruction demarcators and methods for parallel instruction demarcation are included, wherein an instruction syllable sequence comprising a plurality of instruction syllables is received and stored at an instruction buffer. It is determined, using one or more logic blocks arranged in a sequence, a size of an instruction and at least one boundary at which the instruction is demarcated. Additionally, using a controlling logic block a restart point is determined from where the sequence of instruction syllables is examined and demarcated into individual instructions.

Type: Grant

Filed: November 15, 2021

Date of Patent: January 14, 2025

Inventor: Sitaram Yadavalli
Suppressing allocation of registers for register renaming

Patent number: 12190117

Abstract: Techniques are provided for allocating registers for a processor. The techniques include identifying a first instruction of an instruction dispatch set that meets all register allocation suppression criteria of a first set of register allocation suppression criteria, suppressing register allocation for the first instruction, identifying a second instruction of the instruction dispatch set that does not meet all register allocation suppression criteria of a second set of register allocation suppression criteria, and allocating a register for the second instruction.

Type: Grant

Filed: November 26, 2019

Date of Patent: January 7, 2025

Assignee: Advanced Micro Devices, Inc.

Inventors: Neil N. Marketkar, Arun A. Nair
Microprocessor with time count based instruction execution and replay

Patent number: 12190116

Abstract: A processor includes a time counter and a time-resource matrix and provides a method for statically dispatching instructions if the resources are available based on data stored in the time-resource matrix, and wherein execution times for the instructions use a time count from the time counter to specify when the instructions may be provided to an execution pipeline. The execution times are based on fixed latency times of instructions with exception of the load instruction which is based on the data cache hit latency time. A data cache miss causes the load instruction and subsequent dependent instructions to be statically replayed at a later time using the same time count.

Type: Grant

Filed: April 5, 2022

Date of Patent: January 7, 2025

Assignee: Simplex Micro, Inc.

Inventor: Thang Minh Tran
Performance monitoring information informed register renaming

Patent number: 12182575

Abstract: A data processing apparatus comprises: a physical register array, prediction circuitry, register rename circuitry, and hardware execution circuitry. The physical register array comprises a plurality of sectors having one or more different access properties, each of the plurality of sectors having one or more different access properties compared to other sectors of the plurality of sectors, each sector of the plurality of sectors comprising at least one physical register. The prediction circuitry to predict, for a given instruction, a sector identifier identifying one of the sectors of the physical register array to be used for a destination register of the given instruction. The prediction circuitry is configured to select the sector identifier in dependence on prediction information learnt from performance monitoring information indicative of performance achieved for a sequence of instructions when using different sector identifiers for the given instruction.

Type: Grant

Filed: December 12, 2022

Date of Patent: December 31, 2024

Assignee: Arm Limited

Inventor: Mbou Eyole
Thread channel deactivation based on instruction cache misses

Patent number: 12164927

Abstract: Techniques are disclosed relating to instruction scheduling in the context of instruction cache misses. In some embodiments, first-stage scheduler circuitry is configured to assign threads to channels and second-stage scheduler circuitry is configured to assign an operation from a given channel to a given execution pipeline based on decode of an operation for that channel. In some embodiments, thread replacement circuitry is configured to, in response to an instruction cache miss for an operation of a first thread assigned to a first channel, deactivate the first thread from the first channel.

Type: Grant

Filed: November 10, 2022

Date of Patent: December 10, 2024

Assignee: Apple Inc.

Inventors: Justin Friesenhahn, Benjiman L. Goodman
Instruction set architecture for neural network quantization and packing

Patent number: 12159140

Abstract: An electronic device receives a single instruction to apply a neural network operation to a set of M-bit elements stored in one or more input vector registers to initiate a sequence of computational operations related to a neural network. In response to the single instruction, the electronic device implements the neural network operation on the set of M-bit elements to generate a set of P-bit elements by obtaining the set of M-bit elements from the one or more input vector registers, quantizing each of the set of M-bit elements from M bits to P bits, and packing the set of P-bit elements into an output vector register. P is smaller than M. In some embodiments, the neural network operation is a quantization operation including at least a multiplication with a quantization factor and an addition with a zero point.

Type: Grant

Filed: April 28, 2022

Date of Patent: December 3, 2024

Assignee: QUALCOMM Incorporated

Inventors: Srijesh Sudarsanan, Deepak Mathew, Marc Hoffman, Sundar Rajan Balasubramanian, Mansi Jain, James Lee, Gerald Sweeney
Processor with macro-instruction achieving zero-latency data movement

Patent number: 12153921

Abstract: An apparatus includes an array processor to process array data in response to a set of macro-instructions. A macro-instruction in the set of macro-instructions performs loop operations, array iteration operations, and/or arithmetic logic unit (ALU) operations.

Type: Grant

Filed: June 28, 2021

Date of Patent: November 26, 2024

Assignee: Silicon Laboratories Inc.

Inventors: Matthew Brandon Gately, Eric Jonathan Deal, Mark Willard Johnson, Daniel Thomas Riedler
Processor that executes instruction that specifies instruction concatenation and atomicity

Patent number: 12153929

Abstract: A processor includes a plurality of execution units. At least one of the execution units is configured to determine, based on a first field of a first instruction, a number of additional instructions to execute in conjunction with the first instruction and prior to execution of the first instruction. The at least one of the execution units is further configured to determine, based on a second field of the first instruction, a subset of the additional instructions to execute atomically.

Type: Grant

Filed: November 17, 2021

Date of Patent: November 26, 2024

Assignee: Texas Instruments Incorporated

Inventors: Horst Diewald, Johann Zipperer
Register reorganisation by changing a mapping between logical and physical registers based on upcoming operations and an incomplete set of connections between the physical registers and execution units

Patent number: 12141583

Abstract: An apparatus has processing circuitry with execution units to perform operations, physical registers to store data, and forwarding circuitry to forward the data from the physical registers to the execution units. The forwarding circuitry provides an incomplete set of connections between the physical registers and the execution units such that, for each of at least some of the physical registers, the physical register is connected to only a subset of the execution units. The apparatus also has register renaming circuitry to map logical registers identified by the operations to respective physical registers and register reorganisation circuitry to monitor upcoming operations and to determine, based on the upcoming operations and the connections provided by the forwarding circuitry, whether to perform a register reorganisation procedure to change a mapping between the logical registers and the physical registers.

Type: Grant

Filed: September 13, 2022

Date of Patent: November 12, 2024

Assignee: Arm Limited

Inventors: Xiaoyang Shen, Zichao Xie
Processing-in-memory (PIM) system that changes between multiplication/accumulation (MAC) and memory modes and operating methods of the PIM system

Patent number: 12136470

Abstract: A processing-in-memory (PIM) system includes a host and a PIM controller. The host is configured to generate a request for a memory access operation or a multiplication/accumulation (MAC) operation of a PIM device and also to generate a mode definition signal defining an operation mode of the PIM device. The PIM controller is configured to generate a command corresponding to the request to control the memory access operation or the MAC operation of the PIM device. When the operation mode of the PIM device is inconsistent with a mode set defined by the mode definition signal, the PIM controller controls the memory access operation or the MAC operation of the PIM device after changing the operation mode of the PIM device.

Type: Grant

Filed: January 7, 2021

Date of Patent: November 5, 2024

Assignee: SK hynix Inc.

Inventor: Choung Ki Song
One-dimensional zero padding in a stream of matrix elements

Patent number: 12118358

Abstract: Software instructions are executed on a processor within a computer system to configure a streaming engine with stream parameters to define a multidimensional array. The stream parameters define a size for each dimension of the multidimensional array and a specified width for a selected dimension of the array. Data is fetched from a memory coupled to the streaming engine responsive to the stream parameters. A stream of vectors is formed for the multidimensional array responsive to the stream parameters from the data fetched from memory. When the selected dimension in the stream of vectors exceeds the specified width, the streaming engine inserts null elements into each portion of a respective vector for the selected dimension that exceeds the specified width in the stream of vectors. Stream vectors that are completely null are formed by the streaming engine without accessing the system memory for respective data.

Type: Grant

Filed: January 25, 2022

Date of Patent: October 15, 2024

Assignee: Texas Instruments Incorporated

Inventors: Son Hung Tran, Shyam Jagannathan, Timothy David Anderson

1 2 3 4 5 … next