Instruction Issuing Patents (Class 712/214)

Simultaneous issuance of multiple instructions (Class 712/215)

Latency management in synchronization events

Patent number: 11914524

Abstract: An electronic device includes one or more processors for executing one or more virtual machines. In response to a request for initiating a synchronization event, a processor identifies a subset of speculative memory access requests in one or more memory access request queues. Automatically and in accordance with the identifying, the processor purges translations associated with the subset of speculative memory access requests. Subsequent to the purging, the processor initiates the synchronization event. In some implementations, memory access completion is forced in response to a context synchronization event that corresponds to a termination of a first application, a termination of a first virtual machine, or a system call for updating a system register. Alternatively, in some implementations, memory access completion is forced in an operating system level or an application level in response to a data synchronization event that is initiated on a hypervisor layer or a firmware layer.

Type: Grant

Filed: March 1, 2022

Date of Patent: February 27, 2024

Assignee: QUALCOMM Incorporated

Inventors: Adrian Montero, Huzefa Sanjeliwala, Paul Kitchin, Prarthna Santhanakrishnan, Conrado Blasco, Pradeep Kanapathipillai
Pipelines for secure multithread execution

Patent number: 11886882

Abstract: Described herein are systems and methods for secure multithread execution. For example, some methods include fetching an instruction of a first thread from a memory into a processor pipeline that is configured to execute instructions from two or more threads in parallel using execution units of the processor pipeline; detecting that the instruction has been designated as a sensitive instruction; responsive to detection of the sensitive instruction, disabling execution of instructions of threads other than the first thread in the processor pipeline during execution of the sensitive instruction by an execution unit of the processor pipeline; executing the sensitive instruction using an execution unit of the processor pipeline; and, responsive to completion of execution of the sensitive instruction, enabling execution of instructions of threads other than the first thread in the processor pipeline.

Type: Grant

Filed: April 5, 2022

Date of Patent: January 30, 2024

Assignee: Marvell Asia Pte, Ltd.

Inventor: Shubhendu Sekhar Mukherjee
Delayed snoop for improved multi-process false sharing parallel thread performance

Patent number: 11822786

Abstract: Techniques for maintaining cache coherency comprising storing data blocks associated with a main process in a cache line of a main cache memory, storing a first local copy of the data blocks in a first local cache memory of a first processor, storing a second local copy of the set of data blocks in a second local cache memory of a second processor executing a first child process of the main process to generate first output data, writing the first output data to the first data block of the first local copy as a write through, writing the first output data to the first data block of the main cache memory as a part of the write through, transmitting an invalidate request to the second local cache memory, marking the second local copy of the set of data blocks as delayed, and transmitting an acknowledgment to the invalidate request.

Type: Grant

Filed: February 1, 2022

Date of Patent: November 21, 2023

Assignee: Texas Instruments Incorporated

Inventors: Kai Chirca, Timothy David Anderson
Instruction format and instruction set architecture for tensor streaming processor

Patent number: 11822510

Abstract: Embodiments are directed to a processor having a functional slice architecture. The processor is divided into tiles (or functional units) organized into a plurality of functional slices. The functional slices are configured to perform specific operations within the processor, which includes memory slices for storing operand data and arithmetic logic slices for performing operations on received operand data (e.g., vector processing, matrix manipulation). The processor includes a plurality of functional slices of a module type, each functional slice having a plurality of tiles. The processor further includes a plurality of data transport lanes for transporting data in a direction indicated in a corresponding instruction. The processor also includes a plurality of instruction queues, each instruction queue associated with a corresponding functional slice of the plurality of functional slices, wherein the instructions in the instruction queues comprise a functional slice specific operation code.

Type: Grant

Filed: March 1, 2022

Date of Patent: November 21, 2023

Assignee: Groq, Inc.

Inventors: Dennis Charles Abts, Jonathan Alexander Ross, John Thompson, Gregory Michael Thorson
Dynamic allocation of arithmetic logic units for vectorized operations

Patent number: 11816061

Abstract: A system includes a processing device that includes a vector arithmetic logic unit comprising a plurality of arithmetic logic units (ALUs), and a first processor core operatively coupled to the vector arithmetic logic unit, the processing device to receive a first vector instruction from the first processor core, wherein the first vector instruction specifies at least one first input vector having a first vector length, identify a first subset of the ALUs in view of the first vector length and one or more allocation criteria, execute, using the first subset of the set of ALUs, one or more first ALU operations specified by the first vector instruction, wherein the vector arithmetic logic unit executes the first ALU operations in parallel with one or more second ALU operations specified by a second vector instruction received from a second processor core.

Type: Grant

Filed: December 18, 2020

Date of Patent: November 14, 2023

Assignee: Red Hat, Inc.

Inventor: Ulrich Drepper
Reach matrix scheduler circuit for scheduling instructions to be executed in a processor

Patent number: 11803389

Abstract: A reach matrix scheduler circuit for scheduling instructions to be executed in a processor is disclosed. The scheduler circuit includes an N×R matrix wake-up circuit, where ‘N’ is the instruction window size of the scheduler circuit, and ‘R’ is the “reach” within the instruction window of the matrix wake-up circuit, with ‘R’ being less than ‘N’. A grant line associated with each instruction request entry in the N×R matrix wake-up circuit is coupled to ‘R’ other instruction entries among the ‘N’ instruction entries. When a producer instruction in an instruction request entry is ready for issuance, the grant line associated with the instruction request entry is activated so that any other instruction entries coupled to the grant line (i.e., within the “reach” of the instruction request entry) that consume the produced value generated by the producer instruction are “woken-up” and subsequently indicated as ready to be issued.

Type: Grant

Filed: January 9, 2020

Date of Patent: October 31, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Yusuf Cagatay Tekmen, Rodney Wayne Smith, Douglas C. Burger, Gagan Gupta, Kiran Ravi Seth
Apparatuses and methods for ordering bits in a memory device

Patent number: 11782721

Abstract: Systems, apparatuses, and methods for organizing bits in a memory device are described. In a number of embodiments, an apparatus can include an array of memory cells, a data interface, a multiplexer coupled between the array of memory cells and the data interface, and a controller coupled to the array of memory cells, the controller configured to cause the apparatus to latch bits associated with a row of memory cells in the array in a number of sense amplifiers in a prefetch operation and send the bits from the sense amplifiers, through a multiplexer, to a data interface, which may include or be referred to as DQs. The bits may be sent to the DQs in a particular order that may correspond to a particular matrix configuration and may thus facilitate or reduce the complexity of arithmetic operations performed on the data.

Type: Grant

Filed: February 25, 2022

Date of Patent: October 10, 2023

Assignee: Micron Technology, Inc.

Inventors: Glen E. Hush, Aaron P. Boehm, Fa-Long Luo
Scheduling tasks in a processor

Patent number: 11755365

Abstract: A method of scheduling tasks in a processor comprises receiving a plurality of tasks that are ready to be executed, i.e. all their dependencies have been met and all the resources required to execute the task are available, and adding the received tasks to a task queue (or “task pool”). The number of tasks that are executing is monitored and in response to determining that an additional task can be executed by the processor, a task is selected from the task pool based at least in part on a comparison of indications of resources used by tasks being executed and indications of resources used by individual tasks in the task pool and the selected task is then sent for execution.

Type: Grant

Filed: December 23, 2019

Date of Patent: September 12, 2023

Assignee: Imagination Technologies Limited

Inventors: Isuru Herath, Richard Broadhurst
Counting elements in neural network input data

Patent number: 11734002

Abstract: The present disclosure provides a counting device and counting method. The device includes a storage unit, a counting unit, and a register unit, where the storage unit may be connected to the counting unit for storing input data to be counted and storing a number of elements satisfying a given condition in the input data after counting; the register unit may be configured to store an address where input data to be counted is stored in the storage unit; and the counting unit may be connected to the register unit, and may be configured to acquire a counting instruction, read a storage address of the input data to be counted in the register unit according to the counting instruction, acquire corresponding input data to be counted in the storage unit, perform statistical counting on a number of elements in the input data to be counted that satisfy the given condition, and obtain a counting result.

Type: Grant

Filed: November 27, 2019

Date of Patent: August 22, 2023

Assignee: SHANGHAI CAMBRICON INFORMATION TECHNOLOGY CO., LTD

Inventors: Tianshi Chen, Jie Wei, Tian Zhi, Zai Wang
Method and processor system for executing a TELT instruction to access a data item during execution of an atomic primitive

Patent number: 11681567

Abstract: The present disclosure relates to a method for a computer system comprising a plurality of processor cores including a first processor core and a second processor core, wherein a data item is exclusively assigned to the first processor core, of the plurality of processor cores, for executing an atomic primitive by the first processor core. The method includes receiving by the first processor core, from the second processor core, a request for accessing the data item, and in response to determining by the first processor core that the executing of the atomic primitive is not completed by the first processor core, returning a rejection message to the second processor core.

Type: Grant

Filed: May 9, 2019

Date of Patent: June 20, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ralf Winkelmann, Michael Fee, Matthias Klein, Carsten Otte, Edward W. Chencinski, Hanno Eichelberger
Method, apparatus, and system for reducing live readiness calculations in reservation stations

Patent number: 11669333

Abstract: In certain aspects of the disclosure, an apparatus comprises a first scheduling pool associated with a first minimum scheduling latency and a second scheduling pool associated with a second minimum scheduling latency, the second minimum scheduling latency greater than the first minimum scheduling latency. A common instruction picker is coupled to both the first scheduling pool and the second scheduling pool. The common instruction picker may be configured to select a first instruction from the first scheduling pool and a second instruction from the second scheduling pool, and then choose either the first instruction or second instruction for dispatch according to a picking policy.

Type: Grant

Filed: April 26, 2018

Date of Patent: June 6, 2023

Assignee: Qualcomm Incorporated

Inventors: Rodney Wayne Smith, Raghavan Madhavan, Luke Yen, Shivam Priyadarshi, Yusuf Cagatay Tekmen
Cache replacement mechanisms for speculative execution

Patent number: 11663130

Abstract: Described herein are systems and methods for cache replacement mechanisms for speculative execution. For example, some systems include, a buffer comprising entries that are each configured to store a cache line of data and a tag that includes an indication of a status of the cache line stored in the entry, in an integrated circuit that is configured to: responsive to a cache miss caused by a load instruction that is speculatively executed by a processor pipeline, load a cache line of data corresponding to the cache miss into a first entry of the buffer and update the tag of the first entry to indicate the status is speculative; responsive to the load instruction being retired by the processor pipeline, update the tag to indicate the status is validated; and, responsive to the load instruction being flushed from the processor pipeline, update the tag to indicate the status is cancelled.

Type: Grant

Filed: April 30, 2021

Date of Patent: May 30, 2023

Assignee: Marvell Asia Pte, Ltd.

Inventor: Rabin Sugumar
Pre-staged instruction registers for variable length instruction set machine

Patent number: 11599358

Abstract: Methods and systems relating to improved processing architectures with pre-staged instructions are disclosed herein. A disclosed processor includes an instruction memory, at least one functional processing unit, a bus, a set of instruction registers configured to be loaded, using the bus, with a set of pre-staged instructions from the instruction memory, and a logic circuit configured to provide the set of pre-staged instructions from the set of instruction registers to the at least one functional processing unit in response to receiving an instruction from the instruction memory.

Type: Grant

Filed: August 12, 2021

Date of Patent: March 7, 2023

Assignee: Tenstorrent Inc.

Inventors: Miles Robert Dooley, Milos Trajkovic, Rakesh Shaji Lal, Stanislav Sokorac
Fine resolution on-chip voltage simulation to prevent under voltage conditions

Patent number: 11586267

Abstract: Embodiments of the present disclosure relate to managing power provided to a semiconductor circuit to prevent undervoltage conditions. A measured voltage value describing a measured supply voltage at a first subcircuit of a semiconductor circuit can be received, the measured voltage value having a first resolution. A selected metric indicative of a supply voltage present at the first subcircuit can be received, the selected metric having a second resolution higher than the first resolution. The selected metric is calibrated to obtain a calibrated metric when a transition of the measured voltage value occurs.

Type: Grant

Filed: December 19, 2018

Date of Patent: February 21, 2023

Assignee: International Business Machines Corporation

Inventors: Thomas Strach, Preetham M. Lobo, Tobias Webel
System, apparatus and method for configurable control of asymmetric multi-threading (SMT) on a per core basis

Patent number: 11579944

Abstract: In one embodiment, a processor includes: a plurality of cores each comprising a multi-threaded core to concurrently execute a plurality of threads; and a control circuit to concurrently enable at least one of the plurality of cores to operate in a single-threaded mode and at least one other of the plurality of cores to operate in a multi-threaded mode. Other embodiments are described and claimed.

Type: Grant

Filed: November 14, 2018

Date of Patent: February 14, 2023

Assignee: Intel Corporation

Inventors: Daniel J. Ragland, Guy M. Therien, Ankush Varma, Eric J. DeHaemer, David T. Mayo, Ariel Gur, Yoav Ben-Raphael, Mark P. Seconi
Pre-staged instruction registers for variable length instruction set machine

Patent number: 11567764

Abstract: Methods and systems relating to improved processing architectures with pre-staged instructions are disclosed herein. A disclosed processor includes an instruction memory, at least one functional processing unit, a bus, a set of instruction registers configured to be loaded, using the bus, with a set of pre-staged instructions from the instruction memory, and a logic circuit configured to provide the set of pre-staged instructions from the set of instruction registers to the at least one functional processing unit in response to receiving an instruction from the instruction memory.

Type: Grant

Filed: August 12, 2021

Date of Patent: January 31, 2023

Assignee: Tenstorrent Inc.

Inventors: Miles Robert Dooley, Milos Trajkovic, Rakesh Shaji Lal, Stanislav Sokorac
Apparatus and method for generating and processing a trace stream indicative of instruction execution by processing circuitry

Patent number: 11561882

Abstract: An apparatus and method are provided for generating and processing a trace stream indicative of instruction execution by processing circuitry. An apparatus has an input interface for receiving instruction execution information from the processing circuitry indicative of a sequence of instructions executed by the processing circuitry, and trace generation circuitry for generating from the instruction execution information a trace stream comprising a plurality of trace elements indicative of execution by the processing circuitry of instruction flow changing instructions within the sequence.

Type: Grant

Filed: August 9, 2017

Date of Patent: January 24, 2023

Assignee: Arm Limited

Inventors: François Christopher Jacques Botman, Thomas Christopher Grocutt, John Michael Horley, Michael John Williams, Michael John Gibbs
Variable pipeline length in a barrel-multithreaded processor

Patent number: 11526361

Abstract: Devices and techniques for variable pipeline length in a barrel-multithreaded processor are described herein. A completion time for an instruction can be determined prior to insertion into a pipeline of a processor. A conflict between the instruction and a different instruction based on the completion time can be detected. Here, the different instruction is already in the pipeline and the conflict detected when the completion time equals the previously determined completion time for the different instruction. A difference between the completion time and an unconflicted completion time can then be calculated and completion of the instruction delayed by the difference.

Type: Grant

Filed: October 20, 2020

Date of Patent: December 13, 2022

Assignee: Micron Technology, Inc.

Inventor: Tony Brewer
Compressing micro-operations in scheduler entries in a processor

Patent number: 11513802

Abstract: An electronic device includes a processor having a micro-operation queue, multiple scheduler entries, and scheduler compression logic. When a pair of micro-operations in the micro-operation queue is compressible in accordance with one or more compressibility rules, the scheduler compression logic acquires the pair of micro-operations from the micro-operation queue and stores information from both micro-operations of the pair of micro-operations into different portions in a single scheduler entry. In this way, the scheduler compression logic compresses the pair of micro-operations into the single scheduler entry.

Type: Grant

Filed: September 27, 2020

Date of Patent: November 29, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Michael W. Boyer, John Kalamatianos, Pritam Majumder
Memory management method and apparatus

Patent number: 11507412

Abstract: A disclosed example apparatus includes memory; and processor circuitry to: identify a lock-protected section of instructions in the memory; replace lock/unlock instructions with transactional lock acquire and transactional lock release instructions to form a transactional process; and execute the transactional process in a speculative execution.

Type: Grant

Filed: April 28, 2020

Date of Patent: November 22, 2022

Assignee: Intel Corporation

Inventors: Keqiang Wu, Jiwei Lu, Koichi Yamada, Yong-Fong Lee
Processor device for executing SIMD instructions

Patent number: 11500632

Abstract: In a processor device according to the present invention, a memory access unit reads data to be processed from an external memory and writes the data to a first register group that a plurality of processors does not access among a plurality of register groups. A control unit sequentially makes each of the plurality of processors implement a same instruction, in parallel with changing an address of a register group that stores the data to be processed. A scheduler, based on specified scenario information, specifies an instruction to be implemented and a register group to be accessed for the plurality of processors, and specifies a register group to be written to among the plurality of register groups and data to be processed that is to be written for the memory access unit.

Type: Grant

Filed: April 23, 2019

Date of Patent: November 15, 2022

Assignee: ArchiTek Corporation

Inventor: Shuichi Takada
Atomic operations in a large scale distributed computing network

Patent number: 11481216

Abstract: Techniques for executing an atomic command in a distributed computing network are provided. A core cluster, including a plurality of processing cores that do not natively issue atomic commands to the distributed computing network, is coupled to a translation unit. To issue an atomic command, a core requests a location in the translation unit to write an opcode and operands for the atomic command. The translation unit identifies a location (a “window”) that is not in use by another atomic command and indicates the location to the processing core. The processing core writes the opcode and operands into the window and indicates to the translation unit that the atomic command is ready. The translation generates an atomic command and issues the command to the distributed computing network for execution. After execution, the distributed computing network provides a response to the translation unit, which provides that response to the core.

Type: Grant

Filed: September 10, 2018

Date of Patent: October 25, 2022

Assignee: Advanced Micro Devices, Inc.

Inventor: Stanley Ames Lackey, Jr.
Method and apparatus for executing instructions including a blocking instruction generated in response to determining that there is data dependence between instructions

Patent number: 11422817

Abstract: A method and apparatus for executing an instruction are provided. In the method, an instruction queue is first generated, and an instruction from the instruction queue in preset order is acquired. Then, a sending step including: determining a type of the acquired instruction; determining, in response to determining that the acquired instruction is an arithmetic instruction, an executing component for executing the arithmetic instruction from an executing component set; and sending the arithmetic instruction to the determined executing component is executed. Last, in response to determining that the acquired instruction is a blocking instruction, a next instruction is acquired after receiving a signal for instructing an instruction associated with the blocking instruction being completely executed.

Type: Grant

Filed: July 1, 2019

Date of Patent: August 23, 2022

Assignee: Kunlunxin Technology (Beijing) Company Limited

Inventors: Jing Wang, Wei Qi, Yupeng Li, Xiaozhang Gong
Reach-based explicit dataflow processors, and related computer-readable media and methods

Patent number: 11392537

Abstract: Exemplary reach-based explicit dataflow processors and related computer-readable media and methods. The reach-based explicit dataflow processors are configured to support execution of producer instructions encoded with explicit naming of consumer instructions intended to consume the values produced by the producer instructions. The reach-based explicit dataflow processors are configured to make available produced values as inputs to explicitly named consumer instructions as a result of processing producer instructions. The reach-based explicit dataflow processors support execution of a producer instruction that explicitly names a consumer instruction based on using the producer instruction as a relative reference point from the producer instruction.

Type: Grant

Filed: March 18, 2019

Date of Patent: July 19, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Gagan Gupta, Michael Scott McIlvaine, Rodney Wayne Smith, Thomas Philip Speier, David Tennyson Harper, III
Predicting load-based control independent (CI) register data independent (DI) (CIRDI) instructions as CI memory data dependent (DD) (CIMDD) instructions for replay in speculative misprediction recovery in a processor

Patent number: 11392387

Abstract: Predicting load-based control independent (CI), register data independent (DI) (CIRDI) instructions as CI memory data dependent (DD) (CIMDD) instructions for replay in speculative misprediction recovery in a processor. The processor predicts if a source of a load-based CIRDI instruction will be forwarded by a store-based instruction (i.e. “store-forwarded”). If a load-based CIRDI instruction is predicted as store-forwarded, the load-based CIRDI instruction is considered a CIMDD instruction and is replayed in misprediction recovery. If a load-based CIRDI instruction is not predicted as store-forwarded, the processor considers such load-based CIRDI instruction as a pending load-based CIRDI instruction. If this pending load-based CIRDI instruction is determined in execution to be store-forwarded, the instruction pipeline is flushed and the pending load-based CIRDI instruction is also replayed in misprediction recovery.

Type: Grant

Filed: November 4, 2020

Date of Patent: July 19, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Vignyan Reddy Kothinti Naresh, Arthur Perais, Rami Mohammad Al Sheikh, Shivam Priyadarshi
Scheduling tasks using targeted pipelines

Patent number: 11366691

Abstract: A method of scheduling instructions within a parallel processing unit is described. The method comprises decoding, in an instruction decoder, an instruction in a scheduled task in an active state, and checking, by an instruction controller, if an ALU targeted by the decoded instruction is a primary instruction pipeline. If the targeted ALU is a primary instruction pipeline, a list associated with the primary instruction pipeline is checked to determine whether the scheduled task is already included in the list. If the scheduled task is already included in the list, the decoded instruction is sent to the primary instruction pipeline.

Type: Grant

Filed: December 1, 2020

Date of Patent: June 21, 2022

Assignee: Imagination Technologies Limited

Inventors: Simon Nield, Yoong-Chert Foo, Adam de Grasse, Luca Iuliano
Controlling the number of powered vector lanes via a register field

Patent number: 11360536

Abstract: The vector data path is divided into smaller vector lanes. A register such as a memory mapped control register stores a vector lane number (VLX) indicating the number of vector lanes to be powered. A decoder converts this VLX into a vector lane control word, each bit controlling the ON of OFF state of the corresponding vector lane. This number of contiguous least significant vector lanes are powered. In the preferred embodiment the stored data VLX indicates that 2VLX contiguous least significant vector lanes are to be powered. Thus the number of vector lanes powered is limited to an integral power of 2. This manner of coding produces a very compact controlling bit field while obtaining substantially all the power saving advantage of individually controlling the power of all vector lanes.

Type: Grant

Filed: August 3, 2020

Date of Patent: June 14, 2022

Assignee: Texas Instruments Incorporated

Inventors: Timothy David Anderson, Duc Quang Bui
Method and apparatus for balancing binary instruction burstization and chaining

Patent number: 11327760

Abstract: A method for grouping computer instructions includes receiving a set of computer instructions, grouping the set of computer instructions by register dependencies, identifying a plurality of single-definition-use flow (SDF) bundles based on a burstization criteria and a chaining criteria; and based on the SDF bundles, transforming the set of computer instructions. The transformation may include splitting one of the set of computer instructions and setting a burst parameter for the one of the set of computer instruction. The transformation may include grouping a plurality of the set of computer instructions and replacing a pair of register file accesses with a pair of temporary register accesses.

Type: Grant

Filed: April 9, 2020

Date of Patent: May 10, 2022

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Andrew Siu Doug Lee, Ahmed Mohammed Elshafiey Mohammed Eltantawy
Apparatus and method for operating an issue queue

Patent number: 11327791

Abstract: An apparatus provides an issue queue having a first section and a second section. Each entry in each section stores operation information identifying an operation to be performed. Allocation circuitry allocates each item of received operation information to an entry in the first section or the second section. Selection circuitry selects from the issue queue, during a given selection iteration, an operation from amongst the operations whose required source operands are available. Availability update circuitry updates source operand availability for each entry whose operation information identifies as a source operand a destination operand of the selected operation in the given selection iteration. A deferral mechanism inhibits from selection, during a next selection iteration, any operation associated with an entry in the second section whose source operands are now available due to that operation having as a source operand the destination operand of the selected operation in the given selection iteration.

Type: Grant

Filed: August 21, 2019

Date of Patent: May 10, 2022

Assignee: Arm Limited

Inventors: Michael David Achenbach, Robert Greg McDonald, Nicholas Andrew Pfister, Kelvin Domnic Goveas, Michael Filippo, . Abhishek Raja, Zachary Allen Kingsbury
Instruction dispatch routing

Patent number: 11327766

Abstract: A method of instruction dispatch routing comprises receiving an instruction for dispatch to one of a plurality of issue queues; determining a priority status of the instruction; selecting a rotation order based on the priority status, wherein a first rotation order is associated with priority instructions and a second rotation order, different from the first rotation order, is associated with non-priority instructions; selecting an issue queue of the plurality of issue queues based on the selected rotation order; and dispatching the instruction to the selected issue queue.

Type: Grant

Filed: July 31, 2020

Date of Patent: May 10, 2022

Assignee: International Business Machines Corporation

Inventors: Eric Mark Schwarz, Brian W. Thompto, Kurt A. Feiste, Michael Joseph Genden, Dung Q. Nguyen, Susan E. Eisen
Event processing

Patent number: 11321019

Abstract: An event-processing unit for processing tokens associated with a state or state transition, herein also referred to as an event, of an external device is disclosed. The EPU allows token-processing schemes, in which the processing of incoming tokens and the further handling of a processing result by the EPU are determined not only by the token identifier, but also by the payload data of the incoming token or by data in the data memory. A flag-processing capability of a processing-control stage allows applying flag-processing operations such as logical operations to data obtained as a processing result of an ALU-processing operation. The result of these operations determines a subsequent handling of ALU-result data by the EPU. Thus, whether or not the ALU-result data is written to the data memory also influences the processing of any subsequent incoming tokens for which that data is used in the ALU-processing operation.

Type: Grant

Filed: September 11, 2020

Date of Patent: May 3, 2022

Assignee: ACCEMIC TECHNOLOGIES GMBH

Inventor: Alexander Weiss
Executing mutually exclusive vector instructions according to a vector predicate instruction

Patent number: 11301252

Abstract: A data processing apparatus is provided comprising: a plurality of input lanes and a plurality of corresponding output lanes. Processing circuitry executes a first vector instruction and a second vector instruction. The first vector instruction specifies a target of output data from the corresponding output lanes that is specified as a source of input data to the input lanes by the second vector instruction. Mask circuitry stores a first mask that defines a first set of the output lanes that are valid for the first vector instruction, and stores a second mask that defines a second set of the output lanes that are valid for the second vector instruction. The first set and the second set are mutually exclusive. Issue circuitry begins processing of the second vector instruction at a lane index prior to completion of the first vector instruction at the lane index.

Type: Grant

Filed: January 15, 2020

Date of Patent: April 12, 2022

Assignee: Arm Limited

Inventor: Kim Richard Schuttenberg
Proactive voltage droop reduction and/or mitigation in a processor core

Patent number: 11275644

Abstract: Techniques facilitating voltage droop reduction and/or mitigation in a processor core are provided. In one example, a system can comprise a memory that stores, and a processor that executes, computer executable components. The computer executable components can comprise an observation component that detects one or more events at a first stage of a processor pipeline. An event of the one or more events can be a defined event determined to increase a level of power consumed during a second stage of the processor pipeline. The computer executable components can also comprise an instruction component that applies a voltage droop mitigation countermeasure prior to the increase of the level of power consumed during the second stage of the processor pipeline and a feedback component that provides a notification to the instruction component that indicates a success or a failure of a result of the voltage droop mitigation countermeasure.

Type: Grant

Filed: December 6, 2019

Date of Patent: March 15, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Giora Biran, Pradip Bose, Alper Buyuktosunoglu, Pierce I-Jen Chuang, Preetham M. Lobo, Ramon Bertran Monfort, Phillip John Restle, Christos Vezyrtzis, Tobias Webel
Instruction scheduling patterns on decoupled systems

Patent number: 11269646

Abstract: Apparatuses and methods for instruction scheduling in an out-of-order decoupled access-execute processor are disclosed. The instructions for the decoupled access-execute processor comprises access instructions and execute instructions, where access instructions comprise load instructions and instructions which provide operand values to load instructions. Schedule patterns of groups of linked execute instructions are monitored, where the execute instructions in a group of linked execute instructions are linked by data dependencies. On the basis of an identified repeating schedule pattern configurable execution circuitry adopts a configuration to perform the operations defined by the group of linked execute instructions of the repeating schedule pattern.

Type: Grant

Filed: March 29, 2021

Date of Patent: March 8, 2022

Assignee: Arm Limited

Inventors: Mbou Eyole, Michiel Willem Van Tol
Multiple core software forwarding

Patent number: 11212590

Abstract: Approaches for performing all DOCSIS downstream and upstream data forwarding functions using executable software. DOCSIS data forwarding functions may be performed by classifying one or more packets, of a plurality of received packets, to a particular DOCSIS system component, and then processing the one or more packets classified to the same DOCSIS system component on a single CPU core. The one or more packets may be forwarded between a sequence of one or more software stages. The software stages may each be configured to execute on separate logical cores or on a single logical core.

Type: Grant

Filed: July 10, 2017

Date of Patent: December 28, 2021

Assignee: Harmonic, Inc.

Inventors: Adam Levy, Pavlo Shcherbyna, Alex Muller, Vladyslav Buslov, Victoria Sinitsky, Michael W. Patrick, Nitin Sasi Kumar
System, apparatus and method for symbolic store address generation for data-parallel processor

Patent number: 11188341

Abstract: In one embodiment, an apparatus includes: a plurality of execution lanes to perform parallel execution of instructions; and a unified symbolic store address buffer coupled to the plurality of execution lanes, the unified symbolic store address buffer comprising a plurality of entries each to store a symbolic store address for a store instruction to be executed by at least some of the plurality of execution lanes. Other embodiments are described and claimed.

Type: Grant

Filed: March 26, 2019

Date of Patent: November 30, 2021

Assignee: Intel Corporation

Inventors: Jeffrey J. Cook, Srikanth T. Srinivasan, Jonathan D. Pearce, David B. Sheffield
Malware resistant computer

Patent number: 11188681

Abstract: An approach is provided in which an information handling system loads a set of encrypted binary code into a processor that has been encrypted based upon a unique key of the processor. The processor includes an instruction decoder that transforms the set of encrypted binary code into a set of instruction control signals using the unique key. In turn, the processor executes a set of instructions based on the set of instruction control signals.

Type: Grant

Filed: April 8, 2019

Date of Patent: November 30, 2021

Assignee: International Business Machines Corporation

Inventors: Guy M. Cohen, Shai Halevi, Lior Horesh
Method to determine the oldest instruction in an instruction queue of a processor with multiple instruction threads

Patent number: 11182167

Abstract: A method to determine an oldest instruction in an instruction queue of a processor with multiple instruction threads, wherein each of the multiple instruction threads have a unique thread identifier. The method includes tagging each instruction thread, of the multiple instruction threads, in the instruction queue with a unique tag number according to a round-robin scheme, wherein the unique tag number includes the unique thread identifier for each instruction thread and a round number in the round-robin scheme. The method further includes selecting, for each instruction thread, of the multiple instruction threads, the instruction thread with a lowest tag number from the multiple instruction threads in the instruction queue that are tagged with an oldest round number from the round-robin scheme.

Type: Grant

Filed: March 15, 2019

Date of Patent: November 23, 2021

Assignee: International Business Machines Corporation

Inventors: Arni Ingimundarson, Maarten J. Boersma, Niels Fricke
System and method for a lightweight fencing operation

Patent number: 11175916

Abstract: A system and method for a lightweight fence is described. In particular, micro-operations including a fencing micro-operation are dispatched to a load queue. The fencing micro-operation allows micro-operations younger than the fencing micro-operation to execute, where the micro-operations are related to a type of fencing micro-operation. The fencing micro-operation is executed if the fencing micro-operation is the oldest memory access micro-operation, where the oldest memory access micro-operation is related to the type of fencing micro-operation. The fencing micro-operation determines whether micro-operations younger than the fencing micro-operation have load ordering violations and if load ordering violations are detected, the fencing micro-operation signals the retire queue that instructions younger than the fencing micro-operation should be flushed. The instructions to be flushed should include all micro-operations with load ordering violations.

Type: Grant

Filed: December 19, 2017

Date of Patent: November 16, 2021

Assignee: Advanced Micro Devices, Inc.

Inventors: Gregory W. Smaus, John M. King
Systems and methods for invisible speculative execution

Patent number: 11163576

Abstract: A system and method for efficiently preventing visible side-effects in the memory hierarchy during speculative execution is disclosed. Hiding the side-effects of executed instructions in the whole memory hierarchy is both expensive, in terms of performance and energy, and complicated. A system and method is disclosed to hide the side-effects of speculative loads in the cache(s) until the earliest time these speculative loads become non-speculative. A refinement is disclosed where loads that hit in the L1 cache are allowed to proceed by keeping their side-effects on the L1 cache hidden until these loads become non-speculative, and all other speculative loads that miss in the cache(s) are prevented from executing until they become non-speculative. To limit the performance deterioration caused by these delayed loads, a system and method is disclosed that augments the cache(s) with a value predictor or a re-computation engine that supplies predicted or recomputed values to the loads that missed in the cache(s).

Type: Grant

Filed: March 20, 2020

Date of Patent: November 2, 2021

Assignee: ETA SCALE AB

Inventors: Christos Sakalis, Stefanos Kaxiras, Alberto Ros, Alexandra Jimborean, Magnus Själander
Accelerated operation of a graph streaming processor

Patent number: 11150961

Abstract: Methods, systems and apparatuses for graph processing are disclosed. One graph streaming processor includes a thread manager, wherein the thread manager is operative to dispatch operation of the plurality of threads of a plurality of thread processors before dependencies of the dependent threads have been resolved, maintain a scorecard of operation of the plurality of threads of the plurality of thread processors, and provide an indication to at least one of the plurality of thread processors when a dependency between the at least one of the plurality of threads that a request has or has not been satisfied. Further, a producer thread provides a response to the dependency when the dependency has been satisfied, and each of the plurality of thread processors is operative to provide processing updates to the thread manager, and provide queries to the thread manager upon reaching a dependency.

Type: Grant

Filed: February 8, 2019

Date of Patent: October 19, 2021

Assignee: Blaize, Inc.

Inventors: Lokesh Agarwal, Sarvendra Govindammagari, Venkata Ganapathi Puppala, Satyaki Koneru
AC parallelization circuit, AC parallelization method, and parallel information processing device

Patent number: 11144317

Abstract: An AC parallelization circuit includes a transmitting circuit configured to transmit a stop signal to instruct a device for executing calculation in an iteration immediately preceding an iteration for which a concerned device is responsible to stop the calculation in loop-carried dependency calculation; and an estimating circuit configured to generate, as a result of executing the calculation in the preceding iteration, an estimated value to be provided to an arithmetic circuit when the transmitting circuit transmits the stop signal.

Type: Grant

Filed: August 20, 2020

Date of Patent: October 12, 2021

Assignee: FUJITSU LIMITED

Inventor: Hisanao Akima
Slice-target register file for microprocessor

Patent number: 11119774

Abstract: A system and/or method for processing information is disclosed that has at least one processor; a register file associated with the processor, the register file sliced into a plurality of STF blocks having a plurality of STF entries, and in an embodiment, each STF block is further partitioned into a plurality of sub-blocks, each sub-block having a different portion of the plurality of STF entries; and a plurality of execution units configured to read data from and write data to the register file, where the plurality of execution units are arranged in one or more execution slices. In one or more embodiments, the system is configured so that each execution slice has a plurality of STF blocks, and alternatively or additionally, each of the plurality of execution units in a single execution slice is assigned to write to one, and preferably only one, of the plurality of STF blocks.

Type: Grant

Filed: September 6, 2019

Date of Patent: September 14, 2021

Assignee: International Business Machines Corporation

Inventors: Brian W. Thompto, Dung Q. Nguyen, Hung Q. Le, Sam Gat-Shang Chu
System and method for auto-detection of WLAN packets using header

Patent number: 11115964

Abstract: A system and method of auto-detection of WLAN packets includes transmitting in a 60 GHz frequency band a wireless packet comprising a first header, a second header, a payload, and a training field, the first header carrying a plurality of bits, a logical value of a subset of the plurality of bits in the first header indicating the presence of the second header in the wireless packet.

Type: Grant

Filed: February 12, 2016

Date of Patent: September 7, 2021

Assignee: Huawei Technologies Co., Ltd.

Inventors: Yan Xin, Osama Aboul-Magd, Jung Hoon Suh
Predictive on-chip voltage simulation to detect near-future under voltage conditions

Patent number: 11112846

Abstract: Embodiments of the present disclosure relate to detecting undervoltage conditions at a subcircuit. A power supply current of a first subcircuit is determined over a first number of previous clock cycles. A cross current flowing between the first subcircuit and a second subcircuit is determined over the first number of previous clock cycles. An estimated momentary supply voltage present at the first subcircuit is then determined based on the power supply current of the first subcircuit over the first number of previous clock cycles and the cross current flowing between the first subcircuit and the second subcircuit over the first number of previous clock cycles.

Type: Grant

Filed: December 19, 2018

Date of Patent: September 7, 2021

Assignee: International Business Machines Corporation

Inventors: Thomas Strach, Preetham M. Lobo, Tobias Webel
Power-efficient deep neural network module configured for layer and operation fencing and dependency management

Patent number: 11100390

Abstract: A deep neural network (DNN) processor is configured to execute layer descriptors in layer descriptor lists. The descriptors define instructions for performing a forward pass of a DNN by the DNN processor. The layer descriptors can also be utilized to manage the flow of descriptors through the DNN module. For example, layer descriptors can define dependencies upon other descriptors. Descriptors defining a dependency will not execute until the descriptors upon which they are dependent have completed. Layer descriptors can also define a “fence,” or barrier, function that can be used to prevent the processing of upstream layer descriptors until the processing of all downstream layer descriptors is complete. The fence bit guarantees that there are no other layer descriptors in the DNN processing pipeline before the layer descriptor that has the fence to be asserted is processed.

Type: Grant

Filed: April 11, 2018

Date of Patent: August 24, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Chad Balling McBride, Amol Ashok Ambardekar, Kent D. Cedola, George Petre, Larry Marvin Wall, Boris Bobrov
Computer system and memory access technology

Patent number: 11093245

Abstract: A computer system and a memory access technology are provided. In the computer system, when load/store instructions having a dependency relationship is processed, dependency information between a producer load/store instruction and a consumer load/store instruction can be obtained from a processor. A consumer load/store request is sent to a memory controller in the computer system based on the obtained dependency information, so that the memory controller can terminate a dependency relationship between load/store requests in the memory controller locally based on the dependency information in the received consumer load/store request, and execute the consumer load/store request.

Type: Grant

Filed: June 12, 2019

Date of Patent: August 17, 2021

Assignee: Huawei Technologies Co., Ltd.

Inventors: Lei Fang, Xi Chen, Weiguang Cai
System and method for load and store queue allocations at address generation time

Patent number: 11086628

Abstract: A system and method for load queue (LDQ) and store queue (STQ) entry allocations at address generation time that maintains age-order of instructions is described. In particular, writing LDQ and STQ entries are delayed until address generation time. This allows the load and store operations to dispatch, and younger operations (which may not be store and load operations) to also dispatch and execute their instructions. The address generation of the load or store operation is held at an address generation scheduler queue (AGSQ) until a load or store queue entry is available for the operation. The tracking of load queue entries or store queue entries is effectively being done in the AGSQ instead of at the decode engine. The LDQ and STQ depth is not visible from a decode engine's perspective, and increases the effective processing and queue depth.

Type: Grant

Filed: August 15, 2016

Date of Patent: August 10, 2021

Assignee: Advanced Micro Devices, Inc.

Inventor: John M. King
Register file arbitration

Patent number: 11080055

Abstract: Techniques are disclosed relating to arbitration among register file accesses. In some embodiments, an apparatus includes a register file configured to store operands for multiple client circuits and arbitration circuitry configured to select from among multiple received requests to access the register file. In some embodiments, the apparatus includes first interface circuitry configured to provide access requests from a first client circuit to the arbitration circuitry and supplemental interface circuitry configured to receive unsuccessful requests from the first client circuit and provide the received unsuccessful requests to the arbitration circuitry. The supplemental interface circuitry may provide additional catch-up bandwidth to clients that lose arbitration, which may result in fairness during bandwidth shortages.

Type: Grant

Filed: August 22, 2019

Date of Patent: August 3, 2021

Assignee: Apple Inc.

Inventors: Robert D. Kenney, Terence M. Potter
Event handling instruction processing

Patent number: 11074079

Abstract: A method of providing instructions to computer processing apparatus for improved event handling comprises the following. Instructions for execution on the computer processing apparatus are provided to an event processor generator. These instructions comprise a plurality of functional steps, a set of dependencies between the functional steps, and configuration data. The event processor generator creates instances of the functional steps from the instructions and represents the instances as directed acyclic graphs. The event processor generator identifies a plurality of event types and topologically sort is the directed acyclic graphs to determine a topologically ordered event path for each event type. The event processor generator then provides a revised set of instructions for execution on the computer processing apparatus in which original instructions have been replaced by instructions requiring each event type to be executed according to its topologically ordered event path.

Type: Grant

Filed: November 17, 2017

Date of Patent: July 27, 2021

Inventor: Greg Higgins

1 2 3 4 5 … next