Patents Examined by Corey S Faherty

Efficient mitigation of side-channel based attacks against speculative execution processing architectures

Patent number: 11544070

Abstract: The present disclosure is directed to systems and methods for mitigating or eliminating the effectiveness of a side-channel based attack, such as one or more classes of an attack commonly known as Spectre. Novel instruction prefixes, and in certain embodiments one or more corresponding instruction prefix parameters, may be provided to enforce a serialized order of execution for particular instructions without serializing an entire instruction flow, thereby improving performance and mitigation reliability over existing solutions. In addition, improved mitigation of such attacks is provided by randomizing both the execution branch history as well as the source address of each vulnerable indirect branch, thereby eliminating the conditions required for such attacks.

Type: Grant

Filed: July 28, 2021

Date of Patent: January 3, 2023

Assignee: Intel Corporation

Inventors: Rodrigo Branco, Kekai Hu, Ke Sun, Henrique Kawakami
Method for optimizing performance of algorithm using precision scaling

Patent number: 11537395

Abstract: This application relates to a method for optimizing algorithm performance using precision scaling, wherein the method according to an embodiment of present invention comprises obtaining a number of iterations of a unit operation according to precisions of the algorithm including the unit operation that is repeatedly performed, wherein the precisions include a first precision and a second precision, and the number of iterations include a first number of iterations corresponding to the first precision and a second number of iterations corresponding to the second precision; inspecting available precisions of a device on which the algorithm is to be executed, wherein the available precisions include a first available precision corresponding to the first precision and a second available precision corresponding to the second precision; determining an optimal precision by repeatedly performing the unit operation corresponding to an initial operation of the algorithm using the inspected available precision; and repea

Type: Grant

Filed: November 4, 2021

Date of Patent: December 27, 2022

Assignee: Industry-University Cooperation Foundation Hanyang University

Inventors: Yongjun Park, Seokwon Kang, Sang Wook Kim, Hong Kyun Bae, Jae Seo Yu, Kyunghwan Choi
Compiler-assisted inter-SIMD-group register sharing

Patent number: 11537397

Abstract: Systems, apparatuses, and methods for efficiently sharing registers among threads are disclosed. A system includes at least a processor, control logic, and a register file with a plurality of registers. The processor assigns a base set of registers to each thread of a plurality of threads executing on the processor. When a given thread needs more than the base set of registers to execute a given phase of program code, the given thread executes an acquire instruction to acquire exclusive access to an extended set of registers from a shared resource pool. When the given thread no longer needs additional registers, the given thread executes a release instruction to release the extended set of registers back into the shared register pool for other threads to use. In one implementation, the compiler inserts acquire and release instructions into the program code based on a register liveness analysis performed during compilation.

Type: Grant

Filed: March 26, 2018

Date of Patent: December 27, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Farzad Khorasani, Amin Farmahini-Farahani, Nuwan S. Jayasena
Apparatus and method for scalable qubit addressing

Patent number: 11531922

Abstract: An apparatus and method for scalable qubit addressing. For example, one embodiment of a processor comprises: a decoder comprising quantum instruction decode circuitry to decode quantum instructions to generate quantum microoperations (uops) and non-quantum decode circuitry to decode non-quantum instructions to generate non-quantum uops; execution circuitry comprising: an address generation unit (AGU) to generate a system memory address responsive to execution of one or more of the non-quantum uops; and quantum index generation circuitry to generate quantum index values responsive to execution of one or more of the quantum uops, each quantum index value uniquely identifying a quantum bit (qubit) in a quantum processor; wherein to generate a first quantum index value for a first quantum uop, the quantum index generation circuitry is to read the first quantum index value from a first architectural register identified by the first quantum uop.

Type: Grant

Filed: September 27, 2018

Date of Patent: December 20, 2022

Assignee: Intel Corporation

Inventor: Xiang Zou
Method and apparatus for renaming source operands of instructions

Patent number: 11520586

Abstract: A renaming unit configured to rename source operands of instructions in a group. A renaming register maintains architectural to physical register mappings. Architectural to physical register mappings propagate from the renaming register through a chain of update units (U) over bus lines denoted with the architectural registers 0 to L. Update units (U) sequentially, in program order, insert physical register identifiers PR(i) allocated to instructions I(i) with destination operands DOP(i) on bus lines denoted with the destination operands DOP(i). Source operands of an instruction I(i) may be renamed to physical register identifiers after physical register identifiers allocated to instructions older than I(i) are sequentially, in program order, inserted on the bus lines, but before physical register identifiers allocated to I(i) and younger instructions are inserted on the bus lines. A source operand SOP(i) is renamed to a physical register identifier that propagates on a bus line denoted with SOP(i).

Type: Grant

Filed: July 8, 2021

Date of Patent: December 6, 2022

Inventor: Dejan Spasov
Tracking streaming engine vector predicates to control processor execution

Patent number: 11507520

Abstract: In a method of operating a computer system, an instruction loop is executed by a processor in which each iteration of the instruction loop accesses a current data vector and an associated current vector predicate. The instruction loop is repeated when the current vector predicate indicates the current data vector contains at least one valid data element and the instruction loop is exited when the current vector predicate indicates the current data vector contains no valid data elements.

Type: Grant

Filed: March 1, 2021

Date of Patent: November 22, 2022

Assignee: Texas Instruments Incorporated

Inventors: Duc Quang Bui, Joseph Raymond Michael Zbiciak
Assignment of microprocessor register tags at issue time

Patent number: 11500642

Abstract: Provided is a method for assigning register tags to instructions at issue time. The method comprises receiving an instruction for execution by a microprocessor. The method further comprises dispatching the instruction to an issue queue without assigning a register tag to the instruction. The method further comprises determining that the instruction is ready to issue. In response to determining that the instruction is ready to issue, the method comprises assigning an available register tag to the instruction. The method further comprises issuing the instruction.

Type: Grant

Filed: November 10, 2020

Date of Patent: November 15, 2022

Assignee: International Busines Machines Corporation

Inventors: Steven J. Battle, Jentje Leenstra, Brian D. Barrick, Dung Q. Nguyen, Brian W. Thompto
Spatial and temporal merging of remote atomic operations

Patent number: 11500636

Abstract: Disclosed embodiments relate to spatial and temporal merging of remote atomic operations.

Type: Grant

Filed: February 24, 2020

Date of Patent: November 15, 2022

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Joseph Nuzman, Jonas Svennebring, Doddaballapur N. Jayasimha, Samantika S. Sury, David A. Koufaty, Niall D. McDonnell, Yen-Cheng Liu, Stephen R. Van Doren, Stephen J. Robinson
Systems, apparatuses, and methods for chained fused multiply add

Patent number: 11487541

Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand.

Type: Grant

Filed: November 30, 2020

Date of Patent: November 1, 2022

Assignee: Intel Corporation

Inventors: Jesus Corbal, Robert Valentine, Roman S. Dubtsov, Nikita A. Shustrov, Mark J. Charney, Dennis R. Bradford, Milind B. Girkar, Edward T. Grochowski, Thomas D. Fletcher, Warren E. Ferguson
Calculation method and related product

Patent number: 11481215

Abstract: The present disclosure provides a computing method that is applied to a computing device. The computing device includes: a memory, a register unit, and a matrix computing unit. The method includes the following steps: controlling, by the computing device, the matrix computing unit to obtain a first operation instruction, where the first operation instruction includes a matrix reading instruction for a matrix required for executing the instruction; controlling, by the computing device, an operating unit to send a reading command to the memory according to the matrix reading instruction; and controlling, by the computing device, the operating unit to read a matrix corresponding to the matrix reading instruction in a batch reading manner, and executing the first operation instruction on the matrix. The technical solutions in the present disclosure have the advantages of fast computing speed and high efficiency.

Type: Grant

Filed: January 17, 2020

Date of Patent: October 25, 2022

Assignee: Cambricon (Xi'an) Semiconductor Co., Ltd.

Inventors: Tianshi Chen, Shaoli Liu, Zai Wang, Shuai Hu
Apparatus and method for performing operations on capability metadata

Patent number: 11481384

Abstract: An apparatus is provided comprising storage elements to store data blocks, where each data block has capability metadata associated therewith identifying whether the data block specifies a capability, at least one capability type being a bounded pointer. Processing circuitry is then arranged to be responsive to a bulk capability metadata operation identifying a plurality of the storage elements, to perform an operation on the capability metadata associated with each data block stored in the plurality of storage elements. Via a single specified operation, this hence enables query and/or modification operations to be performed on multiple items of capability metadata, hence providing more efficient access to such capability metadata.

Type: Grant

Filed: March 29, 2017

Date of Patent: October 25, 2022

Assignee: Arm Limited

Inventors: Graeme Peter Barnes, Stuart David Biles
Methods and apparatus for deep learning network execution pipeline on multi-processor platform

Patent number: 11461105

Abstract: Methods and systems are disclosed using an execution pipeline on a multi-processor platform for deep learning network execution. In one example, a network workload analyzer receives a workload, analyzes a computation distribution of the workload, and groups the network nodes into groups. A network executor assigns each group to a processing core of the multi-core platform so that the respective processing core handle computation tasks of the received workload for the respective group.

Type: Grant

Filed: April 7, 2017

Date of Patent: October 4, 2022

Assignee: Intel Corporation

Inventors: Liu Yang, Anbang Yao
Method and device for floating point representation with variable precision

Patent number: 11461095

Abstract: The present disclosure relates to a method of storing, by a load and store circuit or other processing means, a variable precision floating point value to a memory address of a memory, the method comprising: reducing the bit length of the variable precision floating point value to no more than a size limit, and storing the variable precision floating point value to one of a plurality of storage zones in the memory, each of the plurality of storage zones having a storage space equal to or greater than the size limit (MBB).

Type: Grant

Filed: March 6, 2020

Date of Patent: October 4, 2022

Assignees: Commissariat à l'Energie Atomique et aux Energies Alternatives, Institut National des Sciences Appliquées de Lyon

Inventors: Andrea Bocco, Florent Dupont De Dinechin, Yves Durand
Efficient implementation of complex vector fused multiply add and complex vector multiply

Patent number: 11455167

Abstract: Disclosed embodiments relate to efficient complex vector multiplication. In one example, an apparatus includes execution circuitry, responsive to an instruction having fields to specify multiplier, multiplicand, and summand complex vectors, to perform two operations: first, to generate a double-even multiplicand by duplicating even elements of the specified multiplicand, and to generate a temporary vector using a fused multiply-add (FMA) circuit having A, B, and C inputs set to the specified multiplier, the double-even multiplicand, and the specified summand, respectively, and second, to generate a double-odd multiplicand by duplicating odd elements of the specified multiplicand, to generate a swapped multiplier by swapping even and odd elements of the specified multiplier, and to generate a result using a second FMA circuit having its even product negated, and having A, B, and C inputs set to the swapped multiplier, the double-odd multiplicand, and the temporary vector, respectively.

Type: Grant

Filed: December 2, 2019

Date of Patent: September 27, 2022

Assignee: Intel Coporation

Inventors: Raanan Sade, Thierry Pons, Amit Gradstein, Zeev Sperber, Mark J. Charney, Robert Valentine, Eyal Oz-Sinay
Electronic device and method for data processing using virtual register mode

Patent number: 11449341

Abstract: The invention relates to an electronic device for data processing, which includes an execution unit with a temporary register, a register file, a first feedback path from the data output of the execution unit to the register file, a second feedback path from the data output of the execution unit to the temporary register, a switch configured to connect the first feedback path and/or the second feedback path, and a logic stage coupled to control the switch. The control stage is configured to control the switch to connect the second feedback path if the data output of an execution unit is used as an operand in the subsequent operation of an execution unit.

Type: Grant

Filed: September 9, 2019

Date of Patent: September 20, 2022

Assignee: Texas Instruments Incorporated

Inventors: Marko Krüger, Steven Bartling, Markus Kösler
Look-up table read

Patent number: 11436015

Abstract: A digital data processor includes a multi-stage butterfly network, which is configured to, in response to a look up table read instruction, receive look up table data from an intermediate register, reorder the look up table data based on control signals comprising look up table configuration register data, and write the reordered look up table data to a destination register specified by the look up table read instruction.

Type: Grant

Filed: September 13, 2019

Date of Patent: September 6, 2022

Assignee: Texas Instmments Incorporated

Inventors: Naveen Bhoria, Duc Bui, Dheera Balasubramanian Samudrala, Rama Venkatasubramanian
High throughput processors

Patent number: 11436186

Abstract: An algorithmic matching pipelined compiler and a reusable algorithmic pipelined core comprise a high throughput processor system. The reusable algorithmic pipelined core is a reconfigurable processing core with a pipelined structure comprising a processor with a setup interface for programming any of a plurality of operations as determined by setup data, a logic decision processor for programming a look up table, a loop counter and a constant register, and a block of memory. This can be used to perform functions. A reconfigurable, programmable circuit routes data and results from one core to another core and/or IO controller and/or interrupt generator, as required to complete an algorithm without further intervention from a central or peripheral processor during processing of an algorithm.

Type: Grant

Filed: June 22, 2018

Date of Patent: September 6, 2022

Assignee: ICAT LLC

Inventors: Robert D Catiller, Daniel Roig, Gnanashanmugam Elumalai
Distributed processor system

Patent number: 11422969

Abstract: This disclosure relates to a distributed processing system for configuring multiple processing channels. The distributed processing system includes a main processor, such as an ARM processor, communicatively coupled to a plurality of co-processors, such as stream processors. The co-processors can execute instructions in parallel with each other and interrupt the ARM processor. Longer latency instructions can be executed by the main processor and lower latency instructions can be executed by the co-processors. There are several ways that a stream can be triggered in the distributed processing system. In an embodiment, the distributed processing system is a stream processor system that includes an ARM processor and stream processors configured to access different register sets. The stream processors can include a main stream processor and stream processors in respective transmit and receive channels. The stream processor system can be implemented in a radio system to configure the radio for operation.

Type: Grant

Filed: June 26, 2020

Date of Patent: August 23, 2022

Assignee: Analog Devices, Inc.

Inventors: Manish J. Manglani, Shipra Bhal, Christopher Mayer
Look-ahead staging for time-travel reconstruction

Patent number: 11416259

Abstract: Disclosed herein are system, method, and computer program product embodiments for utilizing look-ahead-staging (LAS) to guarantee the ability to rollback and reconstruct a package while minimizing locking duration and enabling multiple packages to be processed in a data pipeline simultaneously. An embodiment operates by receiving a package from a source system for processing through a data pipeline. The embodiment stores the package in a persistent storage together with a respective package status. The embodiment transmits the package to the data pipeline in response to the storing. The embodiment receives a commit notification for the package from a target system in response to the transmitting. The embodiment then removes the package from the persistent storage in response to receiving the commit notification for the package.

Type: Grant

Filed: December 11, 2020

Date of Patent: August 16, 2022

Assignee: SAP SE

Inventors: Daniel Bos, Dan Liu, Tobias Karpstein
Multiple-table branch target buffer

Patent number: 11416253

Abstract: A processor includes two or more branch target buffer (BTB) tables for branch prediction, each BTB table storing entries of a different target size or width or storing entries of a different branch type. Each BTB entry includes at least a tag and a target address. For certain branch types that only require a few target address bits, the respective BTB tables are narrower thereby allowing for more BTB entries in the processor separated into respective BTB tables by branch instruction type. An increased number of available BTB entries are stored in a same or a less space in the processor thereby increasing a speed of instruction processing. BTB tables can be defined that do not store any target address and rely on a decode unit to provide it. High value BTB entries have dedicated storage and are therefore less likely to be evicted than low value BTB entries.

Type: Grant

Filed: July 10, 2020

Date of Patent: August 16, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Thomas Clouqueur, Anthony Jarvis

prev … 2 3 4 5 6 7 8 9 10 … next