Patents Examined by Corey S Faherty
  • Patent number: 11544070
    Abstract: The present disclosure is directed to systems and methods for mitigating or eliminating the effectiveness of a side-channel based attack, such as one or more classes of an attack commonly known as Spectre. Novel instruction prefixes, and in certain embodiments one or more corresponding instruction prefix parameters, may be provided to enforce a serialized order of execution for particular instructions without serializing an entire instruction flow, thereby improving performance and mitigation reliability over existing solutions. In addition, improved mitigation of such attacks is provided by randomizing both the execution branch history as well as the source address of each vulnerable indirect branch, thereby eliminating the conditions required for such attacks.
    Type: Grant
    Filed: July 28, 2021
    Date of Patent: January 3, 2023
    Assignee: Intel Corporation
    Inventors: Rodrigo Branco, Kekai Hu, Ke Sun, Henrique Kawakami
  • Patent number: 11537395
    Abstract: This application relates to a method for optimizing algorithm performance using precision scaling, wherein the method according to an embodiment of present invention comprises obtaining a number of iterations of a unit operation according to precisions of the algorithm including the unit operation that is repeatedly performed, wherein the precisions include a first precision and a second precision, and the number of iterations include a first number of iterations corresponding to the first precision and a second number of iterations corresponding to the second precision; inspecting available precisions of a device on which the algorithm is to be executed, wherein the available precisions include a first available precision corresponding to the first precision and a second available precision corresponding to the second precision; determining an optimal precision by repeatedly performing the unit operation corresponding to an initial operation of the algorithm using the inspected available precision; and repea
    Type: Grant
    Filed: November 4, 2021
    Date of Patent: December 27, 2022
    Assignee: Industry-University Cooperation Foundation Hanyang University
    Inventors: Yongjun Park, Seokwon Kang, Sang Wook Kim, Hong Kyun Bae, Jae Seo Yu, Kyunghwan Choi
  • Patent number: 11537397
    Abstract: Systems, apparatuses, and methods for efficiently sharing registers among threads are disclosed. A system includes at least a processor, control logic, and a register file with a plurality of registers. The processor assigns a base set of registers to each thread of a plurality of threads executing on the processor. When a given thread needs more than the base set of registers to execute a given phase of program code, the given thread executes an acquire instruction to acquire exclusive access to an extended set of registers from a shared resource pool. When the given thread no longer needs additional registers, the given thread executes a release instruction to release the extended set of registers back into the shared register pool for other threads to use. In one implementation, the compiler inserts acquire and release instructions into the program code based on a register liveness analysis performed during compilation.
    Type: Grant
    Filed: March 26, 2018
    Date of Patent: December 27, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Farzad Khorasani, Amin Farmahini-Farahani, Nuwan S. Jayasena
  • Patent number: 11531922
    Abstract: An apparatus and method for scalable qubit addressing. For example, one embodiment of a processor comprises: a decoder comprising quantum instruction decode circuitry to decode quantum instructions to generate quantum microoperations (uops) and non-quantum decode circuitry to decode non-quantum instructions to generate non-quantum uops; execution circuitry comprising: an address generation unit (AGU) to generate a system memory address responsive to execution of one or more of the non-quantum uops; and quantum index generation circuitry to generate quantum index values responsive to execution of one or more of the quantum uops, each quantum index value uniquely identifying a quantum bit (qubit) in a quantum processor; wherein to generate a first quantum index value for a first quantum uop, the quantum index generation circuitry is to read the first quantum index value from a first architectural register identified by the first quantum uop.
    Type: Grant
    Filed: September 27, 2018
    Date of Patent: December 20, 2022
    Assignee: Intel Corporation
    Inventor: Xiang Zou
  • Patent number: 11520586
    Abstract: A renaming unit configured to rename source operands of instructions in a group. A renaming register maintains architectural to physical register mappings. Architectural to physical register mappings propagate from the renaming register through a chain of update units (U) over bus lines denoted with the architectural registers 0 to L. Update units (U) sequentially, in program order, insert physical register identifiers PR(i) allocated to instructions I(i) with destination operands DOP(i) on bus lines denoted with the destination operands DOP(i). Source operands of an instruction I(i) may be renamed to physical register identifiers after physical register identifiers allocated to instructions older than I(i) are sequentially, in program order, inserted on the bus lines, but before physical register identifiers allocated to I(i) and younger instructions are inserted on the bus lines. A source operand SOP(i) is renamed to a physical register identifier that propagates on a bus line denoted with SOP(i).
    Type: Grant
    Filed: July 8, 2021
    Date of Patent: December 6, 2022
    Inventor: Dejan Spasov
  • Patent number: 11507520
    Abstract: In a method of operating a computer system, an instruction loop is executed by a processor in which each iteration of the instruction loop accesses a current data vector and an associated current vector predicate. The instruction loop is repeated when the current vector predicate indicates the current data vector contains at least one valid data element and the instruction loop is exited when the current vector predicate indicates the current data vector contains no valid data elements.
    Type: Grant
    Filed: March 1, 2021
    Date of Patent: November 22, 2022
    Assignee: Texas Instruments Incorporated
    Inventors: Duc Quang Bui, Joseph Raymond Michael Zbiciak
  • Patent number: 11500642
    Abstract: Provided is a method for assigning register tags to instructions at issue time. The method comprises receiving an instruction for execution by a microprocessor. The method further comprises dispatching the instruction to an issue queue without assigning a register tag to the instruction. The method further comprises determining that the instruction is ready to issue. In response to determining that the instruction is ready to issue, the method comprises assigning an available register tag to the instruction. The method further comprises issuing the instruction.
    Type: Grant
    Filed: November 10, 2020
    Date of Patent: November 15, 2022
    Assignee: International Busines Machines Corporation
    Inventors: Steven J. Battle, Jentje Leenstra, Brian D. Barrick, Dung Q. Nguyen, Brian W. Thompto
  • Patent number: 11500636
    Abstract: Disclosed embodiments relate to spatial and temporal merging of remote atomic operations.
    Type: Grant
    Filed: February 24, 2020
    Date of Patent: November 15, 2022
    Assignee: Intel Corporation
    Inventors: Christopher J. Hughes, Joseph Nuzman, Jonas Svennebring, Doddaballapur N. Jayasimha, Samantika S. Sury, David A. Koufaty, Niall D. McDonnell, Yen-Cheng Liu, Stephen R. Van Doren, Stephen J. Robinson
  • Patent number: 11487541
    Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand.
    Type: Grant
    Filed: November 30, 2020
    Date of Patent: November 1, 2022
    Assignee: Intel Corporation
    Inventors: Jesus Corbal, Robert Valentine, Roman S. Dubtsov, Nikita A. Shustrov, Mark J. Charney, Dennis R. Bradford, Milind B. Girkar, Edward T. Grochowski, Thomas D. Fletcher, Warren E. Ferguson
  • Patent number: 11481215
    Abstract: The present disclosure provides a computing method that is applied to a computing device. The computing device includes: a memory, a register unit, and a matrix computing unit. The method includes the following steps: controlling, by the computing device, the matrix computing unit to obtain a first operation instruction, where the first operation instruction includes a matrix reading instruction for a matrix required for executing the instruction; controlling, by the computing device, an operating unit to send a reading command to the memory according to the matrix reading instruction; and controlling, by the computing device, the operating unit to read a matrix corresponding to the matrix reading instruction in a batch reading manner, and executing the first operation instruction on the matrix. The technical solutions in the present disclosure have the advantages of fast computing speed and high efficiency.
    Type: Grant
    Filed: January 17, 2020
    Date of Patent: October 25, 2022
    Assignee: Cambricon (Xi'an) Semiconductor Co., Ltd.
    Inventors: Tianshi Chen, Shaoli Liu, Zai Wang, Shuai Hu
  • Patent number: 11481384
    Abstract: An apparatus is provided comprising storage elements to store data blocks, where each data block has capability metadata associated therewith identifying whether the data block specifies a capability, at least one capability type being a bounded pointer. Processing circuitry is then arranged to be responsive to a bulk capability metadata operation identifying a plurality of the storage elements, to perform an operation on the capability metadata associated with each data block stored in the plurality of storage elements. Via a single specified operation, this hence enables query and/or modification operations to be performed on multiple items of capability metadata, hence providing more efficient access to such capability metadata.
    Type: Grant
    Filed: March 29, 2017
    Date of Patent: October 25, 2022
    Assignee: Arm Limited
    Inventors: Graeme Peter Barnes, Stuart David Biles
  • Patent number: 11461105
    Abstract: Methods and systems are disclosed using an execution pipeline on a multi-processor platform for deep learning network execution. In one example, a network workload analyzer receives a workload, analyzes a computation distribution of the workload, and groups the network nodes into groups. A network executor assigns each group to a processing core of the multi-core platform so that the respective processing core handle computation tasks of the received workload for the respective group.
    Type: Grant
    Filed: April 7, 2017
    Date of Patent: October 4, 2022
    Assignee: Intel Corporation
    Inventors: Liu Yang, Anbang Yao
  • Patent number: 11461095
    Abstract: The present disclosure relates to a method of storing, by a load and store circuit or other processing means, a variable precision floating point value to a memory address of a memory, the method comprising: reducing the bit length of the variable precision floating point value to no more than a size limit, and storing the variable precision floating point value to one of a plurality of storage zones in the memory, each of the plurality of storage zones having a storage space equal to or greater than the size limit (MBB).
    Type: Grant
    Filed: March 6, 2020
    Date of Patent: October 4, 2022
    Assignees: Commissariat à l'Energie Atomique et aux Energies Alternatives, Institut National des Sciences Appliquées de Lyon
    Inventors: Andrea Bocco, Florent Dupont De Dinechin, Yves Durand
  • Patent number: 11455167
    Abstract: Disclosed embodiments relate to efficient complex vector multiplication. In one example, an apparatus includes execution circuitry, responsive to an instruction having fields to specify multiplier, multiplicand, and summand complex vectors, to perform two operations: first, to generate a double-even multiplicand by duplicating even elements of the specified multiplicand, and to generate a temporary vector using a fused multiply-add (FMA) circuit having A, B, and C inputs set to the specified multiplier, the double-even multiplicand, and the specified summand, respectively, and second, to generate a double-odd multiplicand by duplicating odd elements of the specified multiplicand, to generate a swapped multiplier by swapping even and odd elements of the specified multiplier, and to generate a result using a second FMA circuit having its even product negated, and having A, B, and C inputs set to the swapped multiplier, the double-odd multiplicand, and the temporary vector, respectively.
    Type: Grant
    Filed: December 2, 2019
    Date of Patent: September 27, 2022
    Assignee: Intel Coporation
    Inventors: Raanan Sade, Thierry Pons, Amit Gradstein, Zeev Sperber, Mark J. Charney, Robert Valentine, Eyal Oz-Sinay
  • Patent number: 11449341
    Abstract: The invention relates to an electronic device for data processing, which includes an execution unit with a temporary register, a register file, a first feedback path from the data output of the execution unit to the register file, a second feedback path from the data output of the execution unit to the temporary register, a switch configured to connect the first feedback path and/or the second feedback path, and a logic stage coupled to control the switch. The control stage is configured to control the switch to connect the second feedback path if the data output of an execution unit is used as an operand in the subsequent operation of an execution unit.
    Type: Grant
    Filed: September 9, 2019
    Date of Patent: September 20, 2022
    Assignee: Texas Instruments Incorporated
    Inventors: Marko Krüger, Steven Bartling, Markus Kösler
  • Patent number: 11436015
    Abstract: A digital data processor includes a multi-stage butterfly network, which is configured to, in response to a look up table read instruction, receive look up table data from an intermediate register, reorder the look up table data based on control signals comprising look up table configuration register data, and write the reordered look up table data to a destination register specified by the look up table read instruction.
    Type: Grant
    Filed: September 13, 2019
    Date of Patent: September 6, 2022
    Assignee: Texas Instmments Incorporated
    Inventors: Naveen Bhoria, Duc Bui, Dheera Balasubramanian Samudrala, Rama Venkatasubramanian
  • Patent number: 11436186
    Abstract: An algorithmic matching pipelined compiler and a reusable algorithmic pipelined core comprise a high throughput processor system. The reusable algorithmic pipelined core is a reconfigurable processing core with a pipelined structure comprising a processor with a setup interface for programming any of a plurality of operations as determined by setup data, a logic decision processor for programming a look up table, a loop counter and a constant register, and a block of memory. This can be used to perform functions. A reconfigurable, programmable circuit routes data and results from one core to another core and/or IO controller and/or interrupt generator, as required to complete an algorithm without further intervention from a central or peripheral processor during processing of an algorithm.
    Type: Grant
    Filed: June 22, 2018
    Date of Patent: September 6, 2022
    Assignee: ICAT LLC
    Inventors: Robert D Catiller, Daniel Roig, Gnanashanmugam Elumalai
  • Patent number: 11422969
    Abstract: This disclosure relates to a distributed processing system for configuring multiple processing channels. The distributed processing system includes a main processor, such as an ARM processor, communicatively coupled to a plurality of co-processors, such as stream processors. The co-processors can execute instructions in parallel with each other and interrupt the ARM processor. Longer latency instructions can be executed by the main processor and lower latency instructions can be executed by the co-processors. There are several ways that a stream can be triggered in the distributed processing system. In an embodiment, the distributed processing system is a stream processor system that includes an ARM processor and stream processors configured to access different register sets. The stream processors can include a main stream processor and stream processors in respective transmit and receive channels. The stream processor system can be implemented in a radio system to configure the radio for operation.
    Type: Grant
    Filed: June 26, 2020
    Date of Patent: August 23, 2022
    Assignee: Analog Devices, Inc.
    Inventors: Manish J. Manglani, Shipra Bhal, Christopher Mayer
  • Patent number: 11416259
    Abstract: Disclosed herein are system, method, and computer program product embodiments for utilizing look-ahead-staging (LAS) to guarantee the ability to rollback and reconstruct a package while minimizing locking duration and enabling multiple packages to be processed in a data pipeline simultaneously. An embodiment operates by receiving a package from a source system for processing through a data pipeline. The embodiment stores the package in a persistent storage together with a respective package status. The embodiment transmits the package to the data pipeline in response to the storing. The embodiment receives a commit notification for the package from a target system in response to the transmitting. The embodiment then removes the package from the persistent storage in response to receiving the commit notification for the package.
    Type: Grant
    Filed: December 11, 2020
    Date of Patent: August 16, 2022
    Assignee: SAP SE
    Inventors: Daniel Bos, Dan Liu, Tobias Karpstein
  • Patent number: 11416253
    Abstract: A processor includes two or more branch target buffer (BTB) tables for branch prediction, each BTB table storing entries of a different target size or width or storing entries of a different branch type. Each BTB entry includes at least a tag and a target address. For certain branch types that only require a few target address bits, the respective BTB tables are narrower thereby allowing for more BTB entries in the processor separated into respective BTB tables by branch instruction type. An increased number of available BTB entries are stored in a same or a less space in the processor thereby increasing a speed of instruction processing. BTB tables can be defined that do not store any target address and rely on a decode unit to provide it. High value BTB entries have dedicated storage and are therefore less likely to be evicted than low value BTB entries.
    Type: Grant
    Filed: July 10, 2020
    Date of Patent: August 16, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Thomas Clouqueur, Anthony Jarvis