Patents Examined by Eric Coleman
  • Patent number: 11132195
    Abstract: The present application discloses a computing device and a neural network processor including the computing device. The computing device includes one or more columns of computing units arranged in an array, wherein at least one computing unit in each column comprises: an arithmetic parameter memory for storing one or more arithmetic parameters; an arithmetic logical unit (ALU) for receiving input data and performing computation on the input data using the one or more arithmetic parameters stored in the arithmetic parameter memory; and an address controller for providing an address control signal to the arithmetic parameter memory to control the storage and output of the one or more arithmetic parameters.
    Type: Grant
    Filed: May 18, 2020
    Date of Patent: September 28, 2021
    Assignee: MONTAGE TECHNOLOGY CO., LTD.
    Inventors: Peng Wang, Chunyi Li
  • Patent number: 11132199
    Abstract: A processor that includes a register file, a latency shifter, a decode unit and a plurality of functional units is introduced. The register file includes a write port. The latency shifter includes a plurality of shifter entries and shifts out a shifter entry among the shifter entries every clock cycle. Each of the shifter entries is associated with a clock cycle and each of shifter entries includes a writeback value that indicates whether the write port of the register file is available for a writeback operation in the associated clock cycles. The decode unit is configured to decode an instruction and issue the instruction according to the writeback value of the latency shifter. The functional units are coupled to the decode unit and the register file and are configured to execute the instruction issued by the decode unit and perform writeback operation to the write port of the register file.
    Type: Grant
    Filed: March 31, 2020
    Date of Patent: September 28, 2021
    Assignee: ANDES TECHNOLOGY CORPORATION
    Inventor: Thang Minh Tran
  • Patent number: 11126430
    Abstract: A vector processor includes a grouping memory functional unit coupled to grouping memory having multiple bins. The vector processor also includes a bitformatting functional unit that performs bit-level data arrangements using any suitable technique or network, such as a Benes network. The vector processor receives and reads an input vector of data that includes portions (e.g., bits) of multiple data streams, and writes each portion corresponding to a respective data stream to a respective bin in parallel using the bitformatting functional unit to align the data. The vector processor also or alternatively receives and reads multiple outgoing data streams, writes portions of the data streams in respective bins of the grouping memory, and intersperses the portions in an outgoing vector of data in parallel, using the bitformatting functional unit to align the data.
    Type: Grant
    Filed: December 27, 2019
    Date of Patent: September 21, 2021
    Assignee: Intel Corporation
    Inventors: Parakalan Venkataraghavan, Thomas W. Smith, Silpa Naidu Chirumavilla, Ravi Shekhar
  • Patent number: 11126691
    Abstract: An apparatus is provided that receives a scalar start value, an adjust amount and wrapping control information, and includes vector generating circuitry for generating a vector comprising a plurality of elements such that a value of a first element is dependent on the scalar start value, and values of the plurality of elements follow a regularly progressing sequence that is constrained to wrap as required to ensure that each value is within bounds determined from the wrapping control information. The adjust amount is used to determine a difference between values of adjacent elements in the regularly progressing sequence. The vector generating circuitry has first adder circuitry for generating a plurality of first candidate values for the plurality of elements, assuming absence of a wrapping condition, and second adder circuitry for generating a plurality of second candidate values for the plurality of elements, assume presence of a wrapping condition.
    Type: Grant
    Filed: June 23, 2020
    Date of Patent: September 21, 2021
    Assignee: Arm Limited
    Inventor: Jack William Derek Andrew
  • Patent number: 11126588
    Abstract: A processor circuit is disclosed. The processor circuit includes a data path block circuit configured to perform a data path operation to generate one or more results. The processor circuit also includes a data register files circuit, having a first register file, where the first register file has a first quantity of read and write ports. The data register files circuit also includes a second register file, where the second register file has a second different quantity of read and write ports. The processor circuit also includes an instruction decoder circuit configured to provide an operation signal to the data path block circuit, where the operation signal identifies a particular data path operation to be performed by the data path block circuit and identifies one or more read ports of the data register files circuit for retrieving data encoding the first and second operands.
    Type: Grant
    Filed: July 28, 2020
    Date of Patent: September 21, 2021
    Assignee: SHENZHEN GOODIX TECHNOLOGY CO., LTD.
    Inventor: Jaehoon Heo
  • Patent number: 11119770
    Abstract: Performing atomic store-and-invalidate operations in processor-based devices is disclosed. In this regard, a processing element (PE) of one or more PEs of a processor-based device includes a store-and-invalidate logic circuit used by a memory access stage of an execution pipeline of the PE to perform an atomic store-and-invalidate operation. Upon receiving an indication to perform a store-and-invalidate operation (e.g., in response to a store-and-invalidate instruction execution) comprising a store address and store data, the memory access stage uses the store-and-invalidate logic circuit to write the store data to a memory location indicated by the store address, and to invalidate an instruction cache line corresponding to the store address in an instruction cache of the PE.
    Type: Grant
    Filed: July 26, 2019
    Date of Patent: September 14, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Thomas Philip Speier, Eric Francis Robinson
  • Patent number: 11120357
    Abstract: In a general aspect, a computing system is configured to execute a quantum approximate optimization algorithm. In some aspects, a control system identifies a pair of qubit devices in a quantum processor. The quantum processor includes a connection that provides coupling between the pair of qubit devices. ZZ coupling between the pair of qubit devices is activated to execute a cost function defined in the quantum approximate optimization algorithm. The cost function is associated with a maximum cut problem, and the ZZ coupling is activated by allowing the pair of qubits to evolve under a natural Hamiltonian for a time period ?. One or more of the pair of qubit devices is measured to obtain an output from an execution of the quantum approximate optimization algorithm.
    Type: Grant
    Filed: March 7, 2018
    Date of Patent: September 14, 2021
    Assignee: Rigetti & Co, Inc.
    Inventors: William J. Zeng, Nicholas C. Rubin, Matthew J. Reagor, Michael Justin Gerchick Scheer
  • Patent number: 11119773
    Abstract: A method for performing a quantum-logic operation on a quantum computer. The method includes enacting classical pebbling on an initial computation graph G defining the quantum-logic operation; extracting a quantum circuit B based on a sequence of steps obtained from the classical pebbling, that sequence including at least one computation step and at least one measurement-based uncomputation step; executing the quantum circuit B on a qubit register of the quantum computer; recording at least one measurement result of the at least one measurement-based uncomputation step of the quantum circuit B as executed on the qubit register; constructing a clean-up computation graph G? based on the at least one measurement result; enacting reversible pebbling on the clean-up computation graph G?; extracting a quantum circuit B? based on a sequence of steps obtained from the reversible pebbling, that sequence including computation and uncomputation steps; and executing the quantum circuit B? on the qubit register.
    Type: Grant
    Filed: April 9, 2020
    Date of Patent: September 14, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Mathias Soeken, Martin Henri Roetteler, Krysta Marie Svore
  • Patent number: 11119769
    Abstract: A method for changing a processor instruction randomly, covertly, and uniquely, so that the reverse process can restore it faithfully to its original form, making it virtually impossible for a malicious user to know how the bits are changed, preventing them from using a buffer overflow attack to write code with the same processor instruction changes into said processor's memory with the goal of taking control of the processor. When the changes are reversed prior to the instruction being executed, reverting the instruction back to its original value, malicious code placed in memory will be randomly altered so that when it is executed by the processor it produces chaotic, random behavior that will not allow control of the processor to be compromised, eventually producing a processing error that will cause the processor to either shut down the software process where the code exists to reload, or reset.
    Type: Grant
    Filed: February 17, 2020
    Date of Patent: September 14, 2021
    Inventor: Forrest L. Pierson
  • Patent number: 11113233
    Abstract: Systems and methods are provided to enable parallelized multiply-accumulate operations in a systolic array. Each row of the systolic array can include multiple busses enabling independent transmission of inputs along the respective bus. Each processing element of a given row-oriented bus can receive an input from a prior element of the given row-oriented bus, and perform arithmetic operations on the input. The systolic array can be divided into a plurality of sub-arrays corresponding to a row-oriented bus where each sub-array is separated by a shifter. Each shifter can shift a row-oriented bus into the active bus position for a given sub-array. Use of row-oriented busses can enable parallelization to increase speed or enable increased latency at individual processing elements.
    Type: Grant
    Filed: June 29, 2020
    Date of Patent: September 7, 2021
    Assignee: Amazon Technologies, Inc.
    Inventor: Thomas A Volpe
  • Patent number: 11106462
    Abstract: A method for sorting of a vector in a processor is provided that includes performing, by the processor in response to a vector sort instruction, sorting of values stored in lanes of the vector to generate a sorted vector, wherein the values are sorted in an order indicated by the vector sort instruction, and storing the sorted vector in a storage location.
    Type: Grant
    Filed: September 30, 2019
    Date of Patent: August 31, 2021
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Timothy David Anderson, Mujibur Rahman
  • Patent number: 11106465
    Abstract: Vector add-with-carry instructions are described which use some elements of a destination vector register, or corresponding fields of a predicate register, to provide the carry information corresponding to results of an add-with-carry operation. This is useful for accelerating computations involving multiplications of long integer values.
    Type: Grant
    Filed: November 15, 2018
    Date of Patent: August 31, 2021
    Assignee: Arm Limited
    Inventors: Mbou Eyole, Nigel John Stephens, Neil Burgess, Grigorios Magklis
  • Patent number: 11093243
    Abstract: Vector interleaving techniques in a data processing apparatus are disclosed, comprising apparatuses, instructions, methods of operating the apparatuses, and simulator implementations. A vector interleaving instruction specifies a first source register, second source register, and destination register. A first set of input data items is retrieved from the first source register and a second set of input data items from the second source register. A data processing operation is performed on selected input data item pairs taken from the first and second set of input data items to generate a set of result data items, which are stored as a result data vector in the destination register. First source register dependent result data items are stored in a first set of alternating positions in the destination data vector and second source register dependent result data items are stored in a second set of alternating positions in the destination data vector.
    Type: Grant
    Filed: July 2, 2018
    Date of Patent: August 17, 2021
    Assignee: ARM Limited
    Inventors: Mbou Eyole, Nigel John Stephens
  • Patent number: 11086574
    Abstract: A circuit that includes a plurality of array cores, each array core of the plurality of array cores comprising: a plurality of distinct data processing circuits; and a data queue register file; a plurality of border cores, each border core of the plurality of border cores comprising: at least a register file, wherein: [i] at least a subset of the plurality of border cores encompasses a periphery of a first subset of the plurality of array cores; and [ii] a combination of the plurality of array cores and the plurality of border cores define an integrated circuit array.
    Type: Grant
    Filed: March 26, 2020
    Date of Patent: August 10, 2021
    Assignee: quadric.io, Inc.
    Inventors: Nigel Drego, Aman Sikka, Mrinalini Ravichandran, Ananth Durbha, Robert Daniel Firu, Veerbhan Kheterpal
  • Patent number: 11086627
    Abstract: A system is provided that includes an instruction buffer that stores bytes representative of one or more macroinstructions and instruction length decoder circuitry. The instruction length decoder circuitry includes a non-sequential first multiplexer circuitry having first input lines receiving a first input data representative of a speculative length of a first macroinstruction of the macroinstructions, and first selector that selects from the first input lines via a one-hot selector vector. The instruction length decoder circuitry also includes a first output line communicatively coupled to second selector, wherein the first output line causes the selector to select from a second input data representative of a first location of a first ending byte for the first macroinstruction with respect to a value x. The first multiplexer circuitry and the second selector may output start and end byte locations for the macroinstructions.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: August 10, 2021
    Assignee: Intel Corporation
    Inventors: Nir Tell, Shahar Sandor, Amotz Yagev, Michael Hermony, Sagie Yakov Goldenberg, Lihu Rappoport
  • Patent number: 11080048
    Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.
    Type: Grant
    Filed: July 1, 2017
    Date of Patent: August 3, 2021
    Assignee: Intel Corporation
    Inventors: Menachem Adelman, Robert Valentine, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Jesus Corbal, Dan Baum, Alexander F. Heinecke, Elmoustapha Ould-Ahmed-Vall, Yuri Gebil, Raanan Sade
  • Patent number: 11068269
    Abstract: Systems and methods for instruction decoding using hash tables. An example method of constructing a decoding tree comprises: generating an aggregated vector of differentiating bit scores representing at least a subset of a set of processor instructions; identifying, based on the aggregated vector of differentiating bit scores, one or more opcode bit positions; and constructing a hash table implementing a current level of a decoding tree representing the subset of the set of processor instructions, wherein the hash table is indexed by one or more opcode bits identified by the one or more opcode bit positions.
    Type: Grant
    Filed: May 20, 2019
    Date of Patent: July 20, 2021
    Assignee: Parallels International GmbH
    Inventors: Alexey Koryakin, Nikolay Dobrovolskiy
  • Patent number: 11068271
    Abstract: A system and method for reducing the latency of data move operations. A register rename unit within a processor determines whether a decoded move instruction qualifies for a zero cycle move operation. If so, control logic assigns a physical register identifier associated with a source operand of the move instruction to the destination operand of the move instruction. Additionally, the register rename unit marks the given move instruction to prevent it from proceeding in the processor pipeline. Further maintenance of the particular physical register identifier may be done by the register rename unit during commit of the given move instruction.
    Type: Grant
    Filed: July 28, 2014
    Date of Patent: July 20, 2021
    Assignee: Apple Inc.
    Inventor: Shyam Sundar
  • Patent number: 11054890
    Abstract: Techniques to control power and processing among a plurality of asymmetric processing elements are disclosed. In one embodiment, one or more asymmetric processing elements are power managed to migrate processes or threads among a plurality of processing elements according to the performance and power needs of the system.
    Type: Grant
    Filed: July 31, 2013
    Date of Patent: July 6, 2021
    Assignee: Intel Corporation
    Inventors: Herbert Hum, Eric Sprangle, Doug Carmean, Rajesh Kumar
  • Patent number: 11048510
    Abstract: Systems, methods, and apparatuses for executing an instruction are described. In some embodiments, the instruction includes at least an opcode, a field for a packed data source operand, and a field for a packed data destination operand. When executed, the instruction causes for each data element position of the source operand, multiply to a value stored in that data element position all values stored in preceding data element positions of the packed data source operand and store a result of the multiplication into a corresponding data element position of the packed data destination operand.
    Type: Grant
    Filed: August 30, 2018
    Date of Patent: June 29, 2021
    Assignee: Intel Corporation
    Inventors: William M. Brown, Elmoustapha Ould-Ahmed-Vall