Patents Examined by Jyoti Mehta
  • Patent number: 11604852
    Abstract: A signal processing apparatus comprises an operation processing part that performs operation processing on data represented in the two's complement representation and a storage processing part that performs storage processing on data represented in a second representation format as a data representation format, and in the second representation format, a data value is identical to one in the two's complement representation when the value is positive or zero, and all the bits lower than the most significant bit that indicates the sign in the two's complement representation are inverted when a data value is negative.
    Type: Grant
    Filed: December 26, 2018
    Date of Patent: March 14, 2023
    Assignee: NEC CORPORATION
    Inventor: Atsufumi Shibayama
  • Patent number: 11593111
    Abstract: An apparatus and method are provided for inhibiting instruction manipulation. The apparatus has execution circuitry for performing data processing operations in response to a sequence of instructions from an instruction set, and decoder circuitry for decoding each instruction in the sequence in order to generate control signals for the execution circuitry. Each instruction comprises a plurality of instruction bits, and the decoder circuitry is arranged to perform a decode operation on each instruction to determine from the value of each instruction bit, and knowledge of the instruction set, the control signals to be issued to the execution circuitry in response to that instruction. An input path to the decoder circuitry comprises a set of wires over which the instruction bits of each instruction are provided.
    Type: Grant
    Filed: January 27, 2020
    Date of Patent: February 28, 2023
    Assignee: Arm Limited
    Inventors: Frederic Jean Denis Arsanto, Carlo Dario Fanara, Luca Scalabrino, Jean Sébastien Leroy
  • Patent number: 11544065
    Abstract: A processor includes a front-end with an instruction set that operates at a first bit width and a floating point unit coupled to receive the instruction set in the processor that operates at the first bit width. The floating point unit operates at a second bit width and, based upon a bit width assessment of the instruction set provided to the floating point unit, the floating point unit employs a shadow-latch configured floating point register file to perform bit width reconfiguration. The shadow-latch configured floating point register file includes a plurality of regular latches and a plurality of shadow latches for storing data that is to be either read from or written to the shadow latches. The bit width reconfiguration enables the floating point unit that operates at the second bit width to operate on the instruction set received at the first bit width.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: January 3, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Arun A. Nair, Todd Baumgartner, Michael Estlick, Erik Swanson
  • Patent number: 11526360
    Abstract: A processor comprising a processor pipeline comprising one or more execution units configured to execute branch instructions, a branch predictor associated with the processor pipeline and configured to predict a branch instruction prediction outcome, and the branch prediction unit. The branch predictor is turned off to save power and avoid miss-predictions when the branch predictor and/or branch prediction unit accuracy is lower than expected.
    Type: Grant
    Filed: November 20, 2018
    Date of Patent: December 13, 2022
    Assignee: International Business Machines Corporation
    Inventors: Naga P. Gorti, Dave S. Levitan
  • Patent number: 11513806
    Abstract: Techniques are provided for vectorizing Heapsort. A K-heap is used as the underlying data structure for indexing values being sorted. The K-heap is vectorized by storing values in a contiguous memory array containing a beginning-most side and end-most side. The vectorized Heapsort utilizes horizontal aggregation SIMD instructions for comparisons, shuffling, and moving data. Thus, the number of comparisons required in order to find the maximum or minimum key value within a single node of the K-heap is reduced resulting in faster retrieval operations.
    Type: Grant
    Filed: April 9, 2021
    Date of Patent: November 29, 2022
    Assignee: Oracle International Corporation
    Inventors: Benjamin Schlegel, Pit Fender, Harshad Kasture, Matthias Brantner, Hassan Chafi
  • Patent number: 11507416
    Abstract: A computer system comprising: (i) a computer subsystem configured to act as a work accelerator, and (ii) a gateway connected to the computer subsystem, the gateway enabling the transfer of data to the computer subsystem from external storage at pre-compiled data exchange synchronization points attained by the computer subsystem, which act as a barrier between a compute phase and an exchange phase of the computer subsystem, wherein the computer subsystem is configured to pull data from a gateway transfer memory of the gateway in response to the pre-compiled data exchange synchronization point attained by the subsystem, wherein the gateway comprises at least one processor configured to perform at least one operation to pre-load at least some of the data from a first memory of the gateway to the gateway transfer memory in advance of the pre-compiled data exchange synchronization point attained by the subsystem.
    Type: Grant
    Filed: May 31, 2019
    Date of Patent: November 22, 2022
    Assignee: GRAPHCORE LIMITED
    Inventor: Brian Manula
  • Patent number: 11507307
    Abstract: A storage system includes a plurality of storage controllers and a drive box including one or more non-volatile storage devices. The drive box includes a memory on which reading and writing are performed in a unit different from a unit for reading and writing the one or more non-volatile storage devices, and which stores control information to be used by the plurality of storage controllers, and a memory controller that enables each storage controller of the plurality of storage controllers to exclusively read and write the control information of the memory by arbitrating accesses to the memory from the plurality of storage controllers.
    Type: Grant
    Filed: February 26, 2020
    Date of Patent: November 22, 2022
    Assignee: HITACHI, LTD.
    Inventors: Kentaro Shimada, Akira Yamamoto, Katsuya Tanaka
  • Patent number: 11507379
    Abstract: A front-end portion of a pipeline includes a stage that speculatively issues at least some instructions out-of-order. A back-end portion of the pipeline includes one or more stages that access a processor memory system. In the front-end (back-end), execution of instructions is managed based on information available in the front-end (back-end). Managing execution of a first memory barrier instruction includes preventing speculative out-of-order issuance of store instructions. The back-end control circuitry provides information accessible to the front-end control circuitry indicating that one or more particular memory instructions have completed handling by the processor memory system.
    Type: Grant
    Filed: May 31, 2019
    Date of Patent: November 22, 2022
    Assignee: Marvell Asia Pte, Ltd.
    Inventors: Shubhendu Sekhar Mukherjee, Michael Bertone, David Albert Carlson
  • Patent number: 11487023
    Abstract: The disclosure relates to a method for measuring the variance in a measurement signal, comprising the following steps: filtering the measurement signal by means of a high-pass filter in order to obtain a filtered measurement signal; determining the variance by using the filtered measurement signal.
    Type: Grant
    Filed: December 12, 2016
    Date of Patent: November 1, 2022
    Assignee: Robert Bosch GmbH
    Inventor: Aaron Troost
  • Patent number: 11487845
    Abstract: A convolutional operation device for performing convolutional neural network processing includes an input sharing network including first and second input feature map registers configured to shift each input feature map, which is inputted in row units, in a row or column direction and output the shifted input feature map and arranged in rows and columns, a first MAC array connected to the first input feature map registers, an input feature map switching network configured to select one of the first and second input feature map registers, a second MAC array connected to one selected by the input feature map switching network among the first and second input feature map registers, and an output shift network configured to shift the output feature map from the first MAC array and the second MAC array to transmit the shifted output feature map to an output memory.
    Type: Grant
    Filed: November 13, 2019
    Date of Patent: November 1, 2022
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Jung Hee Suk, Chun-Gi Lyuh
  • Patent number: 11481223
    Abstract: Methods, systems and apparatuses for reducing operations of Sum-Of-Multiply-Accumulate (SOMAC) instructions are disclosed. One method includes scheduling, by a scheduler, a thread for execution, executing, by a processor of a plurality of processors, the thread, fetching, by the processor, a plurality of instructions for the thread from a memory, selecting, by a thread arbiter of the processor, an instruction of the plurality of instructions for execution in an arithmetic logic unit (ALU) pipeline of the processor, and reading the instruction, and determining, by a macro-instruction iterator of the processor, whether the instruction is a Sum-Of-Multiply-Accumulate (SOMAC) instruction with an instruction size, wherein the instruction size indicates a number of iterations that the SOMAC instruction is to be executed.
    Type: Grant
    Filed: August 8, 2019
    Date of Patent: October 25, 2022
    Assignee: Blaize, Inc.
    Inventors: Kamaraj Thangam, Palaparthy Venkata Divya Bharathi, Satyaki Koneru
  • Patent number: 11481220
    Abstract: An apparatus comprises instruction fetch circuitry to retrieve instructions from storage and branch target storage to store entries comprising source and target addresses for branch instructions. A confidence value is stored with each entry and when a current address matches a source address in an entry, and the confidence value exceeds a confidence threshold, instruction fetch circuitry retrieves a predicted next instruction from a target address in the entry. Branch confidence update circuitry increases the confidence value of the entry on receipt of a confirmation of the target address and decreases the confidence value on receipt of a non-confirmation of the target address. When the confidence value meets a confidence lock threshold below the confidence threshold and non-confirmation of the target address is received, a locking mechanism with respect to the entry is triggered. A corresponding method is also provided.
    Type: Grant
    Filed: October 26, 2016
    Date of Patent: October 25, 2022
    Assignee: Arm Limited
    Inventors: Alexander Alfred Hornung, Adrian Viorel Popescu
  • Patent number: 11461074
    Abstract: The multi-digit binary in-memory multiplication devices are disclosed. The multi-digit binary in-memory multiplication devices of the invention can dramatically reduce the operational steps in comparison with the conventional binary multiplier device. In one embodiment with the expense of more hardware, the in-memory multiplication device can achieve one single step operation. Consequently, the multi-digit binary in-memory multiplication device can improve the computation efficiency and save the computation power by eliminating the data transportations between Arithmetic Logic Unit (ALU), registers, and memory units.
    Type: Grant
    Filed: July 10, 2020
    Date of Patent: October 4, 2022
    Assignee: FLASHSILICON INCORPORATION
    Inventor: Lee Wang
  • Patent number: 11449336
    Abstract: A method of storing register data elements to interleave with data elements of a different register, a processor thereof, and a system thereof, wherein each non-consecutive data elements of a register is retrieved to be stored to interleave with each non-consecutive data elements of a different register upon an executive of an interleaving store instruction, wherein a mask instruction directing a lane of a storage space in which the non-consecutive data elements are stored is executed in conjunction with the interleaving store instruction, and wherein a processor of a second type is configured to emulate a processor of a first type to store the non-consecutive data elements the same as non-consecutive data elements stored in the first type processor.
    Type: Grant
    Filed: February 10, 2020
    Date of Patent: September 20, 2022
    Assignee: Texas Instmments Incorporated
    Inventors: Duc Quang Bui, Alan L. Davis, Dheera Balasubramanian Samudrala, Timothy David Anderson
  • Patent number: 11429387
    Abstract: A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces addresses of data elements. A stream head register stores data elements next to be supplied to functional units for use as operands. Stream metadata is stored in response to a stream store instruction. Stored stream metadata is restored to the stream engine in response to a stream restore instruction. An interrupt changes an open stream to a frozen state discarding stored stream data. A return from interrupt changes a frozen stream to an active state.
    Type: Grant
    Filed: September 3, 2020
    Date of Patent: August 30, 2022
    Assignee: Texas Instruments Incorporated
    Inventors: Joseph Zbiciak, Timothy D. Anderson
  • Patent number: 11429392
    Abstract: Systems and methods are disclosed for secure predictors for speculative execution. Some implementations may eliminate or mitigate side-channel attacks, such as the Spectre-class of attacks, in a processor. For example, an integrated circuit (e.g., a processor) for executing instructions includes a predictor circuit that, when operating in a first mode, uses data stored in a set of predictor entries to generate predictions. For example, the integrated circuit may be configured to: detect a security domain transition for software being executed by the integrated circuit; responsive to the security domain transition, change a mode of the predictor circuit from the first mode to a second mode and invoke a reset of the set of predictor entries, wherein the second mode prevents the use of a first subset of the predictor entries of the set of predictor entries; and, after completion of the reset, change the mode back to the first mode.
    Type: Grant
    Filed: March 22, 2019
    Date of Patent: August 30, 2022
    Assignee: SiFive, Inc.
    Inventors: Krste Asanovic, Andrew Waterman
  • Patent number: 11422821
    Abstract: A system and method for efficiently handling instruction execution ordering. In various embodiments, a processor includes multiple execution lanes, each executing instructions of a particular type, which are not executed by one or more of the other execution lanes. The instruction queue includes one queue for each particular execution lane. Control logic identifies a current youngest age used in allocated entries of the multiple queues, and determines a starting age based on the identified current youngest age and the number of instructions to be issued. Beginning with the determined starting age, ages (in program order) are assigned to a group of instructions being allocated in the multiple queues. Ages of entries in the multiple queues are updated for instructions not being issued based on the number of instructions being issued. Instructions being issued have age differences between them below a threshold.
    Type: Grant
    Filed: September 4, 2018
    Date of Patent: August 23, 2022
    Assignee: Apple Inc.
    Inventors: James N. Hardage, Jr., Christopher M. Tsay, Mahesh K. Reddy
  • Patent number: 11416261
    Abstract: Methods, systems and apparatuses for graph streaming processing are disclosed. One method includes loading, by a group load register, a subset of a an input tensor from a data cache, wherein the group load register provides the subset of the input tensor to all of a plurality of processors, loading, by a plurality of weight data registers, a plurality of weights of a weight tensor, wherein each of the weight data registers provide an weight to a single of the plurality of processors, and performing, by the plurality of processors, a SOMAC (Sum-Of-Multiply-Accumulate) instruction, including simultaneously determining, by each of the plurality of processors, an instruction size of the SOMAC instruction, wherein the instruction size indicates a number of iterations that the SOMAC instruction is to be executed and is equal to a number of outputs within a subset of a plurality of output tensors.
    Type: Grant
    Filed: July 15, 2020
    Date of Patent: August 16, 2022
    Assignee: Blaize, Inc.
    Inventors: Satyaki Koneru, Kamaraj Thangam, Sruthikesh Surineni
  • Patent number: 11409692
    Abstract: A microprocessor system comprises a computational array and a vector computational unit. The computational array includes a plurality of computation units. The vector computational unit is in communication with the computational array and includes a plurality of processing elements. The processing elements are configured to receive output data elements from the computational array and process in parallel the received output data elements.
    Type: Grant
    Filed: March 13, 2018
    Date of Patent: August 9, 2022
    Assignee: Tesla, Inc.
    Inventors: Debjit Das Sarma, Emil Talpes, Peter Joseph Bannon
  • Patent number: 11392537
    Abstract: Exemplary reach-based explicit dataflow processors and related computer-readable media and methods. The reach-based explicit dataflow processors are configured to support execution of producer instructions encoded with explicit naming of consumer instructions intended to consume the values produced by the producer instructions. The reach-based explicit dataflow processors are configured to make available produced values as inputs to explicitly named consumer instructions as a result of processing producer instructions. The reach-based explicit dataflow processors support execution of a producer instruction that explicitly names a consumer instruction based on using the producer instruction as a relative reference point from the producer instruction.
    Type: Grant
    Filed: March 18, 2019
    Date of Patent: July 19, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Gagan Gupta, Michael Scott McIlvaine, Rodney Wayne Smith, Thomas Philip Speier, David Tennyson Harper, III