Patents Examined by Jyoti Mehta
-
Patent number: 11604852Abstract: A signal processing apparatus comprises an operation processing part that performs operation processing on data represented in the two's complement representation and a storage processing part that performs storage processing on data represented in a second representation format as a data representation format, and in the second representation format, a data value is identical to one in the two's complement representation when the value is positive or zero, and all the bits lower than the most significant bit that indicates the sign in the two's complement representation are inverted when a data value is negative.Type: GrantFiled: December 26, 2018Date of Patent: March 14, 2023Assignee: NEC CORPORATIONInventor: Atsufumi Shibayama
-
Patent number: 11593111Abstract: An apparatus and method are provided for inhibiting instruction manipulation. The apparatus has execution circuitry for performing data processing operations in response to a sequence of instructions from an instruction set, and decoder circuitry for decoding each instruction in the sequence in order to generate control signals for the execution circuitry. Each instruction comprises a plurality of instruction bits, and the decoder circuitry is arranged to perform a decode operation on each instruction to determine from the value of each instruction bit, and knowledge of the instruction set, the control signals to be issued to the execution circuitry in response to that instruction. An input path to the decoder circuitry comprises a set of wires over which the instruction bits of each instruction are provided.Type: GrantFiled: January 27, 2020Date of Patent: February 28, 2023Assignee: Arm LimitedInventors: Frederic Jean Denis Arsanto, Carlo Dario Fanara, Luca Scalabrino, Jean Sébastien Leroy
-
Patent number: 11544065Abstract: A processor includes a front-end with an instruction set that operates at a first bit width and a floating point unit coupled to receive the instruction set in the processor that operates at the first bit width. The floating point unit operates at a second bit width and, based upon a bit width assessment of the instruction set provided to the floating point unit, the floating point unit employs a shadow-latch configured floating point register file to perform bit width reconfiguration. The shadow-latch configured floating point register file includes a plurality of regular latches and a plurality of shadow latches for storing data that is to be either read from or written to the shadow latches. The bit width reconfiguration enables the floating point unit that operates at the second bit width to operate on the instruction set received at the first bit width.Type: GrantFiled: September 27, 2019Date of Patent: January 3, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Arun A. Nair, Todd Baumgartner, Michael Estlick, Erik Swanson
-
Patent number: 11526360Abstract: A processor comprising a processor pipeline comprising one or more execution units configured to execute branch instructions, a branch predictor associated with the processor pipeline and configured to predict a branch instruction prediction outcome, and the branch prediction unit. The branch predictor is turned off to save power and avoid miss-predictions when the branch predictor and/or branch prediction unit accuracy is lower than expected.Type: GrantFiled: November 20, 2018Date of Patent: December 13, 2022Assignee: International Business Machines CorporationInventors: Naga P. Gorti, Dave S. Levitan
-
Patent number: 11513806Abstract: Techniques are provided for vectorizing Heapsort. A K-heap is used as the underlying data structure for indexing values being sorted. The K-heap is vectorized by storing values in a contiguous memory array containing a beginning-most side and end-most side. The vectorized Heapsort utilizes horizontal aggregation SIMD instructions for comparisons, shuffling, and moving data. Thus, the number of comparisons required in order to find the maximum or minimum key value within a single node of the K-heap is reduced resulting in faster retrieval operations.Type: GrantFiled: April 9, 2021Date of Patent: November 29, 2022Assignee: Oracle International CorporationInventors: Benjamin Schlegel, Pit Fender, Harshad Kasture, Matthias Brantner, Hassan Chafi
-
Patent number: 11507416Abstract: A computer system comprising: (i) a computer subsystem configured to act as a work accelerator, and (ii) a gateway connected to the computer subsystem, the gateway enabling the transfer of data to the computer subsystem from external storage at pre-compiled data exchange synchronization points attained by the computer subsystem, which act as a barrier between a compute phase and an exchange phase of the computer subsystem, wherein the computer subsystem is configured to pull data from a gateway transfer memory of the gateway in response to the pre-compiled data exchange synchronization point attained by the subsystem, wherein the gateway comprises at least one processor configured to perform at least one operation to pre-load at least some of the data from a first memory of the gateway to the gateway transfer memory in advance of the pre-compiled data exchange synchronization point attained by the subsystem.Type: GrantFiled: May 31, 2019Date of Patent: November 22, 2022Assignee: GRAPHCORE LIMITEDInventor: Brian Manula
-
Patent number: 11507307Abstract: A storage system includes a plurality of storage controllers and a drive box including one or more non-volatile storage devices. The drive box includes a memory on which reading and writing are performed in a unit different from a unit for reading and writing the one or more non-volatile storage devices, and which stores control information to be used by the plurality of storage controllers, and a memory controller that enables each storage controller of the plurality of storage controllers to exclusively read and write the control information of the memory by arbitrating accesses to the memory from the plurality of storage controllers.Type: GrantFiled: February 26, 2020Date of Patent: November 22, 2022Assignee: HITACHI, LTD.Inventors: Kentaro Shimada, Akira Yamamoto, Katsuya Tanaka
-
Patent number: 11507379Abstract: A front-end portion of a pipeline includes a stage that speculatively issues at least some instructions out-of-order. A back-end portion of the pipeline includes one or more stages that access a processor memory system. In the front-end (back-end), execution of instructions is managed based on information available in the front-end (back-end). Managing execution of a first memory barrier instruction includes preventing speculative out-of-order issuance of store instructions. The back-end control circuitry provides information accessible to the front-end control circuitry indicating that one or more particular memory instructions have completed handling by the processor memory system.Type: GrantFiled: May 31, 2019Date of Patent: November 22, 2022Assignee: Marvell Asia Pte, Ltd.Inventors: Shubhendu Sekhar Mukherjee, Michael Bertone, David Albert Carlson
-
Patent number: 11487023Abstract: The disclosure relates to a method for measuring the variance in a measurement signal, comprising the following steps: filtering the measurement signal by means of a high-pass filter in order to obtain a filtered measurement signal; determining the variance by using the filtered measurement signal.Type: GrantFiled: December 12, 2016Date of Patent: November 1, 2022Assignee: Robert Bosch GmbHInventor: Aaron Troost
-
Patent number: 11487845Abstract: A convolutional operation device for performing convolutional neural network processing includes an input sharing network including first and second input feature map registers configured to shift each input feature map, which is inputted in row units, in a row or column direction and output the shifted input feature map and arranged in rows and columns, a first MAC array connected to the first input feature map registers, an input feature map switching network configured to select one of the first and second input feature map registers, a second MAC array connected to one selected by the input feature map switching network among the first and second input feature map registers, and an output shift network configured to shift the output feature map from the first MAC array and the second MAC array to transmit the shifted output feature map to an output memory.Type: GrantFiled: November 13, 2019Date of Patent: November 1, 2022Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTEInventors: Jung Hee Suk, Chun-Gi Lyuh
-
Patent number: 11481223Abstract: Methods, systems and apparatuses for reducing operations of Sum-Of-Multiply-Accumulate (SOMAC) instructions are disclosed. One method includes scheduling, by a scheduler, a thread for execution, executing, by a processor of a plurality of processors, the thread, fetching, by the processor, a plurality of instructions for the thread from a memory, selecting, by a thread arbiter of the processor, an instruction of the plurality of instructions for execution in an arithmetic logic unit (ALU) pipeline of the processor, and reading the instruction, and determining, by a macro-instruction iterator of the processor, whether the instruction is a Sum-Of-Multiply-Accumulate (SOMAC) instruction with an instruction size, wherein the instruction size indicates a number of iterations that the SOMAC instruction is to be executed.Type: GrantFiled: August 8, 2019Date of Patent: October 25, 2022Assignee: Blaize, Inc.Inventors: Kamaraj Thangam, Palaparthy Venkata Divya Bharathi, Satyaki Koneru
-
Patent number: 11481220Abstract: An apparatus comprises instruction fetch circuitry to retrieve instructions from storage and branch target storage to store entries comprising source and target addresses for branch instructions. A confidence value is stored with each entry and when a current address matches a source address in an entry, and the confidence value exceeds a confidence threshold, instruction fetch circuitry retrieves a predicted next instruction from a target address in the entry. Branch confidence update circuitry increases the confidence value of the entry on receipt of a confirmation of the target address and decreases the confidence value on receipt of a non-confirmation of the target address. When the confidence value meets a confidence lock threshold below the confidence threshold and non-confirmation of the target address is received, a locking mechanism with respect to the entry is triggered. A corresponding method is also provided.Type: GrantFiled: October 26, 2016Date of Patent: October 25, 2022Assignee: Arm LimitedInventors: Alexander Alfred Hornung, Adrian Viorel Popescu
-
Patent number: 11461074Abstract: The multi-digit binary in-memory multiplication devices are disclosed. The multi-digit binary in-memory multiplication devices of the invention can dramatically reduce the operational steps in comparison with the conventional binary multiplier device. In one embodiment with the expense of more hardware, the in-memory multiplication device can achieve one single step operation. Consequently, the multi-digit binary in-memory multiplication device can improve the computation efficiency and save the computation power by eliminating the data transportations between Arithmetic Logic Unit (ALU), registers, and memory units.Type: GrantFiled: July 10, 2020Date of Patent: October 4, 2022Assignee: FLASHSILICON INCORPORATIONInventor: Lee Wang
-
Patent number: 11449336Abstract: A method of storing register data elements to interleave with data elements of a different register, a processor thereof, and a system thereof, wherein each non-consecutive data elements of a register is retrieved to be stored to interleave with each non-consecutive data elements of a different register upon an executive of an interleaving store instruction, wherein a mask instruction directing a lane of a storage space in which the non-consecutive data elements are stored is executed in conjunction with the interleaving store instruction, and wherein a processor of a second type is configured to emulate a processor of a first type to store the non-consecutive data elements the same as non-consecutive data elements stored in the first type processor.Type: GrantFiled: February 10, 2020Date of Patent: September 20, 2022Assignee: Texas Instmments IncorporatedInventors: Duc Quang Bui, Alan L. Davis, Dheera Balasubramanian Samudrala, Timothy David Anderson
-
Patent number: 11429387Abstract: A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces addresses of data elements. A stream head register stores data elements next to be supplied to functional units for use as operands. Stream metadata is stored in response to a stream store instruction. Stored stream metadata is restored to the stream engine in response to a stream restore instruction. An interrupt changes an open stream to a frozen state discarding stored stream data. A return from interrupt changes a frozen stream to an active state.Type: GrantFiled: September 3, 2020Date of Patent: August 30, 2022Assignee: Texas Instruments IncorporatedInventors: Joseph Zbiciak, Timothy D. Anderson
-
Patent number: 11429392Abstract: Systems and methods are disclosed for secure predictors for speculative execution. Some implementations may eliminate or mitigate side-channel attacks, such as the Spectre-class of attacks, in a processor. For example, an integrated circuit (e.g., a processor) for executing instructions includes a predictor circuit that, when operating in a first mode, uses data stored in a set of predictor entries to generate predictions. For example, the integrated circuit may be configured to: detect a security domain transition for software being executed by the integrated circuit; responsive to the security domain transition, change a mode of the predictor circuit from the first mode to a second mode and invoke a reset of the set of predictor entries, wherein the second mode prevents the use of a first subset of the predictor entries of the set of predictor entries; and, after completion of the reset, change the mode back to the first mode.Type: GrantFiled: March 22, 2019Date of Patent: August 30, 2022Assignee: SiFive, Inc.Inventors: Krste Asanovic, Andrew Waterman
-
Patent number: 11422821Abstract: A system and method for efficiently handling instruction execution ordering. In various embodiments, a processor includes multiple execution lanes, each executing instructions of a particular type, which are not executed by one or more of the other execution lanes. The instruction queue includes one queue for each particular execution lane. Control logic identifies a current youngest age used in allocated entries of the multiple queues, and determines a starting age based on the identified current youngest age and the number of instructions to be issued. Beginning with the determined starting age, ages (in program order) are assigned to a group of instructions being allocated in the multiple queues. Ages of entries in the multiple queues are updated for instructions not being issued based on the number of instructions being issued. Instructions being issued have age differences between them below a threshold.Type: GrantFiled: September 4, 2018Date of Patent: August 23, 2022Assignee: Apple Inc.Inventors: James N. Hardage, Jr., Christopher M. Tsay, Mahesh K. Reddy
-
Patent number: 11416261Abstract: Methods, systems and apparatuses for graph streaming processing are disclosed. One method includes loading, by a group load register, a subset of a an input tensor from a data cache, wherein the group load register provides the subset of the input tensor to all of a plurality of processors, loading, by a plurality of weight data registers, a plurality of weights of a weight tensor, wherein each of the weight data registers provide an weight to a single of the plurality of processors, and performing, by the plurality of processors, a SOMAC (Sum-Of-Multiply-Accumulate) instruction, including simultaneously determining, by each of the plurality of processors, an instruction size of the SOMAC instruction, wherein the instruction size indicates a number of iterations that the SOMAC instruction is to be executed and is equal to a number of outputs within a subset of a plurality of output tensors.Type: GrantFiled: July 15, 2020Date of Patent: August 16, 2022Assignee: Blaize, Inc.Inventors: Satyaki Koneru, Kamaraj Thangam, Sruthikesh Surineni
-
Patent number: 11409692Abstract: A microprocessor system comprises a computational array and a vector computational unit. The computational array includes a plurality of computation units. The vector computational unit is in communication with the computational array and includes a plurality of processing elements. The processing elements are configured to receive output data elements from the computational array and process in parallel the received output data elements.Type: GrantFiled: March 13, 2018Date of Patent: August 9, 2022Assignee: Tesla, Inc.Inventors: Debjit Das Sarma, Emil Talpes, Peter Joseph Bannon
-
Patent number: 11392537Abstract: Exemplary reach-based explicit dataflow processors and related computer-readable media and methods. The reach-based explicit dataflow processors are configured to support execution of producer instructions encoded with explicit naming of consumer instructions intended to consume the values produced by the producer instructions. The reach-based explicit dataflow processors are configured to make available produced values as inputs to explicitly named consumer instructions as a result of processing producer instructions. The reach-based explicit dataflow processors support execution of a producer instruction that explicitly names a consumer instruction based on using the producer instruction as a relative reference point from the producer instruction.Type: GrantFiled: March 18, 2019Date of Patent: July 19, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Gagan Gupta, Michael Scott McIlvaine, Rodney Wayne Smith, Thomas Philip Speier, David Tennyson Harper, III