Patents by Inventor Jesse Garrett BEU

Jesse Garrett BEU has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

MATRIX MULTIPLICATION IN A DYNAMICALLY SPATIALLY AND DYNAMICALLY TEMPORALLY DIVIDABLE ARCHITECTURE

Publication number: 20240320292

Abstract: A data processing apparatus includes input circuitry that receives a matrix having values in a first format. Output circuitry outputs the matrix having the values in a second format while adjustment circuitry performs a modification of the matrix from the first format to the second format. The second format is computationally contiguous in respect of a data processing apparatus having the first and second vector registers both configured to be dynamically spatially and dynamically temporally divided, performing a matrix multiplication.

Type: Application

Filed: March 23, 2023

Publication date: September 26, 2024

Inventors: Jesse Garrett BEU, Thomas Christopher GROCUTT
MATRIX MULTIPLICATION IN A DYNAMICALLY SPATIALLY AND DYNAMICALLY TEMPORALLY DIVIDABLE ARCHITECTURE

Publication number: 20240320005

Abstract: A data processing apparatus includes first vector registers and second vector registers, both dynamically spatially and dynamically temporally dividable. Decode circuitry receives one or more matrix multiplication instructions that indicate a set of first elements in the first vector registers and a set of second elements in the second vector registers, and in response to receiving the matrix multiplication instructions they generate a matrix multiplication operation. The matrix multiplication operation causes one or more execution units to perform a matrix multiplication of the set of first elements by the set of second elements and an average bit width of the first elements is different to an average bit width of the second elements.

Type: Application

Filed: March 23, 2023

Publication date: September 26, 2024

Inventors: Jesse Garrett BEU, Thomas Christopher GROCUTT
Hybrid filter banks for artificial neural networks

Patent number: 12067373

Abstract: The present disclosure advantageously provides a system including a memory, a processor, and a circuitry to execute one or more mixed precision layers of an artificial neural network (ANN), each mixed precision layer including high-precision weight filters and low precision weight filters. The circuitry is configured to perform one or more calculations on an input feature map having a plurality of input channels (cin) using the high precision weight filters to create a high precision output feature map having a first number of output channels (k), perform one or more calculations on the input feature map using the low precision weight filters to create a low precision output feature map having a second number of output channels (cout?k), and concatenate the high precision output feature map and the low precision output feature map to create a unified output feature map having a plurality of output channels (cout).

Type: Grant

Filed: March 31, 2020

Date of Patent: August 20, 2024

Assignee: Arm Limited

Inventors: Dibakar Gope, Jesse Garrett Beu, Paul Nicholas Whatmough, Matthew Mattina
Apparatus and method for providing coherence data for use when implementing a cache coherency protocol

Patent number: 11934307

Abstract: An apparatus and method are provided for receiving a request from a plurality of processing units, where multiple of those processing units have associated cache storage. A snoop unit is used to implement a cache coherency protocol when a request is received that identifies a cacheable memory address. The snoop unit has snoop filter storage comprising a plurality of snoop filter tables organized in a hierarchical arrangement. The snoop filter tables comprise a primary snoop filter table at a highest level in the hierarchy, and each snoop filter table at a lower level in the hierarchy forms a backup snoop filter table for an adjacent snoop filter table at a higher level in the hierarchy. Each snoop filter table is arranged as a multi-way set associative storage structure, and each backup snoop filter table has a different number of sets than are provided in the adjacent snoop filter table.

Type: Grant

Filed: January 18, 2021

Date of Patent: March 19, 2024

Assignee: Arm Limited

Inventors: Joshua Randall, Jesse Garrett Beu
NEURAL PROCESSING UNIT FOR ATTENTION-BASED INFERENCE

Publication number: 20240028877

Abstract: There is provided a neural processing unit for calculating an attention matrix during machine learning inference. The neural processing unit is configured to calculate: a first score matrix based on differences between a query matrix and a key matrix; a second score matrix based on differences between the key matrix and a learned key matrix; a similarity matrix based on a combination of the first score matrix and second score matrix; and an attention matrix comprising applying a normalisation function to the similarity matrix. Also provided is an apparatus comprising at least one said neural processing unit and at least one memory, the memory configured to pass, on demand, a learned key matrix to the neural processing unit. Also provided is a computer program product having computer readable program code stored thereon which, when executed by said neural processing unit, causes the unit to perform said calculations.

Type: Application

Filed: July 21, 2022

Publication date: January 25, 2024

Inventors: Shounak DATTA, Dibakar GOPE, Jesse Garrett BEU, Mark John O'CONNOR
Vectorized Operations for Sparse Kernels

Publication number: 20230367843

Abstract: A data processing method and processor instructions are provided that leverage scatter operations to efficiently merge vector and matrix indices, as compared to standard matrix and vector operations, as well as merge other arithmetic results, lists of numbers, etc.

Type: Application

Filed: May 13, 2022

Publication date: November 16, 2023

Applicant: Arm Limited

Inventors: Joshua Randall, Jesse Garrett Beu, Krishnendra Nathella, Tuan Quang Ta
System and Method for Accelerating Neural Networks

Publication number: 20230195419

Abstract: A neural network system, method and apparatus are provided. A truth table matrix, an index vector and an input data tensor are read from a memory. At least a portion of the input data tensor is flattened into an input data vector. A scatter accumulate instruction is executed on the index vector and the input data vector to generate an intermediate vector. The truth table matrix and the intermediate vector are then multiplied to generate an output data vector.

Type: Application

Filed: December 17, 2021

Publication date: June 22, 2023

Applicant: Arm Limited

Inventors: Dibakar Gope, Jesse Garrett Beu, Milos Milosavljevic
Skip predictor for pre-trained recurrent neural networks

Patent number: 11663814

Abstract: The present disclosure advantageously provides a system and a method for skipping recurrent neural network (RNN) state updates using a skip predictor. Sequential input data are received and divided into sequences of input data values, each input data value being associated with a different time step for a pre-trained RNN model. At each time step, the hidden state vector for a prior time step is received from the pre-trained RNN model, and a determination, based on the input data value and the hidden state vector for at least one prior time step, is made whether to provide or not provide the input data value associated with the time step to the pre-trained RNN model for processing. When the input data value is not provided, the pre-trained RNN model does not update its hidden state vector. Importantly, the skip predictor is trained without retraining the pre-trained RNN model.

Type: Grant

Filed: April 22, 2020

Date of Patent: May 30, 2023

Assignee: Arm Limited

Inventors: Urmish Ajit Thakker, Jin Tao, Ganesh Suryanarayan Dasika, Jesse Garrett Beu
AN APPARATUS AND METHOD FOR PROVIDING COHERENCE DATA FOR USE WHEN IMPLEMENTING A CACHE COHERENCY PROTOCOL

Publication number: 20230139212

Abstract: An apparatus and method are provided for receiving a request from a plurality of processing units, where multiple of those processing units have associated cache storage. A snoop unit is used to implement a cache coherency protocol when a request is received that identifies a cacheable memory address. The snoop unit has snoop filter storage comprising a plurality of snoop filter tables organized in a hierarchical arrangement. The snoop filter tables comprise a primary snoop filter table at a highest level in the hierarchy, and each snoop filter table at a lower level in the hierarchy forms a backup snoop filter table for an adjacent snoop filter table at a higher level in the hierarchy. Each snoop filter table is arranged as a multi-way set associative storage structure, and each backup snoop filter table has a different number of sets than are provided in the adjacent snoop filter table.

Type: Application

Filed: January 18, 2021

Publication date: May 4, 2023

Inventors: Joshua RANDALL, Jesse Garrett BEU
System, method and apparatus for training neural networks using multiple datasets

Patent number: 11640533

Abstract: A system, an apparatus and methods for utilizing software and hardware portions of a neural network to fix, or hardwire, certain portions, while modifying other portions are provided. A first set of weights for layers of the first neural network are established, and selected weights are modified to generate a second set of weights, based on a second dataset. The second set of weights is then used to train a second neural network.

Type: Grant

Filed: August 3, 2018

Date of Patent: May 2, 2023

Assignee: Arm Limited

Inventors: Paul Nicholas Whatmough, Matthew Mattina, Jesse Garrett Beu
Snoop filter with imprecise encoding

Patent number: 11567870

Abstract: An apparatus comprises snoop filter storage circuitry to store snoop filter entries corresponding to addresses and comprising sharer information. Control circuitry selects which sharers, among a plurality of sharers capable of holding cached data, should be issued with snoop requests corresponding to a target address, based on the sharer information of the snoop filter entry corresponding to the target address. The control circuitry is capable of setting a given snoop filter entry corresponding to a given address to an imprecise encoding in which the sharer information provides an imprecise description of which sharers hold cached data corresponding to the given address, and the given snoop filter entry comprises at least one sharer count value indicative of a number of sharers holding cached data corresponding to the given address.

Type: Grant

Filed: March 29, 2021

Date of Patent: January 31, 2023

Assignee: Arm Limited

Inventors: Joshua Randall, Jamshed Jalal, Tushar P. Ringe, Jesse Garrett Beu
Mixed-precision computation unit

Patent number: 11561767

Abstract: The present disclosure advantageously provides a mixed precision computation (MPC) unit for executing one or more mixed-precision layers of an artificial neural network (ANN). The MPC unit includes a multiplier circuit configured to input a pair of operands and output a product, a first adder circuit coupled to the multiplier circuit, a second adder circuit, coupled to the first adder circuit, configured to input a pair of operands, an accumulator circuit, coupled to the multiplier circuit and the first adder circuit, configured to output an accumulated value, and a controller, coupled to the multiplier circuit, the first adder circuit, the second adder circuit and the accumulator circuit, configured to input a mode control signal. The controller has a plurality of operating modes including a high precision mode, a low precision add mode and a low precision multiply mode.

Type: Grant

Filed: March 31, 2020

Date of Patent: January 24, 2023

Assignee: Arm Limited

Inventors: Dibakar Gope, Jesse Garrett Beu, Paul Nicholas Whatmough, Matthew Mattina
SYSTEM, DEVICES AND/OR PROCESSES FOR ADAPTING NEURAL NETWORK PROCESSING DEVICES

Publication number: 20220405597

Abstract: Example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to adapt a computing device to classify physical features in a deployment environment. In a particular implementation, computing resources may be selectively de-allocated from at least one of one or more elements of a computing architecture based, at least in part, on assessed impacts to the one or more elements of the computing architecture.

Type: Application

Filed: June 16, 2021

Publication date: December 22, 2022

Inventors: Urmish Ajit Thakker, Jesse Garrett Beu, Dibakar Gope, Mark John O'Connor
SNOOP FILTER WITH IMPRECISE ENCODING

Publication number: 20220308999

Abstract: An apparatus comprises snoop filter storage circuitry to store snoop filter entries corresponding to addresses and comprising sharer information. Control circuitry selects which sharers, among a plurality of sharers capable of holding cached data, should be issued with snoop requests corresponding to a target address, based on the sharer information of the snoop filter entry corresponding to the target address. The control circuitry is capable of setting a given snoop filter entry corresponding to a given address to an imprecise encoding in which the sharer information provides an imprecise description of which sharers hold cached data corresponding to the given address, and the given snoop filter entry comprises at least one sharer count value indicative of a number of sharers holding cached data corresponding to said address.

Type: Application

Filed: March 29, 2021

Publication date: September 29, 2022

Inventors: Joshua Randall, Jamshed Jalal, Tusher P. Ringe, Jesse Garrett Beu
Processor for sparse matrix computation

Patent number: 11392376

Abstract: A data processor receives a first set of processor instructions for combining a first matrix with a second matrix to produce a third matrix and generates a second set of processor instructions therefrom by identifying values of non-zero elements of the first matrix stored in a memory of the data processor and determining memory locations of elements of the second matrix. An instruction of the second set of processor instructions includes a determined memory location and/or an explicit value of an identified non-zero element. The second set of processor instructions is executed by the data processor. The second set of processor instructions may be generated by just-in-time compilation of the first set of processor instructions and may include instructions of a custom instruction set architecture.

Type: Grant

Filed: April 11, 2019

Date of Patent: July 19, 2022

Assignee: Arm Limited

Inventors: Zhigang Liu, Matthew Mattina, Paul Nicholas Whatmough, Jesse Garrett Beu
Non-volatile storage circuitry accessible as primary storage for processing circuitry

Patent number: 11249657

Abstract: Non-volatile storage circuitry is provided as primary storage accessible to processing circuitry, e.g. as registers, a cache, scratchpad memory, TLB or on-chip RAM. Power control circuitry powers down a given region of the non-volatile storage circuitry when information stored in said given region is not being used. This provides opportunities for more frequent power savings than would be possible if primary storage was implemented using volatile storage.

Type: Grant

Filed: July 10, 2019

Date of Patent: February 15, 2022

Assignee: Arm Limited

Inventors: Christopher Neal Hinds, Jesse Garrett Beu, Alejandro Rico Carro, Jose Alberto Joao
MIXED-ELEMENT-SIZE INSTRUCTION

Publication number: 20210389948

Abstract: A mixed-element-size instruction is described, which specifies a first operand and a second operand stored in registers. In response to the mixed-element-size instruction, an instruction decoder controls processing circuitry to perform an arithmetic/logical operation on two or more first data elements of the first operand and two or more second data elements of the second operand, where the first data elements have a larger data element size than the second data elements. This is particularly useful for machine learning applications to improve processing throughput and memory bandwidth utilisation.

Type: Application

Filed: June 10, 2020

Publication date: December 16, 2021

Inventors: Jesse Garrett BEU, Dibakar GOPE, David Hennah MANSELL
Apparatus and method for maintaining cache coherence data for memory blocks of different size granularities using a snoop filter storage comprising an n-way set associative storage structure

Patent number: 11151039

Abstract: An apparatus is provided for receiving requests from a plurality of processing units, at least some of which may have associated cache storage. A snoop unit implements a cache coherency protocol when a request received by the apparatus identifies a cacheable memory address. Snoop filter storage is provided comprising an N-way set associative storage structure with a plurality of entries. Each entry stores coherence data for an associated address range identifying a memory block, and the coherence data is used to determine which cache storages need to be subjected to a snoop operation when implementing the cache coherency protocol in response to a received request. The snoop filter storage stores coherence data for memory blocks of at least a plurality P of different size granularities, and is organised as a plurality of at least P banks that are accessible in parallel, where each bank has entries within each of the N-ways of the snoop filter storage.

Type: Grant

Filed: March 17, 2020

Date of Patent: October 19, 2021

Assignee: Arm Limited

Inventors: Joshua Randall, Jesse Garrett Beu
APPARATUS AND METHOD FOR MAINTAINING CACHE COHERENCE DATA FOR MEMORY BLOCKS OF DIFFERENT SIZE GRANULARITIES USING A SNOOP FILTER STORAGE COMPRISING AN N-WAY SET ASSOCIATIVE STORAGE STRUCTURE

Publication number: 20210294743

Abstract: An apparatus is provided for receiving requests from a plurality of processing units, at least some of which may have associated cache storage. A snoop unit implements a cache coherency protocol when a request received by the apparatus identifies a cacheable memory address. Snoop filter storage is provided comprising an N-way set associative storage structure with a plurality of entries. Each entry stores coherence data for an associated address range identifying a memory block, and the coherence data is used to determine which cache storages need to be subjected to a snoop operation when implementing the cache coherency protocol in response to a received request. The snoop filter storage stores coherence data for memory blocks of at least a plurality P of different size granularities, and is organised as a plurality of at least P banks that are accessible in parallel, where each bank has entries within each of the N-ways of the snoop filter storage.

Type: Application

Filed: March 17, 2020

Publication date: September 23, 2021

Inventors: Joshua RANDALL, Jesse Garrett BEU
Counting elements in data items in a data processing apparatus

Patent number: 11042375

Abstract: An apparatus and method of operating the apparatus are provided for performing a count operation. Instruction decoder circuitry is responsive to a count instruction specifying an input data item to generate control signals to control the data processing circuitry to perform a count operation. The count operation determines a count value indicative of a number of input elements of a subset of elements in the specified input data item which have a value which matches a reference value in a reference element in a reference data item. A plurality of count operations may be performed to determine a count data item corresponding to the input data item. A register scatter storage instruction, a gather index generation instruction, and respective apparatuses responsive to them, as well as simulator implementations, are also provided.

Type: Grant

Filed: August 1, 2017

Date of Patent: June 22, 2021

Assignee: ARM Limited

Inventors: Mbou Eyole, Jesse Garrett Beu, Alejandro Martinez Vicente, Timothy Hayes

1 2 next