Patents by Inventor Matthew Mattina

Matthew Mattina has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Time domain unrolling sparse matrix multiplication system and method

Patent number: 11928176

Abstract: A system and method for multiplying matrices are provided. The system includes a processor coupled to a memory and a matrix multiply accelerator (MMA) coupled to the processor. The MMA is configured to multiply, based on a bitmap, a compressed first matrix and a second matrix to generate an output matrix including, for each element i,j of the output matrix, a calculation of a dot product of an ith row of the compressed first matrix and a jth column of the second matrix based on the bitmap. Or, the MMA is configured to multiply, based on the bitmap, the second matrix and the compressed first matrix and to generate the output matrix including, for each element i,j of the output matrix, a calculation of a dot product of an ith row of the second matrix and a jth column of the compressed first matrix based on the bitmap.

Type: Grant

Filed: November 24, 2020

Date of Patent: March 12, 2024

Assignee: Arm Limited

Inventors: Zhi-Gang Liu, Paul Nicholas Whatmough, Matthew Mattina
Refactoring mac operations

Patent number: 11922169

Abstract: A method and apparatus for performing refactored multiply-and-accumulate operations is provided. A summing array includes a plurality of non-volatile memory elements arranged in columns. Each non-volatile memory element in the summing array is programmed to a high resistance state or a low resistance state based on weights of a neural network. The summing array is configured to generate a summed signal for each column based, at least in part, on a plurality of input signals. A multiplying array is coupled to the summing array, and includes a plurality of non-volatile memory elements. Each non-volatile memory element in the multiplying array is programmed to a different conductance level based on the weights of the neural network. The multiplying array is configured to generate an output signal based, at least in part, on the summed signals from the summing array.

Type: Grant

Filed: February 17, 2022

Date of Patent: March 5, 2024

Assignee: Arm Limited

Inventors: Matthew Mattina, Shidhartha Das, Glen Arnold Rosendale, Fernando Garcia Redondo
SYSTEM, DEVICES AND/OR PROCESSES FOR DEFINING A SEARCH SPACE FOR NEURAL NETWORK PROCESSING DEVICE ARCHITECTURES

Publication number: 20240046065

Abstract: Example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to determine options for decisions in connection with design features of a computing device. In a particular implementation, design options for two or more design decisions of neural network processing device may be identified based, at least in part, on combination of a definition of available computing resources and one or more predefined performance constraints.

Type: Application

Filed: August 3, 2022

Publication date: February 8, 2024

Inventors: Hokchhay Tann, Ramon Matas Navarro, Igor Fedorov, Chuteng Zhou, Paul Nicholas Whatmough, Matthew Mattina
Non-volatile memory-based compact mixed-signal multiply-accumulate engine

Patent number: 11886987

Abstract: A multiply-accumulate method and architecture are disclosed. The architecture includes a plurality of networks of non-volatile memory elements arranged in tiled columns. Logic digitally modulates the equivalent conductance of individual networks among the plurality of networks to map the equivalent conductance of each individual network to a single weight within the neural network. A first partial selection of weights within the neural network is mapped into the equivalent conductances of the networks in the columns to enable the computation of multiply-and-accumulate operations by mixed-signal computation. The logic updates the mappings to select a second partial selection of weights to compute additional multiply-and-accumulate operations and repeats the mapping and computation operations until all computations for the neural network are completed.

Type: Grant

Filed: June 25, 2019

Date of Patent: January 30, 2024

Assignee: Arm Limited

Inventors: Shidhartha Das, Matthew Mattina, Glen Arnold Rosendale, Fernando Garcia Redondo
Hardware accelerator for IM2COL operation

Patent number: 11783163

Abstract: The present disclosure advantageously provides a matrix expansion unit that includes an input data selector, a first register set, a second register set, and an output data selector. The input data selector is configured to receive first matrix data in a columnwise format. The first register set is coupled to the input data selector, and includes a plurality of data selectors and a plurality of registers arranged in a first shift loop. The second register set is coupled to the data selector, and includes a plurality of data selectors and a plurality of registers arranged in a second shift loop. The output data selector is coupled to the first register set and the second register set, and is configured to output second matrix data in a rowwise format.

Type: Grant

Filed: June 15, 2020

Date of Patent: October 10, 2023

Assignee: Arm Limited

Inventors: Zhi-Gang Liu, Paul Nicholas Whatmough, Matthew Mattina
Multi-dimensional data path architecture

Patent number: 11693796

Abstract: Various implementations described herein are directed to a device having a multi-layered logic structure with a first logic layer and a second logic layer arranged vertically in a stacked configuration. The device may have a memory array that provides data, and also, the device may have an inter-layer data bus that vertically couples the memory array to the multi-layered logic structure. The inter-layer data bus may provide multiple data paths to the first logic layer and the second logic layer for reuse of the data provided by the memory array.

Type: Grant

Filed: May 31, 2021

Date of Patent: July 4, 2023

Assignee: Arm Limited

Inventors: Paul Nicholas Whatmough, Zhi-Gang Liu, Supreet Jeloka, Saurabh Pijuskumar Sinha, Matthew Mattina
System, method and apparatus for training neural networks using multiple datasets

Patent number: 11640533

Abstract: A system, an apparatus and methods for utilizing software and hardware portions of a neural network to fix, or hardwire, certain portions, while modifying other portions are provided. A first set of weights for layers of the first neural network are established, and selected weights are modified to generate a second set of weights, based on a second dataset. The second set of weights is then used to train a second neural network.

Type: Grant

Filed: August 3, 2018

Date of Patent: May 2, 2023

Assignee: Arm Limited

Inventors: Paul Nicholas Whatmough, Matthew Mattina, Jesse Garrett Beu
Matrix Multiply Accelerator for Variable Bitwidth Operands

Publication number: 20230103312

Abstract: A processor, computer based method and apparatus for performing matrix multiplication are provided. The processor obtains a first bitslice vector comprising m elements, obtains a second bitslice vector comprising n elements, provides at least one element of the first bitslice vector as a first input to a single bit dot product unit, provides at least one element of the second bit-slice vector as a second input to the single-bit dot product unit, and obtains, from the single-bit dot product unit, an output comprising at least a partial dot product of the first and second bitslice vectors.

Type: Application

Filed: March 30, 2022

Publication date: April 6, 2023

Applicant: Arm Limited

Inventors: Zhi-Gang Liu, Paul Nicholas Whatmough, Matthew Mattina, John Fremont Brown, III
Matrix Multiply Accelerator For Variable Bitwidth Operands

Publication number: 20230108629

Abstract: A system and method for multiplying first and second matrices are provided. For the first matrix, a number of bit slice vectors for each row are generated based on the bit resolution, and a first bit slice tensor is generated based on the bit slice vectors for each row. For the second matrix, a number of bit slice vectors for each column are generated based on the bit resolution, and a second bit slice tensor is generated based on the bit slice vectors for each row. The first and second bit slice tensors are multiplied by a matrix multiply accelerator (MMA) to generate an output matrix.

Type: Application

Filed: October 4, 2021

Publication date: April 6, 2023

Applicant: Arm Limited

Inventors: Zhi-Gang Liu, Paul Nicholas Whatmough, Matthew Mattina, John Fremont Brown, III
Nibble Block Format

Publication number: 20230076138

Abstract: A matrix multiplication system and method are provided. The system includes a memory that stores one or more weight tensors, a processor and a matrix multiply accelerator (MMA). The processor converts each weight tensor into an encoded block set that is stored in the memory. Each encoded block set includes a number of encoded blocks, and each encoded block includes a data field and an index field. The MMA converts each encoded block set into a reconstructed weight tensor, and convolves each reconstructed weight tensor and an input data tensor to generate an output data matrix.

Type: Application

Filed: September 9, 2021

Publication date: March 9, 2023

Applicant: Arm Limited

Inventors: Paul Nicholas Whatmough, Zhi-Gang Liu, Matthew Mattina
SYSTEM, DEVICES AND/OR PROCESSES FOR DESIGNING NEURAL NETWORK PROCESSING DEVICES

Publication number: 20230042271

Abstract: Example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to select options for decisions in connection with design features of a computing device. In a particular implementation, design options for two or more design decisions of neural network processing device may be selected based, at least in part, on combination of function values that are computed based, at least in part, on a tensor expressing sample neural network weights.

Type: Application

Filed: August 4, 2021

Publication date: February 9, 2023

Inventors: Igor Fedorov, Ramon Matas Navarro, Chuteng Zhou, Hokchhay Tann, Paul Nicholas Whatmough, Matthew Mattina
SYSTEM, CIRCUIT, DEVICE AND/OR PROCESSES FOR ACCUMULATING NEURAL NETWORK SIGNALS

Publication number: 20230026113

Abstract: Example methods, devices and/or circuits to be implemented in a processing device to perform neural network-based computing operations. According to an embodiment, an accumulation of weighted activation input values may be computed on accumulation cycles at least in part by multiplying and/or scaling accumulated activation input values by an associated neural network weight.

Type: Application

Filed: July 21, 2021

Publication date: January 26, 2023

Inventors: Paul Nicholas Whatmough, Zhi-Gang Liu, Matthew Mattina
Mixed-precision computation unit

Patent number: 11561767

Abstract: The present disclosure advantageously provides a mixed precision computation (MPC) unit for executing one or more mixed-precision layers of an artificial neural network (ANN). The MPC unit includes a multiplier circuit configured to input a pair of operands and output a product, a first adder circuit coupled to the multiplier circuit, a second adder circuit, coupled to the first adder circuit, configured to input a pair of operands, an accumulator circuit, coupled to the multiplier circuit and the first adder circuit, configured to output an accumulated value, and a controller, coupled to the multiplier circuit, the first adder circuit, the second adder circuit and the accumulator circuit, configured to input a mode control signal. The controller has a plurality of operating modes including a high precision mode, a low precision add mode and a low precision multiply mode.

Type: Grant

Filed: March 31, 2020

Date of Patent: January 24, 2023

Assignee: Arm Limited

Inventors: Dibakar Gope, Jesse Garrett Beu, Paul Nicholas Whatmough, Matthew Mattina
Artificial neural network optical hardware accelerator

Patent number: 11526743

Abstract: The present disclosure advantageously provides an Optical Hardware Accelerator (OHA) for an Artificial Neural Network (ANN) that includes a communication bus interface, a memory, a controller, and an optical computing engine (OCE). The OCE is configured to execute an ANN model with ANN weights. Each ANN weight includes a quantized phase shift value ?i and a phase shift value ?i. The OCE includes a digital-to-optical (D/O) converter configured to generate input optical signals based on the input data, an optical neural network (ONN) configured to generate output optical signals based on the input optical signals, and an optical-to-digital (O/D) converter configured to generate the output data based on the output optical signals. The ONN includes a plurality of optical units (OUs), and each OU includes an optical multiply and accumulate (OMAC) module.

Type: Grant

Filed: March 13, 2020

Date of Patent: December 13, 2022

Assignee: Arm Limited

Inventors: Zhi-Gang Liu, Matthew Mattina, John Fremont Brown, III
Multi-Dimensional Data Path Architecture

Publication number: 20220382690

Abstract: Various implementations described herein are directed to a device having a multi-layered logic structure with a first logic layer and a second logic layer arranged vertically in a stacked configuration. The device may have a memory array that provides data, and also, the device may have an inter-layer data bus that vertically couples the memory array to the multi-layered logic structure. The inter-layer data bus may provide multiple data paths to the first logic layer and the second logic layer for reuse of the data provided by the memory array.

Type: Application

Filed: May 31, 2021

Publication date: December 1, 2022

Inventors: Paul Nicholas Whatmough, Zhi-Gang Liu, Supreet Jeloka, Saurabh Pijuskumar Sinha, Matthew Mattina
Modulo operation unit

Patent number: 11507813

Abstract: The present disclosure advantageously provides a modulo operation unit that includes a first input configured to receive operand data, a second input configured to receive modulus data, an initial modulo stage, a sequence of intermediate modulo stages, and a final modulo stage.

Type: Grant

Filed: June 1, 2020

Date of Patent: November 22, 2022

Assignee: Arm Limited

Inventors: Zhi-Gang Liu, Matthew Mattina
Pipelined accumulator

Patent number: 11501151

Abstract: The present disclosure advantageously provides a pipelined accumulator that includes a data selector configured to receive a sequence of operands to be summed, an input register coupled to the data selector, an output register, coupled to the data selector, configured to store a sequence of partial sums and output a final sum, and a multi-stage add module coupled to the input register and the output register. The multi-stage add module is configured to store a sequence of partial sums and a final sum in a redundant format, and perform back-to-back accumulation into the output register.

Type: Grant

Filed: May 28, 2020

Date of Patent: November 15, 2022

Assignee: Arm Limited

Inventors: Paul Nicholas Whatmough, Zhi-Gang Liu, Matthew Mattina
Processor for sparse matrix computation

Patent number: 11392376

Abstract: A data processor receives a first set of processor instructions for combining a first matrix with a second matrix to produce a third matrix and generates a second set of processor instructions therefrom by identifying values of non-zero elements of the first matrix stored in a memory of the data processor and determining memory locations of elements of the second matrix. An instruction of the second set of processor instructions includes a determined memory location and/or an explicit value of an identified non-zero element. The second set of processor instructions is executed by the data processor. The second set of processor instructions may be generated by just-in-time compilation of the first set of processor instructions and may include instructions of a custom instruction set architecture.

Type: Grant

Filed: April 11, 2019

Date of Patent: July 19, 2022

Assignee: Arm Limited

Inventors: Zhigang Liu, Matthew Mattina, Paul Nicholas Whatmough, Jesse Garrett Beu
Apparatus and method for matrix operations

Patent number: 11379556

Abstract: There is provided a data processing apparatus to perform an operation on a first matrix and a second matrix. The data processing apparatus includes receiver circuitry to receive elements of the first matrix, elements of the second matrix, and correspondence data to indicate where the elements of the first matrix are located in the first matrix. Determination circuitry performs, using the correspondence data, a determination of whether, for a given element of the first matrix in column i of the first matrix, a given element of the second matrix occurs in row i of the second matrix. Aggregation circuitry calculates an aggregation between a given row in the first matrix and a given column in the second matrix and includes: functional circuitry to perform, in dependence on the determination, a function on the given element of the first matrix and the given element of the second matrix to produce a partial result.

Type: Grant

Filed: May 21, 2019

Date of Patent: July 5, 2022

Assignee: Arm Limited

Inventors: Matthew Mattina, Zhigang Liu, Paul Nicholas Whatmough, David Hennah Mansell
Refactoring Mac Operations

Publication number: 20220179658

Abstract: A method and apparatus for performing refactored multiply-and-accumulate operations is provided. A summing array includes a plurality of non-volatile memory elements arranged in columns. Each non-volatile memory element in the summing array is programmed to a high resistance state or a low resistance state based on weights of a neural network. The summing array is configured to generate a summed signal for each column based, at least in part, on a plurality of input signals. A multiplying array is coupled to the summing array, and includes a plurality of non-volatile memory elements. Each non-volatile memory element in the multiplying array is programmed to a different conductance level based on the weights of the neural network. The multiplying array is configured to generate an output signal based, at least in part, on the summed signals from the summing array.

Type: Application

Filed: February 17, 2022

Publication date: June 9, 2022

Applicant: Arm Limited

Inventors: Matthew Mattina, Shidhartha Das, Glen Arnold Rosendale, Fernando Garcia Redondo

1 2 3 4 5 next