Patents by Inventor Alexander Matveev

Alexander Matveev has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEM AND METHOD OF NEURAL NETWORK PROCESSING REDUCING INSTRUCTION USAGE

Publication number: 20240249124

Abstract: A system and method for executing or training a neural network (NN) may, using a computer processor, for a matrix A, for each row in A, for each unique value z appearing in one or more locations in the row in A: summing the set of rows in a matrix B where the set of rows in matrix B correspond to the indices of z in the row in A, the summing producing a vector, multiplying the vector by the unique value z to produce a product vector; and adding the product vector to a row in an output matrix C which corresponds to the row in A.

Type: Application

Filed: January 23, 2024

Publication date: July 25, 2024

Applicant: Neuralmagic Inc.

Inventors: Nir SHAVIT, Alexander MATVEEV, Tyler Michael SMITH
System and method of determining and executing deep tensor columns in neural networks

Patent number: 12033053

Abstract: Embodiments of the invention may execute a NN by executing sub-tensor columns, each sub-tensor column including computations from portions of a layers of the NN, and each sub-tensor column performing computations entirely within a first layer of cache (e.g. L2 in one embodiment) and saving its output entirely within a second layer of cache (e.g. L3 in one embodiment). Embodiments may include partitioning the execution of a NN by partitioning the execution of the NN into sub-tensor columns, each sub-tensor column including computations from portions of layers of the NN, each sub-tensor column performing computations entirely within a first layer of cache and saving its output entirely within a second layer of cache.

Type: Grant

Filed: November 23, 2022

Date of Patent: July 9, 2024

Assignee: NEURALMAGIC, INC.

Inventors: Alexander Matveev, Nir Shavit, Govind Ramnarayan
Systems and methods for improved neural network execution

Patent number: 11960934

Abstract: A method and system for computing one or more outputs of a neural network having a plurality of layers is provided. The method and system can include determining a plurality of sub-computations from total computations of the neural network to execute in parallel wherein the computations to execute in parallel involve computations from multiple layers. The method and system also can also include avoiding repeating overlapped computations and/or multiple memory reads and writes during execution.

Type: Grant

Filed: August 8, 2022

Date of Patent: April 16, 2024

Assignee: NEURALMAGIC, INC.

Inventors: Alexander Matveev, Nir Shavit
System and method of determining and executing deep tensor columns in neural networks

Patent number: 11960982

Abstract: A system and method may partition and/or execute a NN, by, for a graph including nodes and hyper edges, each node representing a data item in the NN and each hyper edge representing an operation in the NN, identifying a deep tensor column comprising a subset of the nodes and a subset of the hyper edges, such that the operations in the deep tensor column, when executed, use only data which fits within a preselected cache.

Type: Grant

Filed: October 21, 2022

Date of Patent: April 16, 2024

Assignee: NEURALMAGIC, INC.

Inventors: Alexander Matveev, Nir Shavit, Govind Ramnarayan, Tyler Michael Smith, Sage Moore
System and method of accelerating execution of a neural network

Patent number: 11797855

Abstract: A system and method of accelerating execution of a NN model, by at least one processor may include: receiving a first matrix A, representing elements of a kernel K of the NN model and a second matrix B, representing elements of an input I to kernel K; producing from matrix A, a group-sparse matrix A?, comprising G tensors of elements. The number of elements in each tensor is defined by, or equal to a number of entries in each index of an input tensor register used for a specific Single Instruction Multiple Data (SIMD) tensor operation, and all elements of A? outside said G tensors are null. The system and method may further include executing kernel K on input I, by performing at least one computation of the SIMD tensor operation, having as operands elements of a tensor of the G tensors and corresponding elements of the B matrix.

Type: Grant

Filed: November 4, 2021

Date of Patent: October 24, 2023

Assignee: Neuralmagic, Inc.

Inventors: Alexander Matveev, Dan Alistarh, Justin Kopinsky, Rati Gelashvili, Mark Kurtz, Nir Shavit
Systems and methods for exchange of data in distributed training of machine learning algorithms

Patent number: 11715287

Abstract: Systems and methods may make exchanging data in a neural network (NN) during training more efficient. Exchanging weights among a number of processors training a NN across iterations may include sorting generated weights, compressing the sorted weights, and transmitting the compressed sorted weights. On each Kth iteration a sort order of the sorted weights may be created and transmitted. Exchanging weights among processors training a NN may include executing a forward pass to produce a set of loss values for processors, transmitting loss values to other processors, and at each of the processors, performing backpropagation on at least one layer of the NN using loss values received from other processors.

Type: Grant

Filed: November 16, 2018

Date of Patent: August 1, 2023

Inventors: Alexander Matveev, Nir Shavit
System and method of executing deep tensor columns in neural networks

Patent number: 11556757

Abstract: Embodiments of the invention may execute a NN by executing sub-tensor columns, each sub-tensor column including computations from portions of a layers of the NN, and each sub-tensor column performing computations entirely within a first layer of cache (e.g. L2 in one embodiment) and saving its output entirely within a second layer of cache (e.g. L3 in one embodiment). Embodiments may include partitioning the execution of a NN by partitioning the execution of the NN into sub-tensor columns, each sub-tensor column including computations from portions of layers of the NN, each sub-tensor column performing computations entirely within a first layer of cache and saving its output entirely within a second layer of cache.

Type: Grant

Filed: December 10, 2021

Date of Patent: January 17, 2023

Assignee: Neuralmagic Ltd.

Inventors: Alexander Matveev, Nir Shavit, Govind Ramnarayan
SYSTEMS AND METHODS FOR IMPROVED NEURAL NETWORK EXECUTION

Publication number: 20220383068

Abstract: A method and system for computing one or more outputs of a neural network having a plurality of layers is provided. The method and system can include determining a plurality of sub-computations from total computations of the neural network to execute in parallel wherein the computations to execute in parallel involve computations from multiple layers. The method and system also can also include avoiding repeating overlapped computations and/or multiple memory reads and writes during execution.

Type: Application

Filed: August 8, 2022

Publication date: December 1, 2022

Applicant: Neuralmagic Inc.

Inventors: Alexander MATVEEV, Nir SHAVIT
Systems and methods for improved neural network execution

Patent number: 11449363

Abstract: A method and system for computing one or more outputs of a neural network having a plurality of layers is provided. The method and system can include determining a plurality of sub-computations from total computations of the neural network to execute in parallel wherein the computations to execute in parallel involve computations from multiple layers. The method and system also can also include avoiding repeating overlapped computations and/or multiple memory reads and writes during execution.

Type: Grant

Filed: May 30, 2019

Date of Patent: September 20, 2022

Assignee: Neuralmagic Inc.

Inventors: Alexander Matveev, Nir Shavit
SYSTEM AND METHOD OF ACCELERATING EXECUTION OF A NEURAL NETWORK

Publication number: 20220058486

Abstract: A system and method of accelerating execution of a NN model, by at least one processor may include: receiving a first matrix A, representing elements of a kernel K of the NN model and a second matrix B, representing elements of an input I to kernel K; producing from matrix A, a group-sparse matrix A?, comprising G tensors of elements. The number of elements in each tensor is defined by, or equal to a number of entries in each index of an input tensor register used for a specific Single Instruction Multiple Data (SIMD) tensor operation, and all elements of A? outside said G tensors are null. The system and method may further include executing kernel K on input I, by performing at least one computation of the SIMD tensor operation, having as operands elements of a tensor of the G tensors and corresponding elements of the B matrix.

Type: Application

Filed: November 4, 2021

Publication date: February 24, 2022

Applicant: Neuralmagic Inc.

Inventors: Alexander MATVEEV, Dan ALISTARH, Justin KOPINSKY, Rati GELASHVILI, Mark KURTZ, Nir SHAVIT
System and method of accelerating execution of a neural network

Patent number: 11195095

Abstract: A system and method of accelerating execution of a NN model, by at least one processor may include: receiving a first matrix A, representing elements of a kernel K of the NN model and a second matrix B, representing elements of an input I to kernel K; producing from matrix A, a group-sparse matrix A?, comprising G tensors of elements. The number of elements in each tensor is defined by, or equal to a number of entries in each index of an input tensor register used for a specific Single Instruction Multiple Data (SIMD) tensor operation, and all elements of A? outside said G tensors are null. The system and method may further include executing kernel K on input I, by performing at least one computation of the SIMD tensor operation, having as operands elements of a tensor of the G tensors and corresponding elements of the B matrix.

Type: Grant

Filed: August 5, 2020

Date of Patent: December 7, 2021

Assignee: NEURALMAGIC INC.

Inventors: Alexander Matveev, Dan Alistarh, Justin Kopinsky, Rati Gelashvili, Mark Kurtz, Nir Shavit
SYSTEM AND METHOD OF ACCELERATING EXECUTION OF A NEURAL NETWORK

Publication number: 20210042624

Abstract: A system and method of accelerating execution of a NN model, by at least one processor may include: receiving a first matrix A, representing elements of a kernel K of the NN model and a second matrix B, representing elements of an input I to kernel K; producing from matrix A, a group-sparse matrix A?, comprising G tensors of elements. The number of elements in each tensor is defined by, or equal to a number of entries in each index of an input tensor register used for a specific Single Instruction Multiple Data (SIMD) tensor operation, and all elements of A? outside said G tensors are null. The system and method may further include executing kernel K on input I, by performing at least one computation of the SIMD tensor operation, having as operands elements of a tensor of the G tensors and corresponding elements of the B matrix.

Type: Application

Filed: August 5, 2020

Publication date: February 11, 2021

Applicant: Neuralmagic Inc.

Inventors: Alexander MATVEEV, Dan ALISTARH, Justin KOPINSKY, Rati GELASHVILI, Mark KURTZ, Nir SHAVIT
System and method of executing neural networks

Patent number: 10915816

Abstract: A system and method of inferring a neural network (NN) on one or more target computing devices. The NN may include a plurality of layers, where at least one layer includes one or more kernels. Embodiments may include: receiving a data structure representing the NN; analyzing the data structure to produce one or more tasks, where each task may include computations pertaining to a kernel of the NN; selecting a sparse version of at least one kernel and replacing the at least one kernel with the sparse version; and compiling the one or more tasks to produce one or more respective tensor columns, The one or more tensor columns are adapted to fit in respective one or more cache memories of the one or more target computing devices, and include task instruction code that represents at least one computation of the kernel of the NN.

Type: Grant

Filed: September 18, 2020

Date of Patent: February 9, 2021

Assignee: NEURALMAGIC INC.

Inventors: Alexander Matveev, Nir Shavit
Methods and systems for improved transforms in convolutional neural networks

Patent number: 10902318

Abstract: A system and method for convolutional layer in convolutional neural networks is provided. The convolution is performed via a transformation that includes relocating input, relocating convolution filters and performing an aggregate matrix multiply.

Type: Grant

Filed: November 6, 2018

Date of Patent: January 26, 2021

Assignee: NEURALMAGIC INC.

Inventors: Alexander Matveev, Nir Shavit
SYSTEM AND METHOD OF EXECUTING NEURAL NETWORKS

Publication number: 20210004684

Abstract: A system and method of inferring a neural network (NN) on one or more target computing devices. The NN may include a plurality of layers, where at least one layer includes one or more kernels. Embodiments may include: receiving a data structure representing the NN; analyzing the data structure to produce one or more tasks, where each task may include computations pertaining to a kernel of the NN; selecting a sparse version of at least one kernel and replacing the at least one kernel with the sparse version; and compiling the one or more tasks to produce one or more respective tensor columns, The one or more tensor columns are adapted to fit in respective one or more cache memories of the one or more target computing devices, and include task instruction code that represents at least one computation of the kernel of the NN.

Type: Application

Filed: September 18, 2020

Publication date: January 7, 2021

Applicant: Neuralmagic Inc.

Inventors: Alexander MATVEEV, Nir SHAVIT
System and method of executing neural networks

Patent number: 10832133

Abstract: A system and method of inferring a neural network (NN) on one or more target computing devices. The NN may include a plurality of layers, where at least one layer includes one or more kernels. Embodiments may include: receiving a data structure representing the NN; analyzing the data structure to produce one or more tasks, where each task may include computations pertaining to a kernel of the NN; selecting a sparse version of at least one kernel and replacing the at least one kernel with the sparse version; and compiling the one or more tasks to produce one or more respective tensor columns, The one or more tensor columns are adapted to fit in respective one or more cache memories of the one or more target computing devices, and include task instruction code that represents at least one computation of the kernel of the NN.

Type: Grant

Filed: January 24, 2020

Date of Patent: November 10, 2020

Assignee: NEURALMAGIC INC.

Inventors: Alexander Matveev, Nir Shavit
SYSTEM AND METHOD OF EXECUTING NEURAL NETWORKS

Publication number: 20200160182

Abstract: A system and method of inferring a neural network (NN) on one or more target computing devices. The NN may include a plurality of layers, where at least one layer includes one or more kernels. Embodiments may include: receiving a data structure representing the NN; analyzing the data structure to produce one or more tasks, where each task may include computations pertaining to a kernel of the NN; selecting a sparse version of at least one kernel and replacing the at least one kernel with the sparse version; and compiling the one or more tasks to produce one or more respective tensor columns, The one or more tensor columns are adapted to fit in respective one or more cache memories of the one or more target computing devices, and include task instruction code that represents at least one computation of the kernel of the NN.

Type: Application

Filed: January 24, 2020

Publication date: May 21, 2020

Applicant: Neuralmagic Inc.

Inventors: Alexander MATVEEV, Nir Shavit, Aleksandar Zlateski
SYSTEMS AND METHODS FOR IMPROVED NEURAL NETWORK EXECUTION

Publication number: 20190370071

Abstract: A method and system for computing one or more outputs of a neural network having a plurality of layers is provided. The method and system can include determining a plurality of sub-computations from total computations of the neural network to execute in parallel wherein the computations to execute in parallel involve computations from multiple layers. The method and system also can also include avoiding repeating overlapped computations and/or multiple memory reads and writes during execution.

Type: Application

Filed: May 30, 2019

Publication date: December 5, 2019

Applicant: Neuralmagic Inc.

Inventors: Alexander Matveev, Nir Shavit
SYSTEMS AND METHODS FOR EXCHANGE OF DATA IN DISTRIBUTED TRAINING OF MACHINE LEARNING ALGORITHMS

Publication number: 20190156215

Abstract: Systems and methods may make exchanging data in a neural network (NN) during training more efficient. Exchanging weights among a number of processors training a NN across iterations may include sorting generated weights, compressing the sorted weights, and transmitting the compressed sorted weights. On each Kth iteration a sort order of the sorted weights may be created and transmitted. Exchanging weights among processors training a NN may include executing a forward pass to produce a set of loss values for processors, transmitting loss values to other processors, and at each of the processors, performing backpropagation on at least one layer of the NN using loss values received from other processors.

Type: Application

Filed: November 16, 2018

Publication date: May 23, 2019

Applicant: Neuralmagic Inc.

Inventors: Alexander MATVEEV, Nir SHAVIT
SYSTEMS AND METHODS FOR EXCHANGE OF DATA IN DISTRIBUTED TRAINING OF MACHINE LEARNING ALGORITHMS

Publication number: 20190156214

Abstract: Systems and methods may make exchanging data in a neural network (NN) during training more efficient. Exchanging weights among a number of processors training a NN across iterations may include sorting generated weights, compressing the sorted weights, and transmitting the compressed sorted weights. On each Kth iteration a sort order of the sorted weights may be created and transmitted. Exchanging weights among processors training a NN may include executing a forward pass to produce a set of loss values for processors, transmitting loss values to other processors, and at each of the processors, performing backpropagation on at least one layer of the NN using loss values received from other processors.

Type: Application

Filed: November 16, 2018

Publication date: May 23, 2019

Inventors: Alexander MATVEEV, Nir Shavit

1 2 next