Patents by Inventor Rati GELASHVILI

Rati GELASHVILI has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

System and method of accelerating execution of a neural network

Patent number: 11797855

Abstract: A system and method of accelerating execution of a NN model, by at least one processor may include: receiving a first matrix A, representing elements of a kernel K of the NN model and a second matrix B, representing elements of an input I to kernel K; producing from matrix A, a group-sparse matrix A?, comprising G tensors of elements. The number of elements in each tensor is defined by, or equal to a number of entries in each index of an input tensor register used for a specific Single Instruction Multiple Data (SIMD) tensor operation, and all elements of A? outside said G tensors are null. The system and method may further include executing kernel K on input I, by performing at least one computation of the SIMD tensor operation, having as operands elements of a tensor of the G tensors and corresponding elements of the B matrix.

Type: Grant

Filed: November 4, 2021

Date of Patent: October 24, 2023

Assignee: Neuralmagic, Inc.

Inventors: Alexander Matveev, Dan Alistarh, Justin Kopinsky, Rati Gelashvili, Mark Kurtz, Nir Shavit
SYSTEM AND METHOD OF ACCELERATING EXECUTION OF A NEURAL NETWORK

Publication number: 20220058486

Abstract: A system and method of accelerating execution of a NN model, by at least one processor may include: receiving a first matrix A, representing elements of a kernel K of the NN model and a second matrix B, representing elements of an input I to kernel K; producing from matrix A, a group-sparse matrix A?, comprising G tensors of elements. The number of elements in each tensor is defined by, or equal to a number of entries in each index of an input tensor register used for a specific Single Instruction Multiple Data (SIMD) tensor operation, and all elements of A? outside said G tensors are null. The system and method may further include executing kernel K on input I, by performing at least one computation of the SIMD tensor operation, having as operands elements of a tensor of the G tensors and corresponding elements of the B matrix.

Type: Application

Filed: November 4, 2021

Publication date: February 24, 2022

Applicant: Neuralmagic Inc.

Inventors: Alexander MATVEEV, Dan ALISTARH, Justin KOPINSKY, Rati GELASHVILI, Mark KURTZ, Nir SHAVIT
System and method of accelerating execution of a neural network

Patent number: 11195095

Abstract: A system and method of accelerating execution of a NN model, by at least one processor may include: receiving a first matrix A, representing elements of a kernel K of the NN model and a second matrix B, representing elements of an input I to kernel K; producing from matrix A, a group-sparse matrix A?, comprising G tensors of elements. The number of elements in each tensor is defined by, or equal to a number of entries in each index of an input tensor register used for a specific Single Instruction Multiple Data (SIMD) tensor operation, and all elements of A? outside said G tensors are null. The system and method may further include executing kernel K on input I, by performing at least one computation of the SIMD tensor operation, having as operands elements of a tensor of the G tensors and corresponding elements of the B matrix.

Type: Grant

Filed: August 5, 2020

Date of Patent: December 7, 2021

Assignee: NEURALMAGIC INC.

Inventors: Alexander Matveev, Dan Alistarh, Justin Kopinsky, Rati Gelashvili, Mark Kurtz, Nir Shavit
SYSTEMS AND METHODS FOR NEURAL NETWORK CONVOLUTIONAL LAYER MATRIX MULTIPLICATION USING CACHE MEMORY

Publication number: 20210201124

Abstract: A computer processor may include a number of cores, a shared cache shared among the cores, and a local cache associated with each core and used by that core only. Input data for a neural network (NN) layer may be partitioned into a set of tiles of size T×T, and the tile set may be partitioned into blocks of R tiles. For each block, a core may perform a transform operation on the tiles to produce transformed data matrices fitting in a local cache, and a set of multiply operations, each multiply operation using a transformed data matrix and a transformed kernel matrix from a set of transformed kernel matrices. The set of transformed kernel matrices may fit in the shared cache. The result of at least one of the multiply operations may be stored in a location used to store a transformed data matrix.

Type: Application

Filed: August 27, 2019

Publication date: July 1, 2021

Applicant: Neuralmagic Inc.

Inventor: Rati GELASHVILI
SYSTEM AND METHOD OF ACCELERATING EXECUTION OF A NEURAL NETWORK

Publication number: 20210042624

Abstract: A system and method of accelerating execution of a NN model, by at least one processor may include: receiving a first matrix A, representing elements of a kernel K of the NN model and a second matrix B, representing elements of an input I to kernel K; producing from matrix A, a group-sparse matrix A?, comprising G tensors of elements. The number of elements in each tensor is defined by, or equal to a number of entries in each index of an input tensor register used for a specific Single Instruction Multiple Data (SIMD) tensor operation, and all elements of A? outside said G tensors are null. The system and method may further include executing kernel K on input I, by performing at least one computation of the SIMD tensor operation, having as operands elements of a tensor of the G tensors and corresponding elements of the B matrix.

Type: Application

Filed: August 5, 2020

Publication date: February 11, 2021

Applicant: Neuralmagic Inc.

Inventors: Alexander MATVEEV, Dan ALISTARH, Justin KOPINSKY, Rati GELASHVILI, Mark KURTZ, Nir SHAVIT

System and method of accelerating execution of a neural network

SYSTEM AND METHOD OF ACCELERATING EXECUTION OF A NEURAL NETWORK

System and method of accelerating execution of a neural network

SYSTEMS AND METHODS FOR NEURAL NETWORK CONVOLUTIONAL LAYER MATRIX MULTIPLICATION USING CACHE MEMORY

SYSTEM AND METHOD OF ACCELERATING EXECUTION OF A NEURAL NETWORK