Patents by Inventor Venmugil Elango

Venmugil Elango has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Hierarchical and shared exponent floating point data types

Patent number: 11886833

Abstract: Embodiments of the present disclosure include systems and methods for providing hierarchical and shared exponent floating point data types. First and second shared exponent values are determined based on exponent values of a plurality of floating point values. A third shared exponent value is determined based the first shared exponent value and the second shared exponent value. First and second difference values are determined based on the first shared exponent value, the second shared exponent value, and the third shared exponent value. Sign values and mantissa values are determined for the plurality of floating point values. The sign value and the mantissa value for each floating point value in the plurality of floating point values, the third shared exponent value, the first difference value, and the second difference value are stored in a data structure for a shared exponent floating point data type.

Type: Grant

Filed: June 28, 2021

Date of Patent: January 30, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Bita Darvish Rouhani, Venmugil Elango, Rasoul Shafipour, Jeremy Fowers, Ming Gang Liu, Jinwen Xi, Douglas C. Burger, Eric S. Chung
SYSTEMS AND METHODS FOR SPARSE MATRIX MULTIPLICATION

Publication number: 20230385374

Abstract: A method for sparse matrix multiplication comprises receiving a first block having M elements in a first dimension, and parsing the first block of M elements into a first set of B sub-blocks including MB elements in the first dimension. A first sparsity mask having S % sparsity is applied to the first block of elements, such that each of the first set of B sub-blocks has S % sparsity. A second block is received having M elements in a second dimension, and is parsed into a second set of B sub-blocks that include MB elements in the second dimension. A second sparsity mask having S?% sparsity is applied to the second block of elements, such that S?% of the second set of B sub-blocks have 100% sparsity and (100?S?)% of the second set of B sub-blocks have 0% sparsity. The first and second blocks are then matrix multiplied.

Type: Application

Filed: April 4, 2022

Publication date: November 30, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Venmugil ELANGO, Bita DARVISH ROUHANI, Eric S CHUNG, Douglas Christopher BURGER
ACCELERATING LINEAR ALGEBRA KERNELS FOR ANY PROCESSOR ARCHITECTURE

Publication number: 20230251861

Abstract: Systems and methods for obtaining a set of instructions for executing a computer program and generating executable code for the computer program based, at least in part, on scheduling operations associated with the executable code according to a polyhedral representation of a directed acyclic graph. The set of instructions may be represented as a domain-specific language. The executable code may be executable code for a specific processor architecture.

Type: Application

Filed: April 18, 2023

Publication date: August 10, 2023

Inventors: Venmugil Elango, Norman Rubin, Mahesh Ravishankar, Vinod Grover
SPARSIFYING NARROW DATA FORMATS FOR NEURAL NETWORKS

Publication number: 20220405571

Abstract: Embodiments of the present disclosure include systems and methods for sparsifying narrow data formats for neural networks. A plurality of activation values in a neural network are provided to a muxing unit. A set of sparsification operations are performed on a plurality of weight values to generate a subset of the plurality of weight values and mask values associated with the plurality of weight values. The subset of the plurality of weight values are provided to a matrix multiplication unit. The muxing unit generates a subset of the plurality of activation values based on the mask values and provides the subset of the plurality of activation values to the matrix multiplication unit. The matrix multiplication unit performs a set of matrix multiplication operations on the subset of the plurality of weight values and the subset of the plurality of activation values to generate a set of outputs.

Type: Application

Filed: June 16, 2021

Publication date: December 22, 2022

Inventors: Bita DARVISH ROUHANI, Venmugil Elango, Eric S. Chung, Douglas C Burger, Mattheus C. Heddes, Nishit Shah, Rasoul Shafipour, Ankit More
DATA-AWARE MODEL PRUNING FOR NEURAL NETWORKS

Publication number: 20220383123

Abstract: Embodiments of the present disclosure include systems and methods for performing data-aware model pruning for neural networks. During a training phase, a neural network is trained with a first set of data. During a validation phase, inference with the neural network is performed using a second set of data that causes the neural network to generate a first set of outputs at a layer in the neural network. During the validation phase, a plurality of mean values and a plurality of variance values are calculated based on the first set of outputs. A plurality of entropy values are calculated based on the plurality of mean values and the plurality of variance values. A second set of outputs are pruned based on the plurality of entropy values. The second set of outputs are generated by the layer of the neural network using a third set of data.

Type: Application

Filed: May 28, 2021

Publication date: December 1, 2022

Inventors: Venmugil ELANGO, Bita DARVISH ROUHANI, Eric S. CHUNG, Douglas C. BURGER, Maximilian GOLUB
HIERARCHICAL AND SHARED EXPONENT FLOATING POINT DATA TYPES

Publication number: 20220253281

Abstract: Embodiments of the present disclosure include systems and methods for providing hierarchical and shared exponent floating point data types. First and second shared exponent values are determined based on exponent values of a plurality of floating point values. A third shared exponent value is determined based the first shared exponent value and the second shared exponent value. First and second difference values are determined based on the first shared exponent value, the second shared exponent value, and the third shared exponent value. Sign values and mantissa values are determined for the plurality of floating point values. The sign value and the mantissa value for each floating point value in the plurality of floating point values, the third shared exponent value, the first difference value, and the second difference value are stored in a data structure for a shared exponent floating point data type.

Type: Application

Filed: June 28, 2021

Publication date: August 11, 2022

Inventors: Bita DARVISH ROUHANI, Venmugil ELANGO, Rasoul SHAFIPOUR, Jeremy FOWERS, Ming Gang LIU, Jinwen XI, Douglas C. BURGER, Eric S. CHUNG
REDUCING TRAINING TIMES OF DEEP NEURAL NETWORKS THROUGH EFFICIENT HYBRID PARALLELISM

Publication number: 20210133591

Abstract: Presented are systems and methods to automatically find efficient parallelization strategies for deep neural networks (DNNs). A computation graph comprising an efficiently ordered sequence of vertices aids in computing the best parallelizing strategy in a relatively short time. Effectiveness of the parallelization strategies is evaluated on various DNNs, and the performance of the strategies proposed by various embodiments is compared against data parallelism, expert-designed strategies, and other state-of-the-art approaches. Experimental results demonstrate that the proposed strategies outperform a baseline data parallelism strategy and achieve better performance than expert-designed strategies and state-of-the-art approaches.

Type: Application

Filed: August 4, 2020

Publication date: May 6, 2021

Applicant: Baidu USA LLC

Inventor: Venmugil ELANGO
ACCELERATING LINEAR ALGEBRA KERNELS FOR ANY PROCESSOR ARCHITECTURE

Publication number: 20190278593

Abstract: Systems and methods for obtaining a set of instructions for executing a computer program and generating executable code for the computer program based, at least in part, on scheduling operations associated with the executable code according to a polyhedral representation of a directed acyclic graph. The set of instructions may be represented as a domain-specific language. The executable code may be executable code for a specific processor architecture.

Type: Application

Filed: February 15, 2019

Publication date: September 12, 2019

Inventors: Venmugil Elango, Norman Rubin, Mahesh Ravishankar, Vinod K. Grover