Patents by Inventor Saurabh Manish Raje

Saurabh Manish Raje has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Accelerating inference of transformer-based models

Patent number: 11763082

Abstract: Methods, systems, and computer program products for accelerating inference of transformer-based models are provided herein. A computer-implemented method includes obtaining a machine learning model comprising a plurality of transformer blocks, a task, and a natural language dataset; generating a compressed version of the machine learning model based on the task and the natural language dataset, wherein the generating comprises: obtaining at least one set of tokens, wherein each token in the set corresponds to one of the items in the natural language dataset, identifying and removing one or more redundant output activations of different ones of the plurality of transformer blocks for the at least one set of tokens, and adding one or more input activations corresponding to the one or more removed output activations into the machine learning model at subsequent ones of the plurality of the transformer blocks; and outputting the compressed version of the machine learning model to at least one user.

Type: Grant

Filed: July 12, 2021

Date of Patent: September 19, 2023

Assignee: International Business Machines Corporation

Inventors: Saurabh Goyal, Anamitra Roy Choudhury, Saurabh Manish Raje, Venkatesan T. Chakaravarthy, Yogish Sabharwal, Ashish Verma
MULTI-OBJECTIVE MACHINE LEARNING WITH MODEL AND HYPERPARAMETER OPTIMIZATION FUSION

Publication number: 20230069913

Abstract: Techniques for utilizing model and hyperparameter optimization for multi-objective machine learning are disclosed. In one example, a method comprises the following steps. One of a plurality of hyperparameter optimization operations and a plurality of model parameter optimization operations are performed to generate a first solution set. The other of the plurality of hyperparameter optimization operations and the plurality of model parameter optimization operations are performed to generate a second solution set. At least a portion of the first solution set and at least a portion of the second solution set are combined to generate a third solution set.

Type: Application

Filed: September 9, 2021

Publication date: March 9, 2023

Inventors: Aswin Kannan, Vaibhav Saxena, Anamitra Roy Choudhury, Yogish Sabharwal, Parikshit Ram, Ashish Verma, Saurabh Manish Raje
ACCELERATING INFERENCE OF TRANSFORMER-BASED MODELS

Publication number: 20230015895

Abstract: Methods, systems, and computer program products for accelerating inference of transformer-based models are provided herein. A computer-implemented method includes obtaining a machine learning model comprising a plurality of transformer blocks, a task, and a natural language dataset; generating a compressed version of the machine learning model based on the task and the natural language dataset, wherein the generating comprises: obtaining at least one set of tokens, wherein each token in the set corresponds to one of the items in the natural language dataset, identifying and removing one or more redundant output activations of different ones of the plurality of transformer blocks for the at least one set of tokens, and adding one or more input activations corresponding to the one or more removed output activations into the machine learning model at subsequent ones of the plurality of the transformer blocks; and outputting the compressed version of the machine learning model to at least one user.

Type: Application

Filed: July 12, 2021

Publication date: January 19, 2023

Inventors: Saurabh Goyal, Anamitra Roy Choudhury, Saurabh Manish Raje, Venkatesan T. Chakaravarthy, Yogish Sabharwal, Ashish Verma
ACCELERATING INFERENCE OF NEURAL NETWORK MODELS VIA DYNAMIC EARLY EXITS

Publication number: 20220358358

Abstract: Methods, systems, and computer program products for accelerating inference of neural network models via dynamic early exits are provided herein. A computer-implemented method includes determining a plurality of candidate exit points of a neural network model; obtaining a plurality of outputs of the neural network model for data samples in a target dataset, wherein the plurality of outputs comprises early outputs of the neural network model from the plurality of candidate exit points and regular outputs of the neural network model; and a set of one or more exit points from the plurality of candidate exits points that are dependent on the target dataset based at least in part on the plurality of outputs.

Type: Application

Filed: May 4, 2021

Publication date: November 10, 2022

Inventors: Saurabh Manish Raje, Saurabh Goyal, Anamitra Roy Choudhury, Yogish Sabharwal, Ashish Verma
MULTI-OBJECTIVE AUTOMATED MACHINE LEARNING

Publication number: 20220180146

Abstract: A system, computer program product, and method are presented for performing multi-objective automated machine learning, and, more specifically, to identifying a plurality of machine learning pipelines as Pareto-optimal solutions to optimize a plurality of objectives. The method includes receiving input data directed toward one or more subjects of interest and determining a plurality of objectives to be optimized. The method also includes ingesting at least a portion of the input data through one or more machine learning (ML) models. The method further includes aggregating the plurality of objectives into one or more aggregated single objectives. The method also includes determining a plurality of Pareto-optimal solutions, thereby defining a plurality of ML pipelines that optimize the one or more aggregated single objectives. The method further includes selecting one ML pipeline from the plurality of ML pipelines.

Type: Application

Filed: December 8, 2020

Publication date: June 9, 2022

Inventors: Vaibhav Saxena, Aswin Kannan, Saurabh Manish Raje, Parikshit Ram, Yogish Sabharwal, Ashish Verma
INPUT ORDERING NEURAL NETWORK DECOMPOSITION

Publication number: 20220092423

Abstract: One or more computer processors decompose a weight matrix associated with a neural network utilizing a permutation dependent decomposition. The one or more computer processors regenerate a recovered matrix utilizing the decomposed weight matrix. The one or more computer processors reduce an error between the decomposed weight matrix and regenerated recovered matrix.

Type: Application

Filed: September 21, 2020

Publication date: March 24, 2022

Inventors: Venkatesan T. Chakaravarthy, Anamitra Roy Choudhury, Saurabh Goyal, Saurabh Manish Raje, Yogish Sabharwal, ASHISH VERMA

Accelerating inference of transformer-based models

MULTI-OBJECTIVE MACHINE LEARNING WITH MODEL AND HYPERPARAMETER OPTIMIZATION FUSION

ACCELERATING INFERENCE OF TRANSFORMER-BASED MODELS

ACCELERATING INFERENCE OF NEURAL NETWORK MODELS VIA DYNAMIC EARLY EXITS

MULTI-OBJECTIVE AUTOMATED MACHINE LEARNING

INPUT ORDERING NEURAL NETWORK DECOMPOSITION