Patents by Inventor Sharan NARANG

Sharan NARANG has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ATTENTION NEURAL NETWORKS WITH PARALLEL ATTENTION AND FEED-FORWARD LAYERS

Publication number: 20230316055

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing a machine learning task on a network input to generate a network output. One of the systems comprises an attention neural network configured to perform the machine learning task, the attention neural network comprising a plurality of attention layers, each attention layer comprising an attention sub-layer that is arranged in parallel with a feed-forward sub-layer.

Type: Application

Filed: April 3, 2023

Publication date: October 5, 2023

Inventors: Aakanksha Chowdhery, Jacob Daniel Devlin, Sharan Narang, Jr.
Systems and methods for block-sparse recurrent neural networks

Patent number: 11651223

Abstract: Described herein are systems and methods to prune deep neural network models in reducing the overall memory and compute requirements of these models. It is demonstrated that using block pruning and group lasso combined with pruning during training, block-sparse recurrent neural networks (RNNs) may be built as accurate as dense baseline models. Two different approaches are disclosed to induce block sparsity in neural network models: pruning blocks of weights in a layer and using group lasso regularization to create blocks of weights with zeros. Using these techniques, it is demonstrated that block-sparse RNNs with high sparsity can be created with small loss in accuracy. Block-sparse RNNs eliminate overheads related to data storage and irregular memory accesses while increasing hardware efficiency compared to unstructured sparsity.

Type: Grant

Filed: October 4, 2018

Date of Patent: May 16, 2023

Assignee: Baidu USA LLC

Inventors: Sharan Narang, Eric Undersander, Gregory Diamos
Predicting deep learning scaling

Patent number: 11593655

Abstract: As deep learning application domains grow, a deeper understanding of the relationships between training set size, computational scale, and model accuracy improvements is extremely beneficial. Presented herein are large-scale empirical study of error and model size growth as training sets grow. Embodiments of a methodology for this measurement are introduced herein as well as embodiments for predicting other metrics, such as compute-related metrics. It is shown herein that power-law may be used to represent deep model relationships, such as error and training data size. It is also shown that model size scales sublinearly with data size. These scaling relationships have significant implications on deep learning research, practice, and systems. They can assist model debugging, setting accuracy targets, and decisions about data set growth. They can also guide computing system design and underscore the importance of continued computational scaling.

Type: Grant

Filed: November 30, 2018

Date of Patent: February 28, 2023

Assignee: Baidu USA LLC

Inventors: Joel Hestness, Gregory Diamos, Hee Woo Jun, Sharan Narang, Newsha Ardalani, Md Mostofa Ali Patwary, Yanqi Zhou
Systems and methods for neural text-to-speech using convolutional sequence learning

Patent number: 10796686

Abstract: Described herein are embodiments of a fully-convolutional attention-based neural text-to-speech (TTS) system, which various embodiments may generally be referred to as Deep Voice 3. Embodiments of Deep Voice 3 match state-of-the-art neural speech synthesis systems in naturalness while training ten times faster. Deep Voice 3 embodiments were scaled to data set sizes unprecedented for TTS, training on more than eight hundred hours of audio from over two thousand speakers. In addition, common error modes of attention-based speech synthesis networks were identified and mitigated, and several different waveform synthesis methods were compared. Also presented are embodiments that describe how to scale inference to ten million queries per day on one single-GPU server.

Type: Grant

Filed: August 8, 2018

Date of Patent: October 6, 2020

Assignee: Baidu USA LLC

Inventors: Sercan O. Arik, Wei Ping, Kainan Peng, Sharan Narang, Ajay Kannan, Andrew Gibiansky, Jonathan Raiman, John Miller
PREDICTING DEEP LEARNING SCALING

Publication number: 20200175374

Abstract: As deep learning application domains grow, a deeper understanding of the relationships between training set size, computational scale, and model accuracy improvements is extremely beneficial. Presented herein are large-scale empirical study of error and model size growth as training sets grow. Embodiments of a methodology for this measurement are introduced herein as well as embodiments for predicting other metrics, such as compute-related metrics. It is shown herein that power-law may be used to represent deep model relationships, such as error and training data size. It is also shown that model size scales sublinearly with data size. These scaling relationships have significant implications on deep learning research, practice, and systems. They can assist model debugging, setting accuracy targets, and decisions about data set growth. They can also guide computing system design and underscore the importance of continued computational scaling.

Type: Application

Filed: November 30, 2018

Publication date: June 4, 2020

Applicant: Baidu USA LLC

Inventors: Joel HESTNESS, Gregory DIAMOS, Hee Woo JUN, Sharan NARANG, Newsha ARDALANI, Md Mostofa Ali PATWARY, Yanqi ZHOU
SYSTEMS AND METHODS FOR BLOCK-SPARSE RECURRENT NEURAL NETWORKS

Publication number: 20190130271

Abstract: Described herein are systems and methods to prune deep neural network models in reducing the overall memory and compute requirements of these models. It is demonstrated that using block pruning and group lasso combined with pruning during training, block-sparse recurrent neural networks (RNNs) may be built as accurate as dense baseline models. Two different approaches are disclosed to induce block sparsity in neural network models: pruning blocks of weights in a layer and using group lasso regularization to create blocks of weights with zeros. Using these techniques, it is demonstrated that block-sparse RNNs with high sparsity can be created with small loss in accuracy. Block-sparse RNNs eliminate overheads related to data storage and irregular memory accesses while increasing hardware efficiency compared to unstructured sparsity.

Type: Application

Filed: October 4, 2018

Publication date: May 2, 2019

Applicant: Baidu USA LLC

Inventors: Sharan NARANG, Eric UNDERSANDER, Gregory DIAMOS
SYSTEMS AND METHODS FOR NEURAL TEXT-TO-SPEECH USING CONVOLUTIONAL SEQUENCE LEARNING

Publication number: 20190122651

Abstract: Described herein are embodiments of a fully-convolutional attention-based neural text-to-speech (TTS) system, which various embodiments may generally be referred to as Deep Voice 3. Embodiments of Deep Voice 3 match state-of-the-art neural speech synthesis systems in naturalness while training ten times faster. Deep Voice 3 embodiments were scaled to data set sizes unprecedented for TTS, training on more than eight hundred hours of audio from over two thousand speakers. In addition, common error modes of attention-based speech synthesis networks were identified and mitigated, and several different waveform synthesis methods were compared. Also presented are embodiments that describe how to scale inference to ten million queries per day on one single-GPU server.

Type: Application

Filed: August 8, 2018

Publication date: April 25, 2019

Applicant: Baidu USA LLC

Inventors: Sercan O. ARIK, Wei PING, Kainan PENG, Sharan NARANG, Ajay KANNAN, Andrew GIBIANSKY, Jonathan RAIMAN, John MILLER

ATTENTION NEURAL NETWORKS WITH PARALLEL ATTENTION AND FEED-FORWARD LAYERS

Systems and methods for block-sparse recurrent neural networks

Predicting deep learning scaling

Systems and methods for neural text-to-speech using convolutional sequence learning

PREDICTING DEEP LEARNING SCALING

SYSTEMS AND METHODS FOR BLOCK-SPARSE RECURRENT NEURAL NETWORKS

SYSTEMS AND METHODS FOR NEURAL TEXT-TO-SPEECH USING CONVOLUTIONAL SEQUENCE LEARNING