Patents by Inventor Sharan NARANG

Sharan NARANG has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230316055
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing a machine learning task on a network input to generate a network output. One of the systems comprises an attention neural network configured to perform the machine learning task, the attention neural network comprising a plurality of attention layers, each attention layer comprising an attention sub-layer that is arranged in parallel with a feed-forward sub-layer.
    Type: Application
    Filed: April 3, 2023
    Publication date: October 5, 2023
    Inventors: Aakanksha Chowdhery, Jacob Daniel Devlin, Sharan Narang, Jr.
  • Patent number: 11651223
    Abstract: Described herein are systems and methods to prune deep neural network models in reducing the overall memory and compute requirements of these models. It is demonstrated that using block pruning and group lasso combined with pruning during training, block-sparse recurrent neural networks (RNNs) may be built as accurate as dense baseline models. Two different approaches are disclosed to induce block sparsity in neural network models: pruning blocks of weights in a layer and using group lasso regularization to create blocks of weights with zeros. Using these techniques, it is demonstrated that block-sparse RNNs with high sparsity can be created with small loss in accuracy. Block-sparse RNNs eliminate overheads related to data storage and irregular memory accesses while increasing hardware efficiency compared to unstructured sparsity.
    Type: Grant
    Filed: October 4, 2018
    Date of Patent: May 16, 2023
    Assignee: Baidu USA LLC
    Inventors: Sharan Narang, Eric Undersander, Gregory Diamos
  • Patent number: 11593655
    Abstract: As deep learning application domains grow, a deeper understanding of the relationships between training set size, computational scale, and model accuracy improvements is extremely beneficial. Presented herein are large-scale empirical study of error and model size growth as training sets grow. Embodiments of a methodology for this measurement are introduced herein as well as embodiments for predicting other metrics, such as compute-related metrics. It is shown herein that power-law may be used to represent deep model relationships, such as error and training data size. It is also shown that model size scales sublinearly with data size. These scaling relationships have significant implications on deep learning research, practice, and systems. They can assist model debugging, setting accuracy targets, and decisions about data set growth. They can also guide computing system design and underscore the importance of continued computational scaling.
    Type: Grant
    Filed: November 30, 2018
    Date of Patent: February 28, 2023
    Assignee: Baidu USA LLC
    Inventors: Joel Hestness, Gregory Diamos, Hee Woo Jun, Sharan Narang, Newsha Ardalani, Md Mostofa Ali Patwary, Yanqi Zhou
  • Patent number: 10796686
    Abstract: Described herein are embodiments of a fully-convolutional attention-based neural text-to-speech (TTS) system, which various embodiments may generally be referred to as Deep Voice 3. Embodiments of Deep Voice 3 match state-of-the-art neural speech synthesis systems in naturalness while training ten times faster. Deep Voice 3 embodiments were scaled to data set sizes unprecedented for TTS, training on more than eight hundred hours of audio from over two thousand speakers. In addition, common error modes of attention-based speech synthesis networks were identified and mitigated, and several different waveform synthesis methods were compared. Also presented are embodiments that describe how to scale inference to ten million queries per day on one single-GPU server.
    Type: Grant
    Filed: August 8, 2018
    Date of Patent: October 6, 2020
    Assignee: Baidu USA LLC
    Inventors: Sercan O. Arik, Wei Ping, Kainan Peng, Sharan Narang, Ajay Kannan, Andrew Gibiansky, Jonathan Raiman, John Miller
  • Publication number: 20200175374
    Abstract: As deep learning application domains grow, a deeper understanding of the relationships between training set size, computational scale, and model accuracy improvements is extremely beneficial. Presented herein are large-scale empirical study of error and model size growth as training sets grow. Embodiments of a methodology for this measurement are introduced herein as well as embodiments for predicting other metrics, such as compute-related metrics. It is shown herein that power-law may be used to represent deep model relationships, such as error and training data size. It is also shown that model size scales sublinearly with data size. These scaling relationships have significant implications on deep learning research, practice, and systems. They can assist model debugging, setting accuracy targets, and decisions about data set growth. They can also guide computing system design and underscore the importance of continued computational scaling.
    Type: Application
    Filed: November 30, 2018
    Publication date: June 4, 2020
    Applicant: Baidu USA LLC
    Inventors: Joel HESTNESS, Gregory DIAMOS, Hee Woo JUN, Sharan NARANG, Newsha ARDALANI, Md Mostofa Ali PATWARY, Yanqi ZHOU
  • Publication number: 20190130271
    Abstract: Described herein are systems and methods to prune deep neural network models in reducing the overall memory and compute requirements of these models. It is demonstrated that using block pruning and group lasso combined with pruning during training, block-sparse recurrent neural networks (RNNs) may be built as accurate as dense baseline models. Two different approaches are disclosed to induce block sparsity in neural network models: pruning blocks of weights in a layer and using group lasso regularization to create blocks of weights with zeros. Using these techniques, it is demonstrated that block-sparse RNNs with high sparsity can be created with small loss in accuracy. Block-sparse RNNs eliminate overheads related to data storage and irregular memory accesses while increasing hardware efficiency compared to unstructured sparsity.
    Type: Application
    Filed: October 4, 2018
    Publication date: May 2, 2019
    Applicant: Baidu USA LLC
    Inventors: Sharan NARANG, Eric UNDERSANDER, Gregory DIAMOS
  • Publication number: 20190122651
    Abstract: Described herein are embodiments of a fully-convolutional attention-based neural text-to-speech (TTS) system, which various embodiments may generally be referred to as Deep Voice 3. Embodiments of Deep Voice 3 match state-of-the-art neural speech synthesis systems in naturalness while training ten times faster. Deep Voice 3 embodiments were scaled to data set sizes unprecedented for TTS, training on more than eight hundred hours of audio from over two thousand speakers. In addition, common error modes of attention-based speech synthesis networks were identified and mitigated, and several different waveform synthesis methods were compared. Also presented are embodiments that describe how to scale inference to ten million queries per day on one single-GPU server.
    Type: Application
    Filed: August 8, 2018
    Publication date: April 25, 2019
    Applicant: Baidu USA LLC
    Inventors: Sercan O. ARIK, Wei PING, Kainan PENG, Sharan NARANG, Ajay KANNAN, Andrew GIBIANSKY, Jonathan RAIMAN, John MILLER