Patents by Inventor Christopher Fougner

Christopher Fougner has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods for efficient neural network deployments

Patent number: 10769533

Abstract: Disclosed are systems and methods that implement efficient engines for computation-intensive tasks such as neural network deployment. Various embodiments of the invention provide for high-throughput batching that increases throughput of streaming data in high-traffic applications, such as real-time speech transcription. In embodiments, throughput is increased by dynamically assembling into batches and processing together user requests that randomly arrive at unknown timing such that not all the data is present at once at the time of batching. Some embodiments allow for performing steaming classification using pre-processing. The gains in performance allow for more efficient use of a compute engine and drastically reduce the cost of deploying large neural networks at scale, while meeting strict application requirements and adding relatively little computational latency so as to maintain a satisfactory application experience.

Type: Grant

Filed: July 13, 2016

Date of Patent: September 8, 2020

Assignee: Baidu USA LLC

Inventors: Christopher Fougner, Bryan Catanzaro
Systems and methods for principled bias reduction in production speech models

Patent number: 10657955

Abstract: Described herein are systems and methods to identify and address sources of bias in an end-to-end speech model. In one or more embodiments, the end-to-end model may be a recurrent neural network with two 2D-convolutional input layers, followed by multiple bidirectional recurrent layers and one fully connected layer before a softmax layer. In one or more embodiments, the network is trained end-to-end using the CTC loss function to directly predict sequences of characters from log spectrograms of audio. With optimized recurrent layers and training together with alignment information, some unwanted bias induced by using purely forward only recurrences may be removed in a deployed model.

Type: Grant

Filed: January 30, 2018

Date of Patent: May 19, 2020

Assignee: Baidu USA LLC

Inventors: Eric Battenberg, Rewon Child, Adam Coates, Christopher Fougner, Yashesh Gaur, Jiaji Huang, Heewoo Jun, Ajay Kannan, Markus Kliegl, Atul Kumar, Hairong Liu, Vinay Rao, Sanjeev Satheesh, David Seetapun, Anuroop Sriram, Zhenyao Zhu
Convolutional recurrent neural networks for small-footprint keyword spotting

Patent number: 10540961

Abstract: Described herein are systems and methods for creating and using Convolutional Recurrent Neural Networks (CRNNs) for small-footprint keyword spotting (KWS) systems. Inspired by the large-scale state-of-the-art speech recognition systems, in embodiments, the strengths of convolutional layers to utilize the structure in the data in time and frequency domains are combined with recurrent layers to utilize context for the entire processed frame. The effect of architecture parameters were examined to determine preferred model embodiments given the performance versus model size tradeoff. Various training strategies are provided to improve performance. In embodiments, using only ˜230 k parameters and yielding acceptably low latency, a CRNN model embodiment demonstrated high accuracy and robust performance in a wide range of environments.

Type: Grant

Filed: August 28, 2017

Date of Patent: January 21, 2020

Assignee: Baidu USA LLC

Inventors: Sercan Arik, Markus Kliegl, Rewon Child, Joel Hestness, Andrew Gibiansky, Christopher Fougner, Ryan Prenger, Adam Coates
End-to-end speech recognition

Patent number: 10332509

Abstract: Embodiments of end-to-end deep learning systems and methods are disclosed to recognize speech of vastly different languages, such as English or Mandarin Chinese. In embodiments, the entire pipelines of hand-engineered components are replaced with neural networks, and the end-to-end learning allows handling a diverse variety of speech including noisy environments, accents, and different languages. Using a trained embodiment and an embodiment of a batch dispatch technique with GPUs in a data center, an end-to-end deep learning system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.

Type: Grant

Filed: November 21, 2016

Date of Patent: June 25, 2019

Assignee: Baidu USA, LLC

Inventors: Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Erich Elsen, Jesse Engel, Christopher Fougner, Xu Han, Awni Hannun, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Dani Yogatama, Chong Wang, Jun Zhan, Zhenyao Zhu, Dario Amodei
Deployed end-to-end speech recognition

Patent number: 10319374

Abstract: Embodiments of end-to-end deep learning systems and methods are disclosed to recognize speech of vastly different languages, such as English or Mandarin Chinese. In embodiments, the entire pipelines of hand-engineered components are replaced with neural networks, and the end-to-end learning allows handling a diverse variety of speech including noisy environments, accents, and different languages. Using a trained embodiment and an embodiment of a batch dispatch technique with GPUs in a data center, an end-to-end deep learning system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.

Type: Grant

Filed: November 21, 2016

Date of Patent: June 11, 2019

Assignee: Baidu USA, LLC

Inventors: Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Erich Elsen, Jesse Engel, Christopher Fougner, Xu Han, Awni Hannun, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Dani Yogatama, Chong Wang, Jun Zhan, Zhenyao Zhu, Dario Amodei
CONVOLUTIONAL RECURRENT NEURAL NETWORKS FOR SMALL-FOOTPRINT KEYWORD SPOTTING

Publication number: 20180261213

Abstract: Described herein are systems and methods for creating and using Convolutional Recurrent Neural Networks (CRNNs) for small-footprint keyword spotting (KWS) systems. Inspired by the large-scale state-of-the-art speech recognition systems, in embodiments, the strengths of convolutional layers to utilize the structure in the data in time and frequency domains are combined with recurrent layers to utilize context for the entire processed frame. The effect of architecture parameters were examined to determine preferred model embodiments given the performance versus model size tradeoff. Various training strategies are provided to improve performance. In embodiments, using only ˜230 k parameters and yielding acceptably low latency, a CRNN model embodiment demonstrated high accuracy and robust performance in a wide range of environments.

Type: Application

Filed: August 28, 2017

Publication date: September 13, 2018

Applicant: Baidu USA LLC

Inventors: Sercan Arik, Markus Kliegl, Rewon Child, Joel Hestness, Andrew Gibiansky, Christopher Fougner, Ryan Prenger, Adam Coates
SYSTEMS AND METHODS FOR PRINCIPLED BIAS REDUCTION IN PRODUCTION SPEECH MODELS

Publication number: 20180247643

Abstract: Described herein are systems and methods to identify and address sources of bias in an end-to-end speech model. In one or more embodiments, the end-to-end model may be a recurrent neural network with two 2D-convolutional input layers, followed by multiple bidirectional recurrent layers and one fully connected layer before a softmax layer. In one or more embodiments, the network is trained end-to-end using the CTC loss function to directly predict sequences of characters from log spectrograms of audio. With optimized recurrent layers and training together with alignment information, some unwanted bias induced by using purely forward only recurrences may be removed in a deployed model.

Type: Application

Filed: January 30, 2018

Publication date: August 30, 2018

Applicant: Baidu USA LLC

Inventors: Eric BATTENBERG, Rewon CHILD, Adam COATES, Christopher FOUGNER, Yashesh GAUR, Jiaji HUANG, Heewoo JUN, Ajay KANNAN, Markus KLIEGL, Atul KUMAR, Hairong LIU, Vinay RAO, Sanjeev SATHEESH, David SEETAPUN, Anuroop SRIRAM, Zhenyao ZHU
DEPLOYED END-TO-END SPEECH RECOGNITION

Publication number: 20170148433

Abstract: Embodiments of end-to-end deep learning systems and methods are disclosed to recognize speech of vastly different languages, such as English or Mandarin Chinese. In embodiments, the entire pipelines of hand-engineered components are replaced with neural networks, and the end-to-end learning allows handling a diverse variety of speech including noisy environments, accents, and different languages. Using a trained embodiment and an embodiment of a batch dispatch technique with GPUs in a data center, an end-to-end deep learning system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.

Type: Application

Filed: November 21, 2016

Publication date: May 25, 2017

Applicant: Baidu USA LLC

Inventors: Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Erich Elsen, Jesse Engel, Christopher Fougner, Xu Han, Awni Hannun, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Dani Yogatama, Chong Wang, Jun Zhan, Zhenyao Zhu, Dario Amodei
END-TO-END SPEECH RECOGNITION

Publication number: 20170148431

Abstract: Embodiments of end-to-end deep learning systems and methods are disclosed to recognize speech of vastly different languages, such as English or Mandarin Chinese. In embodiments, the entire pipelines of hand-engineered components are replaced with neural networks, and the end-to-end learning allows handling a diverse variety of speech including noisy environments, accents, and different languages. Using a trained embodiment and an embodiment of a batch dispatch technique with GPUs in a data center, an end-to-end deep learning system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.

Type: Application

Filed: November 21, 2016

Publication date: May 25, 2017

Applicant: Baidu USA LLC

Inventors: Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Erich Elsen, Jesse Engel, Christopher Fougner, Xu Han, Awni Hannun, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Dani Yogatama, Chong Wang, Jun Zhan, Zhenyao Zhu, Dario Amodei
SYSTEMS AND METHODS FOR EFFICIENT NEURAL NETWORK DEPLOYMENTS

Publication number: 20170068889

Abstract: Disclosed are systems and methods that implement efficient engines for computation-intensive tasks such as neural network deployment. Various embodiments of the invention provide for high-throughput batching that increases throughput of streaming data in high-traffic applications, such as real-time speech transcription. In embodiments, throughput is increased by dynamically assembling into batches and processing together user requests that randomly arrive at unknown timing such that not all the data is present at once at the time of batching. Some embodiments allow for performing steaming classification using pre-processing. The gains in performance allow for more efficient use of a compute engine and drastically reduce the cost of deploying large neural networks at scale, while meeting strict application requirements and adding relatively little computational latency so as to maintain a satisfactory application experience.

Type: Application

Filed: July 13, 2016

Publication date: March 9, 2017

Applicant: Baidu USA LLC

Inventors: Christopher Fougner, Bryan Catanzaro