Patents by Inventor Shubhabrata Sengupta

Shubhabrata Sengupta has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Deep learning models for speech recognition

Patent number: 11562733

Abstract: Presented herein are embodiments of state-of-the-art speech recognition systems developed using end-to-end deep learning. In embodiments, the model architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these traditional systems also tend to perform poorly when used in noisy environments. In contrast, embodiments of the system do not need hand-designed components to model background noise, reverberation, or speaker variation, but instead directly learn a function that is robust to such effects. Neither a phoneme dictionary, nor even the concept of a “phoneme,” is needed. Embodiments include a well-optimized recurrent neural network (RNN) training system that can use multiple GPUs, as well as a set of novel data synthesis techniques that allows for a large amount of varied data for training to be efficiently obtained.

Type: Grant

Filed: August 15, 2019

Date of Patent: January 24, 2023

Assignee: BAIDU USA LLC

Inventors: Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Gregory Diamos, Erich Eisen, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Adam Coates, Andrew Ng
Systems and methods for a multi-core optimized recurrent neural network

Patent number: 10832120

Abstract: Systems and methods for a multi-core optimized Recurrent Neural Network (RNN) architecture are disclosed. The various architectures affect communication and synchronization operations according to the Multi-Bulk-Synchronous-Parallel (MBSP) model for a given processor. The resulting family of network architectures, referred to as MBSP-RNNs, perform similarly to a conventional RNNs having the same number of parameters, but are substantially more efficient when mapped onto a modern general purpose processor. Due to the large gain in computational efficiency, for a fixed computational budget, MBSP-RNNs outperform RNNs at applications such as end-to-end speech recognition.

Type: Grant

Filed: April 5, 2016

Date of Patent: November 10, 2020

Assignee: Baidu USA LLC

Inventors: Gregory Diamos, Awni Hannun, Bryan Catanzaro, Dario Amodei, Erich Elsen, Jesse Engel, Shubhabrata Sengupta
Systems and methods for speech transcription

Patent number: 10540957

Abstract: Presented herein are embodiments of state-of-the-art speech recognition systems developed using end-to-end deep learning. In embodiments, the model architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these traditional systems also tend to perform poorly when used in noisy environments. In contrast, embodiments of the system do not need hand-designed components to model background noise, reverberation, or speaker variation, but instead directly learn a function that is robust to such effects. A phoneme dictionary, nor even the concept of a “phoneme,” is needed. Embodiments include a well-optimized recurrent neural network (RNN) training system that can use multiple GPUs, as well as a set of novel data synthesis techniques that allows for a large amount of varied data for training to be efficiently obtained.

Type: Grant

Filed: June 9, 2015

Date of Patent: January 21, 2020

Assignee: BAIDU USA LLC

Inventors: Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Gregory Diamos, Erich Elsen, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Adam Coates, Andrew Y. Ng
DEEP LEARNING MODELS FOR SPEECH RECOGNITION

Publication number: 20190371298

Abstract: Presented herein are embodiments of state-of-the-art speech recognition systems developed using end-to-end deep learning. In embodiments, the model architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these traditional systems also tend to perform poorly when used in noisy environments. In contrast, embodiments of the system do not need hand-designed components to model background noise, reverberation, or speaker variation, but instead directly learn a function that is robust to such effects. A phoneme dictionary, nor even the concept of a “phoneme,” is needed. Embodiments include a well-optimized recurrent neural network (RNN) training system that can use multiple GPUs, as well as a set of novel data synthesis techniques that allows for a large amount of varied data for training to be efficiently obtained.

Type: Application

Filed: August 15, 2019

Publication date: December 5, 2019

Applicant: BAIDU USA LLC

Inventors: Awni HANNUN, Carl CASE, Jared Casper, Bryan Catanzaro, Gregory Diamos, Erich Elsen, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Adam Coates, Andrew Ng
End-to-end speech recognition

Patent number: 10332509

Abstract: Embodiments of end-to-end deep learning systems and methods are disclosed to recognize speech of vastly different languages, such as English or Mandarin Chinese. In embodiments, the entire pipelines of hand-engineered components are replaced with neural networks, and the end-to-end learning allows handling a diverse variety of speech including noisy environments, accents, and different languages. Using a trained embodiment and an embodiment of a batch dispatch technique with GPUs in a data center, an end-to-end deep learning system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.

Type: Grant

Filed: November 21, 2016

Date of Patent: June 25, 2019

Assignee: Baidu USA, LLC

Inventors: Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Erich Elsen, Jesse Engel, Christopher Fougner, Xu Han, Awni Hannun, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Dani Yogatama, Chong Wang, Jun Zhan, Zhenyao Zhu, Dario Amodei
Deployed end-to-end speech recognition

Patent number: 10319374

Abstract: Embodiments of end-to-end deep learning systems and methods are disclosed to recognize speech of vastly different languages, such as English or Mandarin Chinese. In embodiments, the entire pipelines of hand-engineered components are replaced with neural networks, and the end-to-end learning allows handling a diverse variety of speech including noisy environments, accents, and different languages. Using a trained embodiment and an embodiment of a batch dispatch technique with GPUs in a data center, an end-to-end deep learning system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.

Type: Grant

Filed: November 21, 2016

Date of Patent: June 11, 2019

Assignee: Baidu USA, LLC

Inventors: Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Erich Elsen, Jesse Engel, Christopher Fougner, Xu Han, Awni Hannun, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Dani Yogatama, Chong Wang, Jun Zhan, Zhenyao Zhu, Dario Amodei
SYSTEMS AND METHODS FOR A MULTI-CORE OPTIMIZED RECURRENT NEURAL NETWORK

Publication number: 20170169326

Abstract: Systems and methods for a multi-core optimized Recurrent Neural Network (RNN) architecture are disclosed. The various architectures affect communication and synchronization operations according to the Multi-Bulk-Synchronous-Parallel (MBSP) model for a given processor. The resulting family of network architectures, referred to as MBSP-RNNs, perform similarly to a conventional RNNs having the same number of parameters, but are substantially more efficient when mapped onto a modern general purpose processor. Due to the large gain in computational efficiency, for a fixed computational budget, MBSP-RNNs outperform RNNs at applications such as end-to-end speech recognition.

Type: Application

Filed: April 5, 2016

Publication date: June 15, 2017

Applicant: Baidu USA LLC

Inventors: Gregory Diamos, Awni Hannun, Bryan Catanzaro, Dario Amodei, Erich Elsen, Jesse Engel, Shubhabrata Sengupta
DEPLOYED END-TO-END SPEECH RECOGNITION

Publication number: 20170148433

Abstract: Embodiments of end-to-end deep learning systems and methods are disclosed to recognize speech of vastly different languages, such as English or Mandarin Chinese. In embodiments, the entire pipelines of hand-engineered components are replaced with neural networks, and the end-to-end learning allows handling a diverse variety of speech including noisy environments, accents, and different languages. Using a trained embodiment and an embodiment of a batch dispatch technique with GPUs in a data center, an end-to-end deep learning system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.

Type: Application

Filed: November 21, 2016

Publication date: May 25, 2017

Applicant: Baidu USA LLC

Inventors: Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Erich Elsen, Jesse Engel, Christopher Fougner, Xu Han, Awni Hannun, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Dani Yogatama, Chong Wang, Jun Zhan, Zhenyao Zhu, Dario Amodei
END-TO-END SPEECH RECOGNITION

Publication number: 20170148431

Abstract: Embodiments of end-to-end deep learning systems and methods are disclosed to recognize speech of vastly different languages, such as English or Mandarin Chinese. In embodiments, the entire pipelines of hand-engineered components are replaced with neural networks, and the end-to-end learning allows handling a diverse variety of speech including noisy environments, accents, and different languages. Using a trained embodiment and an embodiment of a batch dispatch technique with GPUs in a data center, an end-to-end deep learning system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.

Type: Application

Filed: November 21, 2016

Publication date: May 25, 2017

Applicant: Baidu USA LLC

Inventors: Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Erich Elsen, Jesse Engel, Christopher Fougner, Xu Han, Awni Hannun, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Dani Yogatama, Chong Wang, Jun Zhan, Zhenyao Zhu, Dario Amodei
SYSTEMS AND METHODS FOR SPEECH TRANSCRIPTION

Publication number: 20160171974

Abstract: Presented herein are embodiments of state-of-the-art speech recognition systems developed using end-to-end deep learning. In embodiments, the model architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these traditional systems also tend to perform poorly when used in noisy environments. In contrast, embodiments of the system do not need hand-designed components to model background noise, reverberation, or speaker variation, but instead directly learn a function that is robust to such effects. A phoneme dictionary, nor even the concept of a “phoneme,” is needed. Embodiments include a well-optimized recurrent neural network (RNN) training system that can use multiple GPUs, as well as a set of novel data synthesis techniques that allows for a large amount of varied data for training to be efficiently obtained.

Type: Application

Filed: June 9, 2015

Publication date: June 16, 2016

Applicant: BAIDU USA LLC

Inventors: Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Gregory Diamos, Erich Elsen, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Adam Coates, Andrew Y. Ng
System, method, and computer program product for grouping linearly ordered primitives

Patent number: 8773422

Abstract: A system, method, and computer program product are provided for grouping linearly ordered primitives. In operation, a plurality of primitives are linearly ordered. Additionally, the primitives are grouped. Furthermore, at least one intersection query is performed, utilizing the grouping.

Type: Grant

Filed: December 4, 2007

Date of Patent: July 8, 2014

Assignee: NVIDIA Corporation

Inventors: Michael J. Garland, Timo O. Aila, Shubhabrata Sengupta
System, method, and computer program product for converting a reduction algorithm to a segmented reduction algorithm

Patent number: 8321492

Abstract: A system, method, and computer program product are provided for converting a reduction algorithm to a segmented reduction algorithm. In operation, a reduction algorithm is identified. Additionally, the reduction algorithm is converted to a segmented reduction algorithm. Furthermore, the segmented reduction algorithm is performed to produce an output.

Type: Grant

Filed: December 11, 2008

Date of Patent: November 27, 2012

Assignee: NVIDIA Corporation

Inventors: Shubhabrata Sengupta, Michael J. Garland
System, method, and computer program product for converting a scan algorithm to a segmented scan algorithm in an operator-independent manner

Patent number: 8243083

Abstract: A system, method, and computer program product are provided for converting a scan algorithm to a segmented scan algorithm in an operator independent manner. In operation, a scan algorithm and a limit index data structure are identified. Utilizing the limit index data structure, the scan algorithm is converted to a segmented scan algorithm in an operator-independent manner. Additionally, the segmented scan algorithm is performed to produce an output.

Type: Grant

Filed: December 11, 2008

Date of Patent: August 14, 2012

Assignee: NVIDIA Corporation

Inventors: Michael J. Garland, Shubhabrata Sengupta
Call center application data and interoperation architecture for a telecommunication service center

Patent number: 8068599

Abstract: A call center application data and interoperation architecture provides a centralized design for managing applications providing call center functionality. The architecture integrates information flow using a mater data repository for all applications for all aspects of a call center operation. The architecture provides employee information at defined levels through the complete employment life cycle, including the initial hiring and termination. The architecture provides the employee information by integrating human resources information with call center applications such as Employee attendance and Leave management, ID management, Transport management, Commitment logs, and Movement management, or any other application.

Type: Grant

Filed: March 21, 2008

Date of Patent: November 29, 2011

Assignee: Accenture Global Services Limited

Inventors: Amit Sarin, Shubhabrata Sengupta, Sunandita Ganguly, Amit Kumar Tewari
Call center application data and interoperation architecture for a telecommunication service center

Publication number: 20090175436

Abstract: A call center application data and interoperation architecture provides a centralized design for managing applications providing call center functionality. The architecture integrates information flow using a mater data repository for all applications for all aspects of a call center operation. The architecture provides employee information at defined levels through the complete employment life cycle, including the initial hiring and termination. The architecture provides the employee information by integrating human resources information with call center applications such as Employee attendance and Leave management, ID management, Transport management, Commitment logs, and Movement management, or any other application.

Type: Application

Filed: March 21, 2008

Publication date: July 9, 2009

Applicant: Accenture Global Services GmbH

Inventors: Amit Sarin, Shubhabrata Sengupta, Sunandita Ganguly, Amit Kumar Tewari