Patents by Inventor Joel Hestness
Joel Hestness has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11593655Abstract: As deep learning application domains grow, a deeper understanding of the relationships between training set size, computational scale, and model accuracy improvements is extremely beneficial. Presented herein are large-scale empirical study of error and model size growth as training sets grow. Embodiments of a methodology for this measurement are introduced herein as well as embodiments for predicting other metrics, such as compute-related metrics. It is shown herein that power-law may be used to represent deep model relationships, such as error and training data size. It is also shown that model size scales sublinearly with data size. These scaling relationships have significant implications on deep learning research, practice, and systems. They can assist model debugging, setting accuracy targets, and decisions about data set growth. They can also guide computing system design and underscore the importance of continued computational scaling.Type: GrantFiled: November 30, 2018Date of Patent: February 28, 2023Assignee: Baidu USA LLCInventors: Joel Hestness, Gregory Diamos, Hee Woo Jun, Sharan Narang, Newsha Ardalani, Md Mostofa Ali Patwary, Yanqi Zhou
-
Patent number: 11373042Abstract: Described herein are systems and methods for word embeddings to avoid the need to throw out rare words appearing less than a certain number of times in a corpus. Embodiments of the present disclosure involve group words into clusters/classes for multiple times using different assignments of the vocabulary words to a number of classes. Multiple copies of the training corpus are then generated using the assignments to replace each word with the appropriate class. A word embedding generating model is run on the multiple class corpora to generate multiple class embeddings. An estimate of the gold word embedding matrix is then reconstructed from multiple pairs of assignments, class embeddings, and covariances. Test results show the effectiveness of embodiments of the present disclosure.Type: GrantFiled: December 3, 2019Date of Patent: June 28, 2022Assignee: Baidu USA LLCInventors: Kenneth Church, Hee Woo Jun, Joel Hestness
-
Publication number: 20200193093Abstract: Described herein are systems and methods for word embeddings to avoid the need to throw out rare words appearing less than a certain number of times in a corpus. Embodiments of the present disclosure involve group words into clusters/classes for multiple times using different assignments of the vocabulary words to a number of classes. Multiple copies of the training corpus are then generated using the assignments to replace each word with the appropriate class. A word embedding generating model is run on the multiple class corpora to generate multiple class embeddings. An estimate of the gold word embedding matrix is then reconstructed from multiple pairs of assignments, class embeddings, and covariances. Test results show the effectiveness of embodiments of the present disclosure.Type: ApplicationFiled: December 3, 2019Publication date: June 18, 2020Applicant: Baidu USA LLCInventors: Kenneth CHURCH, Hee Woo JUN, Joel HESTNESS
-
Publication number: 20200175374Abstract: As deep learning application domains grow, a deeper understanding of the relationships between training set size, computational scale, and model accuracy improvements is extremely beneficial. Presented herein are large-scale empirical study of error and model size growth as training sets grow. Embodiments of a methodology for this measurement are introduced herein as well as embodiments for predicting other metrics, such as compute-related metrics. It is shown herein that power-law may be used to represent deep model relationships, such as error and training data size. It is also shown that model size scales sublinearly with data size. These scaling relationships have significant implications on deep learning research, practice, and systems. They can assist model debugging, setting accuracy targets, and decisions about data set growth. They can also guide computing system design and underscore the importance of continued computational scaling.Type: ApplicationFiled: November 30, 2018Publication date: June 4, 2020Applicant: Baidu USA LLCInventors: Joel HESTNESS, Gregory DIAMOS, Hee Woo JUN, Sharan NARANG, Newsha ARDALANI, Md Mostofa Ali PATWARY, Yanqi ZHOU
-
Patent number: 10540961Abstract: Described herein are systems and methods for creating and using Convolutional Recurrent Neural Networks (CRNNs) for small-footprint keyword spotting (KWS) systems. Inspired by the large-scale state-of-the-art speech recognition systems, in embodiments, the strengths of convolutional layers to utilize the structure in the data in time and frequency domains are combined with recurrent layers to utilize context for the entire processed frame. The effect of architecture parameters were examined to determine preferred model embodiments given the performance versus model size tradeoff. Various training strategies are provided to improve performance. In embodiments, using only ˜230 k parameters and yielding acceptably low latency, a CRNN model embodiment demonstrated high accuracy and robust performance in a wide range of environments.Type: GrantFiled: August 28, 2017Date of Patent: January 21, 2020Assignee: Baidu USA LLCInventors: Sercan Arik, Markus Kliegl, Rewon Child, Joel Hestness, Andrew Gibiansky, Christopher Fougner, Ryan Prenger, Adam Coates
-
Publication number: 20180261213Abstract: Described herein are systems and methods for creating and using Convolutional Recurrent Neural Networks (CRNNs) for small-footprint keyword spotting (KWS) systems. Inspired by the large-scale state-of-the-art speech recognition systems, in embodiments, the strengths of convolutional layers to utilize the structure in the data in time and frequency domains are combined with recurrent layers to utilize context for the entire processed frame. The effect of architecture parameters were examined to determine preferred model embodiments given the performance versus model size tradeoff. Various training strategies are provided to improve performance. In embodiments, using only ˜230 k parameters and yielding acceptably low latency, a CRNN model embodiment demonstrated high accuracy and robust performance in a wide range of environments.Type: ApplicationFiled: August 28, 2017Publication date: September 13, 2018Applicant: Baidu USA LLCInventors: Sercan Arik, Markus Kliegl, Rewon Child, Joel Hestness, Andrew Gibiansky, Christopher Fougner, Ryan Prenger, Adam Coates