Patents by Inventor Jesse Engel
Jesse Engel has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240079001Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a prediction of an audio signal. One of the methods includes receiving a request to generate an audio signal conditioned on an input; processing the input using an embedding neural network to map the input to one or more embedding tokens; generating a semantic representation of the audio signal; generating, using one or more generative neural networks and conditioned on at least the semantic representation and the embedding tokens, an acoustic representation of the audio signal; and processing at least the acoustic representation using a decoder neural network to generate the prediction of the audio signal.Type: ApplicationFiled: September 7, 2023Publication date: March 7, 2024Inventors: Andrea Agostinelli, Timo Immanuel Denk, Antoine Caillon, Neil Zeghidour, Jesse Engel, Mauro Verzetti, Christian Frank, Zalán Borsos, Matthew Sharifi, Adam Joseph Roberts
-
Patent number: 11915689Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a prediction of an audio signal. One of the methods includes receiving a request to generate an audio signal conditioned on an input; processing the input using an embedding neural network to map the input to one or more embedding tokens; generating a semantic representation of the audio signal; generating, using one or more generative neural networks and conditioned on at least the semantic representation and the embedding tokens, an acoustic representation of the audio signal; and processing at least the acoustic representation using a decoder neural network to generate the prediction of the audio signal.Type: GrantFiled: September 7, 2023Date of Patent: February 27, 2024Assignee: Google LLCInventors: Andrea Agostinelli, Timo Immanuel Denk, Antoine Caillon, Neil Zeghidour, Jesse Engel, Mauro Verzetti, Christian Frank, Zalán Borsos, Matthew Sharifi, Adam Joseph Roberts, Marco Tagliasacchi
-
Publication number: 20230343348Abstract: Systems and methods of the present disclosure are directed toward digital signal processing using machine-learned differentiable digital signal processors. For example, embodiments of the present disclosure may include differentiable digital signal processors within the training loop of a machine-learned model (e.g., for gradient-based training). Advantageously, systems and methods of the present disclosure provide high quality signal processing using smaller models than prior systems, thereby reducing energy costs (e.g., storage and/or processing costs) associated with performing digital signal processing.Type: ApplicationFiled: June 29, 2023Publication date: October 26, 2023Inventors: Jesse Engel, Adam Roberts, Chenjie Gu, Lamtharn Hantrakul
-
Patent number: 11735197Abstract: Systems and methods of the present disclosure are directed toward digital signal processing using machine-learned differentiable digital signal processors. For example, embodiments of the present disclosure may include differentiable digital signal processors within the training loop of a machine-learned model (e.g., for gradient-based training). Advantageously, systems and methods of the present disclosure provide high quality signal processing using smaller models than prior systems, thereby reducing energy costs (e.g., storage and/or processing costs) associated with performing digital signal processing.Type: GrantFiled: July 7, 2020Date of Patent: August 22, 2023Assignee: GOOGLE LLCInventors: Jesse Engel, Adam Roberts, Chenjie Gu, Lamtharn Hantrakul
-
Publication number: 20230244907Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a sequence of data elements that includes a respective data element at each position in a sequence of positions. In one aspect, a method includes: for each position after a first position in the sequence of positions: obtaining a current sequence of data element embeddings that includes a respective data element embedding of each data element at a position that precedes the current position, obtaining a sequence of latent embeddings, and processing: (i) the current sequence of data element embeddings, and (ii) the sequence of latent embeddings, using a neural network to generate the data element at the current position. The neural network includes a sequence of neural network blocks including: (i) a cross-attention block, (ii) one or more self-attention blocks, and (iii) an output block.Type: ApplicationFiled: January 30, 2023Publication date: August 3, 2023Inventors: Curtis Glenn-Macway Hawthorne, Andrew Coulter Jaegle, Catalina-Codruta Cangea, Sebastian Borgeaud Dit Avocat, Charlie Thomas Curtis Nash, Mateusz Malinowski, Sander Etienne Lea Dieleman, Oriol Vinyals, Matthew Botvinick, Ian Stuart Simon, Hannah Rachel Sheahan, Neil Zeghidour, Jean-Baptiste Alayrac, Joao Carreira, Jesse Engel
-
Publication number: 20220013132Abstract: Systems and methods of the present disclosure are directed toward digital signal processing using machine-learned differentiable digital signal processors. For example, embodiments of the present disclosure may include differentiable digital signal processors within the training loop of a machine-learned model (e.g., for gradient-based training). Advantageously, systems and methods of the present disclosure provide high quality signal processing using smaller models than prior systems, thereby reducing energy costs (e.g., storage and/or processing costs) associated with performing digital signal processing.Type: ApplicationFiled: July 7, 2020Publication date: January 13, 2022Inventors: Jesse Engel, Adam Roberts, Chenjie Gu, Lamtharn Hantrakul
-
Patent number: 10832120Abstract: Systems and methods for a multi-core optimized Recurrent Neural Network (RNN) architecture are disclosed. The various architectures affect communication and synchronization operations according to the Multi-Bulk-Synchronous-Parallel (MBSP) model for a given processor. The resulting family of network architectures, referred to as MBSP-RNNs, perform similarly to a conventional RNNs having the same number of parameters, but are substantially more efficient when mapped onto a modern general purpose processor. Due to the large gain in computational efficiency, for a fixed computational budget, MBSP-RNNs outperform RNNs at applications such as end-to-end speech recognition.Type: GrantFiled: April 5, 2016Date of Patent: November 10, 2020Assignee: Baidu USA LLCInventors: Gregory Diamos, Awni Hannun, Bryan Catanzaro, Dario Amodei, Erich Elsen, Jesse Engel, Shubhabrata Sengupta
-
Patent number: 10332509Abstract: Embodiments of end-to-end deep learning systems and methods are disclosed to recognize speech of vastly different languages, such as English or Mandarin Chinese. In embodiments, the entire pipelines of hand-engineered components are replaced with neural networks, and the end-to-end learning allows handling a diverse variety of speech including noisy environments, accents, and different languages. Using a trained embodiment and an embodiment of a batch dispatch technique with GPUs in a data center, an end-to-end deep learning system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.Type: GrantFiled: November 21, 2016Date of Patent: June 25, 2019Assignee: Baidu USA, LLCInventors: Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Erich Elsen, Jesse Engel, Christopher Fougner, Xu Han, Awni Hannun, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Dani Yogatama, Chong Wang, Jun Zhan, Zhenyao Zhu, Dario Amodei
-
Patent number: 10319374Abstract: Embodiments of end-to-end deep learning systems and methods are disclosed to recognize speech of vastly different languages, such as English or Mandarin Chinese. In embodiments, the entire pipelines of hand-engineered components are replaced with neural networks, and the end-to-end learning allows handling a diverse variety of speech including noisy environments, accents, and different languages. Using a trained embodiment and an embodiment of a batch dispatch technique with GPUs in a data center, an end-to-end deep learning system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.Type: GrantFiled: November 21, 2016Date of Patent: June 11, 2019Assignee: Baidu USA, LLCInventors: Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Erich Elsen, Jesse Engel, Christopher Fougner, Xu Han, Awni Hannun, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Dani Yogatama, Chong Wang, Jun Zhan, Zhenyao Zhu, Dario Amodei
-
Patent number: 10068557Abstract: The present disclosure provides systems and methods that include or otherwise leverage a machine-learned neural synthesizer model. Unlike a traditional synthesizer which generates audio from hand-designed components like oscillators and wavetables, the neural synthesizer model can use deep neural networks to generate sounds at the level of individual samples. Learning directly from data, the neural synthesizer model can provide intuitive control over timbre and dynamics and enable exploration of new sounds that would be difficult or impossible to produce with a hand-tuned synthesizer. As one example, the neural synthesizer model can be a neural synthesis autoencoder that includes an encoder model that learns embeddings descriptive of musical characteristics and an autoregressive decoder model that is conditioned on the embedding to autoregressively generate musical waveforms that have the musical characteristics one audio sample at a time.Type: GrantFiled: August 23, 2017Date of Patent: September 4, 2018Assignee: Google LLCInventors: Jesse Engel, Mohammad Norouzi, Karen Simonyan, Adam Roberts, Cinjon Resnick, Sander Etienne Lea Dieleman, Douglas Eck
-
Publication number: 20170169326Abstract: Systems and methods for a multi-core optimized Recurrent Neural Network (RNN) architecture are disclosed. The various architectures affect communication and synchronization operations according to the Multi-Bulk-Synchronous-Parallel (MBSP) model for a given processor. The resulting family of network architectures, referred to as MBSP-RNNs, perform similarly to a conventional RNNs having the same number of parameters, but are substantially more efficient when mapped onto a modern general purpose processor. Due to the large gain in computational efficiency, for a fixed computational budget, MBSP-RNNs outperform RNNs at applications such as end-to-end speech recognition.Type: ApplicationFiled: April 5, 2016Publication date: June 15, 2017Applicant: Baidu USA LLCInventors: Gregory Diamos, Awni Hannun, Bryan Catanzaro, Dario Amodei, Erich Elsen, Jesse Engel, Shubhabrata Sengupta
-
Publication number: 20170148433Abstract: Embodiments of end-to-end deep learning systems and methods are disclosed to recognize speech of vastly different languages, such as English or Mandarin Chinese. In embodiments, the entire pipelines of hand-engineered components are replaced with neural networks, and the end-to-end learning allows handling a diverse variety of speech including noisy environments, accents, and different languages. Using a trained embodiment and an embodiment of a batch dispatch technique with GPUs in a data center, an end-to-end deep learning system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.Type: ApplicationFiled: November 21, 2016Publication date: May 25, 2017Applicant: Baidu USA LLCInventors: Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Erich Elsen, Jesse Engel, Christopher Fougner, Xu Han, Awni Hannun, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Dani Yogatama, Chong Wang, Jun Zhan, Zhenyao Zhu, Dario Amodei
-
Publication number: 20170148431Abstract: Embodiments of end-to-end deep learning systems and methods are disclosed to recognize speech of vastly different languages, such as English or Mandarin Chinese. In embodiments, the entire pipelines of hand-engineered components are replaced with neural networks, and the end-to-end learning allows handling a diverse variety of speech including noisy environments, accents, and different languages. Using a trained embodiment and an embodiment of a batch dispatch technique with GPUs in a data center, an end-to-end deep learning system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.Type: ApplicationFiled: November 21, 2016Publication date: May 25, 2017Applicant: Baidu USA LLCInventors: Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Erich Elsen, Jesse Engel, Christopher Fougner, Xu Han, Awni Hannun, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Dani Yogatama, Chong Wang, Jun Zhan, Zhenyao Zhu, Dario Amodei