Patents by Inventor Samuel Bengio

Samuel Bengio has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

DEVICE PLACEMENT OPTIMIZATION WITH REINFORCEMENT LEARNING

Publication number: 20200279163

Abstract: A method for determining a placement for machine learning model operations across multiple hardware devices is described.

Type: Application

Filed: May 20, 2020

Publication date: September 3, 2020

Inventors: Samuel Bengio, Mohammad Norouzi, Benoit Steiner, Jeffrey Adgate Dean, Hieu Hy Pham, Azalia Mirhoseini, Quoc V. Le, Naveen Kumar, Yuefeng Zhou, Rasmus Munk Larsen
Generating Target Sequences From Input Sequences Using Partial Conditioning

Publication number: 20200251099

Abstract: A system can be configured to perform tasks such as converting recorded speech to a sequence of phonemes that represent the speech, converting an input sequence of graphemes into a target sequence of phonemes, translating an input sequence of words in one language into a corresponding sequence of words in another language, or predicting a target sequence of words that follow an input sequence of words in a language (e.g., a language model). In a speech recognizer, the RNN system may be used to convert speech to a target sequence of phonemes in real-time so that a transcription of the speech can be generated and presented to a user, even before the user has completed uttering the entire speech input.

Type: Application

Filed: February 4, 2020

Publication date: August 6, 2020

Inventors: Navdeep Jaitly, Quoc V. Le, Oriol Vinyals, Samuel Bengio, Ilya Sutskever
Linear transformation for speech recognition modeling

Patent number: 10714078

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using complex linear projection are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to an utterance. The method further includes generating frequency domain data using the audio data. The method further includes processing the frequency domain data using complex linear projection. The method further includes providing the processed frequency domain data to a neural network trained as an acoustic model. The method further includes generating a transcription for the utterance that is determined based at least on output that the neural network provides in response to receiving the processed frequency domain data.

Type: Grant

Filed: October 26, 2018

Date of Patent: July 14, 2020

Assignee: Google LLC

Inventors: Samuel Bengio, Mirkó Visontai, Christopher Walter George Thornton, Michiel A. U. Bacchiani, Tara N. Sainath, Ehsan Variani, Izhak Shafran
Device placement optimization with reinforcement learning

Patent number: 10692003

Abstract: A method for determining a placement for machine learning model operations across multiple hardware devices is described.

Type: Grant

Filed: June 19, 2019

Date of Patent: June 23, 2020

Assignee: Google LLC

Inventors: Samuel Bengio, Mohammad Norouzi, Benoit Steiner, Jeffrey Adgate Dean, Hieu Hy Pham, Azalia Mirhoseini, Quoc V. Le, Naveen Kumar, Yuefeng Zhou, Rasmus Munk Larsen
Neural Networks For Speaker Verification

Publication number: 20200160869

Abstract: This document generally describes systems, methods, devices, and other techniques related to speaker verification, including (i) training a neural network for a speaker verification model, (ii) enrolling users at a client device, and (iii) verifying identities of users based on characteristics of the users' voices. Some implementations include a computer-implemented method. The method can include receiving, at a computing device, data that characterizes an utterance of a user of the computing device. A speaker representation can be generated, at the computing device, for the utterance using a neural network on the computing device. The neural network can be trained based on a plurality of training samples that each: (i) include data that characterizes a first utterance and data that characterizes one or more second utterances, and (ii) are labeled as a matching speakers sample or a non-matching speakers sample.

Type: Application

Filed: January 24, 2020

Publication date: May 21, 2020

Applicant: Google LLC

Inventors: Georg Heigold, Samuel Bengio, Ignacio Lopez Moreno
Neural Networks with Area Attention

Publication number: 20200104681

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing an area attention layer in a neural network system. The area attention layer area implements a way for a neural network model to attend to areas in the memory, where each area contains a group of items that are structurally adjacent.

Type: Application

Filed: September 27, 2019

Publication date: April 2, 2020

Inventors: Yang Li, Lukasz Mieczyslaw Kaiser, Samuel Bengio, Si Si
END-TO-END TEXT-TO-SPEECH CONVERSION

Publication number: 20200098350

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating speech from text. One of the systems includes one or more computers and one or more storage devices storing instructions that when executed by one or more computers cause the one or more computers to implement: a sequence-to-sequence recurrent neural network configured to: receive a sequence of characters in a particular natural language, and process the sequence of characters to generate a spectrogram of a verbal utterance of the sequence of characters in the particular natural language; and a subsystem configured to: receive the sequence of characters in the particular natural language, and provide the sequence of characters as input to the sequence-to-sequence recurrent neural network to obtain as output the spectrogram of the verbal utterance of the sequence of characters in the particular natural language.

Type: Application

Filed: November 26, 2019

Publication date: March 26, 2020

Inventors: Samuel Bengio, Yuxuan Wang, Zongheng Yang, Zhifeng Chen, Yonghui Wu, Ioannis Agiomyrgiannakis, Ron J. Weiss, Navdeep Jaitly, Ryan M. Rifkin, Robert Andrew James Clark, Quoc V. Le, Russell J. Ryan, Ying Xiao
Neural networks for speaker verification

Patent number: 10586542

Abstract: This document generally describes systems, methods, devices, and other techniques related to speaker verification, including (i) training a neural network for a speaker verification model, (ii) enrolling users at a client device, and (iii) verifying identities of users based on characteristics of the users' voices. Some implementations include a computer-implemented method. The method can include receiving, at a computing device, data that characterizes an utterance of a user of the computing device. A speaker representation can be generated, at the computing device, for the utterance using a neural network on the computing device. The neural network can be trained based on a plurality of training samples that each: (i) include data that characterizes a first utterance and data that characterizes one or more second utterances, and (ii) are labeled as a matching speakers sample or a non-matching speakers sample.

Type: Grant

Filed: April 30, 2018

Date of Patent: March 10, 2020

Assignee: Google LLC

Inventors: Georg Heigold, Samuel Bengio, Ignacio Lopez Moreno
End-to-end text-to-speech conversion

Patent number: 10573293

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating speech from text. One of the systems includes one or more computers and one or more storage devices storing instructions that when executed by one or more computers cause the one or more computers to implement: a sequence-to-sequence recurrent neural network configured to: receive a sequence of characters in a particular natural language, and process the sequence of characters to generate a spectrogram of a verbal utterance of the sequence of characters in the particular natural language; and a subsystem configured to: receive the sequence of characters in the particular natural language, and provide the sequence of characters as input to the sequence-to-sequence recurrent neural network to obtain as output the spectrogram of the verbal utterance of the sequence of characters in the particular natural language.

Type: Grant

Filed: June 20, 2019

Date of Patent: February 25, 2020

Assignee: Google LLC

Inventors: Samuel Bengio, Yuxuan Wang, Zongheng Yang, Zhifeng Chen, Yonghui Wu, Ioannis Agiomyrgiannakis, Ron J. Weiss, Navdeep Jaitly, Ryan M. Rifkin, Robert Andrew James Clark, Quoc V. Le, Russell J. Ryan, Ying Xiao
Generating target sequences from input sequences using partial conditioning

Patent number: 10559300

Abstract: A system can be configured to perform tasks such as converting recorded speech to a sequence of phonemes that represent the speech, converting an input sequence of graphemes into a target sequence of phonemes, translating an input sequence of words in one language into a corresponding sequence of words in another language, or predicting a target sequence of words that follow an input sequence of words in a language (e.g., a language model). In a speech recognizer, the RNN system may be used to convert speech to a target sequence of phonemes in real-time so that a transcription of the speech can be generated and presented to a user, even before the user has completed uttering the entire speech input.

Type: Grant

Filed: August 6, 2018

Date of Patent: February 11, 2020

Assignee: Google LLC

Inventors: Navdeep Jaitly, Quoc V. Le, Oriol Vinyals, Samuel Bengio, Ilya Sutskever
Generating Natural Language Descriptions of Images

Publication number: 20200042866

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating descriptions of input images. One of the methods includes obtaining an input image; processing the input image using a first neural network to generate an alternative representation for the input image; and processing the alternative representation for the input image using a second neural network to generate a sequence of a plurality of words in a target natural language that describes the input image.

Type: Application

Filed: August 12, 2019

Publication date: February 6, 2020

Inventors: Samuel Bengio, Oriol Vinyals, Alexander Toshkov Toshev, Dumitru Erhan
LABEL CONSISTENCY FOR IMAGE ANALYSIS

Publication number: 20200012905

Abstract: Systems and techniques are disclosed for labeling objects within an image. The objects may be labeled by selecting an option from a plurality of options such that each option is a potential label for the object. An option may have an option score associated with. Additionally, a relation score may be calculated for a first option and a second option corresponding to a second object in an image. The relation score may be based on a frequency, probability, or observance corresponding to the co-occurrence of text associated with the first option and the second option in a text corpus such as the World Wide Web. An option may be selected as a label for an object based on a global score calculated based at least on an option score and relation score associated with the option.

Type: Application

Filed: September 19, 2019

Publication date: January 9, 2020

Inventors: Samuel Bengio, Jeffrey Adgate Dean, Quoc V. Le, Jonathon Shlens, Yoram Singer
REWARD AUGMENTED MODEL TRAINING

Publication number: 20190188566

Abstract: A method includes obtaining data identifying a machine learning model to be trained to perform a machine learning task, the machine learning model being configured to receive an input example and to process the input example in accordance with current values of a plurality of model parameters to generate a model output for the input example; obtaining initial training data for training the machine learning model, the initial training data comprising a plurality of training examples and, for each training example, a ground truth output that should be generated by the machine learning model by processing the training example; generating modified training data from the initial training data; and training the machine learning model on the modified training data.

Type: Application

Filed: August 25, 2017

Publication date: June 20, 2019

Inventors: Michael Schuster, Samuel Bengio, Navdeep Jaitly, Zhifeng Chen, Dale Eric Schuurmans, Mohammad Norouzi, Yonghui Wu
COMPLEX LINEAR PROJECTION FOR ACOUSTIC MODELING

Publication number: 20190115013

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using complex linear projection are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to an utterance. The method further includes generating frequency domain data using the audio data. The method further includes processing the frequency domain data using complex linear projection. The method further includes providing the processed frequency domain data to a neural network trained as an acoustic model. The method further includes generating a transcription for the utterance that is determined based at least on output that the neural network provides in response to receiving the processed frequency domain data.

Type: Application

Filed: October 26, 2018

Publication date: April 18, 2019

Inventors: Samuel Bengio, Mirko Visontai, Christopher Walter George Thornton, Michiel A.U. Bacchiani, Tara N. Sainath, Ehsan Variani, Izhak Shafran
GENERATING TARGET SEQUENCES FROM INPUT SEQUENCES USING PARTIAL CONDITIONING

Publication number: 20180342238

Abstract: A system can be configured to perform tasks such as converting recorded speech to a sequence of phonemes that represent the speech, converting an input sequence of graphemes into a target sequence of phonemes, translating an input sequence of words in one language into a corresponding sequence of words in another language, or predicting a target sequence of words that follow an input sequence of words in a language (e.g., a language model). In a speech recognizer, the RNN system may be used to convert speech to a target sequence of phonemes in real-time so that a transcription of the speech can be generated and presented to a user, even before the user has completed uttering the entire speech input.

Type: Application

Filed: August 6, 2018

Publication date: November 29, 2018

Inventors: Navdeep Jaitly, Quoc V. Le, Oriol Vinyals, Samuel Bengio, Ilya Sutskever
Complex linear projection for acoustic modeling

Patent number: 10140980

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using complex linear projection are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to an utterance. The method further includes generating frequency domain data using the audio data. The method further includes processing the frequency domain data using complex linear projection. The method further includes providing the processed frequency domain data to a neural network trained as an acoustic model. The method further includes generating a transcription for the utterance that is determined based at least on output that the neural network provides in response to receiving the processed frequency domain data.

Type: Grant

Filed: December 21, 2016

Date of Patent: November 27, 2018

Assignee: Google LCC

Inventors: Samuel Bengio, Mirko Visontai, Christopher Walter George Thornton, Michiel A. U. Bacchiani, Tara N. Sainath, Ehsan Variani, Izhak Shafran
Neural Networks For Speaker Verification

Publication number: 20180315430

Abstract: This document generally describes systems, methods, devices, and other techniques related to speaker verification, including (i) training a neural network for a speaker verification model, (ii) enrolling users at a client device, and (iii) verifying identities of users based on characteristics of the users' voices. Some implementations include a computer-implemented method. The method can include receiving, at a computing device, data that characterizes an utterance of a user of the computing device. A speaker representation can be generated, at the computing device, for the utterance using a neural network on the computing device. The neural network can be trained based on a plurality of training samples that each: (i) include data that characterizes a first utterance and data that characterizes one or more second utterances, and (ii) are labeled as a matching speakers sample or a non-matching speakers sample.

Type: Application

Filed: April 30, 2018

Publication date: November 1, 2018

Inventors: Georg Heigold, Samuel Bengio, Ignacio Lopez Moreno
Generating target sequences from input sequences using partial conditioning

Patent number: 10043512

Abstract: A system can be configured to perform tasks such as converting recorded speech to a sequence of phonemes that represent the speech, converting an input sequence of graphemes into a target sequence of phonemes, translating an input sequence of words in one language into a corresponding sequence of words in another language, or predicting a target sequence of words that follow an input sequence of words in a language (e.g., a language model). In a speech recognizer, the RNN system may be used to convert speech to a target sequence of phonemes in real-time so that a transcription of the speech can be generated and presented to a user, even before the user has completed uttering the entire speech input.

Type: Grant

Filed: November 11, 2016

Date of Patent: August 7, 2018

Assignee: Google LLC

Inventors: Navdeep Jaitly, Quoc V. Le, Oriol Vinyals, Samuel Bengio, Ilya Sutskever
COMPLEX LINEAR PROJECTION FOR ACOUSTIC MODELING

Publication number: 20180174575

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using complex linear projection are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to an utterance. The method further includes generating frequency domain data using the audio data. The method further includes processing the frequency domain data using complex linear projection. The method further includes providing the processed frequency domain data to a neural network trained as an acoustic model. The method further includes generating a transcription for the utterance that is determined based at least on output that the neural network provides in response to receiving the processed frequency domain data.

Type: Application

Filed: December 21, 2016

Publication date: June 21, 2018

Inventors: Samuel Bengio, Mirko Visontai, Christopher Walter George Thornton, Michiel A.U. Bacchiani, Tara N. Sainath, Ehsan Variani, Izhak Shafran
PROCESSING AND GENERATING SETS USING RECURRENT NEURAL NETWORKS

Publication number: 20170200076

Abstract: In one aspect, this specification describes a recurrent neural network system implemented by one or more computers that is configured to process input sets to generate neural network outputs for each input set. The input set can be a collection of multiple inputs for which the recurrent neural network should generate the same neural network output regardless of the order in which the inputs are arranged in the collection. The recurrent neural network system can include a read neural network, a process neural network, and a write neural network. In another aspect, this specification describes a system implemented as computer programs on one or more computers in one or more locations that is configured to train a recurrent neural network that receives a neural network input and sequentially emits outputs to generate an output sequence for the neural network input.

Type: Application

Filed: January 13, 2017

Publication date: July 13, 2017

Inventors: Oriol Vinyals, Samuel Bengio

prev 1 2 3 next