Patents by Inventor Quoc V. Le

Quoc V. Le has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210232929
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining neural network architectures. One of the methods includes generating, using a controller neural network, a batch of output sequences, each output sequence in the batch specifying a respective subset of a plurality of components of a large neural network that should be active during the processing of inputs by the large neural network; for each output sequence in the batch: determining a performance metric of the large neural network on the particular neural network task (i) in accordance with current values of the large network parameters and (ii) with only the subset of components specified by the output sequences active; and using the performance metrics for the output sequences in the batch to adjust the current values of the controller parameters of the controller neural network.
    Type: Application
    Filed: April 16, 2021
    Publication date: July 29, 2021
    Inventors: Barret Zoph, Yun Jia Guan, Hieu Hy Pham, Quoc V. Le
  • Patent number: 11048875
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing sequential data. In one aspect, a computer-implemented method includes receiving a request to generate a system output for an input data sequence, the input data sequence including a plurality of tokens. One or more tokens may be designated as tokens to be skipped. When a token has not been designated as a token to be skipped, the token is processed using a recurrent neural network to update a current internal state of the recurrent neural network. The system output is generated from the final internal state of the recurrent neural network.
    Type: Grant
    Filed: May 4, 2020
    Date of Patent: June 29, 2021
    Assignee: Google LLC
    Inventors: Quoc V. Le, Hongrae Lee, Wei Yu
  • Patent number: 11030523
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining neural network architectures. One of the methods includes generating, using a controller neural network, a batch of output sequences, each output sequence in the batch defining a respective architecture of a child neural network that is configured to perform a particular neural network task; for each output sequence in the batch: training a respective instance of the child neural network having the architecture defined by the output sequence; evaluating a performance of the trained instance of the child neural network on the particular neural network task to determine a performance metric for the trained instance of the child neural network on the particular neural network task; and using the performance metrics for the trained instances of the child neural network to adjust the current values of the controller parameters of the controller neural network.
    Type: Grant
    Filed: April 29, 2019
    Date of Patent: June 8, 2021
    Assignee: Google LLC
    Inventors: Barret Zoph, Quoc V. Le
  • Publication number: 20210133578
    Abstract: A method for determining a final architecture for a neural network to perform a particular machine learning task is described.
    Type: Application
    Filed: January 8, 2021
    Publication date: May 6, 2021
    Inventors: Mingxing Tan, Quoc V. Le
  • Patent number: 10997503
    Abstract: A method for receiving training data for training a neural network to perform a machine learning task and for searching for, using the training data, an optimized neural network architecture for performing the machine learning task is described. Searching for the optimized neural network architecture includes: maintaining population data; maintaining threshold data; and repeatedly performing the following operations: selecting one or more candidate architectures from the population data; generating a new architecture from the one or more selected candidate architectures; for the new architecture: training a neural network having the new architecture until termination criteria for the training are satisfied; and determining a final measure of fitness of the neural network having the new architecture after the training; and adding data defining the new architecture and the final measure of fitness for the neural network having the new architecture to the population data.
    Type: Grant
    Filed: June 20, 2019
    Date of Patent: May 4, 2021
    Assignee: Google LLC
    Inventors: David Martin Dohan, David Richard So, Chen Liang, Quoc V. Le
  • Patent number: 10984319
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining neural network architectures. One of the methods includes generating, using a controller neural network, a batch of output sequences, each output sequence in the batch specifying a respective subset of a plurality of components of a large neural network that should be active during the processing of inputs by the large neural network; for each output sequence in the batch: determining a performance metric of the large neural network on the particular neural network task (i) in accordance with current values of the large network parameters and (ii) with only the subset of components specified by the output sequences active; and using the performance metrics for the output sequences in the batch to adjust the current values of the controller parameters of the controller neural network.
    Type: Grant
    Filed: April 27, 2020
    Date of Patent: April 20, 2021
    Assignee: Google LLC
    Inventors: Barret Zoph, Yun Jia Guan, Hieu Hy Pham, Quoc V. Le
  • Publication number: 20210097348
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model. One of the methods includes obtaining a training data set for training a machine learning model, the training data set comprising a plurality of training inputs; determining a plurality of data augmentation policies, wherein each data augmentation policy defines a procedure for processing a training input to generate a transformed training input; for each data augmentation policy, training the machine learning model using the data augmentation policy; determining, for each data augmentation policy, a quality measure of the machine learning model that has been trained using the data augmentation policy; and selecting a final data augmentation policy based using the quality measures of the machine learning models.
    Type: Application
    Filed: March 27, 2020
    Publication date: April 1, 2021
    Inventors: Jonathon Shlens, Quoc V. Le, Ekin Dogus Cubuk, Barret Zoph
  • Patent number: 10963779
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing operations using data from a data source. In one aspect, a method includes a neural network system including a controller neural network configured to: receive a controller input for a time step and process the controller input and a representation of a system input to generate: an operation score distribution that assigns a respective operation score to an operation and a data score distribution that assigns a respective data score in the data source. The neural network system can also include an operation subsystem configured to: perform operations to generate operation outputs, wherein at least one of the operations is performed on data in the data source, and combine the operation outputs in accordance with the operation score distribution and the data score distribution to generate a time step output for the time step.
    Type: Grant
    Filed: November 11, 2016
    Date of Patent: March 30, 2021
    Assignee: Google LLC
    Inventors: Quoc V. Le, Ilya Sutskever, Arvind Neelakantan
  • Publication number: 20210089724
    Abstract: Systems and methods are provided that train a machine-learned language encoding model through the use of a contrastive learning task. In particular, the present disclosure describes a contrastive learning task where the encoder learns to distinguish input tokens from plausible alternatives. In some implementations, on each training example the proposed method masks out some subset (e.g., 15%) of the original input tokens, replaces the masked tokens with samples from a “generator” (e.g., which may be a small masked language model), and then trains the encoder to predict whether each token comes from the original data or is a replacement produced by the generator.
    Type: Application
    Filed: September 21, 2020
    Publication date: March 25, 2021
    Inventors: Thang Minh Luong, Quoc V. Le, Kevin Stefan Clark
  • Patent number: 10936828
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for neural translation systems with rare word processing. One of the methods is a method training a neural network translation system to track the source in source sentences of unknown words in target sentences, in a source language and a target language, respectively and includes deriving alignment data from a parallel corpus, the alignment data identifying, in each pair of source and target language sentences in the parallel corpus, aligned source and target words; annotating the sentences in the parallel corpus according to the alignment data and a rare word model to generate a training dataset of paired source and target language sentences; and training a neural network translation model on the training dataset.
    Type: Grant
    Filed: November 16, 2018
    Date of Patent: March 2, 2021
    Assignee: Google LLC
    Inventors: Quoc V. Le, Minh-Thang Luong, Ilya Sutskever, Oriol Vinyals, Wojciech Zaremba
  • Patent number: 10922611
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining update rules for training neural networks. One of the methods includes generating, using a controller neural network, a batch of output sequences, each output sequence in the batch defining a respective update rule; for each output sequence in the batch: training a respective instance of a child neural network using the update rule defined by the output sequence; evaluating a performance of the trained instance of the child neural network on the particular neural network task to determine a performance metric for the trained instance of the child neural network on the particular neural network task; and using the performance metrics for the trained instances of the child neural network to adjust the current values of the controller parameters of the controller neural network.
    Type: Grant
    Filed: October 24, 2019
    Date of Patent: February 16, 2021
    Assignee: Google LLC
    Inventors: Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le
  • Patent number: 10909457
    Abstract: A method for determining a final architecture for a neural network to perform a particular machine learning task is described.
    Type: Grant
    Filed: January 23, 2020
    Date of Patent: February 2, 2021
    Assignee: Google LLC
    Inventors: Mingxing Tan, Quoc V. Le
  • Publication number: 20210019658
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for learning a data augmentation policy for training a machine learning model. In one aspect, a method includes: receiving training data for training a machine learning model to perform a particular machine learning task; determining multiple data augmentation policies, comprising, at each of multiple time steps: generating a current data augmentation policy based on quality measures of data augmentation policies generated at previous time steps; training a machine learning model on the training data using the current data augmentation policy; and determining a quality measure of the current data augmentation policy using the machine learning model after it has been trained using the current data augmentation policy; and selecting a final data augmentation policy based on the quality measures of the determined data augmentation policies.
    Type: Application
    Filed: October 1, 2020
    Publication date: January 21, 2021
    Inventors: Vijay Vasudevan, Barret Zoph, Ekin Dogus Cubuk, Quoc V. Le
  • Publication number: 20200410396
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for performing machine learning tasks. One method includes receiving (i) a model input, and (ii) data identifying a first machine learning task to be performed on the model input to generate a first type of model output for the model input; augmenting the model input with an identifier for the first machine learning task to generate an augmented model input; and processing the augmented model input using a machine learning model, wherein the machine learning model has been trained on training data to perform a plurality of machine learning tasks including the first machine learning task, and wherein the machine learning model has been configured through training to process the augmented model input to generate a machine learning model output of the first type for the model input.
    Type: Application
    Filed: July 13, 2020
    Publication date: December 31, 2020
    Inventors: Zhifeng Chen, Michael Schuster, Melvin Jose Johnson Premkumar, Yonghui Wu, Quoc V. Le, Maxim Krikun, Thorsten Brants
  • Publication number: 20200401899
    Abstract: A method for receiving training data for training a neural network to perform a machine learning task and for searching for, using the training data, an optimized neural network architecture for performing the machine learning task is described. Searching for the optimized neural network architecture includes: maintaining population data; maintaining threshold data; and repeatedly performing the following operations: selecting one or more candidate architectures from the population data; generating a new architecture from the one or more selected candidate architectures; for the new architecture: training a neural network having the new architecture until termination criteria for the training are satisfied; and determining a final measure of fitness of the neural network having the new architecture after the training; and adding data defining the new architecture and the final measure of fitness for the neural network having the new architecture to the population data.
    Type: Application
    Filed: June 20, 2019
    Publication date: December 24, 2020
    Inventors: David Martin Dohan, David Richard So, Chen Liang, Quoc V. Le
  • Publication number: 20200372076
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining, for each of one or more categorical features, a respective vocabulary of categorical feature values of the categorical feature that should be active during processing of inputs by a machine learning model. In one aspect, a method comprises: generating a batch of output sequences, each output sequence in the batch specifying, for each of the categorical features, a respective vocabulary of categorical feature values of the categorical feature that should be active; for each output sequence in the batch, determining a performance metric of the machine learning model on a machine learning task after the machine learning model has been trained to perform the machine learning task with only the respective vocabulary of categorical feature values of each categorical feature specified by the output sequence being active.
    Type: Application
    Filed: May 20, 2020
    Publication date: November 26, 2020
    Inventors: Cong Li, Jay Adams, Manas Joglekar, Pranav Khaitan, Quoc V. Le, Mei Chen
  • Publication number: 20200364540
    Abstract: Generally, the present disclosure is directed to novel machine-learned classification models that operate with hard attention to make discrete attention actions. The present disclosure also provides a self-supervised pre-training procedure that initializes the model to a state with more frequent rewards. Given only the ground truth classification labels for a set of training inputs (e.g., images), the proposed models are able to learn a policy over discrete attention locations that identifies certain portions of the input (e.g., patches of the images) that are relevant to the classification. In such fashion, the models are able to provide high accuracy classifications while also providing an explicit and interpretable basis for the decision.
    Type: Application
    Filed: May 13, 2020
    Publication date: November 19, 2020
    Inventors: Gamaleldin Elsayed, Simon Kornblith, Quoc V. Le
  • Publication number: 20200364543
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for incorporating a computationally efficient expressive output layer in a neural network. The output layer is configured to map a received hidden state to a probability distribution over a vocabulary of possible outputs by generating, from the hidden state, a respective context embedding for each of a plurality of gates; for each of the possible outputs in the vocabulary, computing a gated logit for the possible output by applying an output embedding for the possible output to the weighed sum; and generating the probability distribution over the vocabulary of possible outputs by applying a softmax to the gated logits for the possible outputs in the vocabulary.
    Type: Application
    Filed: May 13, 2020
    Publication date: November 19, 2020
    Inventors: Thang Minh Luong, Quoc V. Le, Zhilin Yang
  • Publication number: 20200364617
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model using teacher annealing.
    Type: Application
    Filed: May 11, 2020
    Publication date: November 19, 2020
    Inventors: Thang Minh Luong, Quoc V. Le, Kevin Stefan Clark
  • Patent number: 10817805
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for learning a data augmentation policy for training a machine learning model. In one aspect, a method includes: receiving training data for training a machine learning model to perform a particular machine learning task; determining multiple data augmentation policies, comprising, at each of multiple time steps: generating a current data augmentation policy based on quality measures of data augmentation policies generated at previous time steps; training a machine learning model on the training data using the current data augmentation policy; and determining a quality measure of the current data augmentation policy using the machine learning model after it has been trained using the current data augmentation policy; and selecting a final data augmentation policy based on the quality measures of the determined data augmentation policies.
    Type: Grant
    Filed: May 20, 2019
    Date of Patent: October 27, 2020
    Assignee: Google LLC
    Inventors: Vijay Vasudevan, Barret Zoph, Ekin Dogus Cubuk, Quoc V. Le