Patents by Inventor Quoc V. Le

Quoc V. Le has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230015737
    Abstract: Systems and methods are provided that train a machine-learned language encoding model through the use of a contrastive learning task. In particular, the present disclosure describes a contrastive learning task where the encoder learns to distinguish input tokens from plausible alternatives. In some implementations, on each training example the proposed method masks out some subset (e.g., 15%) of the original input tokens, replaces the masked tokens with samples from a “generator” (e.g., which may be a small masked language model), and then trains the encoder to predict whether each token comes from the original data or is a replacement produced by the generator.
    Type: Application
    Filed: September 19, 2022
    Publication date: January 19, 2023
    Inventors: Thang Minh Luong, Quoc V. Le, Kevin Stefan Clark
  • Patent number: 11556690
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a computer chip placement. One of the methods includes obtaining netlist data for a computer chip; and generating a computer chip placement, comprising placing a respective macro node at each time step in a sequence comprising a plurality of time steps, the placing comprising, for each time step: generating an input representation for the time step; processing the input representation using a node placement neural network having a plurality of network parameters, wherein the node placement neural network is configured to process the input representation in accordance with current values of the network parameters to generate a score distribution over a plurality of positions on the surface of the computer chip; and assigning the macro node to be placed at the time step to a position from the plurality of positions using the score distribution.
    Type: Grant
    Filed: December 17, 2021
    Date of Patent: January 17, 2023
    Assignee: Google LLC
    Inventors: Anna Darling Goldie, Azalia Mirhoseini, Ebrahim Songhori, Wenjie Jiang, Shen Wang, Roger David Carpenter, Young-Joon Lee, Mustafa Nazim Yazgan, Chian-min Richard Ho, Quoc V. Le, James Laudon, Jeffrey Adgate Dean, Kavya Srinivasa Setty, Omkar Pathak
  • Patent number: 11537664
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining, for each of one or more categorical features, a respective vocabulary of categorical feature values of the categorical feature that should be active during processing of inputs by a machine learning model. In one aspect, a method comprises: generating a batch of output sequences, each output sequence in the batch specifying, for each of the categorical features, a respective vocabulary of categorical feature values of the categorical feature that should be active; for each output sequence in the batch, determining a performance metric of the machine learning model on a machine learning task after the machine learning model has been trained to perform the machine learning task with only the respective vocabulary of categorical feature values of each categorical feature specified by the output sequence being active.
    Type: Grant
    Filed: May 20, 2020
    Date of Patent: December 27, 2022
    Assignee: Google LLC
    Inventors: Cong Li, Jay Adams, Manas Joglekar, Pranav Khaitan, Quoc V. Le, Mei Chen
  • Publication number: 20220405579
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for selecting a neural network to perform a particular machine learning task while satisfying a set of constraints.
    Type: Application
    Filed: March 3, 2021
    Publication date: December 22, 2022
    Inventors: Jiahui Yu, Pengchong Jin, Hanxiao Liu, Gabriel Mintzer Bender, Pieter-Jan Kindermans, Mingxing Tan, Xiaodan Song, Ruoming Pang, Quoc V. Le
  • Publication number: 20220391687
    Abstract: Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for generating and searching reinforcement learning algorithms. In some implementations, a computer-implemented system generates a sequence of candidate reinforcement learning algorithms. Each candidate reinforcement learning algorithm in the sequence is configured to receive an input environment state characterizing a state of an environment and to generate an output that specifies an action to be performed by an agent interacting with the environment. For each candidate reinforcement learning algorithm in the sequence, the system performs a performance evaluation for a set of a plurality of training environments. For each training environment, the system adjusts a set of environment-specific parameters of the candidate reinforcement learning algorithm by performing training of the candidate reinforcement learning algorithm to control a corresponding agent in the training environment.
    Type: Application
    Filed: June 3, 2021
    Publication date: December 8, 2022
    Inventors: John Dalton Co-Reyes, Yingjie Miao, Daiyi Peng, Sergey Vladimir Levine, Quoc V. Le, Honglak Lee, Aleksandra Faust
  • Publication number: 20220383069
    Abstract: A computer-implemented method for performing computer vision with reduced computational cost and improved accuracy can include obtaining, by a computing system including one or more computing devices, input data comprising an input tensor having one or more dimensions, providing, by the computing system, the input data to a machine-learned convolutional attention network, the machine-learned convolutional attention network including two or more network stages, and, in response to providing the input data to the machine-learned convolutional attention network, receiving, by the computing system, a machine-learning prediction from the machine-learned convolutional attention network. The convolutional attention network can include at least one attention block, wherein the attention block includes a relative attention mechanism, the relative attention mechanism including the sum of a static convolution kernel with an adaptive attention matrix.
    Type: Application
    Filed: May 27, 2022
    Publication date: December 1, 2022
    Inventors: Zihang Dai, Hanxiao Liu, Mingxing Tan, Quoc V. Le
  • Publication number: 20220383195
    Abstract: A method for searching for an output machine learning (ML) algorithm to perform an ML task is described. The method includes: receiving a set of training examples and a set of validation examples, and generating a sequence of candidate ML algorithms to perform the task. For each candidate ML algorithm in the sequence, the method includes: setting up one or more training parameters for the candidate ML algorithm by executing a respective candidate setup function, training the candidate ML algorithm by processing the set of training examples using a respective candidate predict function and a respective candidate learn function, and evaluating a performance of the trained candidate ML algorithm by executing the respective candidate predict function on the set of validation examples to determine a performance metric. The method includes selecting a trained candidate ML algorithm with the best performance metric as the output ML algorithm for the task.
    Type: Application
    Filed: February 8, 2021
    Publication date: December 1, 2022
    Inventors: Chen Liang, David Richard So, Esteban Alberto Real, Quoc V. Le
  • Publication number: 20220383119
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing a machine learning task on a network input to generate a network output. One of the systems includes an attention neural network configured to perform the machine learning task. The attention neural network includes one or more attentions layers that each include a squared ReLU activation layer, a depth-wise convolution layer, or both.
    Type: Application
    Filed: May 27, 2022
    Publication date: December 1, 2022
    Inventors: David Richard So, Quoc V. Le, Jr., Hanxiao Liu, Wojciech Andrzej Manke, Zihang Dai, Noam M. Shazeer
  • Publication number: 20220383206
    Abstract: Systems and methods can leverage task-specific unlabeled data to improve downstream performance in data-constrained scenarios. Given a target task, a first technique proposed herein, which can be referred to as task augmentation, uses unlabeled text from the target domain to synthesize a large amount of in-domain training data for an auxiliary task A second technique provides a self-training algorithm, where a model learns to improve itself using its predictions on unlabeled examples.
    Type: Application
    Filed: May 27, 2022
    Publication date: December 1, 2022
    Inventors: Thang Minh Luong, Tu Thanh Vu, Quoc V. Le, Grady Hayes Simon
  • Publication number: 20220367052
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing a machine learning task on a network input to generate a network output. In one aspect, one of the systems includes a neural network configured to perform the machine learning task, the neural network including one or more blocks that each include a feedforward spatial transformation unit.
    Type: Application
    Filed: May 16, 2022
    Publication date: November 17, 2022
    Inventors: Hanxiao Liu, David Richard So, Quoc V. Le, Zihang Dai
  • Patent number: 11501168
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for structuring and training a recurrent neural network. This describes a technique that improves the ability to capture long term dependencies in recurrent neural networks by adding an unsupervised auxiliary loss at one or more anchor points to the original objective. This auxiliary loss forces the network to either reconstruct previous events or predict next events in a sequence, making truncated backpropagation feasible for long sequences and also improving full backpropagation through time.
    Type: Grant
    Filed: February 11, 2019
    Date of Patent: November 15, 2022
    Assignee: Google LLC
    Inventors: Andrew M. Dai, Quoc V. Le, Hoang Trieu Trinh, Thang Minh Luong
  • Patent number: 11488067
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model using teacher annealing.
    Type: Grant
    Filed: May 11, 2020
    Date of Patent: November 1, 2022
    Assignee: Google LLC
    Inventors: Thang Minh Luong, Quoc V. Le, Kevin Stefan Clark
  • Patent number: 11481609
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for incorporating a computationally efficient expressive output layer in a neural network. The output layer is configured to map a received hidden state to a probability distribution over a vocabulary of possible outputs by generating, from the hidden state, a respective context embedding for each of a plurality of gates; for each of the possible outputs in the vocabulary, computing a gated logit for the possible output by applying an output embedding for the possible output to the weighed sum; and generating the probability distribution over the vocabulary of possible outputs by applying a softmax to the gated logits for the possible outputs in the vocabulary.
    Type: Grant
    Filed: May 13, 2020
    Date of Patent: October 25, 2022
    Assignee: Google LLC
    Inventors: Thang Minh Luong, Quoc V. Le, Zhilin Yang
  • Patent number: 11475277
    Abstract: Generally, the present disclosure is directed to novel machine-learned classification models that operate with hard attention to make discrete attention actions. The present disclosure also provides a self-supervised pre-training procedure that initializes the model to a state with more frequent rewards. Given only the ground truth classification labels for a set of training inputs (e.g., images), the proposed models are able to learn a policy over discrete attention locations that identifies certain portions of the input (e.g., patches of the images) that are relevant to the classification. In such fashion, the models are able to provide high accuracy classifications while also providing an explicit and interpretable basis for the decision.
    Type: Grant
    Filed: May 13, 2020
    Date of Patent: October 18, 2022
    Assignee: GOOGLE LLC
    Inventors: Gamaleldin Elsayed, Simon Kornblith, Quoc V. Le
  • Patent number: 11455514
    Abstract: A method for determining a placement for machine learning model operations across multiple hardware devices includes receiving data specifying machine learning operations, and determining a placement that assigns each of the operations specified by the data to a respective device from the multiple hardware devices.
    Type: Grant
    Filed: August 28, 2019
    Date of Patent: September 27, 2022
    Assignee: Google LLC
    Inventors: Benoit Steiner, Anna Darling Goldie, Jeffrey Adgate Dean, Hieu Hy Pham, Azalia Mirhoseini, Quoc V. Le
  • Publication number: 20220301298
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an image representation neural network.
    Type: Application
    Filed: March 17, 2022
    Publication date: September 22, 2022
    Inventors: Tsung-Yi Lin, Barret Zoph, Ekin Dogus Cubuk, Golnaz Ghiasi, Quoc V. Le
  • Patent number: 11449684
    Abstract: Systems and methods are provided that train a machine-learned language encoding model through the use of a contrastive learning task. In particular, the present disclosure describes a contrastive learning task where the encoder learns to distinguish input tokens from plausible alternatives. In some implementations, on each training example the proposed method masks out some subset (e.g., 15%) of the original input tokens, replaces the masked tokens with samples from a “generator” (e.g., which may be a small masked language model), and then trains the encoder to predict whether each token comes from the original data or is a replacement produced by the generator.
    Type: Grant
    Filed: September 21, 2020
    Date of Patent: September 20, 2022
    Assignee: GOOGLE LLC
    Inventors: Thang Minh Luong, Quoc V. Le, Kevin Stefan Clark
  • Patent number: 11450096
    Abstract: Systems and methods of the present disclosure can include a computer-implemented method for efficient machine-learned model training. The method can include obtaining a plurality of training samples for a machine-learned model. The method can include, for one or more first training iterations, training, based at least in part on a first regularization magnitude configured to control a relative effect of one or more regularization techniques, the machine-learned model using one or more respective first training samples of the plurality of training samples. The method can include, for one or more second training iterations, training, based at least in part on a second regularization magnitude greater than the first regularization magnitude, the machine-learned model using one or more respective second training samples of the plurality of training samples.
    Type: Grant
    Filed: December 29, 2021
    Date of Patent: September 20, 2022
    Assignee: GOOGLE LLC
    Inventors: Mingxing Tan, Quoc V. Le
  • Publication number: 20220245928
    Abstract: Systems and methods of the present disclosure can include a computer-implemented method for efficient machine-learned model training. The method can include obtaining a plurality of training samples for a machine-learned model. The method can include, for one or more first training iterations, training, based at least in part on a first regularization magnitude configured to control a relative effect of one or more regularization techniques, the machine-learned model using one or more respective first training samples of the plurality of training samples. The method can include, for one or more second training iterations, training, based at least in part on a second regularization magnitude greater than the first regularization magnitude, the machine-learned model using one or more respective second training samples of the plurality of training samples.
    Type: Application
    Filed: December 29, 2021
    Publication date: August 4, 2022
    Inventors: Mingxing Tan, Quoc V. Le
  • Publication number: 20220230048
    Abstract: Methods, systems, and apparatus, including computer-readable media, for scaling neural network architectures on hardware accelerators. A method includes receiving training data and information specifying target computing resources, and performing using the training data, a neural architecture search over a search space to identify an architecture for a base neural network. A plurality of scaling parameter values for scaling the base neural network can be identified, which can include repeatedly selecting a plurality of candidate scaling parameter values, and determining a measure of performance for the base neural network scaled according to the plurality of candidate scaling parameter values, in accordance with a plurality of second objectives including a latency objective. An architecture for a scaled neural network can be determined using the architecture of the base neural network scaled according to the plurality of scaling parameter values.
    Type: Application
    Filed: February 12, 2021
    Publication date: July 21, 2022
    Inventors: Andrew Li, Sheng Li, Mingxing Tan, Ruoming Pang, Liqun Cheng, Quoc V. Le, Norman Paul Jouppi