Patents by Inventor Thang Minh Luong

Thang Minh Luong has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Contrastive Pre-Training for Language Tasks

Publication number: 20240160857

Abstract: Systems and methods are provided that train a machine-learned language encoding model through the use of a contrastive learning task. In particular, the present disclosure describes a contrastive learning task where the encoder learns to distinguish input tokens from plausible alternatives. In some implementations, on each training example the proposed method masks out some subset (e.g., 15%) of the original input tokens, replaces the masked tokens with samples from a “generator” (e.g., which may be a small masked language model), and then trains the encoder to predict whether each token comes from the original data or is a replacement produced by the generator.

Type: Application

Filed: January 25, 2024

Publication date: May 16, 2024

Inventors: Thang Minh Luong, Quoc V. Le, Kevin Stefan Clark
Vector-Quantized Image Modeling

Publication number: 20240112088

Abstract: Systems and methods are provided for vector-quantized image modeling using vision transformers and improved codebook handling. In particular, the present disclosure provides a Vector-quantized Image Modeling (VIM) approach that involves pretraining a machine learning model (e.g., Transformer model) to predict rasterized image tokens autoregressively. The discrete image tokens can be encoded from a learned Vision-Transformer-based VQGAN (example implementations of which can be referred to as ViT-VQGAN). The present disclosure proposes multiple improvements over vanilla VQGAN from architecture to codebook learning, yielding better efficiency and reconstruction fidelity. The improved ViT-VQGAN further improves vector-quantized image modeling tasks, including unconditional image generation, conditioned image generation (e.g., class-conditioned image generation), and unsupervised representation learning.

Type: Application

Filed: November 27, 2023

Publication date: April 4, 2024

Inventors: Jiahui Yu, Xin Li, Han Zhang, Vijay Vasudevan, Alexander Yeong-Shiuh Ku, Jason Michael Baldridge, Yuanzhong Xu, Jing Yu Koh, Thang Minh Luong, Gunjan Baid, Zirui Wang, Yonghui Wu
Training machine learning models using teacher annealing

Patent number: 11922281

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model using teacher annealing.

Type: Grant

Filed: October 31, 2022

Date of Patent: March 5, 2024

Assignee: Google LLC

Inventors: Thang Minh Luong, Quoc V. Le, Kevin Stefan Clark
Contrastive pre-training for language tasks

Patent number: 11914969

Abstract: Systems and methods are provided that train a machine-learned language encoding model through the use of a contrastive learning task. In particular, the present disclosure describes a contrastive learning task where the encoder learns to distinguish input tokens from plausible alternatives. In some implementations, on each training example the proposed method masks out some subset (e.g., 15%) of the original input tokens, replaces the masked tokens with samples from a “generator” (e.g., which may be a small masked language model), and then trains the encoder to predict whether each token comes from the original data or is a replacement produced by the generator.

Type: Grant

Filed: September 19, 2022

Date of Patent: February 27, 2024

Assignee: GOOGLE LLC

Inventors: Thang Minh Luong, Quoc V. Le, Kevin Stefan Clark
TRAINING MACHINE LEARNING MODELS USING TEACHER ANNEALING

Publication number: 20230049747

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model using teacher annealing.

Type: Application

Filed: October 31, 2022

Publication date: February 16, 2023

Inventors: Thang Minh Luong, Quoc V. Le, Kevin Stefan Clark
Contrastive Pre-Training for Language Tasks

Publication number: 20230015737

Abstract: Systems and methods are provided that train a machine-learned language encoding model through the use of a contrastive learning task. In particular, the present disclosure describes a contrastive learning task where the encoder learns to distinguish input tokens from plausible alternatives. In some implementations, on each training example the proposed method masks out some subset (e.g., 15%) of the original input tokens, replaces the masked tokens with samples from a “generator” (e.g., which may be a small masked language model), and then trains the encoder to predict whether each token comes from the original data or is a replacement produced by the generator.

Type: Application

Filed: September 19, 2022

Publication date: January 19, 2023

Inventors: Thang Minh Luong, Quoc V. Le, Kevin Stefan Clark
Task Augmentation and Self-Training for Improved Few-Shot Learning

Publication number: 20220383206

Abstract: Systems and methods can leverage task-specific unlabeled data to improve downstream performance in data-constrained scenarios. Given a target task, a first technique proposed herein, which can be referred to as task augmentation, uses unlabeled text from the target domain to synthesize a large amount of in-domain training data for an auxiliary task A second technique provides a self-training algorithm, where a model learns to improve itself using its predictions on unlabeled examples.

Type: Application

Filed: May 27, 2022

Publication date: December 1, 2022

Inventors: Thang Minh Luong, Tu Thanh Vu, Quoc V. Le, Grady Hayes Simon
Learning longer-term dependencies in neural network using auxiliary losses

Patent number: 11501168

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for structuring and training a recurrent neural network. This describes a technique that improves the ability to capture long term dependencies in recurrent neural networks by adding an unsupervised auxiliary loss at one or more anchor points to the original objective. This auxiliary loss forces the network to either reconstruct previous events or predict next events in a sequence, making truncated backpropagation feasible for long sequences and also improving full backpropagation through time.

Type: Grant

Filed: February 11, 2019

Date of Patent: November 15, 2022

Assignee: Google LLC

Inventors: Andrew M. Dai, Quoc V. Le, Hoang Trieu Trinh, Thang Minh Luong
Training machine learning models using teacher annealing

Patent number: 11488067

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model using teacher annealing.

Type: Grant

Filed: May 11, 2020

Date of Patent: November 1, 2022

Assignee: Google LLC

Inventors: Thang Minh Luong, Quoc V. Le, Kevin Stefan Clark
Computationally efficient expressive output layers for neural networks

Patent number: 11481609

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for incorporating a computationally efficient expressive output layer in a neural network. The output layer is configured to map a received hidden state to a probability distribution over a vocabulary of possible outputs by generating, from the hidden state, a respective context embedding for each of a plurality of gates; for each of the possible outputs in the vocabulary, computing a gated logit for the possible output by applying an output embedding for the possible output to the weighed sum; and generating the probability distribution over the vocabulary of possible outputs by applying a softmax to the gated logits for the possible outputs in the vocabulary.

Type: Grant

Filed: May 13, 2020

Date of Patent: October 25, 2022

Assignee: Google LLC

Inventors: Thang Minh Luong, Quoc V. Le, Zhilin Yang
Contrastive pre-training for language tasks

Patent number: 11449684

Abstract: Systems and methods are provided that train a machine-learned language encoding model through the use of a contrastive learning task. In particular, the present disclosure describes a contrastive learning task where the encoder learns to distinguish input tokens from plausible alternatives. In some implementations, on each training example the proposed method masks out some subset (e.g., 15%) of the original input tokens, replaces the masked tokens with samples from a “generator” (e.g., which may be a small masked language model), and then trains the encoder to predict whether each token comes from the original data or is a replacement produced by the generator.

Type: Grant

Filed: September 21, 2020

Date of Patent: September 20, 2022

Assignee: GOOGLE LLC

Inventors: Thang Minh Luong, Quoc V. Le, Kevin Stefan Clark
TRAINING MACHINE LEARNING MODELS USING UNSUPERVISED DATA AUGMENTATION

Publication number: 20220215209

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a IT machine learning model. One of the methods includes receiving training data comprising a plurality of unlabeled training inputs and a plurality of labeled training inputs; generating augmented training data, comprising generating, for each of the plurality of unlabeled training inputs, a respective augmented training input by applying a data augmentation technique to the unlabeled training input; and training the machine learning model on the augmented training data. In particular, but not exclusively, the model may be trained for perceptual tasks (e.g. tasks relating to vision or speech).

Type: Application

Filed: April 24, 2020

Publication date: July 7, 2022

Inventors: Thang Minh Luong, Quoc V. Le, Qizhe Xie, Zihang Dai
SELF-TRAINING TECHNIQUE FOR GENERATING NEURAL NETWORK MODELS

Publication number: 20220083840

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, used to implement a self-training technique for generating neural network (NN) models. A first model is generated in response to training a first NN using labeled data. A respective pseudo label is generated for each item of unlabeled data when items of unlabeled data are processed using the first model. A second NN is used to process each item of a combined dataset to train the second NN. The combined dataset includes items of labeled data and a corresponding item for each respective pseudo label. Attributes of items in the combined dataset are modified to inject noise into the combined dataset when the second NN is trained. A second model is generated after the second NN is trained by processing items in the combined dataset, including processing items that represent the noise injected into the combined dataset.

Type: Application

Filed: September 11, 2020

Publication date: March 17, 2022

Inventors: Thang Minh Luong, Quoc V. Le, Qizhe Xie
Energy-Based Language Models

Publication number: 20220067304

Abstract: Systems and methods are provided for training and using energy-based language models such as cloze language models. In particular, one aspect of the present disclosure is directed to an energy-based cloze language model for representation learning over text. In some instances, the models provided herein can be referred to as the “Electric” model. Similar to the BERT model, example models proposed herein can be a conditional generative model of tokens given their contexts. However, example models proposed herein do not mask text or output a full distribution over tokens that could occur in a context. Instead, the example proposed models assign a scalar energy score to each input token. Another aspect of the present disclosure provides techniques to train the proposed models to assign low energies to data tokens and high energies to other ones using an algorithm based on noise-contrastive estimation.

Type: Application

Filed: August 27, 2021

Publication date: March 3, 2022

Inventors: Thang Minh Luong, Quoc V. Le, Kevin Stefan Clark
Sequence processing using online attention

Patent number: 11080589

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence including a respective output at each of multiple output time steps from respective encoded representations of inputs in an input sequence. The method includes, for each output time step, starting from the position, in the input order, of the encoded representation that was selected as a preceding context vector at a preceding output time step, traversing the encoded representations until an encoded representation is selected as a current context vector at the output time step. A decoder neural network processes the current context vector and a preceding output at the preceding output time step to generate a respective output score for each possible output and to update the hidden state of the decoder recurrent neural network. An output is selected for the output time step using the output scores.

Type: Grant

Filed: July 8, 2019

Date of Patent: August 3, 2021

Assignee: Google LLC

Inventors: Ron J. Weiss, Thang Minh Luong, Peter J. Liu, Colin Abraham Raffel, Douglas Eck
Contrastive Pre-Training for Language Tasks

Publication number: 20210089724

Abstract: Systems and methods are provided that train a machine-learned language encoding model through the use of a contrastive learning task. In particular, the present disclosure describes a contrastive learning task where the encoder learns to distinguish input tokens from plausible alternatives. In some implementations, on each training example the proposed method masks out some subset (e.g., 15%) of the original input tokens, replaces the masked tokens with samples from a “generator” (e.g., which may be a small masked language model), and then trains the encoder to predict whether each token comes from the original data or is a replacement produced by the generator.

Type: Application

Filed: September 21, 2020

Publication date: March 25, 2021

Inventors: Thang Minh Luong, Quoc V. Le, Kevin Stefan Clark
COMPUTATIONALLY EFFICIENT EXPRESSIVE OUTPUT LAYERS FOR NEURAL NETWORKS

Publication number: 20200364543

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for incorporating a computationally efficient expressive output layer in a neural network. The output layer is configured to map a received hidden state to a probability distribution over a vocabulary of possible outputs by generating, from the hidden state, a respective context embedding for each of a plurality of gates; for each of the possible outputs in the vocabulary, computing a gated logit for the possible output by applying an output embedding for the possible output to the weighed sum; and generating the probability distribution over the vocabulary of possible outputs by applying a softmax to the gated logits for the possible outputs in the vocabulary.

Type: Application

Filed: May 13, 2020

Publication date: November 19, 2020

Inventors: Thang Minh Luong, Quoc V. Le, Zhilin Yang
TRAINING MACHINE LEARNING MODELS USING TEACHER ANNEALING

Publication number: 20200364617

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model using teacher annealing.

Type: Application

Filed: May 11, 2020

Publication date: November 19, 2020

Inventors: Thang Minh Luong, Quoc V. Le, Kevin Stefan Clark
SEQUENCE PROCESSING USING ONLINE ATTENTION

Publication number: 20190332919

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence including a respective output at each of multiple output time steps from respective encoded representations of inputs in an input sequence. The method includes, for each output time step, starting from the position, in the input order, of the encoded representation that was selected as a preceding context vector at a preceding output time step, traversing the encoded representations until an encoded representation is selected as a current context vector at the output time step. A decoder neural network processes the current context vector and a preceding output at the preceding output time step to generate a respective output score for each possible output and to update the hidden state of the decoder recurrent neural network. An output is selected for the output time step using the output scores.

Type: Application

Filed: July 8, 2019

Publication date: October 31, 2019

Inventors: Ron J. Weiss, Thang Minh Luong, Peter J. Liu, Colin Abraham Raffel, Douglas Eck
LEARNING LONGER-TERM DEPENDENCIES IN NEURAL NETWORK USING AUXILIARY LOSSES

Publication number: 20190251449

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for structuring and training a recurrent neural network. This describes a technique that improves the ability to capture long term dependencies in recurrent neural networks by adding an unsupervised auxiliary loss at one or more anchor points to the original objective. This auxiliary loss forces the network to either reconstruct previous events or predict next events in a sequence, making truncated backpropagation feasible for long sequences and also improving full backpropagation through time.

Type: Application

Filed: February 11, 2019

Publication date: August 15, 2019

Inventors: Andrew M. Dai, Quoc V. Le, Hoang Trieu Trinh, Thang Minh Luong