Patents by Inventor Noam M. Shazeer

Noam M. Shazeer has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS

Publication number: 20200342316

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.

Type: Application

Filed: October 29, 2018

Publication date: October 29, 2020

Inventors: Noam M. Shazeer, Lukasz Mieczyslaw Kaiser, Etienne Pot, Mohammad Saleh, Ben David Goodrich, Peter J. Liu, Ryan Sepassi
Distributing tensor computations across computing devices

Patent number: 10796225

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for distributing tensor computations across computing devices. One of the methods includes: receiving specification data that specifies a distribution of tensor computations among a plurality of computing devices, wherein each tensor computation (i) is defined to receive, as input, one or more respective input tensors each having one or more respective input dimensions, (ii) is defined to generate, as output, one or more respective output tensors each having one or more respective output dimensions, or both, wherein the specification data specifies a respective layout for each input and output tensor that assigns each dimension of the input or output tensor to one or more of the plurality of computing devices; assigning, based on the layouts for the input and output tensors, respective device-local operations to each of the computing devices; and causing the tensor computations to be executed.

Type: Grant

Filed: August 5, 2019

Date of Patent: October 6, 2020

Assignee: Google LLC

Inventor: Noam M. Shazeer
Multi-task multi-modal machine learning system

Patent number: 10789427

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for training a machine learning model to perform multiple machine learning tasks from multiple machine learning domains. One system includes a machine learning model that includes multiple input modality neural networks corresponding to respective different modalities and being configured to map received data inputs of the corresponding modality to mapped data inputs from a unified representation space; an encoder neural network configured to process mapped data inputs from the unified representation space to generate respective encoder data outputs; a decoder neural network configured to process encoder data outputs to generate respective decoder data outputs from the unified representation space; and multiple output modality neural networks corresponding to respective different modalities and being configured to map decoder data outputs to data outputs of the corresponding modality.

Type: Grant

Filed: November 19, 2019

Date of Patent: September 29, 2020

Assignee: Google LLC

Inventors: Noam M. Shazeer, Aidan Nicholas Gomez, Lukasz Mieczyslaw Kaiser, Jakob D. Uszkoreit, Llion Owen Jones, Niki J. Parmar, Ashish Teku Vaswani
MIXTURE OF EXPERTS NEURAL NETWORKS

Publication number: 20200279150

Abstract: A system includes a neural network that includes a Mixture of Experts (MoE) subnetwork between a first neural network layer and a second neural network layer. The MoE subnetwork includes multiple expert neural networks. Each expert neural network is configured to process a first layer output generated by the first neural network layer to generate a respective expert output. The MoE subnetwork further includes a gating subsystem that selects, based on the first layer output, one or more of the expert neural networks and determine a respective weight for each selected expert neural network, provides the first layer output as input to each of the selected expert neural networks, combines the expert outputs generated by the selected expert neural networks in accordance with the weights for the selected expert neural networks to generate an MoE output, and provides the MoE output as input to the second neural network layer.

Type: Application

Filed: May 20, 2020

Publication date: September 3, 2020

Inventors: Noam M. Shazeer, Azalia Mirhoseini, Krzysztof Stanislaw Maziarz
Attention-based sequence transduction neural networks

Patent number: 10719764

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. In one aspect, one of the systems includes an encoder neural network configured to receive the input sequence and generate encoded representations of the network inputs, the encoder neural network comprising a sequence of one or more encoder subnetworks, each encoder subnetwork configured to receive a respective encoder subnetwork input for each of the input positions and to generate a respective subnetwork output for each of the input positions, and each encoder subnetwork comprising: an encoder self-attention sub-layer that is configured to receive the subnetwork input for each of the input positions and, for each particular input position in the input order: apply an attention mechanism over the encoder subnetwork inputs using one or more queries derived from the encoder subnetwork input at the particular input position.

Type: Grant

Filed: September 3, 2019

Date of Patent: July 21, 2020

Assignee: Google LLC

Inventors: Noam M. Shazeer, Aidan Nicholas Gomez, Lukasz Mieczyslaw Kaiser, Jakob D. Uszkoreit, Llion Owen Jones, Niki J. Parmar, Illia Polosukhin, Ashish Teku Vaswani
Mixture of experts neural networks

Patent number: 10719761

Abstract: A system includes a neural network that includes a Mixture of Experts (MoE) subnetwork between a first neural network layer and a second neural network layer. The MoE subnetwork includes multiple expert neural networks. Each expert neural network is configured to process a first layer output generated by the first neural network layer to generate a respective expert output. The MoE subnetwork further includes a gating subsystem that selects, based on the first layer output, one or more of the expert neural networks and determine a respective weight for each selected expert neural network, provides the first layer output as input to each of the selected expert neural networks, combines the expert outputs generated by the selected expert neural networks in accordance with the weights for the selected expert neural networks to generate an MoE output, and provides the MoE output as input to the second neural network layer.

Type: Grant

Filed: April 24, 2019

Date of Patent: July 21, 2020

Assignee: Google LLC

Inventors: Noam M. Shazeer, Azalia Mirhoseini, Krzysztof Stanislaw Maziarz
Generating feature embeddings from a co-occurrence matrix

Patent number: 10685012

Abstract: Methods, and systems, including computer programs encoded on computer storage media for generating compressed representations from a co-occurrence matrix. A method includes obtaining a set of sub matrices of a co-occurrence matrix, where each row of the co-occurrence matrix corresponds to a feature from a first feature vocabulary and each column of the co-occurrence matrix corresponds to a feature from a second feature vocabulary; selecting a sub matrix, wherein the sub matrix is associated with a particular row block and column block of the co-occurrence matrix; assigning respective d-dimensional initial row and column embedding vectors to each row and column from the particular row and column blocks, respectively; and determining a final row embedding vector and a final column embedding vector by iteratively adjusting the initial row embedding vectors and the initial column embedding vectors using the co-occurrence matrix.

Type: Grant

Filed: February 3, 2017

Date of Patent: June 16, 2020

Assignee: Google LLC

Inventors: Noam M. Shazeer, Colin Hearne Evans, Christopher Robert Waterson, Ryan P. Doherty
SUGGESTING AND/OR PROVIDING TARGETING CRITERIA FOR ADVERTISEMENTS

Publication number: 20200151760

Abstract: Keyword suggestions that are category-aware (and field-proven) may be used to help advertisers better target the serving of their ads, and may reduce unused ad spot inventory. The advertiser can enter ad information, such as a creative, a landing Webpage, other keywords, etc. for example. A keyword facility may use this entered ad information as seed information to infer one or more categories. It may then request that the advertiser confirm or deny some basic feedback information (e.g., categories, Webpage information, etc.). For example, an advertiser may be provided with candidate categories and may be asked to confirm (e.g., using checkboxes) which of the categories are relevant to their ad. Keywords may be determined using at least the categories. The determined keywords may be provided to the advertiser as suggested keywords, or may automatically populate ad serving constraint information as targeting keywords.

Type: Application

Filed: January 10, 2020

Publication date: May 14, 2020

Inventors: Ross Koningstein, Valentin Spitkovsky, Georges Harik, Noam M. Shazeer
SPEECH RECOGNITION WITH ATTENTION-BASED RECURRENT NEURAL NETWORKS

Publication number: 20200118554

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for speech recognition. One method includes obtaining an input acoustic sequence, the input acoustic sequence representing an utterance, and the input acoustic sequence comprising a respective acoustic feature representation at each of a first number of time steps, processing the input acoustic sequence using a first neural network to convert the input acoustic sequence into an alternative representation for the input acoustic sequence, processing the alternative representation for the input acoustic sequence using an attention-based Recurrent Neural Network (RNN) to generate, for each position in an output sequence order, a set of substring scores that includes a respective substring score for each substring in a set of substrings; and generating a sequence of substrings that represent a transcription of the utterance.

Type: Application

Filed: December 13, 2019

Publication date: April 16, 2020

Applicant: Google LLC

Inventors: William Chan, Navdeep Jaitly, Quoc V. Le, Oriol Vinyals, Noam M. Shazeer
MULTI-TASK MULTI-MODAL MACHINE LEARNING SYSTEM

Publication number: 20200089755

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for training a machine learning model to perform multiple machine learning tasks from multiple machine learning domains. One system includes a machine learning model that includes multiple input modality neural networks corresponding to respective different modalities and being configured to map received data inputs of the corresponding modality to mapped data inputs from a unified representation space; an encoder neural network configured to process mapped data inputs from the unified representation space to generate respective encoder data outputs; a decoder neural network configured to process encoder data outputs to generate respective decoder data outputs from the unified representation space; and multiple output modality neural networks corresponding to respective different modalities and being configured to map decoder data outputs to data outputs of the corresponding modality.

Type: Application

Filed: November 19, 2019

Publication date: March 19, 2020

Inventors: Noam M. Shazeer, Aidan Nicholas Gomez, Lukasz Mieczyslaw Kaiser, Jakob D. Uszkoreit, Llion Owen Jones, Niki J. Parmar, Ashish Teku Vaswani
PARALLEL DECODING USING TRANSFORMER MODELS

Publication number: 20200082226

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing parallel generation of output from an autoregressive sequence to sequence model. In one aspect, a blockwise parallel decoding method takes advantage of the fact that some architectures can score sequences in sublinear time. By generating predictions for multiple time steps at once then backing off to a longest prefix validated by the scoring model, the methods can substantially improve the speed of greedy decoding without compromising performance.

Type: Application

Filed: November 13, 2019

Publication date: March 12, 2020

Inventors: Noam M. Shazeer, Jakob D. Uszkoreit, Mitchell Thomas Stern
Suggesting and/or providing targeting criteria for advertisements

Patent number: 10580033

Abstract: Keyword suggestions that are category-aware (and field-proven) may be used to help advertisers better target the serving of their ads, and may reduce unused ad spot inventory. The advertiser can enter ad information, such as a creative, a landing Webpage, other keywords, etc. for example. A keyword facility may use this entered ad information as seed information to infer one or more categories. It may then request that the advertiser confirm or deny some basic feedback information (e.g., categories, Webpage information, etc.). For example, an advertiser may be provided with candidate categories and may be asked to confirm (e.g., using checkboxes) which of the categories are relevant to their ad. Keywords may be determined using at least the categories. The determined keywords may be provided to the advertiser as suggested keywords, or may automatically populate ad serving constraint information as targeting keywords.

Type: Grant

Filed: May 16, 2017

Date of Patent: March 3, 2020

Assignee: Google LLC

Inventors: Ross Koningstein, Valentin Spitkovsky, Georges Harik, Noam M. Shazeer
DISTRIBUTING TENSOR COMPUTATIONS ACROSS COMPUTING DEVICES

Publication number: 20200042875

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for distributing tensor computations across computing devices. One of the methods includes: receiving specification data that specifies a distribution of tensor computations among a plurality of computing devices, wherein each tensor computation (i) is defined to receive, as input, one or more respective input tensors each having one or more respective input dimensions, (ii) is defined to generate, as output, one or more respective output tensors each having one or more respective output dimensions, or both, wherein the specification data specifies a respective layout for each input and output tensor that assigns each dimension of the input or output tensor to one or more of the plurality of computing devices; assigning, based on the layouts for the input and output tensors, respective device-local operations to each of the computing devices; and causing the tensor computations to be executed.

Type: Application

Filed: August 5, 2019

Publication date: February 6, 2020

Inventor: Noam M. Shazeer
Speech recognition with attention-based recurrent neural networks

Patent number: 10540962

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for speech recognition. One method includes obtaining an input acoustic sequence, the input acoustic sequence representing an utterance, and the input acoustic sequence comprising a respective acoustic feature representation at each of a first number of time steps; processing the input acoustic sequence using a first neural network to convert the input acoustic sequence into an alternative representation for the input acoustic sequence; processing the alternative representation for the input acoustic sequence using an attention-based Recurrent Neural Network (RNN) to generate, for each position in an output sequence order, a set of substring scores that includes a respective substring score for each substring in a set of substrings; and generating a sequence of substrings that represent a transcription of the utterance.

Type: Grant

Filed: May 3, 2018

Date of Patent: January 21, 2020

Inventors: William Chan, Navdeep Jaitly, Quoc V. Le, Oriol Vinyals, Noam M. Shazeer
Parallel decoding using autoregressive machine learning models

Patent number: 10521701

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing parallel generation of output from an autoregressive sequence to sequence model. In one aspect, a blockwise parallel decoding method takes advantage of the fact that some architectures can score sequences in sublinear time. By generating predictions for multiple time steps at once then backing off to a longest prefix validated by the scoring model, the methods can substantially improve the speed of greedy decoding without compromising performance.

Type: Grant

Filed: May 20, 2019

Date of Patent: December 31, 2019

Assignee: Google LLC

Inventors: Noam M. Shazeer, Jakob D. Uszkoreit, Mitchell Thomas Stern
ATTENTION-BASED SEQUENCE TRANSDUCTION NEURAL NETWORKS

Publication number: 20190392319

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. In one aspect, one of the systems includes an encoder neural network configured to receive the input sequence and generate encoded representations of the network inputs, the encoder neural network comprising a sequence of one or more encoder subnetworks, each encoder subnetwork configured to receive a respective encoder subnetwork input for each of the input positions and to generate a respective subnetwork output for each of the input positions, and each encoder subnetwork comprising: an encoder self-attention sub-layer that is configured to receive the subnetwork input for each of the input positions and, for each particular input position in the input order: apply an attention mechanism over the encoder subnetwork inputs using one or more queries derived from the encoder subnetwork input at the particular input position.

Type: Application

Filed: September 3, 2019

Publication date: December 26, 2019

Inventors: Noam M. Shazeer, Aidan Nicholas Gomez, Lukasz Mieczyslaw Kaiser, Jakob D. Uszkoreit, Llion Owen Jones, Niki J. Parmar, Illia Polosukhin, Ashish Teku Vaswani
Training recurrent neural networks to generate sequences

Patent number: 10504023

Abstract: This document generally describes a neural network training system, including one or more computers, that trains a recurrent neural network (RNN) to receive an input, e.g., an input sequence, and to generate a sequence of outputs from the input sequence. In some implementations, training can include, for each position after an initial position in a training target sequence, selecting a preceding output of the RNN to provide as input to the RNN at the position, including determining whether to select as the preceding output (i) a true output in a preceding position in the output order or (ii) a value derived from an output of the RNN for the preceding position in an output order generated in accordance with current values of the parameters of the recurrent neural network.

Type: Grant

Filed: June 6, 2016

Date of Patent: December 10, 2019

Assignee: Google LLC

Inventors: Samy Bengio, Oriol Vinyals, Navdeep Jaitly, Noam M. Shazeer
PARALLEL DECODING USING AUTOREGRESSIVE MACHINE LEARNING MODELS

Publication number: 20190354812

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing parallel generation of output from an autoregressive sequence to sequence model. In one aspect, a blockwise parallel decoding method takes advantage of the fact that some architectures can score sequences in sublinear time. By generating predictions for multiple time steps at once then backing off to a longest prefix validated by the scoring model, the methods can substantially improve the speed of greedy decoding without compromising performance.

Type: Application

Filed: May 20, 2019

Publication date: November 21, 2019

Inventors: Noam M. Shazeer, Jakob D. Uszkoreit, Mitchell Thomas Stern
Attention-based sequence transduction neural networks

Patent number: 10452978

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. In one aspect, one of the systems includes an encoder neural network configured to receive the input sequence and generate encoded representations of the network inputs, the encoder neural network comprising a sequence of one or more encoder subnetworks, each encoder subnetwork configured to receive a respective encoder subnetwork input for each of the input positions and to generate a respective subnetwork output for each of the input positions, and each encoder subnetwork comprising: an encoder self-attention sub-layer that is configured to receive the subnetwork input for each of the input positions and, for each particular input position in the input order: apply an attention mechanism over the encoder subnetwork inputs using one or more queries derived from the encoder subnetwork input at the particular input position.

Type: Grant

Filed: June 28, 2018

Date of Patent: October 22, 2019

Assignee: Google LLC

Inventors: Noam M. Shazeer, Aidan Nicholas Gomez, Lukasz Mieczyslaw Kaiser, Jakob D. Uszkoreit, Llion Owen Jones, Niki J. Parmar, Illia Polosukhin, Ashish Teku Vaswani
MIXTURE OF EXPERTS NEURAL NETWORKS

Publication number: 20190251423

Abstract: A system includes a neural network that includes a Mixture of Experts (MoE) subnetwork between a first neural network layer and a second neural network layer. The MoE subnetwork includes multiple expert neural networks. Each expert neural network is configured to process a first layer output generated by the first neural network layer to generate a respective expert output. The MoE subnetwork further includes a gating subsystem that selects, based on the first layer output, one or more of the expert neural networks and determine a respective weight for each selected expert neural network, provides the first layer output as input to each of the selected expert neural networks, combines the expert outputs generated by the selected expert neural networks in accordance with the weights for the selected expert neural networks to generate an MoE output, and provides the MoE output as input to the second neural network layer.

Type: Application

Filed: April 24, 2019

Publication date: August 15, 2019

Inventors: Noam M. Shazeer, Azalia Mirhoseini, Krzysztof Stanislaw Maziarz

prev 1 2 3 4 5 next