Patents by Inventor Krzysztof Stanislaw Maziarz

Krzysztof Stanislaw Maziarz has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Flexible parameter sharing for multi-task learning

Patent number: 11915120

Abstract: Systems and methods for flexible parameter sharing for multi-task learning are provided. A training method can include obtaining a test input, selecting a particular task from one or more tasks, and training a multi-task machine-learned model for the particular task by performing a forward pass using the test input and one or more connection probability matrices to generate a sample distribution of test outputs, training the components of the machine-learned model based at least in part on the sample distribution, and performing a backwards pass to train a connection probability matrix of the multi-task machine-learned model using a straight-through Gumbel-softmax approximation.

Type: Grant

Filed: March 17, 2020

Date of Patent: February 27, 2024

Assignee: GOOGLE LLC

Inventors: Effrosyni Kokiopoulou, Krzysztof Stanislaw Maziarz, Andrea Gesmundo, Luciano Sbaiz, Gábor Bartók, Jesse Berent
MIXTURE OF EXPERTS NEURAL NETWORKS

Publication number: 20230419079

Abstract: A system includes a neural network that includes a Mixture of Experts (MoE) subnetwork between a first neural network layer and a second neural network layer. The MoE subnetwork includes multiple expert neural networks. Each expert neural network is configured to process a first layer output generated by the first neural network layer to generate a respective expert output. The MoE subnetwork further includes a gating subsystem that selects, based on the first layer output, one or more of the expert neural networks and determine a respective weight for each selected expert neural network, provides the first layer output as input to each of the selected expert neural networks, combines the expert outputs generated by the selected expert neural networks in accordance with the weights for the selected expert neural networks to generate an MoE output, and provides the MoE output as input to the second neural network layer.

Type: Application

Filed: September 8, 2023

Publication date: December 28, 2023

Inventors: Noam M. Shazeer, Azalia Mirhoseini, Krzysztof Stanislaw Maziarz
Mixture of experts neural networks

Patent number: 11790214

Abstract: A system includes a neural network that includes a Mixture of Experts (MoE) subnetwork between a first neural network layer and a second neural network layer. The MoE subnetwork includes multiple expert neural networks. Each expert neural network is configured to process a first layer output generated by the first neural network layer to generate a respective expert output. The MoE subnetwork further includes a gating subsystem that selects, based on the first layer output, one or more of the expert neural networks and determine a respective weight for each selected expert neural network, provides the first layer output as input to each of the selected expert neural networks, combines the expert outputs generated by the selected expert neural networks in accordance with the weights for the selected expert neural networks to generate an MoE output, and provides the MoE output as input to the second neural network layer.

Type: Grant

Filed: May 20, 2020

Date of Patent: October 17, 2023

Assignee: Google LLC

Inventors: Noam M. Shazeer, Azalia Mirhoseini, Krzysztof Stanislaw Maziarz
Flexible Parameter Sharing for Multi-Task Learning

Publication number: 20210232895

Abstract: Systems and methods for flexible parameter sharing for multi-task learning are provided. A training method can include obtaining a test input, selecting a particular task from one or more tasks, and training a multi-task machine-learned model for the particular task by performing a forward pass using the test input and one or more connection probability matrices to generate a sample distribution of test outputs, training the components of the machine-learned model based at least in part on the sample distribution, and performing a backwards pass to train a connection probability matrix of the multi-task machine-learned model using a straight-through Gumbel-softmax approximation.

Type: Application

Filed: March 17, 2020

Publication date: July 29, 2021

Inventors: Effrosyni Kokiopoulou, Krzysztof Stanislaw Maziarz, Andrea Gesmundo, Luciano Sbaiz, Gábor Bartók, Jesse Berent
MIXTURE OF EXPERTS NEURAL NETWORKS

Publication number: 20200279150

Abstract: A system includes a neural network that includes a Mixture of Experts (MoE) subnetwork between a first neural network layer and a second neural network layer. The MoE subnetwork includes multiple expert neural networks. Each expert neural network is configured to process a first layer output generated by the first neural network layer to generate a respective expert output. The MoE subnetwork further includes a gating subsystem that selects, based on the first layer output, one or more of the expert neural networks and determine a respective weight for each selected expert neural network, provides the first layer output as input to each of the selected expert neural networks, combines the expert outputs generated by the selected expert neural networks in accordance with the weights for the selected expert neural networks to generate an MoE output, and provides the MoE output as input to the second neural network layer.

Type: Application

Filed: May 20, 2020

Publication date: September 3, 2020

Inventors: Noam M. Shazeer, Azalia Mirhoseini, Krzysztof Stanislaw Maziarz
Mixture of experts neural networks

Patent number: 10719761

Abstract: A system includes a neural network that includes a Mixture of Experts (MoE) subnetwork between a first neural network layer and a second neural network layer. The MoE subnetwork includes multiple expert neural networks. Each expert neural network is configured to process a first layer output generated by the first neural network layer to generate a respective expert output. The MoE subnetwork further includes a gating subsystem that selects, based on the first layer output, one or more of the expert neural networks and determine a respective weight for each selected expert neural network, provides the first layer output as input to each of the selected expert neural networks, combines the expert outputs generated by the selected expert neural networks in accordance with the weights for the selected expert neural networks to generate an MoE output, and provides the MoE output as input to the second neural network layer.

Type: Grant

Filed: April 24, 2019

Date of Patent: July 21, 2020

Assignee: Google LLC

Inventors: Noam M. Shazeer, Azalia Mirhoseini, Krzysztof Stanislaw Maziarz
MIXTURE OF EXPERTS NEURAL NETWORKS

Publication number: 20190251423

Abstract: A system includes a neural network that includes a Mixture of Experts (MoE) subnetwork between a first neural network layer and a second neural network layer. The MoE subnetwork includes multiple expert neural networks. Each expert neural network is configured to process a first layer output generated by the first neural network layer to generate a respective expert output. The MoE subnetwork further includes a gating subsystem that selects, based on the first layer output, one or more of the expert neural networks and determine a respective weight for each selected expert neural network, provides the first layer output as input to each of the selected expert neural networks, combines the expert outputs generated by the selected expert neural networks in accordance with the weights for the selected expert neural networks to generate an MoE output, and provides the MoE output as input to the second neural network layer.

Type: Application

Filed: April 24, 2019

Publication date: August 15, 2019

Inventors: Noam M. Shazeer, Azalia Mirhoseini, Krzysztof Stanislaw Maziarz

Flexible parameter sharing for multi-task learning

MIXTURE OF EXPERTS NEURAL NETWORKS

Mixture of experts neural networks

Flexible Parameter Sharing for Multi-Task Learning

MIXTURE OF EXPERTS NEURAL NETWORKS

Mixture of experts neural networks

MIXTURE OF EXPERTS NEURAL NETWORKS