Patents Assigned to DeepMind Technologies

Learning motor primitives and training a machine learning system using a linear-feedback-stabilized policy

Patent number: 11714996

Abstract: A computer-implemented method of training a student machine learning system comprises receiving data indicating execution of an expert, determining one or more actions performed by the expert during the execution and a corresponding state-action Jacobian, and training the student machine learning system using a linear-feedback-stabilized policy. The linear-feedback-stabilized policy may be based on the state-action Jacobian. Also a neural network system for representing a space of probabilistic motor primitives, implemented by one or more computers. The neural network system comprises an encoder configured to generate latent variables based on a plurality of inputs, each input comprising a plurality of frames, and a decoder configured to generate an action based on one or more of the latent variables and a state.

Type: Grant

Filed: July 25, 2022

Date of Patent: August 1, 2023

Assignee: DeepMind Technologies Limited

Inventors: Leonard Hasenclever, Vu Pham, Joshua Merel, Alexandre Galashov
Classifying input examples using a comparison set

Patent number: 11714993

Abstract: Methods, systems, and apparatus for classifying a new example using a comparison set of comparison examples. One method includes maintaining a comparison set, the comparison set including comparison examples and a respective label vector for each of the comparison examples, each label vector including a respective score for each label in a predetermined set of labels; receiving a new example; determining a respective attention weight for each comparison example by applying a neural network attention mechanism to the new example and to the comparison examples; and generating a respective label score for each label in the predetermined set of labels from, for each of the comparison examples, the respective attention weight for the comparison example and the respective label vector for the comparison example, in which the respective label score for each of the labels represents a likelihood that the label is a correct label for the new example.

Type: Grant

Filed: April 6, 2021

Date of Patent: August 1, 2023

Assignee: DeepMind Technologies Limited

Inventors: Charles Blundell, Oriol Vinyals
Graph neural network systems for generating structured representations of objects

Patent number: 11704541

Abstract: There is described a neural network system for generating a graph, the graph comprising a set of nodes and edges. The system comprises one or more neural networks configured to represent a probability distribution over sequences of node generating decisions and/or edge generating decisions, and one or more computers configured to sample the probability distribution represented by the one or more neural networks to generate a graph.

Type: Grant

Filed: October 29, 2018

Date of Patent: July 18, 2023

Assignee: DeepMind Technologies Limited

Inventors: Yujia Li, Christopher James Dyer, Oriol Vinyals
Contiguous sparsity pattern neural networks

Patent number: 11693627

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using neural networks having contiguous sparsity patterns. One of the methods includes storing a first parameter matrix of a neural network having a contiguous sparsity pattern in storage associated with a computing device. The computing device performs an inference pass of the neural network to generate an output vector, including reading, from the storage associated with the computing device, one or more activation values from the input vector, reading, from the storage associated with the computing device, a block of non-zero parameter values, and multiplying each of the one or more activation values by one or more of the block of non-zero parameter values.

Type: Grant

Filed: February 11, 2019

Date of Patent: July 4, 2023

Assignee: DeepMind Technologies Limited

Inventors: Karen Simonyan, Nal Emmerich Kalchbrenner, Erich Konrad Elsen
Learning non-differentiable weights of neural networks using evolutionary strategies

Patent number: 11676035

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. The neural network has a plurality of differentiable weights and a plurality of non-differentiable weights. One of the methods includes determining trained values of the plurality of differentiable weights and the non-differentiable weights by repeatedly performing operations that include determining an update to the current values of the plurality of differentiable weights using a machine learning gradient-based training technique and determining, using an evolution strategies (ES) technique, an update to the current values of a plurality of distribution parameters.

Type: Grant

Filed: January 23, 2020

Date of Patent: June 13, 2023

Assignee: DeepMind Technologies Limited

Inventors: Karel Lenc, Karen Simonyan, Tom Schaul, Erich Konrad Elsen
Variable thresholds in constrained optimization

Patent number: 11675855

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for re-ranking a collection of documents according to a first metric and subject to a constraint on a function of one or more second metrics. One of the methods includes: obtaining, for each document in the first collection of documents, a respective first metric value corresponding to the first metric and respective one or more second metric values corresponding to the one or more second metrics; re-ranking the first collection of documents, comprising: determining the constraint on the function of one or more second metrics by computing a first threshold value using a variable threshold function that takes as input second metric values for the documents in the first collection of documents; and determining the re-ranking for the first collection of documents by solving a constrained optimization for the first metric constrained by the first threshold value.

Type: Grant

Filed: November 18, 2020

Date of Patent: June 13, 2023

Assignee: DeepMind Technologies Limited

Inventors: Anton Zhernov, Krishnamurthy Dvijotham, Xiaohong Gong, Amogh S. Asgekar
Performing navigation tasks using grid codes

Patent number: 11662210

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a grid cell neural network and an action selection neural network. The grid cell network is configured to: receive an input comprising data characterizing a velocity of the agent; process the input to generate a grid cell representation; and process the grid cell representation to generate an estimate of a position of the agent in the environment; the action selection neural network is configured to: receive an input comprising a grid cell representation and an observation characterizing a state of the environment; and process the input to generate an action selection network output.

Type: Grant

Filed: May 18, 2022

Date of Patent: May 30, 2023

Assignee: DeepMind Technologies Limited

Inventors: Andrea Banino, Sudarshan Kumaran, Raia Thais Hadsell, Benigno Uria-Martínez
Action selection neural network training using imitation learning in latent space

Patent number: 11663441

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection policy neural network, wherein the action selection policy neural network is configured to process an observation characterizing a state of an environment to generate an action selection policy output, wherein the action selection policy output is used to select an action to be performed by an agent interacting with an environment. In one aspect, a method comprises: obtaining an observation characterizing a state of the environment subsequent to the agent performing a selected action; generating a latent representation of the observation; processing the latent representation of the observation using a discriminator neural network to generate an imitation score; determining a reward from the imitation score; and adjusting the current values of the action selection policy neural network parameters based on the reward using a reinforcement learning training technique.

Type: Grant

Filed: September 27, 2019

Date of Patent: May 30, 2023

Assignee: DeepMind Technologies Limited

Inventors: Scott Ellison Reed, Yusuf Aytar, Ziyu Wang, Tom Paine, Sergio Gomez Colmenarejo, David Budden, Tobias Pfaff, Aaron Gerard Antonius van den Oord, Oriol Vinyals, Alexander Novikov
Distributional reinforcement learning for continuous control tasks

Patent number: 11663475

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by a reinforcement learning agent interacting with an environment. In particular, the actions are selected from a continuous action space and the system trains the action selection neural network jointly with a distribution Q network that is used to update the parameters of the action selection neural network.

Type: Grant

Filed: September 15, 2022

Date of Patent: May 30, 2023

Assignee: DeepMind Technologies Limited

Inventors: David Budden, Matthew William Hoffman, Gabriel Barth-Maron
Training action selection neural networks using a differentiable credit function

Patent number: 11651208

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for reinforcement learning. A reinforcement learning neural network selects actions to be performed by an agent interacting with an environment to perform a task in an attempt to achieve a specified result. The reinforcement learning neural network has at least one input to receive an input observation characterizing a state of the environment and at least one output for determining an action to be performed by the agent in response to the input observation. The system includes a reward function network coupled to the reinforcement learning neural network. The reward function network has an input to receive reward data characterizing a reward provided by one or more states of the environment and is configured to determine a reward function to provide one or more target values for training the reinforcement learning neural network.

Type: Grant

Filed: May 22, 2018

Date of Patent: May 16, 2023

Assignee: DeepMind Technologies Limited

Inventors: Zhongwen Xu, Hado Phillip van Hasselt, Joseph Varughese Modayil, Andre da Motta Salles Barreto, David Silver
Committed information rate variational autoencoders

Patent number: 11636283

Abstract: A variational autoencoder (VAE) neural network system, comprising an encoder neural network to encode an input data item to define a posterior distribution for a set of latent variables, and a decoder neural network to generate an output data item representing values of a set of latent variables sampled from the posterior distribution. The system is configured for training with an objective function including a term dependent on a difference between the posterior distribution and a prior distribution. The prior and posterior distributions are arranged so that they cannot be matched to one another. The VAE system may be used for compressing and decompressing data.

Type: Grant

Filed: June 1, 2020

Date of Patent: April 25, 2023

Assignee: DeepMind Technologies Limited

Inventors: Benjamin Poole, Aaron Gerard Antonius van den Oord, Ali Razavi-Nematollahi, Oriol Vinyals
Action selection using interaction history graphs

Patent number: 11636347

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a method comprises: obtaining a graph of nodes and edges that represents an interaction history of the agent with the environment; generating an encoded representation of the graph representing the interaction history of the agent with the environment; processing an input based on the encoded representation of the graph using an action selection neural network, in accordance with current values of action selection neural network parameters, to generate an action selection output; and selecting an action from a plurality of possible actions to be performed by the agent using the action selection output generated by the action selection neural network.

Type: Grant

Filed: January 22, 2020

Date of Patent: April 25, 2023

Assignee: DeepMind Technologies Limited

Inventors: Hanjun Dai, Yujia Li, Chenglong Wang, Rishabh Singh, Po-Sen Huang, Pushmeet Kohli
Reinforcement learning using distributed prioritized replay

Patent number: 11625604

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. One of the systems includes (i) a plurality of actor computing units, in which each of the actor computing units is configured to maintain a respective replica of the action selection neural network and to perform a plurality of actor operations, and (ii) one or more learner computing units, in which each of the one or more learner computing units is configured to perform a plurality of learner operations.

Type: Grant

Filed: October 29, 2018

Date of Patent: April 11, 2023

Assignee: DeepMind Technologies Limited

Inventors: David Budden, Gabriel Barth-Maron, John Quan, Daniel George Horgan
Multi-agent reinforcement learning with matchmaking policies

Patent number: 11627165

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a policy neural network having a plurality of policy parameters and used to select actions to be performed by an agent to control the agent to perform a particular task while interacting with one or more other agents in an environment. In one aspect, the method includes: maintaining data specifying a pool of candidate action selection policies; maintaining data specifying respective matchmaking policy; and training the policy neural network using a reinforcement learning technique to update the policy parameters. The policy parameters define policies to be used in controlling the agent to perform the particular task.

Type: Grant

Filed: January 24, 2020

Date of Patent: April 11, 2023

Assignee: DeepMind Technologies Limited

Inventors: David Silver, Oriol Vinyals, Maxwell Elliot Jaderberg
Training machine learning models by determining update rules using recurrent neural networks

Patent number: 11615310

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for training machine learning models. One method includes obtaining a machine learning model, wherein the machine learning model comprises one or more model parameters, and the machine learning model is trained using gradient descent techniques to optimize an objective function; determining an update rule for the model parameters using a recurrent neural network (RNN); and applying a determined update rule for a final time step in a sequence of multiple time steps to the model parameters.

Type: Grant

Filed: May 19, 2017

Date of Patent: March 28, 2023

Assignee: DeepMind Technologies Limited

Inventors: Misha Man Ray Denil, Tom Schaul, Marcin Andrychowicz, Joao Ferdinando Gomes de Freitas, Sergio Gomez Colmenarejo, Matthew William Hoffman, David Benjamin Pfau
Distributional reinforcement learning using quantile function neural networks

Patent number: 11610118

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method comprises: receiving a current observation; for each action of a plurality of actions: randomly sampling one or more probability values; for each probability value: processing the action, the current observation, and the probability value using a quantile function network to generate an estimated quantile value for the probability value with respect to a probability distribution over possible returns that would result from the agent performing the action in response to the current observation; determining a measure of central tendency of the one or more estimated quantile values; and selecting an action to be performed by the agent in response to the current observation using the measures of central tendency for the actions.

Type: Grant

Filed: February 11, 2019

Date of Patent: March 21, 2023

Assignee: DeepMind Technologies Limited

Inventors: Georg Ostrovski, William Clinton Dabney
Population based training of neural networks

Patent number: 11604985

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. A method includes: training a neural network having multiple network parameters to perform a particular neural network task and to determine trained values of the network parameters using an iterative training process having multiple hyperparameters, the method includes: maintaining multiple candidate neural networks and, for each of the multiple candidate neural networks, data specifying: (i) respective values of network parameters for the candidate neural network, (ii) respective values of hyperparameters for the candidate neural network, and (iii) a quality measure that measures a performance of the candidate neural network on the particular neural network task; and for each of the multiple candidate neural networks, repeatedly performing additional training operations.

Type: Grant

Filed: November 22, 2018

Date of Patent: March 14, 2023

Assignee: DeepMind Technologies Limited

Inventors: Maxwell Elliot Jaderberg, Wojciech Czarnecki, Timothy Frederick Goldie Green, Valentin Clement Dalibard
Training action-selection neural networks from demonstrations using multiple losses

Patent number: 11604941

Abstract: A method of training an action selection neural network to perform a demonstrated task using a supervised learning technique. The action selection neural network is configured to receive demonstration data comprising actions to perform the task and rewards received for performing the actions. The action selection neural network has auxiliary prediction task neural networks on one or more of its intermediate outputs. The action selection policy neural network is trained using multiple combined losses, concurrently with the auxiliary prediction task neural networks.

Type: Grant

Filed: October 29, 2018

Date of Patent: March 14, 2023

Assignee: DeepMind Technologies Limited

Inventor: Todd Andrew Hester
Training action selection neural networks using leave-one-out-updates

Patent number: 11604997

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a policy neural network. The policy neural network is used to select actions to be performed by an agent that interacts with an environment by receiving an observation characterizing a state of the environment and performing an action from a set of actions in response to the received observation. A trajectory is obtained from a replay memory, and a final update to current values of the policy network parameters is determined for each training observation in the trajectory. The final updates to the current values of the policy network parameters are determined from selected action updates and leave-one-out updates.

Type: Grant

Filed: June 11, 2018

Date of Patent: March 14, 2023

Assignee: DeepMind Technologies Limited

Inventors: Marc Gendron-Bellemare, Mohammad Gheshlaghi Azar, Audrunas Gruslys, Remi Munos
Augmented recurrent neural network with external memory

Patent number: 11593640

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory. One of the methods includes providing an output derived from the neural network output for the time step as a system output for the time step; maintaining a current state of the external memory; determining, from the neural network output for the time step, memory state parameters for the time step; updating the current state of the external memory using the memory state parameters for the time step; reading data from the external memory in accordance with the updated state of the external memory; and combining the data read from the external memory with a system input for the next time step to generate the neural network input for the next time step.

Type: Grant

Filed: September 9, 2019

Date of Patent: February 28, 2023

Assignee: DeepMind Technologies Limited

Inventors: Edward Thomas Grefenstette, Karl Moritz Hermann, Mustafa Suleyman, Philip Blunsom

prev … 3 4 5 6 7 8 9 10 11 … next