Patents Assigned to DeepMind Technologies Limited

Unsupervised control using learned rewards

Patent number: 11727281

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent that interacts with an environment. In one aspect, a system comprises: an action selection subsystem that selects actions to be performed by the agent using an action selection policy generated using an action selection neural network; a reward subsystem that is configured to: receive an observation characterizing a current state of the environment and an observation characterizing a goal state of the environment; generate a reward using an embedded representation of the observation characterizing the current state of the environment and an embedded representation of the observation characterizing the goal state of the environment; and a training subsystem that is configured to train the action selection neural network based on the rewards generated by the reward subsystem using reinforcement learning techniques.

Type: Grant

Filed: January 27, 2022

Date of Patent: August 15, 2023

Assignee: DeepMind Technologies Limited

Inventors: David Constantine Patrick Warde-Farley, Volodymyr Mnih
Parallel execution of gated activation unit operations

Patent number: 11720781

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for interleaving matrix operations of a gated activation unit. One of the methods includes receiving a plurality of weight matrices of a gated activation unit of the neural network, the gated activation unit having two or more layers, each layer defining operations comprising: (i) a matrix operation between a weight matrix for the layer and concatenated input vectors and (ii) a nonlinear activation operation using a result of the matrix operation. Rows of the plurality of weight matrices are interleaved by assigning groups of corresponding rows to respective thread blocks, each thread block being a computation unit for execution by an independent processing unit of a plurality of independent processing units of a parallel processing device.

Type: Grant

Filed: October 20, 2017

Date of Patent: August 8, 2023

Assignee: DeepMind Technologies Limited

Inventor: Erich Konrad Elsen
Neural episodic control

Patent number: 11720796

Abstract: A method includes maintaining respective episodic memory data for each of multiple actions; receiving a current observation characterizing a current state of an environment being interacted with by an agent; processing the current observation using an embedding neural network in accordance with current values of parameters of the embedding neural network to generate a current key embedding for the current observation; for each action of the plurality of actions: determining the p nearest key embeddings in the episodic memory data for the action to the current key embedding according to a distance measure, and determining a Q value for the action from the return estimates mapped to by the p nearest key embeddings in the episodic memory data for the action; and selecting, using the Q values for the actions, an action from the multiple actions as the action to be performed by the agent.

Type: Grant

Filed: April 23, 2020

Date of Patent: August 8, 2023

Assignee: DeepMind Technologies Limited

Inventors: Benigno Uria-Martínez, Alexander Pritzel, Charles Blundell, Adrià Puigdomènech Badia
Training neural networks using synthetic gradients

Patent number: 11715009

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network including a first subnetwork followed by a second subnetwork on training inputs by optimizing an objective function. In one aspect, a method includes processing a training input using the neural network to generate a training model output, including processing a subnetwork input for the training input using the first subnetwork to generate a subnetwork activation for the training input in accordance with current values of parameters of the first subnetwork, and providing the subnetwork activation as input to the second subnetwork; determining a synthetic gradient of the objective function for the first subnetwork by processing the subnetwork activation using a synthetic gradient model in accordance with current values of parameters of the synthetic gradient model; and updating the current values of the parameters of the first subnetwork using the synthetic gradient.

Type: Grant

Filed: May 19, 2017

Date of Patent: August 1, 2023

Assignee: DeepMind Technologies Limited

Inventors: Oriol Vinyals, Alexander Benjamin Graves, Wojciech Czarnecki, Koray Kavukcuoglu, Simon Osindero, Maxwell Elliot Jaderberg
Learning from delayed outcomes using neural networks

Patent number: 11714994

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for learning from delayed outcomes using neural networks. One of the methods includes receiving an input observation; generating, from the input observation, an output label distribution over possible labels for the input observation at a final time, comprising: processing the input observation using a first neural network configured to process the input observation to generate a distribution over possible values for an intermediate indicator at a first time earlier than the final time; generating, from the distribution, an input value for the intermediate indicator; and processing the input value for the intermediate indicator using a second neural network configured to process the input value for the intermediate indicator to determine the output label distribution over possible values for the input observation at the final time; and providing an output derived from the output label distribution.

Type: Grant

Filed: March 11, 2019

Date of Patent: August 1, 2023

Assignee: DeepMind Technologies Limited

Inventors: Huiyi Hu, Ray Jiang, Timothy Arthur Mann, Sven Adrian Gowal, Balaji Lakshminarayanan, András György
Classifying input examples using a comparison set

Patent number: 11714993

Abstract: Methods, systems, and apparatus for classifying a new example using a comparison set of comparison examples. One method includes maintaining a comparison set, the comparison set including comparison examples and a respective label vector for each of the comparison examples, each label vector including a respective score for each label in a predetermined set of labels; receiving a new example; determining a respective attention weight for each comparison example by applying a neural network attention mechanism to the new example and to the comparison examples; and generating a respective label score for each label in the predetermined set of labels from, for each of the comparison examples, the respective attention weight for the comparison example and the respective label vector for the comparison example, in which the respective label score for each of the labels represents a likelihood that the label is a correct label for the new example.

Type: Grant

Filed: April 6, 2021

Date of Patent: August 1, 2023

Assignee: DeepMind Technologies Limited

Inventors: Charles Blundell, Oriol Vinyals
Learning motor primitives and training a machine learning system using a linear-feedback-stabilized policy

Patent number: 11714996

Abstract: A computer-implemented method of training a student machine learning system comprises receiving data indicating execution of an expert, determining one or more actions performed by the expert during the execution and a corresponding state-action Jacobian, and training the student machine learning system using a linear-feedback-stabilized policy. The linear-feedback-stabilized policy may be based on the state-action Jacobian. Also a neural network system for representing a space of probabilistic motor primitives, implemented by one or more computers. The neural network system comprises an encoder configured to generate latent variables based on a plurality of inputs, each input comprising a plurality of frames, and a decoder configured to generate an action based on one or more of the latent variables and a state.

Type: Grant

Filed: July 25, 2022

Date of Patent: August 1, 2023

Assignee: DeepMind Technologies Limited

Inventors: Leonard Hasenclever, Vu Pham, Joshua Merel, Alexandre Galashov
Data-driven robot control

Patent number: 11712799

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-driven robotic control. One of the methods includes maintaining robot experience data; obtaining annotation data; training, on the annotation data, a reward model; generating task-specific training data for the particular task, comprising, for each experience in a second subset of the experiences in the robot experience data: processing the observation in the experience using the trained reward model to generate a reward prediction, and associating the reward prediction with the experience; and training a policy neural network on the task-specific training data for the particular task, wherein the policy neural network is configured to receive a network input comprising an observation and to generate a policy output that defines a control policy for a robot performing the particular task.

Type: Grant

Filed: September 14, 2020

Date of Patent: August 1, 2023

Assignee: DeepMind Technologies Limited

Inventors: Serkan Cabi, Ziyu Wang, Alexander Novikov, Ksenia Konyushkova, Sergio Gomez Colmenarejo, Scott Ellison Reed, Misha Man Ray Denil, Jonathan Karl Scholz, Oleg O. Sushkov, Rae Chan Jeong, David Barker, David Budden, Mel Vecerik, Yusuf Aytar, Joao Ferdinando Gomes de Freitas
Jointly learning exploratory and non-exploratory action selection policies

Patent number: 11714990

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by an agent interacting with an environment. In one aspect, the method comprises: receiving an observation characterizing a current state of the environment; processing the observation and an exploration importance factor using the action selection neural network to generate an action selection output; selecting an action to be performed by the agent using the action selection output; determining an exploration reward; determining an overall reward based on: (i) the exploration importance factor, and (ii) the exploration reward; and training the action selection neural network using a reinforcement learning technique based on the overall reward.

Type: Grant

Filed: May 22, 2020

Date of Patent: August 1, 2023

Assignee: DeepMind Technologies Limited

Inventors: Adrià Puigdomènech Badia, Pablo Sprechmann, Alex Vitvitskyi, Zhaohan Guo, Bilal Piot, Steven James Kapturowski, Olivier Tieleman, Charles Blundell
Graph neural network systems for generating structured representations of objects

Patent number: 11704541

Abstract: There is described a neural network system for generating a graph, the graph comprising a set of nodes and edges. The system comprises one or more neural networks configured to represent a probability distribution over sequences of node generating decisions and/or edge generating decisions, and one or more computers configured to sample the probability distribution represented by the one or more neural networks to generate a graph.

Type: Grant

Filed: October 29, 2018

Date of Patent: July 18, 2023

Assignee: DeepMind Technologies Limited

Inventors: Yujia Li, Christopher James Dyer, Oriol Vinyals
Contiguous sparsity pattern neural networks

Patent number: 11693627

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using neural networks having contiguous sparsity patterns. One of the methods includes storing a first parameter matrix of a neural network having a contiguous sparsity pattern in storage associated with a computing device. The computing device performs an inference pass of the neural network to generate an output vector, including reading, from the storage associated with the computing device, one or more activation values from the input vector, reading, from the storage associated with the computing device, a block of non-zero parameter values, and multiplying each of the one or more activation values by one or more of the block of non-zero parameter values.

Type: Grant

Filed: February 11, 2019

Date of Patent: July 4, 2023

Assignee: DeepMind Technologies Limited

Inventors: Karen Simonyan, Nal Emmerich Kalchbrenner, Erich Konrad Elsen
Learning non-differentiable weights of neural networks using evolutionary strategies

Patent number: 11676035

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. The neural network has a plurality of differentiable weights and a plurality of non-differentiable weights. One of the methods includes determining trained values of the plurality of differentiable weights and the non-differentiable weights by repeatedly performing operations that include determining an update to the current values of the plurality of differentiable weights using a machine learning gradient-based training technique and determining, using an evolution strategies (ES) technique, an update to the current values of a plurality of distribution parameters.

Type: Grant

Filed: January 23, 2020

Date of Patent: June 13, 2023

Assignee: DeepMind Technologies Limited

Inventors: Karel Lenc, Karen Simonyan, Tom Schaul, Erich Konrad Elsen
Variable thresholds in constrained optimization

Patent number: 11675855

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for re-ranking a collection of documents according to a first metric and subject to a constraint on a function of one or more second metrics. One of the methods includes: obtaining, for each document in the first collection of documents, a respective first metric value corresponding to the first metric and respective one or more second metric values corresponding to the one or more second metrics; re-ranking the first collection of documents, comprising: determining the constraint on the function of one or more second metrics by computing a first threshold value using a variable threshold function that takes as input second metric values for the documents in the first collection of documents; and determining the re-ranking for the first collection of documents by solving a constrained optimization for the first metric constrained by the first threshold value.

Type: Grant

Filed: November 18, 2020

Date of Patent: June 13, 2023

Assignee: DeepMind Technologies Limited

Inventors: Anton Zhernov, Krishnamurthy Dvijotham, Xiaohong Gong, Amogh S. Asgekar
Action selection neural network training using imitation learning in latent space

Patent number: 11663441

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection policy neural network, wherein the action selection policy neural network is configured to process an observation characterizing a state of an environment to generate an action selection policy output, wherein the action selection policy output is used to select an action to be performed by an agent interacting with an environment. In one aspect, a method comprises: obtaining an observation characterizing a state of the environment subsequent to the agent performing a selected action; generating a latent representation of the observation; processing the latent representation of the observation using a discriminator neural network to generate an imitation score; determining a reward from the imitation score; and adjusting the current values of the action selection policy neural network parameters based on the reward using a reinforcement learning training technique.

Type: Grant

Filed: September 27, 2019

Date of Patent: May 30, 2023

Assignee: DeepMind Technologies Limited

Inventors: Scott Ellison Reed, Yusuf Aytar, Ziyu Wang, Tom Paine, Sergio Gomez Colmenarejo, David Budden, Tobias Pfaff, Aaron Gerard Antonius van den Oord, Oriol Vinyals, Alexander Novikov
Distributional reinforcement learning for continuous control tasks

Patent number: 11663475

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by a reinforcement learning agent interacting with an environment. In particular, the actions are selected from a continuous action space and the system trains the action selection neural network jointly with a distribution Q network that is used to update the parameters of the action selection neural network.

Type: Grant

Filed: September 15, 2022

Date of Patent: May 30, 2023

Assignee: DeepMind Technologies Limited

Inventors: David Budden, Matthew William Hoffman, Gabriel Barth-Maron
Performing navigation tasks using grid codes

Patent number: 11662210

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a grid cell neural network and an action selection neural network. The grid cell network is configured to: receive an input comprising data characterizing a velocity of the agent; process the input to generate a grid cell representation; and process the grid cell representation to generate an estimate of a position of the agent in the environment; the action selection neural network is configured to: receive an input comprising a grid cell representation and an observation characterizing a state of the environment; and process the input to generate an action selection network output.

Type: Grant

Filed: May 18, 2022

Date of Patent: May 30, 2023

Assignee: DeepMind Technologies Limited

Inventors: Andrea Banino, Sudarshan Kumaran, Raia Thais Hadsell, Benigno Uria-Martínez
Training action selection neural networks using a differentiable credit function

Patent number: 11651208

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for reinforcement learning. A reinforcement learning neural network selects actions to be performed by an agent interacting with an environment to perform a task in an attempt to achieve a specified result. The reinforcement learning neural network has at least one input to receive an input observation characterizing a state of the environment and at least one output for determining an action to be performed by the agent in response to the input observation. The system includes a reward function network coupled to the reinforcement learning neural network. The reward function network has an input to receive reward data characterizing a reward provided by one or more states of the environment and is configured to determine a reward function to provide one or more target values for training the reinforcement learning neural network.

Type: Grant

Filed: May 22, 2018

Date of Patent: May 16, 2023

Assignee: DeepMind Technologies Limited

Inventors: Zhongwen Xu, Hado Phillip van Hasselt, Joseph Varughese Modayil, Andre da Motta Salles Barreto, David Silver
Action selection using interaction history graphs

Patent number: 11636347

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a method comprises: obtaining a graph of nodes and edges that represents an interaction history of the agent with the environment; generating an encoded representation of the graph representing the interaction history of the agent with the environment; processing an input based on the encoded representation of the graph using an action selection neural network, in accordance with current values of action selection neural network parameters, to generate an action selection output; and selecting an action from a plurality of possible actions to be performed by the agent using the action selection output generated by the action selection neural network.

Type: Grant

Filed: January 22, 2020

Date of Patent: April 25, 2023

Assignee: DeepMind Technologies Limited

Inventors: Hanjun Dai, Yujia Li, Chenglong Wang, Rishabh Singh, Po-Sen Huang, Pushmeet Kohli
Committed information rate variational autoencoders

Patent number: 11636283

Abstract: A variational autoencoder (VAE) neural network system, comprising an encoder neural network to encode an input data item to define a posterior distribution for a set of latent variables, and a decoder neural network to generate an output data item representing values of a set of latent variables sampled from the posterior distribution. The system is configured for training with an objective function including a term dependent on a difference between the posterior distribution and a prior distribution. The prior and posterior distributions are arranged so that they cannot be matched to one another. The VAE system may be used for compressing and decompressing data.

Type: Grant

Filed: June 1, 2020

Date of Patent: April 25, 2023

Assignee: DeepMind Technologies Limited

Inventors: Benjamin Poole, Aaron Gerard Antonius van den Oord, Ali Razavi-Nematollahi, Oriol Vinyals
Multi-agent reinforcement learning with matchmaking policies

Patent number: 11627165

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a policy neural network having a plurality of policy parameters and used to select actions to be performed by an agent to control the agent to perform a particular task while interacting with one or more other agents in an environment. In one aspect, the method includes: maintaining data specifying a pool of candidate action selection policies; maintaining data specifying respective matchmaking policy; and training the policy neural network using a reinforcement learning technique to update the policy parameters. The policy parameters define policies to be used in controlling the agent to perform the particular task.

Type: Grant

Filed: January 24, 2020

Date of Patent: April 11, 2023

Assignee: DeepMind Technologies Limited

Inventors: David Silver, Oriol Vinyals, Maxwell Elliot Jaderberg

prev 1 2 3 4 5 6 7 8 … next