Patents Assigned to DeepMind Technologies Limited

Reinforcement learning using distributed prioritized replay

Patent number: 11625604

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. One of the systems includes (i) a plurality of actor computing units, in which each of the actor computing units is configured to maintain a respective replica of the action selection neural network and to perform a plurality of actor operations, and (ii) one or more learner computing units, in which each of the one or more learner computing units is configured to perform a plurality of learner operations.

Type: Grant

Filed: October 29, 2018

Date of Patent: April 11, 2023

Assignee: DeepMind Technologies Limited

Inventors: David Budden, Gabriel Barth-Maron, John Quan, Daniel George Horgan
Training machine learning models by determining update rules using recurrent neural networks

Patent number: 11615310

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for training machine learning models. One method includes obtaining a machine learning model, wherein the machine learning model comprises one or more model parameters, and the machine learning model is trained using gradient descent techniques to optimize an objective function; determining an update rule for the model parameters using a recurrent neural network (RNN); and applying a determined update rule for a final time step in a sequence of multiple time steps to the model parameters.

Type: Grant

Filed: May 19, 2017

Date of Patent: March 28, 2023

Assignee: DeepMind Technologies Limited

Inventors: Misha Man Ray Denil, Tom Schaul, Marcin Andrychowicz, Joao Ferdinando Gomes de Freitas, Sergio Gomez Colmenarejo, Matthew William Hoffman, David Benjamin Pfau
Distributional reinforcement learning using quantile function neural networks

Patent number: 11610118

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method comprises: receiving a current observation; for each action of a plurality of actions: randomly sampling one or more probability values; for each probability value: processing the action, the current observation, and the probability value using a quantile function network to generate an estimated quantile value for the probability value with respect to a probability distribution over possible returns that would result from the agent performing the action in response to the current observation; determining a measure of central tendency of the one or more estimated quantile values; and selecting an action to be performed by the agent in response to the current observation using the measures of central tendency for the actions.

Type: Grant

Filed: February 11, 2019

Date of Patent: March 21, 2023

Assignee: DeepMind Technologies Limited

Inventors: Georg Ostrovski, William Clinton Dabney
Population based training of neural networks

Patent number: 11604985

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. A method includes: training a neural network having multiple network parameters to perform a particular neural network task and to determine trained values of the network parameters using an iterative training process having multiple hyperparameters, the method includes: maintaining multiple candidate neural networks and, for each of the multiple candidate neural networks, data specifying: (i) respective values of network parameters for the candidate neural network, (ii) respective values of hyperparameters for the candidate neural network, and (iii) a quality measure that measures a performance of the candidate neural network on the particular neural network task; and for each of the multiple candidate neural networks, repeatedly performing additional training operations.

Type: Grant

Filed: November 22, 2018

Date of Patent: March 14, 2023

Assignee: DeepMind Technologies Limited

Inventors: Maxwell Elliot Jaderberg, Wojciech Czarnecki, Timothy Frederick Goldie Green, Valentin Clement Dalibard
Training action-selection neural networks from demonstrations using multiple losses

Patent number: 11604941

Abstract: A method of training an action selection neural network to perform a demonstrated task using a supervised learning technique. The action selection neural network is configured to receive demonstration data comprising actions to perform the task and rewards received for performing the actions. The action selection neural network has auxiliary prediction task neural networks on one or more of its intermediate outputs. The action selection policy neural network is trained using multiple combined losses, concurrently with the auxiliary prediction task neural networks.

Type: Grant

Filed: October 29, 2018

Date of Patent: March 14, 2023

Assignee: DeepMind Technologies Limited

Inventor: Todd Andrew Hester
Training action selection neural networks using leave-one-out-updates

Patent number: 11604997

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a policy neural network. The policy neural network is used to select actions to be performed by an agent that interacts with an environment by receiving an observation characterizing a state of the environment and performing an action from a set of actions in response to the received observation. A trajectory is obtained from a replay memory, and a final update to current values of the policy network parameters is determined for each training observation in the trajectory. The final updates to the current values of the policy network parameters are determined from selected action updates and leave-one-out updates.

Type: Grant

Filed: June 11, 2018

Date of Patent: March 14, 2023

Assignee: DeepMind Technologies Limited

Inventors: Marc Gendron-Bellemare, Mohammad Gheshlaghi Azar, Audrunas Gruslys, Remi Munos
Augmented recurrent neural network with external memory

Patent number: 11593640

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory. One of the methods includes providing an output derived from the neural network output for the time step as a system output for the time step; maintaining a current state of the external memory; determining, from the neural network output for the time step, memory state parameters for the time step; updating the current state of the external memory using the memory state parameters for the time step; reading data from the external memory in accordance with the updated state of the external memory; and combining the data read from the external memory with a system input for the next time step to generate the neural network input for the next time step.

Type: Grant

Filed: September 9, 2019

Date of Patent: February 28, 2023

Assignee: DeepMind Technologies Limited

Inventors: Edward Thomas Grefenstette, Karl Moritz Hermann, Mustafa Suleyman, Philip Blunsom
Distributed training using actor-critic reinforcement learning with off-policy correction factors

Patent number: 11593646

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.

Type: Grant

Filed: February 5, 2019

Date of Patent: February 28, 2023

Assignee: DeepMind Technologies Limited

Inventors: Hubert Josef Soyer, Lasse Espeholt, Karen Simonyan, Yotam Doron, Vlad Firoiu, Volodymyr Mnih, Koray Kavukcuoglu, Remi Munos, Thomas Ward, Timothy James Alexander Harley, Iain Robert Dunning
Scene understanding and generation using neural networks

Patent number: 11587344

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for image rendering. In one aspect, a method comprises receiving a plurality of observations characterizing a particular scene, each observation comprising an image of the particular scene and data identifying a location of a camera that captured the image. In another aspect, the method comprises receiving a plurality of observations characterizing a particular video, each observation comprising a video frame from the particular video and data identifying a time stamp of the video frame in the particular video. In yet another aspect, the method comprises receiving a plurality of observations characterizing a particular image, each observation comprising a crop of the particular image and data characterizing the crop of the particular image. The method processes each of the plurality of observations using an observation neural network to determine a numeric representation as output.

Type: Grant

Filed: May 3, 2019

Date of Patent: February 21, 2023

Assignee: DeepMind Technologies Limited

Inventors: Danilo Jimenez Rezende, Seyed Mohammadali Eslami, Karol Gregor, Frederic Olivier Besse
Reinforcement learning using a relational network for generating data encoding relationships between entities in an environment

Patent number: 11580429

Abstract: A neural network system is proposed, including an input network for extracting, from state data, respective entity data for each a plurality of entities which are present, or at least potentially present, in the environment. The entity data describes the entity. The neural network contains a relational network for parsing this data, which includes one or more attention blocks which may be stacked to perform successive actions on the entity data. The attention blocks each include a respective transform network for each of the entities. The transform network for each entity is able to transform data which the transform network receives for the entity into modified entity data for the entity, based on data for a plurality of the other entities. An output network is arranged to receive data output by the relational network, and use the received data to select a respective action.

Type: Grant

Filed: May 20, 2019

Date of Patent: February 14, 2023

Assignee: DeepMind Technologies Limited

Inventors: Yujia Li, Victor Constant Bapst, Vinicius Zambaldi, David Nunes Raposo, Adam Anthony Santoro
Parallel video processing neural networks

Patent number: 11580736

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for parallel processing of video frames using neural networks. One of the methods includes receiving a video sequence comprising a respective video frame at each of a plurality of time steps; and processing the video sequence using a video processing neural network to generate a video processing output for the video sequence, wherein the video processing neural network includes a sequence of network components, wherein the network components comprise a plurality of layer blocks each comprising one or more neural network layers, wherein each component is active for a respective subset of the plurality of time steps, and wherein each layer block is configured to, at each time step at which the layer block is active, receive an input generated at a previous time step and to process the input to generate a block output.

Type: Grant

Filed: January 7, 2019

Date of Patent: February 14, 2023

Assignee: DeepMind Technologies Limited

Inventors: Simon Osindero, Joao Carreira, Viorica Patraucean, Andrew Zisserman
Learning observation representations by predicting the future in latent space

Patent number: 11568207

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an encoder neural network that is configured to process an input observation to generate a latent representation of the input observation. In one aspect, a method includes: obtaining a sequence of observations; for each observation in the sequence of observations, processing the observation using the encoder neural network to generate a latent representation of the observation; for each of one or more given observations in the sequence of observations: generating a context latent representation of the given observation; and generating, from the context latent representation of the given observation, a respective estimate of the latent representations of one or more particular observations that are after the given observation in the sequence of observations.

Type: Grant

Filed: September 27, 2019

Date of Patent: January 31, 2023

Assignee: DeepMind Technologies Limited

Inventors: Aaron Gerard Antonius van den Oord, Yazhe Li, Oriol Vinyals
Training neural networks using a prioritized experience memory

Patent number: 11568250

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network used to select actions performed by a reinforcement learning agent interacting with an environment. In one aspect, a method includes maintaining a replay memory, where the replay memory stores pieces of experience data generated as a result of the reinforcement learning agent interacting with the environment. Each piece of experience data is associated with a respective expected learning progress measure that is a measure of an expected amount of progress made in the training of the neural network if the neural network is trained on the piece of experience data. The method further includes selecting a piece of experience data from the replay memory by prioritizing for selection pieces of experience data having relatively higher expected learning progress measures and training the neural network on the selected piece of experience data.

Type: Grant

Filed: May 4, 2020

Date of Patent: January 31, 2023

Assignee: DeepMind Technologies Limited

Inventors: Tom Schaul, John Quan, David Silver
Recommending content using neural networks

Patent number: 11562209

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for content recommendation using neural networks. In One aspect, a method includes: receiving context information for an action recommendation from multiple possible actions; processing the context information using a neural network that includes Bayesian neural network layers to generate, for each of the actions, one or more parameters of a distribution over possible action scores for the action, where each parameter for each Bayesian layer is associated with data representing a probability distribution over multiple possible current values for the parameter; for each parameter of each Bayesian neural network layer, selecting the current value for the parameter using data representing probability distribution over possible current values for the parameter; and selecting an action from multiple possible actions using the parameters of the distributions over the possible action scores for the action.

Type: Grant

Filed: October 7, 2019

Date of Patent: January 24, 2023

Assignee: DeepMind Technologies Limited

Inventors: Charles Blundell, Julien Robert Michel Cornebise
Dynamic placement of computation sub-graphs

Patent number: 11551144

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for assigning operations of a computational graph to a plurality of computing devices are disclosed. Data characterizing a computational graph is obtained. Context information for a computational environment in which to perform the operations of the computational graph is received. A model input is generated, which includes at least the context information and the data characterizing the computational graph. The model input is processed using the machine learning model to generate an output defining placement assignments of the operations of the computational graph to the plurality of computing devices. The operations of the computational graph are assigned to the plurality of computing device according to the defined placement assignments.

Type: Grant

Filed: January 30, 2018

Date of Patent: January 10, 2023

Assignee: DeepMind Technologies Limited

Inventors: Jakob Nicolaus Foerster, Matthew Sharifi
Neural networks for selecting actions to be performed by a robotic agent

Patent number: 11534911

Abstract: A system includes a neural network system implemented by one or more computers. The neural network system is configured to receive an observation characterizing a current state of a real-world environment being interacted with by a robotic agent to perform a robotic task and to process the observation to generate a policy output that defines an action to be performed by the robotic agent in response to the observation. The neural network system includes: (i) a sequence of deep neural networks (DNNs), in which the sequence of DNNs includes a simulation-trained DNN that has been trained on interactions of a simulated version of the robotic agent with a simulated version of the real-world environment to perform a simulated version of the robotic task, and (ii) a first robot-trained DNN that is configured to receive the observation and to process the observation to generate the policy output.

Type: Grant

Filed: March 25, 2020

Date of Patent: December 27, 2022

Assignee: DeepMind Technologies Limited

Inventors: Razvan Pascanu, Raia Thais Hadsell, Mel Vecerik, Thomas Rothoerl, Andrei-Alexandru Rusu, Nicolas Manfred Otto Heess
Action selection for reinforcement learning using a manager neural network that generates goal vectors defining agent objectives

Patent number: 11537887

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a system configured to select actions to be performed by an agent that interacts with an environment. The system comprises a manager neural network subsystem and a worker neural network subsystem. The manager subsystem is configured to, at each of the multiple time steps, generate a final goal vector for the time step. The worker subsystem is configured to, at each of multiple time steps, use the final goal vector generated by the manager subsystem to generate a respective action score for each action in a predetermined set of actions.

Type: Grant

Filed: May 5, 2020

Date of Patent: December 27, 2022

Assignee: DeepMind Technologies Limited

Inventors: Simon Osindero, Koray Kavukcuoglu, Alexander Vezhnevets
Deep neural network system for similarity-based graph representations

Patent number: 11537719

Abstract: There is described a neural network system implemented by one or more computers for determining graph similarity. The neural network system comprises one or more neural networks configured to process an input graph to generate a node state representation vector for each node of the input graph and an edge representation vector for each edge of the input graph; and process the node state representation vectors and the edge representation vectors to generate a vector representation of the input graph. The neural network system further comprises one or more processors configured to: receive a first graph; receive a second graph; generate a vector representation of the first graph; generate a vector representation of the second graph; determine a similarity score for the first graph and the second graph based upon the vector representations of the first graph and the second graph.

Type: Grant

Filed: May 17, 2019

Date of Patent: December 27, 2022

Assignee: DeepMind Technologies Limited

Inventors: Yujia Li, Chenjie Gu, Thomas Dullien, Oriol Vinyals, Pushmeet Kohli
Training more secure neural networks by using local linearity regularization

Patent number: 11526755

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes processing each training input using the neural network and in accordance with the current values of the network parameters to generate a network output for the training input; computing a respective loss for each of the training inputs by evaluating a loss function; identifying, from a plurality of possible perturbations, a maximally non-linear perturbation; and determining an update to the current values of the parameters of the neural network by performing an iteration of a neural network training procedure to decrease the respective losses for the training inputs and to decrease the non-linearity of the loss function for the identified maximally non-linear perturbation.

Type: Grant

Filed: May 22, 2020

Date of Patent: December 13, 2022

Assignee: DeepMind Technologies Limited

Inventors: Chongli Qin, Sven Adrian Gowal, Soham De, Robert Stanforth, James Martens, Krishnamurthy Dvijotham, Dilip Krishnan, Alhussein Fawzi
Distributed training of reinforcement learning systems

Patent number: 11507827

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for distributed training of reinforcement learning systems. One of the methods includes receiving, by a learner, current values of the parameters of the Q network from a parameter server, wherein each learner maintains a respective learner Q network replica and a respective target Q network replica; updating, by the learner, the parameters of the learner Q network replica maintained by the learner using the current values; selecting, by the learner, an experience tuple from a respective replay memory; computing, by the learner, a gradient from the experience tuple using the learner Q network replica maintained by the learner and the target Q network replica maintained by the learner; and providing, by the learner, the computed gradient to the parameter server.

Type: Grant

Filed: October 14, 2019

Date of Patent: November 22, 2022

Assignee: DeepMind Technologies Limited

Inventors: Praveen Deepak Srinivasan, Rory Fearon, Cagdas Alcicek, Arun Sarath Nair, Samuel Blackwell, Vedavyas Panneershelvam, Alessandro De Maria, Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Mustafa Suleyman

prev 1 2 3 4 5 6 7 8 9 … next