Patents Assigned to DeepMind Technologies

Distributed training using actor-critic reinforcement learning with off-policy correction factors

Patent number: 11593646

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.

Type: Grant

Filed: February 5, 2019

Date of Patent: February 28, 2023

Assignee: DeepMind Technologies Limited

Inventors: Hubert Josef Soyer, Lasse Espeholt, Karen Simonyan, Yotam Doron, Vlad Firoiu, Volodymyr Mnih, Koray Kavukcuoglu, Remi Munos, Thomas Ward, Timothy James Alexander Harley, Iain Robert Dunning
Scene understanding and generation using neural networks

Patent number: 11587344

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for image rendering. In one aspect, a method comprises receiving a plurality of observations characterizing a particular scene, each observation comprising an image of the particular scene and data identifying a location of a camera that captured the image. In another aspect, the method comprises receiving a plurality of observations characterizing a particular video, each observation comprising a video frame from the particular video and data identifying a time stamp of the video frame in the particular video. In yet another aspect, the method comprises receiving a plurality of observations characterizing a particular image, each observation comprising a crop of the particular image and data characterizing the crop of the particular image. The method processes each of the plurality of observations using an observation neural network to determine a numeric representation as output.

Type: Grant

Filed: May 3, 2019

Date of Patent: February 21, 2023

Assignee: DeepMind Technologies Limited

Inventors: Danilo Jimenez Rezende, Seyed Mohammadali Eslami, Karol Gregor, Frederic Olivier Besse
Parallel video processing neural networks

Patent number: 11580736

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for parallel processing of video frames using neural networks. One of the methods includes receiving a video sequence comprising a respective video frame at each of a plurality of time steps; and processing the video sequence using a video processing neural network to generate a video processing output for the video sequence, wherein the video processing neural network includes a sequence of network components, wherein the network components comprise a plurality of layer blocks each comprising one or more neural network layers, wherein each component is active for a respective subset of the plurality of time steps, and wherein each layer block is configured to, at each time step at which the layer block is active, receive an input generated at a previous time step and to process the input to generate a block output.

Type: Grant

Filed: January 7, 2019

Date of Patent: February 14, 2023

Assignee: DeepMind Technologies Limited

Inventors: Simon Osindero, Joao Carreira, Viorica Patraucean, Andrew Zisserman
Reinforcement learning using a relational network for generating data encoding relationships between entities in an environment

Patent number: 11580429

Abstract: A neural network system is proposed, including an input network for extracting, from state data, respective entity data for each a plurality of entities which are present, or at least potentially present, in the environment. The entity data describes the entity. The neural network contains a relational network for parsing this data, which includes one or more attention blocks which may be stacked to perform successive actions on the entity data. The attention blocks each include a respective transform network for each of the entities. The transform network for each entity is able to transform data which the transform network receives for the entity into modified entity data for the entity, based on data for a plurality of the other entities. An output network is arranged to receive data output by the relational network, and use the received data to select a respective action.

Type: Grant

Filed: May 20, 2019

Date of Patent: February 14, 2023

Assignee: DeepMind Technologies Limited

Inventors: Yujia Li, Victor Constant Bapst, Vinicius Zambaldi, David Nunes Raposo, Adam Anthony Santoro
Training neural networks using a prioritized experience memory

Patent number: 11568250

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network used to select actions performed by a reinforcement learning agent interacting with an environment. In one aspect, a method includes maintaining a replay memory, where the replay memory stores pieces of experience data generated as a result of the reinforcement learning agent interacting with the environment. Each piece of experience data is associated with a respective expected learning progress measure that is a measure of an expected amount of progress made in the training of the neural network if the neural network is trained on the piece of experience data. The method further includes selecting a piece of experience data from the replay memory by prioritizing for selection pieces of experience data having relatively higher expected learning progress measures and training the neural network on the selected piece of experience data.

Type: Grant

Filed: May 4, 2020

Date of Patent: January 31, 2023

Assignee: DeepMind Technologies Limited

Inventors: Tom Schaul, John Quan, David Silver
Learning observation representations by predicting the future in latent space

Patent number: 11568207

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an encoder neural network that is configured to process an input observation to generate a latent representation of the input observation. In one aspect, a method includes: obtaining a sequence of observations; for each observation in the sequence of observations, processing the observation using the encoder neural network to generate a latent representation of the observation; for each of one or more given observations in the sequence of observations: generating a context latent representation of the given observation; and generating, from the context latent representation of the given observation, a respective estimate of the latent representations of one or more particular observations that are after the given observation in the sequence of observations.

Type: Grant

Filed: September 27, 2019

Date of Patent: January 31, 2023

Assignee: DeepMind Technologies Limited

Inventors: Aaron Gerard Antonius van den Oord, Yazhe Li, Oriol Vinyals
Recommending content using neural networks

Patent number: 11562209

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for content recommendation using neural networks. In One aspect, a method includes: receiving context information for an action recommendation from multiple possible actions; processing the context information using a neural network that includes Bayesian neural network layers to generate, for each of the actions, one or more parameters of a distribution over possible action scores for the action, where each parameter for each Bayesian layer is associated with data representing a probability distribution over multiple possible current values for the parameter; for each parameter of each Bayesian neural network layer, selecting the current value for the parameter using data representing probability distribution over possible current values for the parameter; and selecting an action from multiple possible actions using the parameters of the distributions over the possible action scores for the action.

Type: Grant

Filed: October 7, 2019

Date of Patent: January 24, 2023

Assignee: DeepMind Technologies Limited

Inventors: Charles Blundell, Julien Robert Michel Cornebise
Dynamic placement of computation sub-graphs

Patent number: 11551144

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for assigning operations of a computational graph to a plurality of computing devices are disclosed. Data characterizing a computational graph is obtained. Context information for a computational environment in which to perform the operations of the computational graph is received. A model input is generated, which includes at least the context information and the data characterizing the computational graph. The model input is processed using the machine learning model to generate an output defining placement assignments of the operations of the computational graph to the plurality of computing devices. The operations of the computational graph are assigned to the plurality of computing device according to the defined placement assignments.

Type: Grant

Filed: January 30, 2018

Date of Patent: January 10, 2023

Assignee: DeepMind Technologies Limited

Inventors: Jakob Nicolaus Foerster, Matthew Sharifi
Action selection for reinforcement learning using a manager neural network that generates goal vectors defining agent objectives

Patent number: 11537887

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a system configured to select actions to be performed by an agent that interacts with an environment. The system comprises a manager neural network subsystem and a worker neural network subsystem. The manager subsystem is configured to, at each of the multiple time steps, generate a final goal vector for the time step. The worker subsystem is configured to, at each of multiple time steps, use the final goal vector generated by the manager subsystem to generate a respective action score for each action in a predetermined set of actions.

Type: Grant

Filed: May 5, 2020

Date of Patent: December 27, 2022

Assignee: DeepMind Technologies Limited

Inventors: Simon Osindero, Koray Kavukcuoglu, Alexander Vezhnevets
Neural networks for selecting actions to be performed by a robotic agent

Patent number: 11534911

Abstract: A system includes a neural network system implemented by one or more computers. The neural network system is configured to receive an observation characterizing a current state of a real-world environment being interacted with by a robotic agent to perform a robotic task and to process the observation to generate a policy output that defines an action to be performed by the robotic agent in response to the observation. The neural network system includes: (i) a sequence of deep neural networks (DNNs), in which the sequence of DNNs includes a simulation-trained DNN that has been trained on interactions of a simulated version of the robotic agent with a simulated version of the real-world environment to perform a simulated version of the robotic task, and (ii) a first robot-trained DNN that is configured to receive the observation and to process the observation to generate the policy output.

Type: Grant

Filed: March 25, 2020

Date of Patent: December 27, 2022

Assignee: DeepMind Technologies Limited

Inventors: Razvan Pascanu, Raia Thais Hadsell, Mel Vecerik, Thomas Rothoerl, Andrei-Alexandru Rusu, Nicolas Manfred Otto Heess
Deep neural network system for similarity-based graph representations

Patent number: 11537719

Abstract: There is described a neural network system implemented by one or more computers for determining graph similarity. The neural network system comprises one or more neural networks configured to process an input graph to generate a node state representation vector for each node of the input graph and an edge representation vector for each edge of the input graph; and process the node state representation vectors and the edge representation vectors to generate a vector representation of the input graph. The neural network system further comprises one or more processors configured to: receive a first graph; receive a second graph; generate a vector representation of the first graph; generate a vector representation of the second graph; determine a similarity score for the first graph and the second graph based upon the vector representations of the first graph and the second graph.

Type: Grant

Filed: May 17, 2019

Date of Patent: December 27, 2022

Assignee: DeepMind Technologies Limited

Inventors: Yujia Li, Chenjie Gu, Thomas Dullien, Oriol Vinyals, Pushmeet Kohli
Training more secure neural networks by using local linearity regularization

Patent number: 11526755

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes processing each training input using the neural network and in accordance with the current values of the network parameters to generate a network output for the training input; computing a respective loss for each of the training inputs by evaluating a loss function; identifying, from a plurality of possible perturbations, a maximally non-linear perturbation; and determining an update to the current values of the parameters of the neural network by performing an iteration of a neural network training procedure to decrease the respective losses for the training inputs and to decrease the non-linearity of the loss function for the identified maximally non-linear perturbation.

Type: Grant

Filed: May 22, 2020

Date of Patent: December 13, 2022

Assignee: DeepMind Technologies Limited

Inventors: Chongli Qin, Sven Adrian Gowal, Soham De, Robert Stanforth, James Martens, Krishnamurthy Dvijotham, Dilip Krishnan, Alhussein Fawzi
Distributed training of reinforcement learning systems

Patent number: 11507827

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for distributed training of reinforcement learning systems. One of the methods includes receiving, by a learner, current values of the parameters of the Q network from a parameter server, wherein each learner maintains a respective learner Q network replica and a respective target Q network replica; updating, by the learner, the parameters of the learner Q network replica maintained by the learner using the current values; selecting, by the learner, an experience tuple from a respective replay memory; computing, by the learner, a gradient from the experience tuple using the learner Q network replica maintained by the learner and the target Q network replica maintained by the learner; and providing, by the learner, the computed gradient to the parameter server.

Type: Grant

Filed: October 14, 2019

Date of Patent: November 22, 2022

Assignee: DeepMind Technologies Limited

Inventors: Praveen Deepak Srinivasan, Rory Fearon, Cagdas Alcicek, Arun Sarath Nair, Samuel Blackwell, Vedavyas Panneershelvam, Alessandro De Maria, Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Mustafa Suleyman
Distributional reinforcement learning for continuous control tasks

Patent number: 11481629

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by a reinforcement learning agent interacting with an environment. In particular, the actions are selected from a continuous action space and the system trains the action selection neural network jointly with a distribution Q network that is used to update the parameters of the action selection neural network.

Type: Grant

Filed: October 29, 2018

Date of Patent: October 25, 2022

Assignee: DeepMind Technologies Limited

Inventors: David Budden, Matthew William Hoffman, Gabriel Barth-Maron
Training action selection neural networks using apprenticeship

Patent number: 11468321

Abstract: An off-policy reinforcement learning actor-critic neural network system configured to select actions from a continuous action space to be performed by an agent interacting with an environment to perform a task. An observation defines environment state data and reward data. The system has an actor neural network which learns a policy function mapping the state data to action data. A critic neural network learns an action-value (Q) function. A replay buffer stores tuples of the state data, the action data, the reward data and new state data. The replay buffer also includes demonstration transition data comprising a set of the tuples from a demonstration of the task within the environment. The neural network system is configured to train the actor neural network and the critic neural network off-policy using stored tuples from the replay buffer comprising tuples both from operation of the system and from the demonstration transition data.

Type: Grant

Filed: June 28, 2018

Date of Patent: October 11, 2022

Assignee: DeepMind Technologies Limited

Inventors: Olivier Claude Pietquin, Martin Riedmiller, Wang Fumin, Bilal Piot, Mel Vecerik, Todd Andrew Hester, Thomas Rothoerl, Thomas Lampe, Nicolas Manfred Otto Heess, Jonathan Karl Scholz
Generating output examples using bit blocks

Patent number: 11468295

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output examples using neural networks. One of the methods includes receiving a request to generate an output example of a particular type, accessing dependency data, and generating the output example by, at each of a plurality of generation time steps: identifying one or more current blocks for the generation time step, wherein each current block is a block for which the values of the bits in all of the other blocks identified in the dependency for the block have already been generated; and generating the values of the bits in the current blocks for the generation time step conditioned on, for each current block, the already generated values of the bits in the other blocks identified in the dependency for the current block.

Type: Grant

Filed: May 21, 2018

Date of Patent: October 11, 2022

Assignee: DeepMind Technologies Limited

Inventors: Nal Emmerich Kalchbrenner, Karen Simonyan, Erich Konrad Elsen
Generating images using neural networks

Patent number: 11462034

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating images using neural networks. One of the methods includes generating the output image pixel by pixel from a sequence of pixels taken from the output image, comprising, for each pixel in the output image, generating a respective score distribution over a discrete set of possible color values for each of the plurality of color channels.

Type: Grant

Filed: March 10, 2021

Date of Patent: October 4, 2022

Assignee: DeepMind Technologies Limited

Inventors: Aaron Gerard Antonius van den Oord, Nal Emmerich Kalchbrenner, Karen Simonyan
Training action selection neural networks using look-ahead search

Patent number: 11449750

Abstract: Methods, systems and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network. One of the methods includes receiving an observation characterizing a current state of the environment; determining a target network output for the observation by performing a look ahead search of possible future states of the environment starting from the current state until the environment reaches a possible future state that satisfies one or more termination criteria, wherein the look ahead search is guided by the neural network in accordance with current values of the network parameters; selecting an action to be performed by the agent in response to the observation using the target network output generated by performing the look ahead search; and storing, in an exploration history data store, the target network output in association with the observation for use in updating the current values of the network parameters.

Type: Grant

Filed: May 28, 2018

Date of Patent: September 20, 2022

Assignee: DeepMind Technologies Limited

Inventors: Karen Simonyan, David Silver, Julian Schrittwieser
Sampling latent variables to generate multiple segmentations of an image

Patent number: 11430123

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a plurality of possible segmentations of an image. In one aspect, a method comprises: receiving a request to generate a plurality of possible segmentations of an image; sampling a plurality of latent variables from a latent space, wherein each latent variable is sampled from the latent space in accordance with a respective probability distribution over the latent space that is determined based on the image; generating a plurality of possible segmentations of the image, comprising, for each latent variable, processing the image and the latent variable using a segmentation neural network having a plurality of segmentation neural network parameters to generate the possible segmentation of the image; and providing the plurality of possible segmentations of the image in response to the request.

Type: Grant

Filed: May 22, 2020

Date of Patent: August 30, 2022

Assignee: DeepMind Technologies Limited

Inventors: Simon Kohl, Bernardino Romera-Paredes, Danilo Jimenez Rezende, Seyed Mohammadali Eslami, Pushmeet Kohli, Andrew Zisserman, Olaf Ronneberger
Evaluating reinforcement learning policies

Patent number: 11429898

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for evaluating reinforcement learning policies. One of the methods includes receiving a plurality of training histories for a reinforcement learning agent; determining a total reward for each training observation in the training histories; partitioning the training observations into a plurality of partitions; determining, for each partition and from the partitioned training observations, a probability that the reinforcement learning agent will receive the total reward for the partition if the reinforcement learning agent performs the action for the partition in response to receiving the current observation; determining, from the probabilities and for each total reward, a respective estimated value of performing each action in response to receiving the current observation; and selecting an action from the pre-determined set of actions from the estimated values in accordance with an action selection policy.

Type: Grant

Filed: October 14, 2019

Date of Patent: August 30, 2022

Assignee: DeepMind Technologies Limited

Inventors: Joel William Veness, Marc Gendron-Bellemare

prev … 4 5 6 7 8 9 10 11 12 … next