Patents Assigned to DeepMind Technologies

Environment prediction using reinforcement learning

Patent number: 10733501

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for prediction of an outcome related to an environment. In one aspect, a system comprises a state representation neural network that is configured to: receive an observation characterizing a state of an environment being interacted with by an agent and process the observation to generate an internal state representation of the environment state; a prediction neural network that is configured to receive a current internal state representation of a current environment state and process the current internal state representation to generate a predicted subsequent state representation of a subsequent state of the environment and a predicted reward for the subsequent state; and a value prediction neural network that is configured to receive a current internal state representation of a current environment state and process the current internal state representation to generate a value prediction.

Type: Grant

Filed: May 3, 2019

Date of Patent: August 4, 2020

Assignee: DeepMind Technologies Limited

Inventors: David Silver, Tom Schaul, Matteo Hessel, Hado Philip van Hasselt
Training reinforcement learning neural networks

Patent number: 10733504

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a Q network used to select actions to be performed by an agent interacting with an environment. One of the methods includes obtaining a plurality of experience tuples and training the Q network on each of the experience tuples using the Q network and a target Q network that is identical to the Q network but with the current values of the parameters of the target Q network being different from the current values of the parameters of the Q network.

Type: Grant

Filed: September 9, 2016

Date of Patent: August 4, 2020

Assignee: DeepMind Technologies Limited

Inventors: Hado Philip van Hasselt, Arthur Clément Guez
Processing text sequences using neural networks

Patent number: 10733390

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language modeling. In one aspect, a system comprises: a masked convolutional decoder neural network that comprises a plurality of masked convolutional neural network layers and is configured to generate a respective probability distribution over a set of possible target embeddings at each of a plurality of time steps; and a modeling engine that is configured to use the respective probability distribution generated by the decoder neural network at each of the plurality of time steps to estimate a probability that a string represented by the target embeddings corresponding to the plurality of time steps belongs to the natural language.

Type: Grant

Filed: June 7, 2019

Date of Patent: August 4, 2020

Assignee: DeepMind Technologies Limited

Inventors: Nal Emmerich Kalchbrenner, Karen Simonyan, Lasse Espeholt
AUGMENTING NEURAL NETWORKS WITH EXTERNAL MEMORY

Publication number: 20200226446

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory. One of the methods includes providing an output derived from a first portion of a neural network output as a system output; determining one or more sets of writing weights for each of a plurality of locations in an external memory; writing data defined by a third portion of the neural network output to the external memory in accordance with the sets of writing weights; determining one or more sets of reading weights for each of the plurality of locations in the external memory from a fourth portion of the neural network output; reading data from the external memory in accordance with the sets of reading weights; and combining the data read from the external memory with a next system input to generate the next neural network input.

Type: Application

Filed: March 26, 2020

Publication date: July 16, 2020

Applicant: DeepMind Technologies Limited

Inventors: Alexander Benjamin Graves, Ivo Danihelka, Gregory Duncan Wayne
Image generation using subscaling and depth up-scaling

Patent number: 10713755

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output images. One of the methods includes obtaining data specifying (i) a partitioning of the H by W pixel grid of the output image into K disjoint, interleaved sub-images and (ii) an ordering of the sub-images; and generating intensity values sub-image by sub-image, comprising: for each particular color channel for each particular pixel in each particular sub-image, generating, using a generative neural network, the intensity value for the particular color channel conditioned on intensity values for (i) any pixels that are in sub-images that are before the particular sub-image in the ordering, (ii) any pixels within the particular sub-image that are before the particular pixel in a raster-scan order over the output image, and (iii) the particular pixel for any color channels that are before the particular color channel in a color channel order.

Type: Grant

Filed: September 27, 2019

Date of Patent: July 14, 2020

Assignee: DeepMind Technologies Limited

Inventors: Nal Emmerich Kalchbrenner, Jacob Lee Menick
Recurrent environment predictors

Patent number: 10713559

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for environment simulation. In one aspect, a system comprises a recurrent neural network configured to, at each of a plurality of time steps, receive a preceding action for a preceding time step, update a preceding initial hidden state of the recurrent neural network from the preceding time step using the preceding action, update a preceding cell state of the recurrent neural network from the preceding time step using at least the initial hidden state for the time step, and determine a final hidden state for the time step using the cell state for the time step. The system further comprises a decoder neural network configured to receive the final hidden state for the time step and process the final hidden state to generate a predicted observation characterizing a predicted state of the environment at the time step.

Type: Grant

Filed: May 3, 2019

Date of Patent: July 14, 2020

Assignee: DeepMind Technologies Limited

Inventors: Daniel Pieter Wierstra, Shakir Mohamed, Silvia Chiappa, Sebastien Henri Andre Racaniere
Training action selection neural networks using off-policy actor critic reinforcement learning

Patent number: 10706352

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network. One of the methods includes maintaining a replay memory that stores trajectories generated as a result of interaction of an agent with an environment; and training an action selection neural network having policy parameters on the trajectories in the replay memory, wherein training the action selection neural network comprises: sampling a trajectory from the replay memory; and adjusting current values of the policy parameters by training the action selection neural network on the trajectory using an off-policy actor critic reinforcement learning technique.

Type: Grant

Filed: May 3, 2019

Date of Patent: July 7, 2020

Assignee: DeepMind Technologies Limited

Inventors: Ziyu Wang, Nicolas Manfred Otto Heess, Victor Constant Bapst
Augmenting neural networks to generate additional outputs

Patent number: 10691997

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks to generate additional outputs. One of the systems includes a neural network and a sequence processing subsystem, wherein the sequence processing subsystem is configured to perform operations comprising, for each of the system inputs in a sequence of system inputs: receiving the system input; generating an initial neural network input from the system input; causing the neural network to process the initial neural network input to generate an initial neural network output for the system input; and determining, from a first portion of the initial neural network output for the system input, whether or not to cause the neural network to generate one or more additional neural network outputs for the system input.

Type: Grant

Filed: December 21, 2015

Date of Patent: June 23, 2020

Assignee: DeepMind Technologies Limited

Inventors: Alexander Benjamin Graves, Ivo Danihelka, Gregory Duncan Wayne
Action selection for reinforcement learning using neural networks

Patent number: 10679126

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a system configured to select actions to be performed by an agent that interacts with an environment. The system comprises a manager neural network subsystem and a worker neural network subsystem. The manager subsystem is configured to, at each of the multiple time steps, generate a final goal vector for the time step. The worker subsystem is configured to, at each of multiple time steps, use the final goal vector generated by the manager subsystem to generate a respective action score for each action in a predetermined set of actions.

Type: Grant

Filed: July 15, 2019

Date of Patent: June 9, 2020

Assignee: DeepMind Technologies Limited

Inventors: Simon Osindero, Koray Kavukcuoglu, Alexander Vezhnevets
Committed information rate variational autoencoders

Patent number: 10671889

Abstract: A variational autoencoder (VAE) neural network system, comprising an encoder neural network to encode an input data item to define a posterior distribution for a set of latent variables, and a decoder neural network to generate an output data item representing values of a set of latent variables sampled from the posterior distribution. The system is configured for training with an objective function including a term dependent on a difference between the posterior distribution and a prior distribution. The prior and posterior distributions are arranged so that they cannot be matched to one another. The VAE system may be used for compressing and decompressing data.

Type: Grant

Filed: September 27, 2019

Date of Patent: June 2, 2020

Assignee: DeepMind Technologies Limited

Inventors: Benjamin Poole, Aaron Gerard Antonius van den Oord, Ali Razavi-Nematollahi, Oriol Vinyals
Data-efficient reinforcement learning for continuous control tasks

Patent number: 10664725

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-efficient reinforcement learning. One of the systems is a system for training an actor neural network used to select actions to be performed by an agent that interacts with an environment by receiving observations characterizing states of the environment and, in response to each observation, performing an action selected from a continuous space of possible actions, wherein the actor neural network maps observations to next actions in accordance with values of parameters of the actor neural network, and wherein the system comprises: a plurality of workers, wherein each worker is configured to operate independently of each other worker, wherein each worker is associated with a respective agent replica that interacts with a respective replica of the environment during the training of the actor neural network.

Type: Grant

Filed: July 31, 2019

Date of Patent: May 26, 2020

Assignee: DeepMind Technologies Limited

Inventors: Martin Riedmiller, Roland Hafner, Mel Vecerik, Timothy Paul Lillicrap, Thomas Lampe, Ivaylo Popov, Gabriel Barth-Maron, Nicolas Manfred Otto Heess
Neural episodic control

Patent number: 10664753

Abstract: A method includes maintaining respective episodic memory data for each of multiple actions; receiving a current observation characterizing a current state of an environment being interacted with by an agent; processing the current observation using an embedding neural network in accordance with current values of parameters of the embedding neural network to generate a current key embedding for the current observation; for each action of the plurality of actions: determining the p nearest key embeddings in the episodic memory data for the action to the current key embedding according to a distance measure, and determining a Q value for the action from the return estimates mapped to by the p nearest key embeddings in the episodic memory data for the action; and selecting, using the Q values for the actions, an action from the multiple actions as the action to be performed by the agent.

Type: Grant

Filed: June 19, 2019

Date of Patent: May 26, 2020

Assignee: DeepMind Technologies Limited

Inventors: Benigno Uria-Martínez, Alexander Pritzel, Charles Blundell, Adria Puigdomenech Badia
Generative neural networks

Patent number: 10657436

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a neural network system. In one aspect, a neural network system includes a recurrent neural network that is configured to, for each time step of a predetermined number of time steps, receive a set of latent variables for the time step and process the latent variables to update a hidden state of the recurrent neural network; and a generative subsystem that is configured to, for each time step, generate the set of latent variables for the time step and provide the set of latent variables as input to the recurrent neural network; update a hidden canvas using the updated hidden state of the recurrent neural network; and, for a last time step, generate an output image using the updated hidden canvas for the last time step.

Type: Grant

Filed: January 7, 2019

Date of Patent: May 19, 2020

Assignee: DeepMind Technologies Limited

Inventors: Ivo Danihelka, Danilo Jimenez Rezende, Shakir Mohamed
Training neural networks using a prioritized experience memory

Patent number: 10650310

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network used to select actions performed by a reinforcement learning agent interacting with an environment. In one aspect, a method includes maintaining a replay memory, where the replay memory stores pieces of experience data generated as a result of the reinforcement learning agent interacting with the environment. Each piece of experience data is associated with a respective expected learning progress measure that is a measure of an expected amount of progress made in the training of the neural network if the neural network is trained on the piece of experience data. The method further includes selecting a piece of experience data from the replay memory by prioritizing for selection pieces of experience data having relatively higher expected learning progress measures and training the neural network on the selected piece of experience data.

Type: Grant

Filed: November 11, 2016

Date of Patent: May 12, 2020

Assignee: DeepMind Technologies Limited

Inventors: Tom Schaul, John Quan, David Silver
Augmenting neural networks with external memory

Patent number: 10650302

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory. One of the methods includes providing an output derived from a first portion of a neural network output as a system output; determining one or more sets of writing weights for each of a plurality of locations in an external memory; writing data defined by a third portion of the neural network output to the external memory in accordance with the sets of writing weights; determining one or more sets of reading weights for each of the plurality of locations in the external memory from a fourth portion of the neural network output; reading data from the external memory in accordance with the sets of reading weights; and combining the data read from the external memory with a next system input to generate the next neural network input.

Type: Grant

Filed: October 16, 2015

Date of Patent: May 12, 2020

Assignee: DeepMind Technologies Limited

Inventors: Alexander Benjamin Graves, Ivo Danihelka, Gregory Duncan Wayne
Optimizing data center controls using neural networks

Patent number: 10643121

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for improving operational efficiency within a data center by modeling data center performance and predicting power usage efficiency. An example method receives a state input characterizing a current state of a data center. For each data center setting slate, the state input and the data center setting slate are processed through an ensemble of machine learning models. Each machine learning model is configured to receive and process the state input and the data center setting slate to generate an efficiency score that characterizes a predicted resource efficiency of the data center if the data center settings defined by the data center setting slate are adopted t. The method selects, based on the efficiency scores for the data center setting slates, new values for the data center settings.

Type: Grant

Filed: January 19, 2017

Date of Patent: May 5, 2020

Assignee: DeepMind Technologies Limited

Inventors: Richard Andrew Evans, Jim Gao, Michael C. Ryan, Gabriel Dulac-Arnold, Jonathan Karl Scholz, Todd Andrew Hester
Training variational autoencoders to generate disentangled latent factors

Patent number: 10643131

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a variational auto-encoder (VAE) to generate disentangled latent factors on unlabeled training images. In one aspect, a method includes receiving the plurality of unlabeled training images, and, for each unlabeled training image, processing the unlabeled training image using the VAE to determine the latent representation of the unlabeled training image and to generate a reconstruction of the unlabeled training image in accordance with current values of the parameters of the VAE, and adjusting current values of the parameters of the VAE by optimizing a loss function that depends on a quality of the reconstruction and also on a degree of independence between the latent factors in the latent representation of the unlabeled training image.

Type: Grant

Filed: August 5, 2019

Date of Patent: May 5, 2020

Assignee: DeepMind Technologies Limited

Inventors: Loic Matthey-de-l'Endroit, Arka Tilak Pal, Shakir Mohamed, Xavier Glorot, Irina Higgins, Alexander Lerchner
Neural programming

Patent number: 10635974

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for neural programming. One of the methods includes processing a current neural network input using a core recurrent neural network to generate a neural network output; determining, from the neural network output, whether or not to end a currently invoked program and to return to a calling program from the set of programs; determining, from the neural network output, a next program to be called; determining, from the neural network output, contents of arguments to the next program to be called; receiving a representation of a current state of the environment; and generating a next neural network input from an embedding for the next program to be called and the representation of the current state of the environment.

Type: Grant

Filed: November 11, 2016

Date of Patent: April 28, 2020

Assignee: DeepMind Technologies Limited

Inventors: Scott Ellison Reed, Joao Ferdinando Gomes de Freitas
Neural networks for selecting actions to be performed by a robotic agent

Patent number: 10632618

Abstract: A system includes a neural network system implemented by one or more computers. The neural network system is configured to receive an observation characterizing a current state of a real-world environment being interacted with by a robotic agent to perform a robotic task and to process the observation to generate a policy output that defines an action to be performed by the robotic agent in response to the observation. The neural network system includes: (i) a sequence of deep neural networks (DNNs), in which the sequence of DNNs includes a simulation-trained DNN that has been trained on interactions of a simulated version of the robotic agent with a simulated version of the real-world environment to perform a simulated version of the robotic task, and (ii) a first robot-trained DNN that is configured to receive the observation and to process the observation to generate the policy output.

Type: Grant

Filed: April 10, 2019

Date of Patent: April 28, 2020

Assignee: DeepMind Technologies Limited

Inventors: Razvan Pascanu, Raia Thais Hadsell, Mel Vecerik, Thomas Rothoerl, Andrei-Alexandru Rusu, Nicolas Manfred Otto Heess
Selecting reinforcement learning actions using goals and observations

Patent number: 10628733

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for reinforcement learning using goals and observations. One of the methods includes receiving an observation characterizing a current state of the environment; receiving a goal characterizing a target state from a set of target states of the environment; processing the observation using an observation neural network to generate a numeric representation of the observation; processing the goal using a goal neural network to generate a numeric representation of the goal; combining the numeric representation of the observation and the numeric representation of the goal to generate a combined representation; processing the combined representation using an action score neural network to generate a respective score for each action in the predetermined set of actions; and selecting the action to be performed using the respective scores for the actions in the predetermined set of actions.

Type: Grant

Filed: April 6, 2016

Date of Patent: April 21, 2020

Assignee: DeepMind Technologies Limited

Inventors: Tom Schaul, Daniel George Horgan, Karol Gregor, David Silver

prev … 9 10 11 12 13 14 15 next