Patents by Inventor Timothy Paul Lillicrap

Timothy Paul Lillicrap has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

DATA-EFFICIENT REINFORCEMENT LEARNING FOR CONTINUOUS CONTROL TASKS

Publication number: 20240062035

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-efficient reinforcement learning. One of the systems is a system for training an actor neural network used to select actions to be performed by an agent that interacts with an environment by receiving observations characterizing states of the environment and, in response to each observation, performing an action selected from a continuous space of possible actions, wherein the actor neural network maps observations to next actions in accordance with values of parameters of the actor neural network, and wherein the system comprises: a plurality of workers, wherein each worker is configured to operate independently of each other worker, wherein each worker is associated with a respective agent replica that interacts with a respective replica of the environment during the training of the actor neural network.

Type: Application

Filed: July 12, 2023

Publication date: February 22, 2024

Inventors: Martin Riedmiller, Roland Hafner, Mel Vecerik, Timothy Paul Lillicrap, Thomas Lampe, Ivaylo Popov, Gabriel Barth-Maron, Nicolas Manfred Otto Heess
AUGMENTING ATTENTION-BASED NEURAL NETWORKS TO SELECTIVELY ATTEND TO PAST INPUTS

Publication number: 20240046103

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing a machine learning task on a network input that is a sequence to generate a network output. In one aspect, one of the methods includes, for each particular sequence of layer inputs: for each attention layer in the neural network: maintaining episodic memory data; maintaining compressed memory data; receiving a layer input to be processed by the attention layer; and applying an attention mechanism over (i) the compressed representation in the compressed memory data for the layer, (ii) the hidden states in the episodic memory data for the layer, and (iii) the respective hidden state at each of the plurality of input positions in the particular network input to generate a respective activation for each input position in the layer input.

Type: Application

Filed: October 12, 2023

Publication date: February 8, 2024

Inventors: Jack William Rae, Anna Potapenko, Timothy Paul Lillicrap
Selecting reinforcement learning actions using a low-level controller

Patent number: 11875258

Abstract: Methods, systems, and apparatus for selecting actions to be performed by an agent interacting with an environment. One system includes a high-level controller neural network, low-level controller network, and subsystem. The high-level controller neural network receives an input observation and processes the input observation to generate a high-level output defining a control signal for the low-level controller. The low-level controller neural network receives a designated component of an input observation and processes the designated component and an input control signal to generate a low-level output that defines an action to be performed by the agent in response to the input observation.

Type: Grant

Filed: December 2, 2021

Date of Patent: January 16, 2024

Assignee: DeepMind Technologies Limited

Inventors: Nicolas Manfred Otto Heess, Timothy Paul Lillicrap, Gregory Duncan Wayne, Yuval Tassa
Augmenting attentioned-based neural networks to selectively attend to past inputs

Patent number: 11829884

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing a machine learning task on a network input that is a sequence to generate a network output. In one aspect, one of the methods includes, for each particular sequence of layer inputs: for each attention layer in the neural network: maintaining episodic memory data; maintaining compressed memory data; receiving a layer input to be processed by the attention layer; and applying an attention mechanism over (i) the compressed representation in the compressed memory data for the layer, (ii) the hidden states in the episodic memory data for the layer, and (iii) the respective hidden state at each of the plurality of input positions in the particular network input to generate a respective activation for each input position in the layer input.

Type: Grant

Filed: September 25, 2020

Date of Patent: November 28, 2023

Assignee: DeepMind Technologies Limited

Inventors: Jack William Rae, Anna Potapenko, Timothy Paul Lillicrap
Continuous control with deep reinforcement learning

Patent number: 11803750

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an actor neural network used to select actions to be performed by an agent interacting with an environment. One of the methods includes obtaining a minibatch of experience tuples; and updating current values of the parameters of the actor neural network, comprising: for each experience tuple in the minibatch: processing the training observation and the training action in the experience tuple using a critic neural network to determine a neural network output for the experience tuple, and determining a target neural network output for the experience tuple; updating current values of the parameters of the critic neural network using errors between the target neural network outputs and the neural network outputs; and updating the current values of the parameters of the actor neural network using the critic neural network.

Type: Grant

Filed: September 14, 2020

Date of Patent: October 31, 2023

Assignee: DeepMind Technologies Limited

Inventors: Timothy Paul Lillicrap, Jonathan James Hunt, Alexander Pritzel, Nicolas Manfred Otto Heess, Tom Erez, Yuval Tassa, David Silver, Daniel Pieter Wierstra
Controlling agents over long time scales using temporal value transport

Patent number: 11769049

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network system used to control an agent interacting with an environment to perform a specified task. One of the methods includes causing the agent to perform a task episode in which the agent attempts to perform the specified task; for each of one or more particular time steps in the sequence: generating a modified reward for the particular time step from (i) the actual reward at the time step and (ii) value predictions at one or more time steps that are more than a threshold number of time steps after the particular time step in the sequence; and training, through reinforcement learning, the neural network system using at least the modified rewards for the particular time steps.

Type: Grant

Filed: September 28, 2020

Date of Patent: September 26, 2023

Assignee: DeepMind Technologies Limited

Inventors: Gregory Duncan Wayne, Timothy Paul Lillicrap, Chia-Chun Hung, Joshua Simon Abramson
Data-efficient reinforcement learning for continuous control tasks

Patent number: 11741334

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-efficient reinforcement learning. One of the systems is a system for training an actor neural network used to select actions to be performed by an agent that interacts with an environment by receiving observations characterizing states of the environment and, in response to each observation, performing an action selected from a continuous space of possible actions, wherein the actor neural network maps observations to next actions in accordance with values of parameters of the actor neural network, and wherein the system comprises: a plurality of workers, wherein each worker is configured to operate independently of each other worker, wherein each worker is associated with a respective agent replica that interacts with a respective replica of the environment during the training of the actor neural network.

Type: Grant

Filed: May 22, 2020

Date of Patent: August 29, 2023

Assignee: DeepMind Technologies Limited

Inventors: Martin Riedmiller, Roland Hafner, Mel Vecerik, Timothy Paul Lillicrap, Thomas Lampe, Ivaylo Popov, Gabriel Barth-Maron, Nicolas Manfred Otto Heess
LEARNED COMPUTER CONTROL USING POINTING DEVICE AND KEYBOARD ACTIONS

Publication number: 20230244325

Abstract: A computer-implemented method for controlling a particular computer to execute a task is described. The method includes receiving a control input comprising a visual input, the visual input including one or more screen frames of a computer display that represent at least a current state of the particular computer; processing the control input using a neural network to generate one or more control outputs that are used to control the particular computer to execute the task, in which the one or more control outputs include an action type output that specifies at least one of a pointing device action or a keyboard action to be performed to control the particular computer; determining one or more actions from the one or more control outputs; and executing the one or more actions to control the particular computer.

Type: Application

Filed: January 30, 2023

Publication date: August 3, 2023

Inventors: Peter Conway Humphreys, Timothy Paul Lillicrap, Tobias Markus Pohlen, Adam Anthony Santoro
CONTROLLING INTERACTIVE AGENTS USING MULTI-MODAL INPUTS

Publication number: 20230178076

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling agents. In particular, an interactive agent can be controlled based on multi-modal inputs that include both an observation image and a natural language text sequence.

Type: Application

Filed: December 7, 2022

Publication date: June 8, 2023

Inventors: Joshua Simon Abramson, Arun Ahuja, Federico Javier Carnevale, Petko Ivanov Georgiev, Chia-Chun Hung, Timothy Paul Lillicrap, Alistair Michael Muldal, Adam Anthony Santoro, Tamara Louise von Glehn, Jessica Paige Landon, Gregory Duncan Wayne, Chen Yan, Rui Zhu
REINFORCEMENT LEARNING USING ADVANTAGE ESTIMATES

Publication number: 20220284266

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for computing Q values for actions to be performed by an agent interacting with an environment from a continuous action space of actions. In one aspect, a system includes a value subnetwork configured to receive an observation characterizing a current state of the environment and process the observation to generate a value estimate; a policy subnetwork configured to receive the observation and process the observation to generate an ideal point in the continuous action space; and a subsystem configured to receive a particular point in the continuous action space representing a particular action; generate an advantage estimate for the particular action; and generate a Q value for the particular action that is an estimate of an expected return resulting from the agent performing the particular action when the environment is in the current state.

Type: Application

Filed: March 25, 2022

Publication date: September 8, 2022

Inventors: Shixiang Gu, Timothy Paul Lillicrap, Ilya Sutskever, Sergey Vladimir Levine
ENERGY-BASED ASSOCIATIVE MEMORY NEURAL NETWORKS

Publication number: 20220180147

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for implementing associative memory. In one aspect a system comprises an associative memory neural network to process an input to generate an output that defines an energy corresponding to the input. A reading subsystem retrieves stored information from the associative memory neural network. The reading subsystem performs operations including receiving a given, i.e. query, input and retrieving a data element from the associative memory neural network that is associated with the given input. The retrieving is performed by iteratively adjusting the given input using the associative memory neural network.

Type: Application

Filed: May 19, 2020

Publication date: June 9, 2022

Inventors: Sergey Bartunov, Jack William Rae, Timothy Paul Lillicrap, Simon Osindero
Reinforcement learning using advantage estimates

Patent number: 11288568

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for computing Q values for actions to be performed by an agent interacting with an environment from a continuous action space of actions. In one aspect, a system includes a value subnetwork configured to receive an observation characterizing a current state of the environment and process the observation to generate a value estimate; a policy subnetwork configured to receive the observation and process the observation to generate an ideal point in the continuous action space; and a subsystem configured to receive a particular point in the continuous action space representing a particular action; generate an advantage estimate for the particular action; and generate a Q value for the particular action that is an estimate of an expected return resulting from the agent performing the particular action when the environment is in the current state.

Type: Grant

Filed: February 9, 2017

Date of Patent: March 29, 2022

Assignee: Google LLC

Inventors: Shixiang Gu, Timothy Paul Lillicrap, Ilya Sutskever, Sergey Vladimir Levine
Selecting reinforcement learning actions using a low-level controller

Patent number: 11210585

Abstract: Methods, systems, and apparatus for selecting actions to be performed by an agent interacting with an environment. One system includes a high-level controller neural network, low-level controller network, and subsystem. The high-level controller neural network receives an input observation and processes the input observation to generate a high-level output defining a control signal for the low-level controller. The low-level controller neural network receives a designated component of an input observation and processes the designated component and an input control signal to generate a low-level output that defines an action to be performed by the agent in response to the input observation.

Type: Grant

Filed: May 12, 2017

Date of Patent: December 28, 2021

Assignee: DeepMind Technologies Limited

Inventors: Nicolas Manfred Otto Heess, Timothy Paul Lillicrap, Gregory Duncan Wayne, Yuval Tassa
Augmenting neural networks with sparsely-accessed external memory

Patent number: 11151443

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory. One of the systems includes a sparse memory access subsystem that is configured to perform operations comprising generating a sparse set of reading weights that includes a respective reading weight for each of the plurality of locations in the external memory using the read key, reading data from the plurality of locations in the external memory in accordance with the sparse set of reading weights, generating a set of writing weights that includes a respective writing weight for each of the plurality of locations in the external memory, and writing the write vector to the plurality of locations in the external memory in accordance with the writing weights.

Type: Grant

Filed: February 3, 2017

Date of Patent: October 19, 2021

Assignee: DeepMind Technologies Limited

Inventors: Ivo Danihelka, Gregory Duncan Wayne, Fu-min Wang, Edward Thomas Grefenstette, Jack William Rae, Alexander Benjamin Graves, Timothy Paul Lillicrap, Timothy James Alexander Harley, Jonathan James Hunt
TRAINING REINFORCEMENT LEARNING AGENTS TO LEARN FARSIGHTED BEHAVIORS BY PREDICTING IN LATENT SPACE

Publication number: 20210158162

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection policy neural network used to select an action to be performed by an agent interacting with an environment. In one aspect, a method includes: receiving a latent representation characterizing a current state of the environment; generating a trajectory of latent representations that starts with the received latent representation; for each latent representation in the trajectory: determining a predicted reward; and processing the state latent representation using a value neural network to generate a predicted state value; determining a corresponding target state value for each latent representation in the trajectory; determining, based on the target state values, an update to the current values of the policy neural network parameters; and determining an update to the current values of the value neural network parameters.

Type: Application

Filed: November 24, 2020

Publication date: May 27, 2021

Inventors: Danijar Hafner, Mohammad Norouzi, Timothy Paul Lillicrap
SCALABLE AND COMPRESSIVE NEURAL NETWORK DATA STORAGE SYSTEM

Publication number: 20210150314

Abstract: A system for compressed data storage using a neural network. The system comprises a memory comprising a plurality of memory locations configured to store data; a query neural network configured to process a representation of an input data item to generate a query; an immutable key data store comprising key data for indexing the plurality of memory locations; an addressing system configured to process the key data and the query to generate a weighting associated with the plurality of memory locations; a memory read system configured to generate output memory data from the memory based upon the generated weighting associated with the plurality of memory locations and the data stored at the plurality of memory locations; and a memory write system configured to write received write data to the memory based upon the generated weighting associated with the plurality of memory locations.

Type: Application

Filed: November 23, 2020

Publication date: May 20, 2021

Inventors: Jack William Rae, Timothy Paul Lillicrap, Sergey Bartunov
AUGMENTING ATTENTIONED-BASED NEURAL NETWORKS TO SELECTIVELY ATTEND TO PAST INPUTS

Publication number: 20210089829

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing a machine learning task on a network input that is a sequence to generate a network output. In one aspect, one of the methods includes, for each particular sequence of layer inputs: for each attention layer in the neural network: maintaining episodic memory data; maintaining compressed memory data; receiving a layer input to be processed by the attention layer; and applying an attention mechanism over (i) the compressed representation in the compressed memory data for the layer, (ii) the hidden states in the episodic memory data for the layer, and (iii) the respective hidden state at each of the plurality of input positions in the particular network input to generate a respective activation for each input position in the layer input.

Type: Application

Filed: September 25, 2020

Publication date: March 25, 2021

Inventors: Jack William Rae, Anna Potapenko, Timothy Paul Lillicrap
CONTROLLING AGENTS OVER LONG TIME SCALES USING TEMPORAL VALUE TRANSPORT

Publication number: 20210081723

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network system used to control an agent interacting with an environment to perform a specified task. One of the methods includes causing the agent to perform a task episode in which the agent attempts to perform the specified task; for each of one or more particular time steps in the sequence: generating a modified reward for the particular time step from (i) the actual reward at the time step and (ii) value predictions at one or more time steps that are more than a threshold number of time steps after the particular time step in the sequence; and training, through reinforcement learning, the neural network system using at least the modified rewards for the particular time steps.

Type: Application

Filed: September 28, 2020

Publication date: March 18, 2021

Inventors: Gregory Duncan Wayne, Timothy Paul Lillicrap, Chia-Chun Hung, Joshua Simon Abramson
TRAINING AN UNSUPERVISED MEMORY-BASED PREDICTION SYSTEM TO LEARN COMPRESSED REPRESENTATIONS OF AN ENVIRONMENT

Publication number: 20210034969

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a memory-based prediction system configured to receive an input observation characterizing a state of an environment interacted with by an agent and to process the input observation and data read from a memory to update data stored in the memory and to generate a latent representation of the state of the environment. The method comprises: for each of a plurality of time steps: processing an observation for the time step and data read from the memory to: (i) update the data stored in the memory, and (ii) generate a latent representation of the current state of the environment as of the time step; and generating a predicted return that will be received by the agent as a result of interactions with the environment after the observation for the time step is received.

Type: Application

Filed: March 11, 2019

Publication date: February 4, 2021

Inventors: Gregory Duncan Wayne, Chia-Chun Hung, David Antony Amos, Mehdi Mirza Mohammadi, Arun Ahuja, Timothy Paul Lillicrap
Augmenting neural networks with external memory

Patent number: 10885426

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory. One of the systems includes a controller neural network that includes a Least Recently Used Access (LRUA) subsystem configured to: maintain a respective usage weight for each of a plurality of locations in the external memory, and for each of the plurality of time steps: generate a respective reading weight for each location using a read key, read data from the locations in accordance with the reading weights, generate a respective writing weight for each of the locations from a respective reading weight from a preceding time step and the respective usage weight for the location, write a write vector to the locations in accordance with the writing weights, and update the respective usage weight from the respective reading weight and the respective writing weight.

Type: Grant

Filed: December 30, 2016

Date of Patent: January 5, 2021

Assignee: DeepMind Technologies Limited

Inventors: Adam Anthony Santoro, Daniel Pieter Wiestra, Timothy Paul Lillicrap, Sergey Bartunov, Ivo Danihelka

1 2 next