Patents by Inventor Adria Puigdomenech Badia

Adria Puigdomenech Badia has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ATTENTION NEURAL NETWORKS WITH SHORT-TERM MEMORY UNITS

Publication number: 20240095495

Abstract: A system for controlling an agent interacting with an environment to perform a task. The system includes an action selection neural network configured to generate action selection outputs that are used to select actions to be performed by the agent. The action selection neural network includes an encoder sub network configured to generate encoded representations of the current observations; an attention sub network configured to generate attention sub network outputs with the used of an attention mechanism; a recurrent sub network configured to generate recurrent sub network outputs; and an action selection sub network configured to generate the action selection outputs that are used to select the actions to be performed by the agent in response to the current observations.

Type: Application

Filed: February 7, 2022

Publication date: March 21, 2024

Inventors: Andrea Banino, Adrià Puigdomènech Badia, Jacob Charles Walker, Timothy Anthony Julian Scholtes, Jovana Mitrovic, Charles Blundell
JOINTLY LEARNING EXPLORATORY AND NON-EXPLORATORY ACTION SELECTION POLICIES

Publication number: 20240028866

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by an agent interacting with an environment. In one aspect, the method comprises: receiving an observation characterizing a current state of the environment; processing the observation and an exploration importance factor using the action selection neural network to generate an action selection output; selecting an action to be performed by the agent using the action selection output; determining an exploration reward; determining an overall reward based on: (i) the exploration importance factor, and (ii) the exploration reward; and training the action selection neural network using a reinforcement learning technique based on the overall reward.

Type: Application

Filed: June 13, 2023

Publication date: January 25, 2024

Inventors: Adrià Puigdomènech Badia, Pablo Sprechmann, Alex Vitvitskyi, Zhaohan Guo, Bilal Piot, Steven James Kapturowski, Olivier Tieleman, Charles Blundell
Asynchronous deep reinforcement learning

Patent number: 11783182

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for asynchronous deep reinforcement learning. One of the systems includes a plurality of workers, wherein each worker is configured to operate independently of each other worker, and wherein each worker is associated with a respective actor that interacts with a respective replica of the environment during the training of the deep neural network.

Type: Grant

Filed: February 8, 2021

Date of Patent: October 10, 2023

Assignee: DeepMind Technologies Limited

Inventors: Volodymyr Mnih, Adrià Puigdomènech Badia, Alexander Benjamin Graves, Timothy James Alexander Harley, David Silver, Koray Kavukcuoglu
Neural episodic control

Patent number: 11720796

Abstract: A method includes maintaining respective episodic memory data for each of multiple actions; receiving a current observation characterizing a current state of an environment being interacted with by an agent; processing the current observation using an embedding neural network in accordance with current values of parameters of the embedding neural network to generate a current key embedding for the current observation; for each action of the plurality of actions: determining the p nearest key embeddings in the episodic memory data for the action to the current key embedding according to a distance measure, and determining a Q value for the action from the return estimates mapped to by the p nearest key embeddings in the episodic memory data for the action; and selecting, using the Q values for the actions, an action from the multiple actions as the action to be performed by the agent.

Type: Grant

Filed: April 23, 2020

Date of Patent: August 8, 2023

Assignee: DeepMind Technologies Limited

Inventors: Benigno Uria-Martínez, Alexander Pritzel, Charles Blundell, Adrià Puigdomènech Badia
Jointly learning exploratory and non-exploratory action selection policies

Patent number: 11714990

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by an agent interacting with an environment. In one aspect, the method comprises: receiving an observation characterizing a current state of the environment; processing the observation and an exploration importance factor using the action selection neural network to generate an action selection output; selecting an action to be performed by the agent using the action selection output; determining an exploration reward; determining an overall reward based on: (i) the exploration importance factor, and (ii) the exploration reward; and training the action selection neural network using a reinforcement learning technique based on the overall reward.

Type: Grant

Filed: May 22, 2020

Date of Patent: August 1, 2023

Assignee: DeepMind Technologies Limited

Inventors: Adrià Puigdomènech Badia, Pablo Sprechmann, Alex Vitvitskyi, Zhaohan Guo, Bilal Piot, Steven James Kapturowski, Olivier Tieleman, Charles Blundell
REINFORCEMENT LEARNING WITH ADAPTIVE RETURN COMPUTATION SCHEMES

Publication number: 20230059004

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for reinforcement learning with adaptive return computation schemes. In one aspect, a method includes: maintaining data specifying a policy for selecting between multiple different return computation schemes, each return computation scheme assigning a different importance to exploring the environment while performing an episode of a task; selecting, using the policy, a return computation scheme from the multiple different return computation schemes; controlling an agent to perform the episode of the task to maximize a return computed according to the selected return computation scheme; identifying rewards that were generated as a result of the agent performing the episode of the task; and updating, using the identified rewards, the policy for selecting between multiple different return computation schemes.

Type: Application

Filed: February 8, 2021

Publication date: February 23, 2023

Inventors: Adrià Puigdomènech Badia, Bilal Piot, Pablo Sprechmann, Steven James Kapturowski, Alex Vitvitskyi, Zhaohan Guo, Charles Blundell
REINFORCEMENT LEARNING USING BASELINE AND POLICY NEURAL NETWORKS

Publication number: 20220261647

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for asynchronous deep reinforcement learning. One of the systems includes a plurality of workers, wherein each worker is configured to operate independently of each other worker, and wherein each worker is associated with a respective actor that interacts with a respective replica of the environment during the training of the deep neural network.

Type: Application

Filed: April 29, 2022

Publication date: August 18, 2022

Inventors: Volodymyr Mnih, Adrià Puigdomènech Badia, Alexander Benjamin Graves, Timothy James Alexander Harley, David Silver, Koray Kavukcuoglu
NEURAL NETWORK-BASED MEMORY SYSTEM WITH VARIABLE RECIRCULATION OF QUERIES USING MEMORY CONTENT

Publication number: 20220253698

Abstract: A neural network based memory system with external memory for storing representations of knowledge items. The memory can be used to retrieve indirectly related knowledge items by recirculating queries, and is useful for relational reasoning. Implementations of the system control how many times queries are recirculated, and hence the degree of relational reasoning, to minimize computation.

Type: Application

Filed: May 22, 2020

Publication date: August 11, 2022

Inventors: Andrea Banino, Charles Blundell, Adrià Puigdomènech Badia, Raphael Koster, Sudarshan Kumaran
Asynchronous deep reinforcement learning

Patent number: 11334792

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for asynchronous deep reinforcement learning. One of the systems includes a plurality of workers, wherein each worker is configured to operate independently of each other worker, and wherein each worker is associated with a respective actor that interacts with a respective replica of the environment during the training of the deep neural network.

Type: Grant

Filed: May 3, 2019

Date of Patent: May 17, 2022

Assignee: DeepMind Technologies Limited

Inventors: Volodymyr Mnih, Adria Puigdomenech Badia, Alexander Benjamin Graves, Timothy James Alexander Harley, David Silver, Koray Kavukcuoglu
Imagination-based agent neural networks

Patent number: 11328183

Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.

Type: Grant

Filed: September 14, 2020

Date of Patent: May 10, 2022

Assignee: DeepMind Technologies Limited

Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Arthur Clement Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
ASYNCHRONOUS DEEP REINFORCEMENT LEARNING

Publication number: 20210166127

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for asynchronous deep reinforcement learning. One of the systems includes a plurality of workers, wherein each worker is configured to operate independently of each other worker, and wherein each worker is associated with a respective actor that interacts with a respective replica of the environment during the training of the deep neural network.

Type: Application

Filed: February 8, 2021

Publication date: June 3, 2021

Inventors: Volodymyr Mnih, Adrià Puigdomènech Badia, Alexander Benjamin Graves, Timothy James Alexander Harley, David Silver, Koray Kavukcuoglu
IMAGINATION-BASED AGENT NEURAL NETWORKS

Publication number: 20210073594

Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.

Type: Application

Filed: September 14, 2020

Publication date: March 11, 2021

Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Arthur Clement Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
Asynchronous deep reinforcement learning

Patent number: 10936946

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for asynchronous deep reinforcement learning. One of the systems includes a plurality of workers, wherein each worker is configured to operate independently of each other worker, and wherein each worker is associated with a respective actor that interacts with a respective replica of the environment during the training of the deep neural network.

Type: Grant

Filed: November 11, 2016

Date of Patent: March 2, 2021

Assignee: DeepMind Technologies Limited

Inventors: Volodymyr Mnih, Adrià Puigdomènech Badia, Alexander Benjamin Graves, Timothy James Alexander Harley, David Silver, Koray Kavukcuoglu
JOINTLY LEARNING EXPLORATORY AND NON-EXPLORATORY ACTION SELECTION POLICIES

Publication number: 20200372366

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by an agent interacting with an environment. In one aspect, the method comprises: receiving an observation characterizing a current state of the environment; processing the observation and an exploration importance factor using the action selection neural network to generate an action selection output; selecting an action to be performed by the agent using the action selection output; determining an exploration reward; determining an overall reward based on: (i) the exploration importance factor, and (ii) the exploration reward; and training the action selection neural network using a reinforcement learning technique based on the overall reward.

Type: Application

Filed: May 22, 2020

Publication date: November 26, 2020

Inventors: Adrià Puigdomènech Badia, Pablo Sprechmann, Alex Vitvitskyi, Zhaohan Guo, Bilal Piot, Steven James Kapturowski, Olivier Tieleman, Charles Blundell
Imagination-based agent neural networks

Patent number: 10776670

Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.

Type: Grant

Filed: November 19, 2019

Date of Patent: September 15, 2020

Assignee: DeepMind Technologies Limited

Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Arthur Clement Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
MACHINE LEARNING SYSTEMS WITH MEMORY BASED PARAMETER ADAPTATION FOR LEARNING FAST AND SLOWER

Publication number: 20200285940

Abstract: There is described herein a computer-implemented method of processing an input data item. The method comprises processing the input data item using a parametric model to generate output data, wherein the parametric model comprises a first sub-model and a second sub-model. The processing comprises processing, by the first sub-model, the input data to generate a query data item, retrieving, from a memory storing data point-value pairs, at least one data point-value pair based upon the query data item and modifying weights of the second sub-model based upon the retrieved at least one data point-value pair. The output data is then generated based upon the modified second sub-model.

Type: Application

Filed: October 29, 2018

Publication date: September 10, 2020

Inventors: Pablo Sprechmann, Siddhant Jayakumar, Jack William Rae, Alexander Pritzel, Adrià Puigdomènech Badia, Oriol Vinyals, Razvan Pascanu, Charles Blundell
NEURAL EPISODIC CONTROL

Publication number: 20200265317

Abstract: A method includes maintaining respective episodic memory data for each of multiple actions; receiving a current observation characterizing a current state of an environment being interacted with by an agent; processing the current observation using an embedding neural network in accordance with current values of parameters of the embedding neural network to generate a current key embedding for the current observation; for each action of the plurality of actions: determining the p nearest key embeddings in the episodic memory data for the action to the current key embedding according to a distance measure, and determining a Q value for the action from the return estimates mapped to by the p nearest key embeddings in the episodic memory data for the action; and selecting, using the Q values for the actions, an action from the multiple actions as the action to be performed by the agent.

Type: Application

Filed: April 23, 2020

Publication date: August 20, 2020

Inventors: Benigno Uria-Martínez, Alexander Pritzel, Charles Blundell, Adrià Puigdomènech Badia
Neural episodic control

Patent number: 10664753

Abstract: A method includes maintaining respective episodic memory data for each of multiple actions; receiving a current observation characterizing a current state of an environment being interacted with by an agent; processing the current observation using an embedding neural network in accordance with current values of parameters of the embedding neural network to generate a current key embedding for the current observation; for each action of the plurality of actions: determining the p nearest key embeddings in the episodic memory data for the action to the current key embedding according to a distance measure, and determining a Q value for the action from the return estimates mapped to by the p nearest key embeddings in the episodic memory data for the action; and selecting, using the Q values for the actions, an action from the multiple actions as the action to be performed by the agent.

Type: Grant

Filed: June 19, 2019

Date of Patent: May 26, 2020

Assignee: DeepMind Technologies Limited

Inventors: Benigno Uria-Martínez, Alexander Pritzel, Charles Blundell, Adria Puigdomenech Badia
IMAGINATION-BASED AGENT NEURAL NETWORKS

Publication number: 20200090006

Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.

Type: Application

Filed: November 19, 2019

Publication date: March 19, 2020

Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Arthur Clement Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
NEURAL EPISODIC CONTROL

Publication number: 20190303764

Abstract: A method includes maintaining respective episodic memory data for each of multiple actions; receiving a current observation characterizing a current state of an environment being interacted with by an agent; processing the current observation using an embedding neural network in accordance with current values of parameters of the embedding neural network to generate a current key embedding for the current observation; for each action of the plurality of actions: determining the p nearest key embeddings in the episodic memory data for the action to the current key embedding according to a distance measure, and determining a Q value for the action from the return estimates mapped to by the p nearest key embeddings in the episodic memory data for the action; and selecting, using the Q values for the actions, an action from the multiple actions as the action to be performed by the agent.

Type: Application

Filed: June 19, 2019

Publication date: October 3, 2019

Inventors: Benigno Uria-Martínez, Alexander Pritzel, Charles Blundell, Adria Puigdomenech Badia

1 2 next