Patents by Inventor Theophane Guillaume Weber
Theophane Guillaume Weber has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11836596Abstract: A system including one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to implement a memory and memory-based neural network is described. The memory is configured to store a respective memory vector at each of a plurality of memory locations in the memory. The memory-based neural network is configured to: at each of a plurality of time steps: receive an input; determine an update to the memory, wherein determining the update comprising applying an attention mechanism over the memory vectors in the memory and the received input; update the memory using the determined update to the memory; and generate an output for the current time step using the updated memory.Type: GrantFiled: November 30, 2020Date of Patent: December 5, 2023Assignee: DeepMind Technologies LimitedInventors: Mike Chrzanowski, Jack William Rae, Ryan Faulkner, Theophane Guillaume Weber, David Nunes Raposo, Adam Anthony Santoro
-
Publication number: 20220366247Abstract: A reinforcement learning system and method that selects actions to be performed by an agent interacting with an environment. The system uses a combination of reinforcement learning and a look ahead search: Reinforcement learning Q-values are used to guide the look ahead search and the search is used in turn to improve the Q-values. The system learns from a combination of real experience and simulated, model-based experience.Type: ApplicationFiled: September 23, 2020Publication date: November 17, 2022Inventors: Jessica Blake Chandler Hamrick, Victor Constant Bapst, Alvaro Sanchez, Tobias Pfaff, Theophane Guillaume Weber, Lars Buesing, Peter William Battaglia
-
Publication number: 20220366245Abstract: A reinforcement learning method and system that selects actions to be performed by a reinforcement learning agent interacting with an environment. A causal model is implemented by a hindsight model neural network and trained using hindsight i.e. using future environment state trajectories. As the method and system does not have access to this future information when selecting an action, the hindsight model neural network is used to train a model neural network which is conditioned on data from current observations, which learns to predict an output of the hindsight model neural network.Type: ApplicationFiled: September 23, 2020Publication date: November 17, 2022Inventors: Arthur Clement Guez, Fabio Viola, Theophane Guillaume Weber, Lars Buesing, Nicolas Manfred Otto Heess
-
Publication number: 20220366246Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using an environment model to simulate state transitions of an environment being interacted with by an agent that is controlled using a policy neural network. One of the methods includes initializing an internal representation of a state of the environment at a current time point; repeatedly performing the following operations: receiving an action to be performed by the agent; generating, based on the internal representation, a predicted latent representation that is a prediction of a latent representation that would have been generated by the policy neural network by processing an observation characterizing the state of the environment corresponding to the internal representation; and updating the internal representation to simulate a state transition caused by the agent performing the received action by processing the predicted latent representation and the received action using the environment model.Type: ApplicationFiled: September 24, 2020Publication date: November 17, 2022Inventors: Ivo Danihelka, Danilo Jimenez Rezende, Karol Gregor, Georgios Papamakarios, Theophane Guillaume Weber
-
Patent number: 11388424Abstract: A system implemented by one or more computers comprises a visual encoder component configured to receive as input data representing a sequence of image frames, in particular representing objects in a scene of the sequence, and to output a sequence of corresponding state codes, each state code comprising vectors, one for each of the objects. Each vector represents a respective position and velocity of its corresponding object. The system also comprises a dynamic predictor component configured to take as input a sequence of state codes, for example from the visual encoder, and predict a state code for a next unobserved frame. The system further comprises a state decoder component configured to convert the predicted state code, to a state, the state comprising a respective position and velocity vector for each object in the scene. This state may represent a predicted position and velocity vector for each of the objects.Type: GrantFiled: December 29, 2020Date of Patent: July 12, 2022Assignee: DeepMind Technologies LimitedInventors: Nicholas Watters, Razvan Pascanu, Peter William Battaglia, Daniel Zorn, Theophane Guillaume Weber
-
Patent number: 11328183Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.Type: GrantFiled: September 14, 2020Date of Patent: May 10, 2022Assignee: DeepMind Technologies LimitedInventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Arthur Clement Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
-
Publication number: 20210152835Abstract: A system implemented by one or more computers comprises a visual encoder component configured to receive as input data representing a sequence of image frames, in particular representing objects in a scene of the sequence, and to output a sequence of corresponding state codes, each state code comprising vectors, one for each of the objects. Each vector represents a respective position and velocity of its corresponding object. The system also comprises a dynamic predictor component configured to take as input a sequence of state codes, for example from the visual encoder, and predict a state code for a next unobserved frame. The system further comprises a state decoder component configured to convert the predicted state code, to a state, the state comprising a respective position and velocity vector for each object in the scene. This state may represent a predicted position and velocity vector for each of the objects.Type: ApplicationFiled: December 29, 2020Publication date: May 20, 2021Inventors: Nicholas Watters, Razvan Pascanu, Peter William Battaglia, Daniel Zorn, Theophane Guillaume Weber
-
Publication number: 20210089834Abstract: A neural network system is proposed to select actions to be performed by an agent interacting with an environment to perform a task in an attempt to achieve a specified result. The system may include a controller to receive state data and context data, and to output action data. The system may also include an imagination module to receive the state and action data, and to output consequent state data. The system may also include a manager to receive the state data and the context data, and to output route data which defines whether the system is to execute an action or to imagine. The system may also include a memory to store the context data.Type: ApplicationFiled: December 7, 2020Publication date: March 25, 2021Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
-
Publication number: 20210081795Abstract: A system including one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to implement a memory and memory-based neural network is described. The memory is configured to store a respective memory vector at each of a plurality of memory locations in the memory. The memory-based neural network is configured to: at each of a plurality of time steps: receive an input; determine an update to the memory, wherein determining the update comprising applying an attention mechanism over the memory vectors in the memory and the received input; update the memory using the determined update to the memory; and generate an output for the current time step using the updated memory.Type: ApplicationFiled: November 30, 2020Publication date: March 18, 2021Inventors: Mike Chrzanowski, Jack William Rae, Ryan Faulkner, Theophane Guillaume Weber, David Nunes Raposo, Adam Anthony Santoro
-
Publication number: 20210073594Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.Type: ApplicationFiled: September 14, 2020Publication date: March 11, 2021Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Arthur Clement Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
-
Patent number: 10887607Abstract: A system implemented by one or more computers comprises a visual encoder component configured to receive as input data representing a sequence of image frames, in particular representing objects in a scene of the sequence, and to output a sequence of corresponding state codes, each state code comprising vectors, one for each of the objects. Each vector represents a respective position and velocity of its corresponding object. The system also comprises a dynamic predictor component configured to take as input a sequence of state codes, for example from the visual encoder, and predict a state code for a next unobserved frame. The system further comprises a state decoder component configured to convert the predicted state code, to a state, the state comprising a respective position and velocity vector for each object in the scene. This state may represent a predicted position and velocity vector for each of the objects.Type: GrantFiled: November 18, 2019Date of Patent: January 5, 2021Assignee: DeepMind Technologies LimitedInventors: Nicholas Watters, Razvan Pascanu, Peter William Battaglia, Daniel Zorn, Theophane Guillaume Weber
-
Patent number: 10860895Abstract: A neural network system is proposed to select actions to be performed by an agent interacting with an environment to perform a task in an attempt to achieve a specified result. The system may include a controller to receive state data and context data, and to output action data. The system may also include an imagination module to receive the state and action data, and to output consequent state data. The system may also include a manager to receive the state data and the context data, and to output route data which defines whether the system is to execute an action or to imagine. The system may also include a memory to store the context data.Type: GrantFiled: November 19, 2019Date of Patent: December 8, 2020Assignee: DeepMind Technologies LimitedInventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
-
Patent number: 10853725Abstract: A system including one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to implement a memory and memory-based neural network is described. The memory is configured to store a respective memory vector at each of a plurality of memory locations in the memory. The memory-based neural network is configured to: at each of a plurality of time steps: receive an input; determine an update to the memory, wherein determining the update comprising applying an attention mechanism over the memory vectors in the memory and the received input; update the memory using the determined update to the memory; and generate an output for the current time step using the updated memory.Type: GrantFiled: May 17, 2019Date of Patent: December 1, 2020Assignee: DeepMind Technologies LimitedInventors: Mike Chrzanowski, Jack William Rae, Ryan Faulkner, Theophane Guillaume Weber, David Nunes Raposo, Adam Anthony Santoro
-
Patent number: 10776670Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.Type: GrantFiled: November 19, 2019Date of Patent: September 15, 2020Assignee: DeepMind Technologies LimitedInventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Arthur Clement Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
-
Publication number: 20200090006Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.Type: ApplicationFiled: November 19, 2019Publication date: March 19, 2020Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Arthur Clement Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
-
Publication number: 20200092565Abstract: A system implemented by one or more computers comprises a visual encoder component configured to receive as input data representing a sequence of image frames, in particular representing objects in a scene of the sequence, and to output a sequence of corresponding state codes, each state code comprising vectors, one for each of the objects. Each vector represents a respective position and velocity of its corresponding object. The system also comprises a dynamic predictor component configured to take as input a sequence of state codes, for example from the visual encoder, and predict a state code for a next unobserved frame. The system further comprises a state decoder component configured to convert the predicted state code, to a state, the state comprising a respective position and velocity vector for each object in the scene. This state may represent a predicted position and velocity vector for each of the objects.Type: ApplicationFiled: November 18, 2019Publication date: March 19, 2020Inventors: Nicholas Watters, Razvan Pascanu, Peter William Battaglia, Daniel Zorn, Theophane Guillaume Weber
-
Publication number: 20200082227Abstract: A neural network system is proposed to select actions to be performed by an agent interacting with an environment to perform a task in an attempt to achieve a specified result. The system may include a controller to receive state data and context data, and to output action data. The system may also include an imagination module to receive the state and action data, and to output consequent state data. The system may also include a manager to receive the state data and the context data, and to output route data which defines whether the system is to execute an action or to imagine. The system may also include a memory to store the context data.Type: ApplicationFiled: November 19, 2019Publication date: March 12, 2020Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
-
Publication number: 20190354858Abstract: A system including one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to implement a memory and memory-based neural network is described. The memory is configured to store a respective memory vector at each of a plurality of memory locations in the memory. The memory-based neural network is configured to: at each of a plurality of time steps: receive an input; determine an update to the memory, wherein determining the update comprising applying an attention mechanism over the memory vectors in the memory and the received input; update the memory using the determined update to the memory; and generate an output for the current time step using the updated memory.Type: ApplicationFiled: May 17, 2019Publication date: November 21, 2019Inventors: Mike Chrzanowski, Jack William Rae, Ryan Faulkner, Theophane Guillaume Weber, David Nunes Raposo, Adam Anthony Santoro