Patents by Inventor Theophane Guillaume Weber

Theophane Guillaume Weber has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11836596
    Abstract: A system including one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to implement a memory and memory-based neural network is described. The memory is configured to store a respective memory vector at each of a plurality of memory locations in the memory. The memory-based neural network is configured to: at each of a plurality of time steps: receive an input; determine an update to the memory, wherein determining the update comprising applying an attention mechanism over the memory vectors in the memory and the received input; update the memory using the determined update to the memory; and generate an output for the current time step using the updated memory.
    Type: Grant
    Filed: November 30, 2020
    Date of Patent: December 5, 2023
    Assignee: DeepMind Technologies Limited
    Inventors: Mike Chrzanowski, Jack William Rae, Ryan Faulkner, Theophane Guillaume Weber, David Nunes Raposo, Adam Anthony Santoro
  • Publication number: 20220366247
    Abstract: A reinforcement learning system and method that selects actions to be performed by an agent interacting with an environment. The system uses a combination of reinforcement learning and a look ahead search: Reinforcement learning Q-values are used to guide the look ahead search and the search is used in turn to improve the Q-values. The system learns from a combination of real experience and simulated, model-based experience.
    Type: Application
    Filed: September 23, 2020
    Publication date: November 17, 2022
    Inventors: Jessica Blake Chandler Hamrick, Victor Constant Bapst, Alvaro Sanchez, Tobias Pfaff, Theophane Guillaume Weber, Lars Buesing, Peter William Battaglia
  • Publication number: 20220366245
    Abstract: A reinforcement learning method and system that selects actions to be performed by a reinforcement learning agent interacting with an environment. A causal model is implemented by a hindsight model neural network and trained using hindsight i.e. using future environment state trajectories. As the method and system does not have access to this future information when selecting an action, the hindsight model neural network is used to train a model neural network which is conditioned on data from current observations, which learns to predict an output of the hindsight model neural network.
    Type: Application
    Filed: September 23, 2020
    Publication date: November 17, 2022
    Inventors: Arthur Clement Guez, Fabio Viola, Theophane Guillaume Weber, Lars Buesing, Nicolas Manfred Otto Heess
  • Publication number: 20220366246
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using an environment model to simulate state transitions of an environment being interacted with by an agent that is controlled using a policy neural network. One of the methods includes initializing an internal representation of a state of the environment at a current time point; repeatedly performing the following operations: receiving an action to be performed by the agent; generating, based on the internal representation, a predicted latent representation that is a prediction of a latent representation that would have been generated by the policy neural network by processing an observation characterizing the state of the environment corresponding to the internal representation; and updating the internal representation to simulate a state transition caused by the agent performing the received action by processing the predicted latent representation and the received action using the environment model.
    Type: Application
    Filed: September 24, 2020
    Publication date: November 17, 2022
    Inventors: Ivo Danihelka, Danilo Jimenez Rezende, Karol Gregor, Georgios Papamakarios, Theophane Guillaume Weber
  • Patent number: 11388424
    Abstract: A system implemented by one or more computers comprises a visual encoder component configured to receive as input data representing a sequence of image frames, in particular representing objects in a scene of the sequence, and to output a sequence of corresponding state codes, each state code comprising vectors, one for each of the objects. Each vector represents a respective position and velocity of its corresponding object. The system also comprises a dynamic predictor component configured to take as input a sequence of state codes, for example from the visual encoder, and predict a state code for a next unobserved frame. The system further comprises a state decoder component configured to convert the predicted state code, to a state, the state comprising a respective position and velocity vector for each object in the scene. This state may represent a predicted position and velocity vector for each of the objects.
    Type: Grant
    Filed: December 29, 2020
    Date of Patent: July 12, 2022
    Assignee: DeepMind Technologies Limited
    Inventors: Nicholas Watters, Razvan Pascanu, Peter William Battaglia, Daniel Zorn, Theophane Guillaume Weber
  • Patent number: 11328183
    Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.
    Type: Grant
    Filed: September 14, 2020
    Date of Patent: May 10, 2022
    Assignee: DeepMind Technologies Limited
    Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Arthur Clement Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
  • Publication number: 20210152835
    Abstract: A system implemented by one or more computers comprises a visual encoder component configured to receive as input data representing a sequence of image frames, in particular representing objects in a scene of the sequence, and to output a sequence of corresponding state codes, each state code comprising vectors, one for each of the objects. Each vector represents a respective position and velocity of its corresponding object. The system also comprises a dynamic predictor component configured to take as input a sequence of state codes, for example from the visual encoder, and predict a state code for a next unobserved frame. The system further comprises a state decoder component configured to convert the predicted state code, to a state, the state comprising a respective position and velocity vector for each object in the scene. This state may represent a predicted position and velocity vector for each of the objects.
    Type: Application
    Filed: December 29, 2020
    Publication date: May 20, 2021
    Inventors: Nicholas Watters, Razvan Pascanu, Peter William Battaglia, Daniel Zorn, Theophane Guillaume Weber
  • Publication number: 20210089834
    Abstract: A neural network system is proposed to select actions to be performed by an agent interacting with an environment to perform a task in an attempt to achieve a specified result. The system may include a controller to receive state data and context data, and to output action data. The system may also include an imagination module to receive the state and action data, and to output consequent state data. The system may also include a manager to receive the state data and the context data, and to output route data which defines whether the system is to execute an action or to imagine. The system may also include a memory to store the context data.
    Type: Application
    Filed: December 7, 2020
    Publication date: March 25, 2021
    Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
  • Publication number: 20210081795
    Abstract: A system including one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to implement a memory and memory-based neural network is described. The memory is configured to store a respective memory vector at each of a plurality of memory locations in the memory. The memory-based neural network is configured to: at each of a plurality of time steps: receive an input; determine an update to the memory, wherein determining the update comprising applying an attention mechanism over the memory vectors in the memory and the received input; update the memory using the determined update to the memory; and generate an output for the current time step using the updated memory.
    Type: Application
    Filed: November 30, 2020
    Publication date: March 18, 2021
    Inventors: Mike Chrzanowski, Jack William Rae, Ryan Faulkner, Theophane Guillaume Weber, David Nunes Raposo, Adam Anthony Santoro
  • Publication number: 20210073594
    Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.
    Type: Application
    Filed: September 14, 2020
    Publication date: March 11, 2021
    Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Arthur Clement Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
  • Patent number: 10887607
    Abstract: A system implemented by one or more computers comprises a visual encoder component configured to receive as input data representing a sequence of image frames, in particular representing objects in a scene of the sequence, and to output a sequence of corresponding state codes, each state code comprising vectors, one for each of the objects. Each vector represents a respective position and velocity of its corresponding object. The system also comprises a dynamic predictor component configured to take as input a sequence of state codes, for example from the visual encoder, and predict a state code for a next unobserved frame. The system further comprises a state decoder component configured to convert the predicted state code, to a state, the state comprising a respective position and velocity vector for each object in the scene. This state may represent a predicted position and velocity vector for each of the objects.
    Type: Grant
    Filed: November 18, 2019
    Date of Patent: January 5, 2021
    Assignee: DeepMind Technologies Limited
    Inventors: Nicholas Watters, Razvan Pascanu, Peter William Battaglia, Daniel Zorn, Theophane Guillaume Weber
  • Patent number: 10860895
    Abstract: A neural network system is proposed to select actions to be performed by an agent interacting with an environment to perform a task in an attempt to achieve a specified result. The system may include a controller to receive state data and context data, and to output action data. The system may also include an imagination module to receive the state and action data, and to output consequent state data. The system may also include a manager to receive the state data and the context data, and to output route data which defines whether the system is to execute an action or to imagine. The system may also include a memory to store the context data.
    Type: Grant
    Filed: November 19, 2019
    Date of Patent: December 8, 2020
    Assignee: DeepMind Technologies Limited
    Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
  • Patent number: 10853725
    Abstract: A system including one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to implement a memory and memory-based neural network is described. The memory is configured to store a respective memory vector at each of a plurality of memory locations in the memory. The memory-based neural network is configured to: at each of a plurality of time steps: receive an input; determine an update to the memory, wherein determining the update comprising applying an attention mechanism over the memory vectors in the memory and the received input; update the memory using the determined update to the memory; and generate an output for the current time step using the updated memory.
    Type: Grant
    Filed: May 17, 2019
    Date of Patent: December 1, 2020
    Assignee: DeepMind Technologies Limited
    Inventors: Mike Chrzanowski, Jack William Rae, Ryan Faulkner, Theophane Guillaume Weber, David Nunes Raposo, Adam Anthony Santoro
  • Patent number: 10776670
    Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.
    Type: Grant
    Filed: November 19, 2019
    Date of Patent: September 15, 2020
    Assignee: DeepMind Technologies Limited
    Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Arthur Clement Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
  • Publication number: 20200090006
    Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.
    Type: Application
    Filed: November 19, 2019
    Publication date: March 19, 2020
    Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Arthur Clement Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
  • Publication number: 20200092565
    Abstract: A system implemented by one or more computers comprises a visual encoder component configured to receive as input data representing a sequence of image frames, in particular representing objects in a scene of the sequence, and to output a sequence of corresponding state codes, each state code comprising vectors, one for each of the objects. Each vector represents a respective position and velocity of its corresponding object. The system also comprises a dynamic predictor component configured to take as input a sequence of state codes, for example from the visual encoder, and predict a state code for a next unobserved frame. The system further comprises a state decoder component configured to convert the predicted state code, to a state, the state comprising a respective position and velocity vector for each object in the scene. This state may represent a predicted position and velocity vector for each of the objects.
    Type: Application
    Filed: November 18, 2019
    Publication date: March 19, 2020
    Inventors: Nicholas Watters, Razvan Pascanu, Peter William Battaglia, Daniel Zorn, Theophane Guillaume Weber
  • Publication number: 20200082227
    Abstract: A neural network system is proposed to select actions to be performed by an agent interacting with an environment to perform a task in an attempt to achieve a specified result. The system may include a controller to receive state data and context data, and to output action data. The system may also include an imagination module to receive the state and action data, and to output consequent state data. The system may also include a manager to receive the state data and the context data, and to output route data which defines whether the system is to execute an action or to imagine. The system may also include a memory to store the context data.
    Type: Application
    Filed: November 19, 2019
    Publication date: March 12, 2020
    Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
  • Publication number: 20190354858
    Abstract: A system including one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to implement a memory and memory-based neural network is described. The memory is configured to store a respective memory vector at each of a plurality of memory locations in the memory. The memory-based neural network is configured to: at each of a plurality of time steps: receive an input; determine an update to the memory, wherein determining the update comprising applying an attention mechanism over the memory vectors in the memory and the received input; update the memory using the determined update to the memory; and generate an output for the current time step using the updated memory.
    Type: Application
    Filed: May 17, 2019
    Publication date: November 21, 2019
    Inventors: Mike Chrzanowski, Jack William Rae, Ryan Faulkner, Theophane Guillaume Weber, David Nunes Raposo, Adam Anthony Santoro