Patents by Inventor Lars Buesing
Lars Buesing has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20220366245Abstract: A reinforcement learning method and system that selects actions to be performed by a reinforcement learning agent interacting with an environment. A causal model is implemented by a hindsight model neural network and trained using hindsight i.e. using future environment state trajectories. As the method and system does not have access to this future information when selecting an action, the hindsight model neural network is used to train a model neural network which is conditioned on data from current observations, which learns to predict an output of the hindsight model neural network.Type: ApplicationFiled: September 23, 2020Publication date: November 17, 2022Inventors: Arthur Clement Guez, Fabio Viola, Theophane Guillaume Weber, Lars Buesing, Nicolas Manfred Otto Heess
-
Publication number: 20220366247Abstract: A reinforcement learning system and method that selects actions to be performed by an agent interacting with an environment. The system uses a combination of reinforcement learning and a look ahead search: Reinforcement learning Q-values are used to guide the look ahead search and the search is used in turn to improve the Q-values. The system learns from a combination of real experience and simulated, model-based experience.Type: ApplicationFiled: September 23, 2020Publication date: November 17, 2022Inventors: Jessica Blake Chandler Hamrick, Victor Constant Bapst, Alvaro Sanchez, Tobias Pfaff, Theophane Guillaume Weber, Lars Buesing, Peter William Battaglia
-
Patent number: 11328183Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.Type: GrantFiled: September 14, 2020Date of Patent: May 10, 2022Assignee: DeepMind Technologies LimitedInventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Arthur Clement Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
-
Publication number: 20210383228Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating prediction outputs characterizing a set of entities. In one aspect, a method comprises: obtaining data defining a graph, comprising: (i) a set of nodes, wherein each node represents a respective entity from the set of entities, (ii) a current set of edges, wherein each edge connects a pair of nodes, and (iii) a respective current embedding of each node; at each of a plurality of time steps: updating the respective current embedding of each node, comprising processing data defining the graph using a graph neural network; and updating the current set of edges based at least in part on the updated embeddings of the nodes; and at one or more of the plurality of time steps: generating a prediction output characterizing the set of entities based on the current embeddings of the nodes.Type: ApplicationFiled: June 4, 2021Publication date: December 9, 2021Inventors: Petar Velickovic, Charles Blundell, Oriol Vinyals, Razvan Pascanu, Lars Buesing, Matthew Overlan
-
Publication number: 20210089834Abstract: A neural network system is proposed to select actions to be performed by an agent interacting with an environment to perform a task in an attempt to achieve a specified result. The system may include a controller to receive state data and context data, and to output action data. The system may also include an imagination module to receive the state and action data, and to output consequent state data. The system may also include a manager to receive the state data and the context data, and to output route data which defines whether the system is to execute an action or to imagine. The system may also include a memory to store the context data.Type: ApplicationFiled: December 7, 2020Publication date: March 25, 2021Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
-
Publication number: 20210073594Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.Type: ApplicationFiled: September 14, 2020Publication date: March 11, 2021Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Arthur Clement Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
-
Patent number: 10860895Abstract: A neural network system is proposed to select actions to be performed by an agent interacting with an environment to perform a task in an attempt to achieve a specified result. The system may include a controller to receive state data and context data, and to output action data. The system may also include an imagination module to receive the state and action data, and to output consequent state data. The system may also include a manager to receive the state data and the context data, and to output route data which defines whether the system is to execute an action or to imagine. The system may also include a memory to store the context data.Type: GrantFiled: November 19, 2019Date of Patent: December 8, 2020Assignee: DeepMind Technologies LimitedInventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
-
Patent number: 10776670Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.Type: GrantFiled: November 19, 2019Date of Patent: September 15, 2020Assignee: DeepMind Technologies LimitedInventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Arthur Clement Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
-
Publication number: 20200090006Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.Type: ApplicationFiled: November 19, 2019Publication date: March 19, 2020Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Arthur Clement Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere
-
Publication number: 20200082227Abstract: A neural network system is proposed to select actions to be performed by an agent interacting with an environment to perform a task in an attempt to achieve a specified result. The system may include a controller to receive state data and context data, and to output action data. The system may also include an imagination module to receive the state and action data, and to output consequent state data. The system may also include a manager to receive the state data and the context data, and to output route data which defines whether the system is to execute an action or to imagine. The system may also include a memory to store the context data.Type: ApplicationFiled: November 19, 2019Publication date: March 12, 2020Inventors: Daniel Pieter Wierstra, Yujia Li, Razvan Pascanu, Peter William Battaglia, Theophane Guillaume Weber, Lars Buesing, David Paul Reichert, Oriol Vinyals, Nicolas Manfred Otto Heess, Sebastien Henri Andre Racaniere