Patents by Inventor Victor Constant Bapst
Victor Constant Bapst has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12190223Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network. One of the methods includes maintaining a replay memory that stores trajectories generated as a result of interaction of an agent with an environment; and training an action selection neural network having policy parameters on the trajectories in the replay memory, wherein training the action selection neural network comprises: sampling a trajectory from the replay memory; and adjusting current values of the policy parameters by training the action selection neural network on the trajectory using an off-policy actor critic reinforcement learning technique.Type: GrantFiled: May 28, 2020Date of Patent: January 7, 2025Assignee: DeepMind Technologies LimitedInventors: Ziyu Wang, Nicolas Manfred Otto Heess, Victor Constant Bapst
-
Patent number: 12190236Abstract: Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for predicting one or more properties of a material. One of the methods includes maintaining data specifying a set of known materials each having a respective known physical structure; receiving data specifying a new material; identifying a plurality of known materials in the set of known materials that are similar to the new material; determining a predicted embedding of the new material from at least respective embeddings corresponding to each of the similar known materials; and processing the predicted embedding of the new material using an experimental prediction neural network to predict one or more properties of the new material.Type: GrantFiled: April 26, 2021Date of Patent: January 7, 2025Assignee: DeepMind Technologies LimitedInventors: Annette Ada Nkechinyere Obika, Tian Xie, Victor Constant Bapst, Alexander Lloyd Gaunt, James Kirkpatrick
-
Patent number: 11983634Abstract: A method is proposed for training a multitask computer system, such as a multitask neural network system. The system comprises a set of trainable workers and a shared module. The trainable workers and shared module are trained on a plurality of different tasks, such that each worker learns to perform a corresponding one of the tasks according to a respective task policy, and said shared policy network learns a multitask policy which represents common behavior for the tasks. The coordinated training is performed by optimizing an objective function comprising, for each task: a reward term indicative of an expected reward earned by a worker in performing the corresponding task according to the task policy; and at least one entropy term which regularizes the distribution of the task policy towards the distribution of the multitask policy.Type: GrantFiled: September 27, 2021Date of Patent: May 14, 2024Assignee: DeepMind Technologies LimitedInventors: Razvan Pascanu, Raia Thais Hadsell, Victor Constant Bapst, Wojciech Czarnecki, James Kirkpatrick, Yee Whye Teh, Nicolas Manfred Otto Heess
-
Publication number: 20230196146Abstract: A neural network system is proposed, including an input network for extracting, from state data, respective entity data for each a plurality of entities which are present, or at least potentially present, in the environment. The entity data describes the entity. The neural network contains a relational network for parsing this data, which includes one or more attention blocks which may be stacked to perform successive actions on the entity data. The attention blocks each include a respective transform network for each of the entities. The transform network for each entity is able to transform data which the transform network receives for the entity into modified entity data for the entity, based on data for a plurality of the other entities. An output network is arranged to receive data output by the relational network, and use the received data to select a respective action.Type: ApplicationFiled: February 13, 2023Publication date: June 22, 2023Inventors: Yujia Li, Victor Constant Bapst, Vinicius Zambaldi, David Nunes Raposo, Adam Anthony Santoro
-
Patent number: 11580429Abstract: A neural network system is proposed, including an input network for extracting, from state data, respective entity data for each a plurality of entities which are present, or at least potentially present, in the environment. The entity data describes the entity. The neural network contains a relational network for parsing this data, which includes one or more attention blocks which may be stacked to perform successive actions on the entity data. The attention blocks each include a respective transform network for each of the entities. The transform network for each entity is able to transform data which the transform network receives for the entity into modified entity data for the entity, based on data for a plurality of the other entities. An output network is arranged to receive data output by the relational network, and use the received data to select a respective action.Type: GrantFiled: May 20, 2019Date of Patent: February 14, 2023Assignee: DeepMind Technologies LimitedInventors: Yujia Li, Victor Constant Bapst, Vinicius Zambaldi, David Nunes Raposo, Adam Anthony Santoro
-
Publication number: 20220366247Abstract: A reinforcement learning system and method that selects actions to be performed by an agent interacting with an environment. The system uses a combination of reinforcement learning and a look ahead search: Reinforcement learning Q-values are used to guide the look ahead search and the search is used in turn to improve the Q-values. The system learns from a combination of real experience and simulated, model-based experience.Type: ApplicationFiled: September 23, 2020Publication date: November 17, 2022Inventors: Jessica Blake Chandler Hamrick, Victor Constant Bapst, Alvaro Sanchez, Tobias Pfaff, Theophane Guillaume Weber, Lars Buesing, Peter William Battaglia
-
Publication number: 20220083869Abstract: A method is proposed for training a multitask computer system, such as a multitask neural network system. The system comprises a set of trainable workers and a shared module. The trainable workers and shared module are trained on a plurality of different tasks, such that each worker learns to perform a corresponding one of the tasks according to a respective task policy, and said shared policy network learns a multitask policy which represents common behavior for the tasks. The coordinated training is performed by optimizing an objective function comprising, for each task: a reward term indicative of an expected reward earned by a worker in performing the corresponding task according to the task policy; and at least one entropy term which regularizes the distribution of the task policy towards the distribution of the multitask policy.Type: ApplicationFiled: September 27, 2021Publication date: March 17, 2022Inventors: Razvan Pascanu, Raia Thais Hadsell, Victor Constant Bapst, Wojciech Czarnecki, James Kirkpatrick, Yee Whye Teh, Nicolas Manfred Otto Heess
-
Publication number: 20210334655Abstract: Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for predicting one or more properties of a material. One of the methods includes maintaining data specifying a set of known materials each having a respective known physical structure; receiving data specifying a new material; identifying a plurality of known materials in the set of known materials that are similar to the new material; determining a predicted embedding of the new material from at least respective embeddings corresponding to each of the similar known materials; and processing the predicted embedding of the new material using an experimental prediction neural network to predict one or more properties of the new material.Type: ApplicationFiled: April 26, 2021Publication date: October 28, 2021Inventors: Annette Ada Nkechinyere Obika, Tian Xie, Victor Constant Bapst, Alexander Lloyd Gaunt, James Kirkpatrick
-
Patent number: 11132609Abstract: A method is proposed for training a multitask computer system, such as a multitask neural network system. The system comprises a set of trainable workers and a shared module. The trainable workers and shared module are trained on a plurality of different tasks, such that each worker learns to perform a corresponding one of the tasks according to a respective task policy, and said shared policy network learns a multitask policy which represents common behavior for the tasks. The coordinated training is performed by optimizing an objective function comprising, for each task: a reward term indicative of an expected reward earned by a worker in performing the corresponding task according to the task policy; and at least one entropy term which regularizes the distribution of the task policy towards the distribution of the multitask policy.Type: GrantFiled: November 19, 2019Date of Patent: September 28, 2021Assignee: DeepMind Technologies LimitedInventors: Razvan Pascanu, Raia Thais Hadsell, Victor Constant Bapst, Wojciech Czarnecki, James Kirkpatrick, Yee Whye Teh, Nicolas Manfred Otto Heess
-
Publication number: 20200293862Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network. One of the methods includes maintaining a replay memory that stores trajectories generated as a result of interaction of an agent with an environment; and training an action selection neural network having policy parameters on the trajectories in the replay memory, wherein training the action selection neural network comprises: sampling a trajectory from the replay memory; and adjusting current values of the policy parameters by training the action selection neural network on the trajectory using an off-policy actor critic reinforcement learning technique.Type: ApplicationFiled: May 28, 2020Publication date: September 17, 2020Inventors: Ziyu Wang, Nicolas Manfred Otto Heess, Victor Constant Bapst
-
Patent number: 10706352Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network. One of the methods includes maintaining a replay memory that stores trajectories generated as a result of interaction of an agent with an environment; and training an action selection neural network having policy parameters on the trajectories in the replay memory, wherein training the action selection neural network comprises: sampling a trajectory from the replay memory; and adjusting current values of the policy parameters by training the action selection neural network on the trajectory using an off-policy actor critic reinforcement learning technique.Type: GrantFiled: May 3, 2019Date of Patent: July 7, 2020Assignee: DeepMind Technologies LimitedInventors: Ziyu Wang, Nicolas Manfred Otto Heess, Victor Constant Bapst
-
Publication number: 20200090048Abstract: A method is proposed for training a multitask computer system, such as a multitask neural network system. The system comprises a set of trainable workers and a shared module. The trainable workers and shared module are trained on a plurality of different tasks, such that each worker learns to perform a corresponding one of the tasks according to a respective task policy, and said shared policy network learns a multitask policy which represents common behavior for the tasks. The coordinated training is performed by optimizing an objective function comprising, for each task: a reward term indicative of an expected reward earned by a worker in performing the corresponding task according to the task policy; and at least one entropy term which regularizes the distribution of the task policy towards the distribution of the multitask policy.Type: ApplicationFiled: November 19, 2019Publication date: March 19, 2020Inventors: Razvan Pascanu, Raia Thais Hadsell, Victor Constant Bapst, Wojciech Czarnecki, James Kirkpatrick, Yee Whye Teh, Nicolas Manfred Otto Heess
-
Publication number: 20190354885Abstract: A neural network system is proposed, including an input network for extracting, from state data, respective entity data for each a plurality of entities which are present, or at least potentially present, in the environment. The entity data describes the entity. The neural network contains a relational network for parsing this data, which includes one or more attention blocks which may be stacked to perform successive actions on the entity data. The attention blocks each include a respective transform network for each of the entities. The transform network for each entity is able to transform data which the transform network receives for the entity into modified entity data for the entity, based on data for a plurality of the other entities. An output network is arranged to receive data output by the relational network, and use the received data to select a respective action.Type: ApplicationFiled: May 20, 2019Publication date: November 21, 2019Inventors: Yujia Li, Victor Constant Bapst, Vinicius Zambaldi, David Nunes Raposo, Adam Anthony Santoro
-
Publication number: 20190258918Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network. One of the methods includes maintaining a replay memory that stores trajectories generated as a result of interaction of an agent with an environment; and training an action selection neural network having policy parameters on the trajectories in the replay memory, wherein training the action selection neural network comprises: sampling a trajectory from the replay memory; and adjusting current values of the policy parameters by training the action selection neural network on the trajectory using an off-policy actor critic reinforcement learning technique.Type: ApplicationFiled: May 3, 2019Publication date: August 22, 2019Inventors: Ziyu Wang, Nicolas Manfred Otto Heess, Victor Constant Bapst