Patents by Inventor Hubert Josef Soyer
Hubert Josef Soyer has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250200379Abstract: The invention describes a system and a method for controlling an agent interacting with an environment to perform a task, the method comprising, at each of a plurality of first time steps from a plurality of time steps: receiving an observation characterizing a state of the environment at the first time step; determining a goal representation for the first time step that characterizes a goal state of the environment to be reached by the agent; processing the observation and the goal representation using a low-level controller neural network to generate a low-level policy output that defines an action to be performed by the agent in response to the observation, wherein the low-level controller neural network comprises: a representation neural network configured to process the observation to generate an internal state representation of the observation, and a low-level policy head configured to process the state observation representation and the goal representation to generate the low-level policy output; and cType: ApplicationFiled: June 7, 2023Publication date: June 19, 2025Inventors: Hubert Josef Soyer, Feryal Behbahani, Thomas Albert Keck, Kyriacos Nikiforou, Bernardo Avila Pires, Satinder Singh Baveja
-
Patent number: 12299574Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.Type: GrantFiled: October 16, 2023Date of Patent: May 13, 2025Assignee: DeepMind Technologies LimitedInventors: Hubert Josef Soyer, Lasse Espeholt, Karen Simonyan, Yotam Doron, Vlad Firoiu, Volodymyr Mnih, Koray Kavukcuoglu, Remi Munos, Thomas Ward, Timothy James Alexander Harley, Iain Robert Dunning
-
Publication number: 20240127060Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.Type: ApplicationFiled: October 16, 2023Publication date: April 18, 2024Inventors: Hubert Josef Soyer, Lasse Espeholt, Karen Simonyan, Yotam Doron, Vlad Firoiu, Volodymyr Mnih, Koray Kavukcuoglu, Remi Munos, Thomas Ward, Timothy James Alexander Harley, Iain Robert Dunning
-
Publication number: 20240119262Abstract: Methods and systems for performing a sequence of machine learning tasks. One system includes a sequence of deep neural networks (DNNs), including: a first DNN corresponding to a first machine learning task, wherein the first DNN comprises a first plurality of indexed layers, and each layer in the first plurality of indexed layers is configured to receive a respective layer input and process the layer input to generate a respective layer output; and one or more subsequent DNNs corresponding to one or more respective machine learning tasks, wherein each subsequent DNN comprises a respective plurality of indexed layers, and each layer in a respective plurality of indexed layers with index greater than one receives input from a preceding layer of the respective subsequent DNN, and one or more preceding layers of respective preceding DNNs, wherein a preceding layer is a layer whose index is one less than the current index.Type: ApplicationFiled: October 2, 2023Publication date: April 11, 2024Inventors: Neil Charles Rabinowitz, Guillaume Desjardins, Andrei-Alexandru Rusu, Koray Kavukcuoglu, Raia Thais Hadsell, Razvan Pascanu, James Kirkpatrick, Hubert Josef Soyer
-
Patent number: 11868894Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.Type: GrantFiled: January 4, 2023Date of Patent: January 9, 2024Assignee: DeepMind Technologies LimitedInventors: Hubert Josef Soyer, Lasse Espeholt, Karen Simonyan, Yotam Doron, Vlad Firoiu, Volodymyr Mnih, Koray Kavukcuoglu, Remi Munos, Thomas Ward, Timothy James Alexander Harley, Iain Robert Dunning
-
Patent number: 11775804Abstract: Methods and systems for performing a sequence of machine learning tasks. One system includes a sequence of deep neural networks (DNNs), including: a first DNN corresponding to a first machine learning task, wherein the first DNN comprises a first plurality of indexed layers, and each layer in the first plurality of indexed layers is configured to receive a respective layer input and process the layer input to generate a respective layer output; and one or more subsequent DNNs corresponding to one or more respective machine learning tasks, wherein each subsequent DNN comprises a respective plurality of indexed layers, and each layer in a respective plurality of indexed layers with index greater than one receives input from a preceding layer of the respective subsequent DNN, and one or more preceding layers of respective preceding DNNs, wherein a preceding layer is a layer whose index is one less than the current index.Type: GrantFiled: March 15, 2021Date of Patent: October 3, 2023Assignee: DeepMind Technologies LimitedInventors: Neil Charles Rabinowitz, Guillaume Desjardins, Andrei-Alexandru Rusu, Koray Kavukcuoglu, Raia Thais Hadsell, Razvan Pascanu, James Kirkpatrick, Hubert Josef Soyer
-
Publication number: 20230153617Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.Type: ApplicationFiled: January 4, 2023Publication date: May 18, 2023Inventors: Hubert Josef Soyer, Lasse Espeholt, Karen Simonyan, Yotam Doron, Vlad Firoiu, Volodymyr Mnih, Koray Kavukcuoglu, Remi Munos, Thomas Ward, Timothy James Alexander Harley, Iain Robert Dunning
-
Patent number: 11593646Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.Type: GrantFiled: February 5, 2019Date of Patent: February 28, 2023Assignee: DeepMind Technologies LimitedInventors: Hubert Josef Soyer, Lasse Espeholt, Karen Simonyan, Yotam Doron, Vlad Firoiu, Volodymyr Mnih, Koray Kavukcuoglu, Remi Munos, Thomas Ward, Timothy James Alexander Harley, Iain Robert Dunning
-
Patent number: 11074481Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a reinforcement learning system. In one aspect, a method of training an action selection policy neural network for use in selecting actions to be performed by an agent navigating through an environment to accomplish one or more goals comprises: receiving an observation image characterizing a current state of the environment; processing, using the action selection policy neural network, an input comprising the observation image to generate an action selection output; processing, using a geometry-prediction neural network, an intermediate output generated by the action selection policy neural network to predict a value of a feature of a geometry of the environment when in the current state; and backpropagating a gradient of a geometry-based auxiliary loss into the action selection policy neural network to determine a geometry-based auxiliary update for current values of the network parameters.Type: GrantFiled: January 17, 2020Date of Patent: July 27, 2021Assignee: DeepMind Technologies LimitedInventors: Fabio Viola, Piotr Wojciech Mirowski, Andrea Banino, Razvan Pascanu, Hubert Josef Soyer, Andrew James Ballard, Sudarshan Kumaran, Raia Thais Hadsell, Laurent Sifre, Rostislav Goroshin, Koray Kavukcuoglu, Misha Man Ray Denil
-
Publication number: 20210201116Abstract: Methods and systems for performing a sequence of machine learning tasks. One system includes a sequence of deep neural networks (DNNs), including: a first DNN corresponding to a first machine learning task, wherein the first DNN comprises a first plurality of indexed layers, and each layer in the first plurality of indexed layers is configured to receive a respective layer input and process the layer input to generate a respective layer output; and one or more subsequent DNNs corresponding to one or more respective machine learning tasks, wherein each subsequent DNN comprises a respective plurality of indexed layers, and each layer in a respective plurality of indexed layers with index greater than one receives input from a preceding layer of the respective subsequent DNN, and one or more preceding layers of respective preceding DNNs, wherein a preceding layer is a layer whose index is one less than the current index.Type: ApplicationFiled: March 15, 2021Publication date: July 1, 2021Inventors: Neil Charles Rabinowitz, Guillaume Desjardins, Andrei-Alexandru Rusu, Koray Kavukcuoglu, Raia Thais Hadsell, Razvan Pascanu, James Kirkpatrick, Hubert Josef Soyer
-
Patent number: 10949734Abstract: Methods and systems for performing a sequence of machine learning tasks. One system includes a sequence of deep neural networks (DNNs), including: a first DNN corresponding to a first machine learning task, wherein the first DNN comprises a first plurality of indexed layers, and each layer in the first plurality of indexed layers is configured to receive a respective layer input and process the layer input to generate a respective layer output; and one or more subsequent DNNs corresponding to one or more respective machine learning tasks, wherein each subsequent DNN comprises a respective plurality of indexed layers, and each layer in a respective plurality of indexed layers with index greater than one receives input from a preceding layer of the respective subsequent DNN, and one or more preceding layers of respective preceding DNNs, wherein a preceding layer is a layer whose index is one less than the current index.Type: GrantFiled: December 30, 2016Date of Patent: March 16, 2021Assignee: DeepMind Technologies LimitedInventors: Neil Charles Rabinowitz, Guillaume Desjardins, Andrei-Alexandru Rusu, Koray Kavukcuoglu, Raia Thais Hadsell, Razvan Pascanu, James Kirkpatrick, Hubert Josef Soyer
-
Publication number: 20210034970Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.Type: ApplicationFiled: February 5, 2019Publication date: February 4, 2021Inventors: Hubert Josef Soyer, Lasse Espeholt, Karen Simonyan, Yotam Doron, Vlad Firoiu, Volodymyr Mnih, Koray Kavukcuoglu, Remi Munos, Thomas Ward, Timothy James Alexander Harley, Iain Robert Dunning
-
Publication number: 20200151515Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a reinforcement learning system. In one aspect, a method of training an action selection policy neural network for use in selecting actions to be performed by an agent navigating through an environment to accomplish one or more goals comprises: receiving an observation image characterizing a current state of the environment; processing, using the action selection policy neural network, an input comprising the observation image to generate an action selection output; processing, using a geometry-prediction neural network, an intermediate output generated by the action selection policy neural network to predict a value of a feature of a geometry of the environment when in the current state; and backpropagating a gradient of a geometry-based auxiliary loss into the action selection policy neural network to determine a geometry-based auxiliary update for current values of the network parameters.Type: ApplicationFiled: January 17, 2020Publication date: May 14, 2020Inventors: Fabio Viola, Piotr Wojciech Mirowski, Andrea Banino, Razvan Pascanu, Hubert Josef Soyer, Andrew James Ballard, Sudarshan Kumaran, Raia Thais Hadsell, Laurent Sifre, Rostislav Goroshin, Koray Kavukcuoglu, Misha Man Ray Denil
-
Patent number: 10572776Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a reinforcement learning system. In one aspect, a method of training an action selection policy neural network for use in selecting actions to be performed by an agent navigating through an environment to accomplish one or more goals comprises: receiving an observation image characterizing a current state of the environment; processing, using the action selection policy neural network, an input comprising the observation image to generate an action selection output; processing, using a geometry-prediction neural network, an intermediate output generated by the action selection policy neural network to predict a value of a feature of a geometry of the environment when in the current state; and backpropagating a gradient of a geometry-based auxiliary loss into the action selection policy neural network to determine a geometry-based auxiliary update for current values of the network parameters.Type: GrantFiled: May 3, 2019Date of Patent: February 25, 2020Assignee: DeepMind Technologies LimitedInventors: Fabio Viola, Piotr Wojciech Mirowski, Andrea Banino, Razvan Pascanu, Hubert Josef Soyer, Andrew James Ballard, Sudarshan Kumaran, Raia Thais Hadsell, Laurent Sifre, Rostislav Goroshin, Koray Kavukcuoglu, Misha Man Ray Denil
-
Publication number: 20190266449Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a reinforcement learning system. In one aspect, a method of training an action selection policy neural network for use in selecting actions to be performed by an agent navigating through an environment to accomplish one or more goals comprises: receiving an observation image characterizing a current state of the environment; processing, using the action selection policy neural network, an input comprising the observation image to generate an action selection output; processing, using a geometry-prediction neural network, an intermediate output generated by the action selection policy neural network to predict a value of a feature of a geometry of the environment when in the current state; and backpropagating a gradient of a geometry-based auxiliary loss into the action selection policy neural network to determine a geometry-based auxiliary update for current values of the network parameters.Type: ApplicationFiled: May 3, 2019Publication date: August 29, 2019Inventors: Fabio Viola, Piotr Wojciech Mirowski, Andrea Banino, Razvan Pascanu, Hubert Josef Soyer, Andrew James Ballard, Sudarshan Kumaran, Raia Thais Hadsell, Laurent Sifre, Rostislav Goroshin, Koray Kavukcuoglu, Misha Man Ray Denil
-
Publication number: 20170337464Abstract: Methods and systems for performing a sequence of machine learning tasks. One system includes a sequence of deep neural networks (DNNs), including: a first DNN corresponding to a first machine learning task, wherein the first DNN comprises a first plurality of indexed layers, and each layer in the first plurality of indexed layers is configured to receive a respective layer input and process the layer input to generate a respective layer output; and one or more subsequent DNNs corresponding to one or more respective machine learning tasks, wherein each subsequent DNN comprises a respective plurality of indexed layers, and each layer in a respective plurality of indexed layers with index greater than one receives input from a preceding layer of the respective subsequent DNN, and one or more preceding layers of respective preceding DNNs, wherein a preceding layer is a layer whose index is one less than the current index.Type: ApplicationFiled: December 30, 2016Publication date: November 23, 2017Inventors: Neil Charles Rabinowitz, Guillaume Desjardins, Andrei-Alexandru Rusu, Koray Kavukcuoglu, Raia Thais Hadsell, Razvan Pascanu, James Kirkpatrick, Hubert Josef Soyer