Patents by Inventor Jost Tobias Springenberg

Jost Tobias Springenberg has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

PLANNING USING A JUMPY TRAJECTORY DECODER NEURAL NETWORK

Publication number: 20240220795

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling agents using jumpy trajectory decoder neural networks.

Type: Application

Filed: December 29, 2023

Publication date: July 4, 2024

Inventors: Jingwei Zhang, Arunkumar Byravan, Jost Tobias Springenberg, Martin Riedmiller, Nicolas Manfred Otto Heess, Leonard Hasenclever, Abbas Abdolmaleki, Dushyant Rao
TRAINING AN ACTION SELECTION SYSTEM USING RELATIVE ENTROPY Q-LEARNING

Publication number: 20230214649

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection system using reinforcement learning techniques. In one aspect, a method comprises at each of multiple iterations: obtaining a batch of experience, each experience tuple comprising: a first observation, an action, a second observation, and a reward; for each experience tuple, determining a state value for the second observation, comprising: processing the first observation using a policy neural network to generate an action score for each action in a set of possible actions; sampling multiple actions from the set of possible actions in accordance with the action scores; processing the second observation using a Q neural network to generate a Q value for each sampled action; and determining the state value for the second observation; and determining an update to current values of the Q neural network parameters using the state values.

Type: Application

Filed: July 27, 2021

Publication date: July 6, 2023

Inventors: Rae Chan Jeong, Jost Tobias Springenberg, Jacqueline Ok-chan Kay, Daniel Hai Huan Zheng, Alexandre Galashov, Nicolas Manfred Otto Heess, Francesco Nori
ROBUST REINFORCEMENT LEARNING FOR CONTINUOUS CONTROL WITH MODEL MISSPECIFICATION

Publication number: 20220343157

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a policy neural network having policy parameters. One of the methods includes sampling a mini-batch comprising one or more observation-action-reward tuples generated as a result of interactions of a first agent with a first environment; determining an update to current values of the Q network parameters by minimizing a robust entropy-regularized temporal difference (TD) error that accounts for possible perturbations of the states of the first environment represented by the observations in the observation-action-reward tuples; and determining, using the Q-value neural network, an update to the policy network parameters using the sampled mini-batch of observation-action-reward tuples.

Type: Application

Filed: June 17, 2020

Publication date: October 27, 2022

Inventors: Daniel J. Mankowitz, Nir Levine, Rae Chan Jeong, Abbas Abdolmaleki, Jost Tobias Springenberg, Todd Andrew Hester, Timothy Arthur Mann, Martin Riedmiller
HIERARCHICAL POLICIES FOR MULTITASK TRANSFER

Publication number: 20220237488

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for controlling an agent. One of the methods includes obtaining an observation characterizing a current state of the environment and data identifying a task currently being performed by the agent; processing the observation and the data identifying the task using a high-level controller to generate a high-level probability distribution that assigns a respective probability to each of a plurality of low-level controllers; processing the observation using each of the plurality of low-level controllers to generate, for each of the plurality of low-level controllers, a respective low-level probability distribution; generating a combined probability distribution; and selecting, using the combined probability distribution, an action from the space of possible actions to be performed by the agent in response to the observation.

Type: Application

Filed: May 22, 2020

Publication date: July 28, 2022

Inventors: Markus Wulfmeier, Abbas Abdolmaleki, Roland Hafner, Jost Tobias Springenberg, Nicolas Manfred Otto Heess, Martin Riedmiller
GRAPH NEURAL NETWORKS REPRESENTING PHYSICAL SYSTEMS

Publication number: 20210049467

Abstract: A graph neural network system implementing a learnable physics engine for understanding and controlling a physical system. The physical system is considered to be composed of bodies coupled by joints and is represented by static and dynamic graphs. A graph processing neural network processes an input graph e.g. the static and dynamic graphs, to provide an output graph, e.g. a predicted dynamic graph. The graph processing neural network is differentiable and may be used for control and/or reinforcement learning. The trained graph neural network system can be applied to physical systems with similar but new graph structures (zero-shot learning).

Type: Application

Filed: April 12, 2019

Publication date: February 18, 2021

Inventors: Martin Riedmiller, Raia Thais Hadsell, Peter William Battaglia, Joshua Merel, Jost Tobias Springenberg, Alvaro Sanchez, Nicolas Manfred Otto Heess

PLANNING USING A JUMPY TRAJECTORY DECODER NEURAL NETWORK

TRAINING AN ACTION SELECTION SYSTEM USING RELATIVE ENTROPY Q-LEARNING

ROBUST REINFORCEMENT LEARNING FOR CONTINUOUS CONTROL WITH MODEL MISSPECIFICATION

HIERARCHICAL POLICIES FOR MULTITASK TRANSFER

GRAPH NEURAL NETWORKS REPRESENTING PHYSICAL SYSTEMS