Patents by Inventor Alexandre Galashov

Alexandre Galashov has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Learning motor primitives and training a machine learning system using a linear-feedback-stabilized policy

Patent number: 11714996

Abstract: A computer-implemented method of training a student machine learning system comprises receiving data indicating execution of an expert, determining one or more actions performed by the expert during the execution and a corresponding state-action Jacobian, and training the student machine learning system using a linear-feedback-stabilized policy. The linear-feedback-stabilized policy may be based on the state-action Jacobian. Also a neural network system for representing a space of probabilistic motor primitives, implemented by one or more computers. The neural network system comprises an encoder configured to generate latent variables based on a plurality of inputs, each input comprising a plurality of frames, and a decoder configured to generate an action based on one or more of the latent variables and a state.

Type: Grant

Filed: July 25, 2022

Date of Patent: August 1, 2023

Assignee: DeepMind Technologies Limited

Inventors: Leonard Hasenclever, Vu Pham, Joshua Merel, Alexandre Galashov
TRAINING AN ACTION SELECTION SYSTEM USING RELATIVE ENTROPY Q-LEARNING

Publication number: 20230214649

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection system using reinforcement learning techniques. In one aspect, a method comprises at each of multiple iterations: obtaining a batch of experience, each experience tuple comprising: a first observation, an action, a second observation, and a reward; for each experience tuple, determining a state value for the second observation, comprising: processing the first observation using a policy neural network to generate an action score for each action in a set of possible actions; sampling multiple actions from the set of possible actions in accordance with the action scores; processing the second observation using a Q neural network to generate a Q value for each sampled action; and determining the state value for the second observation; and determining an update to current values of the Q neural network parameters using the state values.

Type: Application

Filed: July 27, 2021

Publication date: July 6, 2023

Inventors: Rae Chan Jeong, Jost Tobias Springenberg, Jacqueline Ok-chan Kay, Daniel Hai Huan Zheng, Alexandre Galashov, Nicolas Manfred Otto Heess, Francesco Nori
NON-OCCLUDING VIDEO OVERLAYS

Publication number: 20220417586

Abstract: Methods, systems, and computer media provide for identifying exclusion zones in frames of a video, aggregating those exclusion zones for a specified duration or number of frames, defining a inclusion zone within which overlaid content is eligible for inclusion, and providing overlaid content for inclusion in the inclusion zone. The exclusion zones can include regions in which significant features are detected such as text, human features, objects from a selected set of object categories, or moving objects.

Type: Application

Filed: July 29, 2020

Publication date: December 29, 2022

Inventors: Elena Erbiceanu-Tener, Alexandre Galashov, Andy Chiu, Nathan Wiegand
LEARNING MOTOR PRIMITIVES AND TRAINING A MACHINE LEARNING SYSTEM USING A LINEAR-FEEDBACK-STABILIZED POLICY

Publication number: 20220374686

Abstract: A computer-implemented method of training a student machine learning system comprises receiving data indicating execution of an expert, determining one or more actions performed by the expert during the execution and a corresponding state-action Jacobian, and training the student machine learning system using a linear-feedback-stabilized policy. The linear-feedback-stabilized policy may be based on the state-action Jacobian. Also a neural network system for representing a space of probabilistic motor primitives, implemented by one or more computers. The neural network system comprises an encoder configured to generate latent variables based on a plurality of inputs, each input comprising a plurality of frames, and a decoder configured to generate an action based on one or more of the latent variables and a state.

Type: Application

Filed: July 25, 2022

Publication date: November 24, 2022

Inventors: Leonard Hasenclever, Vu Pham, Joshua Merel, Alexandre Galashov
Learning motor primitives and training a machine learning system using a linear-feedback-stabilized policy

Patent number: 11403513

Abstract: A computer-implemented method of training a student machine learning system comprises receiving data indicating execution of an expert, determining one or more actions performed by the expert during the execution and a corresponding state-action Jacobian, and training the student machine learning system using a linear-feedback-stabilized policy. The linear-feedback-stabilized policy may be based on the state-action Jacobian. Also a neural network system for representing a space of probabilistic motor primitives, implemented by one or more computers. The neural network system comprises an encoder configured to generate latent variables based on a plurality of inputs, each input comprising a plurality of frames, and a decoder configured to generate an action based on one or more of the latent variables and a state.

Type: Grant

Filed: September 27, 2019

Date of Patent: August 2, 2022

Assignee: DeepMind Technologies Limited

Inventors: Leonard Hasenclever, Vu Pham, Joshua Merel, Alexandre Galashov
LEARNING MOTOR PRIMITIVES AND TRAINING A MACHINE LEARNING SYSTEM USING A LINEAR-FEEDBACK-STABILIZED POLICY

Publication number: 20200104685

Abstract: A computer-implemented method of training a student machine learning system comprises receiving data indicating execution of an expert, determining one or more actions performed by the expert during the execution and a corresponding state-action Jacobian, and training the student machine learning system using a linear-feedback-stabilized policy. The linear-feedback-stabilized policy may be based on the state-action Jacobian. Also a neural network system for representing a space of probabilistic motor primitives, implemented by one or more computers. The neural network system comprises an encoder configured to generate latent variables based on a plurality of inputs, each input comprising a plurality of frames, and a decoder configured to generate an action based on one or more of the latent variables and a state.

Type: Application

Filed: September 27, 2019

Publication date: April 2, 2020

Inventors: Leonard Hasenclever, Vu Pham, Joshua Merel, Alexandre Galashov

Learning motor primitives and training a machine learning system using a linear-feedback-stabilized policy

TRAINING AN ACTION SELECTION SYSTEM USING RELATIVE ENTROPY Q-LEARNING

NON-OCCLUDING VIDEO OVERLAYS

LEARNING MOTOR PRIMITIVES AND TRAINING A MACHINE LEARNING SYSTEM USING A LINEAR-FEEDBACK-STABILIZED POLICY

Learning motor primitives and training a machine learning system using a linear-feedback-stabilized policy

LEARNING MOTOR PRIMITIVES AND TRAINING A MACHINE LEARNING SYSTEM USING A LINEAR-FEEDBACK-STABILIZED POLICY