Patents by Inventor Sergey Vladimir Levine

Sergey Vladimir Levine has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

CONTROL POLICIES FOR ROBOTIC AGENTS

Publication number: 20240078429

Abstract: A method includes: receiving data identifying, for each of one or more objects, a respective target location to which a robotic agent interacting with a real-world environment should move the object; causing the robotic agent to move the one or more objects to the one or more target locations by repeatedly performing the following: receiving a current image of a current state of the real-world environment; determining, from the current image, a next sequence of actions to be performed by the robotic agent using a next image prediction neural network that predicts future images based on a current action and an action to be performed by the robotic agent; and directing the robotic agent to perform the next sequence of actions.

Type: Application

Filed: November 13, 2023

Publication date: March 7, 2024

Inventors: Chelsea Breanna Finn, Sergey Vladimir Levine
Control policies for robotic agents

Patent number: 11853876

Abstract: A method includes: receiving data identifying, for each of one or more objects, a respective target location to which a robotic agent interacting with a real-world environment should move the object; causing the robotic agent to move the one or more objects to the one or more target locations by repeatedly performing the following: receiving a current image of a current state of the real-world environment; determining, from the current image, a next sequence of actions to be performed by the robotic agent using a next image prediction neural network that predicts future images based on a current action and an action to be performed by the robotic agent; and directing the robotic agent to perform the next sequence of actions.

Type: Grant

Filed: September 15, 2017

Date of Patent: December 26, 2023

Assignee: Google LLC

Inventors: Chelsea Breanna Finn, Sergey Vladimir Levine
Pixel-Level Video Prediction with Improved Performance and Efficiency

Publication number: 20230239499

Abstract: One aspect provides a machine-learned video prediction model configured to receive and process one or more previous video frames to generate one or more predicted subsequent video frames, wherein the machine-learned video prediction model comprises a convolutional variational auto encoder, and wherein the convolutional variational auto encoder comprises an encoder portion comprising one or more encoding cells and a decoder portion comprising one or more decoding cells.

Type: Application

Filed: May 27, 2022

Publication date: July 27, 2023

Inventors: Mohammad Babaeizadeh, Chelsea Breanna Finn, Dumitru Erhan, Mohammad Taghi Saffar, Sergey Vladimir Levine, Suraj Nair
OFFLINE META REINFORCEMENT LEARNING FOR ONLINE ADAPTATION FOR ROBOTIC CONTROL TASKS

Publication number: 20230095351

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a robotic control policy to perform a particular task. One of the methods includes performing a meta reinforcement learning phase including using training data collected for a plurality of different robotic control tasks and updating a robotic control policy according to the training data, wherein the robotic control policy is conditioned on an encoder network that is trained to predict which task is being performed from a context of a robotic operating environment; and performing an adaptation phase using a plurality of demonstrations for the particular task, including iteratively updating the encoder network after processing each demonstration of the plurality of demonstrations, thereby training the encoder network to learn environmental features of successful task runs.

Type: Application

Filed: September 15, 2022

Publication date: March 30, 2023

Inventors: Jianlan Luo, Stefan Schaal, Sergey Vladimir Levine, Zihao Zhao
MULTI-DIRECTIONAL HIGH-RESOLUTION OPTICAL TACTILE SENSORS

Publication number: 20230074958

Abstract: Optical tactile sensors are provided that include a scaffolding structure, a transparent elastomer material covering at least an end portion of the scaffolding structure, and one or multiple cameras situated on the end portion of the scaffolding structure and embedded within the transparent elastomer material, wherein the one or multiple cameras are situated so as to provide an extended, e.g., up to 360°, field of view about the end portion of the scaffolding structure.

Type: Application

Filed: April 19, 2022

Publication date: March 9, 2023

Inventors: Frederik David Ebert, Akhil Amar Padmanabha, Stephen Tian, Roberto Calandra, Sergey Vladimir Levine
REINFORCEMENT LEARNING ALGORITHM SEARCH

Publication number: 20220391687

Abstract: Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for generating and searching reinforcement learning algorithms. In some implementations, a computer-implemented system generates a sequence of candidate reinforcement learning algorithms. Each candidate reinforcement learning algorithm in the sequence is configured to receive an input environment state characterizing a state of an environment and to generate an output that specifies an action to be performed by an agent interacting with the environment. For each candidate reinforcement learning algorithm in the sequence, the system performs a performance evaluation for a set of a plurality of training environments. For each training environment, the system adjusts a set of environment-specific parameters of the candidate reinforcement learning algorithm by performing training of the candidate reinforcement learning algorithm to control a corresponding agent in the training environment.

Type: Application

Filed: June 3, 2021

Publication date: December 8, 2022

Inventors: John Dalton Co-Reyes, Yingjie Miao, Daiyi Peng, Sergey Vladimir Levine, Quoc V. Le, Honglak Lee, Aleksandra Faust
Off-policy control policy evaluation

Patent number: 11477243

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for off-policy evaluation of a control policy. One of the methods includes obtaining policy data specifying a control policy for controlling a source agent interacting with a source environment to perform a particular task; obtaining a validation data set generated from interactions of a target agent in a target environment; determining a performance estimate that represents an estimate of a performance of the control policy in controlling the target agent to perform the particular task in the target environment; and determining, based on the performance estimate, whether to deploy the control policy for controlling the target agent to perform the particular task in the target environment.

Type: Grant

Filed: March 23, 2020

Date of Patent: October 18, 2022

Assignee: Google LLC

Inventors: Kanury Kanishka Rao, Konstantinos Bousmalis, Christopher K. Harris, Alexander Irpan, Sergey Vladimir Levine, Julian Ibarz
REINFORCEMENT LEARNING USING ADVANTAGE ESTIMATES

Publication number: 20220284266

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for computing Q values for actions to be performed by an agent interacting with an environment from a continuous action space of actions. In one aspect, a system includes a value subnetwork configured to receive an observation characterizing a current state of the environment and process the observation to generate a value estimate; a policy subnetwork configured to receive the observation and process the observation to generate an ideal point in the continuous action space; and a subsystem configured to receive a particular point in the continuous action space representing a particular action; generate an advantage estimate for the particular action; and generate a Q value for the particular action that is an estimate of an expected return resulting from the agent performing the particular action when the environment is in the current state.

Type: Application

Filed: March 25, 2022

Publication date: September 8, 2022

Inventors: Shixiang Gu, Timothy Paul Lillicrap, Ilya Sutskever, Sergey Vladimir Levine
Using simulation and domain adaptation for robotic control

Patent number: 11341364

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network that is used to control a robotic agent interacting with a real-world environment.

Type: Grant

Filed: September 20, 2018

Date of Patent: May 24, 2022

Assignee: Google LLC

Inventors: Konstantinos Bousmalis, Alexander Irpan, Paul Wohlhart, Yunfei Bai, Mrinal Kalakrishnan, Julian Ibarz, Sergey Vladimir Levine, Kurt Konolige, Vincent O. Vanhoucke, Matthew Laurance Kelcey
Reinforcement learning using advantage estimates

Patent number: 11288568

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for computing Q values for actions to be performed by an agent interacting with an environment from a continuous action space of actions. In one aspect, a system includes a value subnetwork configured to receive an observation characterizing a current state of the environment and process the observation to generate a value estimate; a policy subnetwork configured to receive the observation and process the observation to generate an ideal point in the continuous action space; and a subsystem configured to receive a particular point in the continuous action space representing a particular action; generate an advantage estimate for the particular action; and generate a Q value for the particular action that is an estimate of an expected return resulting from the agent performing the particular action when the environment is in the current state.

Type: Grant

Filed: February 9, 2017

Date of Patent: March 29, 2022

Assignee: Google LLC

Inventors: Shixiang Gu, Timothy Paul Lillicrap, Ilya Sutskever, Sergey Vladimir Levine
Control policies for collective robot learning

Patent number: 11188821

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, of training a global policy neural network. One of the methods includes initializing an instance of the robotic task for multiple local workers, generating a trajectory of state-action pairs by selecting actions to be performed by the robotic agent while performing the instance of the robotic task, optimizing a local policy controller on the trajectory, generating an optimized trajectory using the optimized local controller, and storing the optimized trajectory in a replay memory associated with the local worker. The method includes sampling, for multiple global workers, an optimized trajectory from one of one or more replay memories associated with the global worker, and training the replica of the global policy neural network maintained by the global worker on the sampled optimized trajectory to determine delta values for the parameters of the global policy neural network.

Type: Grant

Filed: September 15, 2017

Date of Patent: November 30, 2021

Assignee: X Development LLC

Inventors: Mrinal Kalakrishnan, Ali Hamid Yahya Valdovinos, Adrian Ling Hin Li, Yevgen Chebotar, Sergey Vladimir Levine
Agent navigation using visual inputs

Patent number: 11010948

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for navigation using visual inputs. One of the systems includes a mapping subsystem configured to, at each time step of a plurality of time steps, generate a characterization of an environment from an image of the environment at the time step, wherein the characterization comprises an environment map identifying locations in the environment having a particular characteristic, and wherein generating the characterization comprises, for each time step: obtaining the image of the environment at the time step, processing the image to generate a first initial characterization for the time step, obtaining a final characterization for a previous time step, processing the characterization for the previous time step to generate a second initial characterization for the time step, and combining the first initial characterization and the second initial characterization to generate a final characterization for the time step.

Type: Grant

Filed: February 9, 2018

Date of Patent: May 18, 2021

Assignee: Google LLC

Inventors: Rahul Sukthankar, Saurabh Gupta, James Christopher Davidson, Sergey Vladimir Levine, Jitendra Malik
Control policies for robotic agents

Patent number: 10960539

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, of training a global policy neural network. One of the methods includes initializing a plurality of instances of the robotic task. For each instance of the robotic task, the method includes generating a trajectory of state-action pairs by selecting actions to be performed by the robotic agent while performing the instance of the robotic task in accordance with current values of the parameters of the global policy neural network, and optimizing a local policy controller that is specific to the instance on the trajectory of state-action pairs for the instance. The method further includes generating training data for the global policy neural network using the local policy controllers, and training the global policy neural network on the training data to adjust the current values of the parameters of the global policy neural network.

Type: Grant

Filed: September 15, 2017

Date of Patent: March 30, 2021

Assignee: X Development LLC

Inventors: Mrinal Kalakrishnan, Ali Hamid Yahya Valdovinos, Adrian Ling Hin Li, Yevgen Chebotar, Sergey Vladimir Levine
OFF-POLICY CONTROL POLICY EVALUATION

Publication number: 20200304545

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for off-policy evaluation of a control policy. One of the methods includes obtaining policy data specifying a control policy for controlling a source agent interacting with a source environment to perform a particular task; obtaining a validation data set generated from interactions of a target agent in a target environment; determining a performance estimate that represents an estimate of a performance of the control policy in controlling the target agent to perform the particular task in the target environment; and determining, based on the performance estimate, whether to deploy the control policy for controlling the target agent to perform the particular task in the target environment.

Type: Application

Filed: March 23, 2020

Publication date: September 24, 2020

Inventors: Kanury Kanishka Rao, Konstantinos Bousmalis, Christopher K. Harris, Alexander Irpan, Sergey Vladimir Levine, Julian Ibarz
USING SIMULATION AND DOMAIN ADAPTATION FOR ROBOTIC CONTROL

Publication number: 20200279134

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network that is used to control a robotic agent interacting with a real-world environment.

Type: Application

Filed: September 20, 2018

Publication date: September 3, 2020

Inventors: Konstantinos Bousmalis, Alexander Irpan, Paul Wohlhart, Yunfei Bai, Mrinal Kalakrishnan, Julian Ibarz, Sergey Vladimir Levine, Kurt Konolige, Vincent O. Vanhoucke, Matthew Laurance Kelcey
Self-supervised robotic object interaction

Patent number: 10635944

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an object representation neural network. One of the methods includes obtaining training sets of images, each training set comprising: (i) a before image of a before scene of the environment, (ii) an after image of an after scene of the environment after the robot has removed a particular object, and (iii) an object image of the particular object, and training the object representation neural network on the batch of training data, comprising determining an update to the object representation parameters that encourages the vector embedding of the particular object in each training set to be closer to a difference between (i) the vector embedding of the after scene in the training set and (ii) the vector embedding of the before scene in the training set.

Type: Grant

Filed: June 17, 2019

Date of Patent: April 28, 2020

Assignee: Google LLC

Inventors: Eric Victor Jang, Sergey Vladimir Levine, Coline Manon Devin
SELF-SUPERVISED ROBOTIC OBJECT INTERACTION

Publication number: 20190385022

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an object representation neural network. One of the methods includes obtaining training sets of images, each training set comprising: (i) a before image of a before scene of the environment, (ii) an after image of an after scene of the environment after the robot has removed a particular object, and (iii) an object image of the particular object, and training the object representation neural network on the batch of training data, comprising determining an update to the object representation parameters that encourages the vector embedding of the particular object in each training set to be closer to a difference between (i) the vector embedding of the after scene in the training set and (ii) the vector embedding of the before scene in the training set.

Type: Application

Filed: June 17, 2019

Publication date: December 19, 2019

Inventors: Eric Victor Jang, Sergey Vladimir Levine, Coline Manon Devin
AGENT NAVIGATION USING VISUAL INPUTS

Publication number: 20190371025

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for navigation using visual inputs. One of the systems includes a mapping subsystem configured to, at each time step of a plurality of time steps, generate a characterization of an environment from an image of the environment at the time step, wherein the characterization comprises an environment map identifying locations in the environment having a particular characteristic, and wherein generating the characterization comprises, for each time step: obtaining the image of the environment at the time step, processing the image to generate a first initial characterization for the time step, obtaining a final characterization for a previous time step, processing the characterization for the previous time step to generate a second initial characterization for the time step, and combining the first initial characterization and the second initial characterization to generate a final characterization for the time step.

Type: Application

Filed: February 9, 2018

Publication date: December 5, 2019

Inventors: Rahul Sukthankar, Saurabh Gupta, James Christopher Davidson, Sergey Vladimir Levine, Jitendra Malik
CONTROL POLICIES FOR ROBOTIC AGENTS

Publication number: 20190251437

Abstract: A method includes: receiving data identifying, for each of one or more objects, a respective target location to which a robotic agent interacting with a real-world environment should move the object; causing the robotic agent to move the one or more objects to the one or more target locations by repeatedly performing the following: receiving a current image of a current state of the real-world environment; determining, from the current image, a next sequence of actions to be performed by the robotic agent using a next image prediction neural network that predicts future images based on a current action and an action to be performed by the robotic agent; and directing the robotic agent to perform the next sequence of actions.

Type: Application

Filed: September 15, 2017

Publication date: August 15, 2019

Inventors: Chelsea Breanna Finn, Sergey Vladimir Levine
REINFORCEMENT LEARNING USING ADVANTAGE ESTIMATES

Publication number: 20170228662

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for computing Q values for actions to be performed by an agent interacting with an environment from a continuous action space of actions. In one aspect, a system includes a value subnetwork configured to receive an observation characterizing a current state of the environment and process the observation to generate a value estimate; a policy subnetwork configured to receive the observation and process the observation to generate an ideal point in the continuous action space; and a subsystem configured to receive a particular point in the continuous action space representing a particular action; generate an advantage estimate for the particular action; and generate a Q value for the particular action that is an estimate of an expected return resulting from the agent performing the particular action when the environment is in the current state.

Type: Application

Filed: February 9, 2017

Publication date: August 10, 2017

Inventors: Shixiang Gu, Timothy Paul Lillicrap, Ilya ISutskever, Sergey Vladimir Levine