Patents by Inventor Sergey Levine
Sergey Levine has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12611768Abstract: Training and/or using a recurrent neural network model for visual servoing of an end effector of a robot. In visual servoing, the model can be utilized to generate, at each of a plurality of time steps, an action prediction that represents a prediction of how the end effector should be moved to cause the end effector to move toward a target object. The model can be viewpoint invariant in that it can be utilized across a variety of robots having vision components at a variety of viewpoints and/or can be utilized for a single robot even when a viewpoint, of a vision component of the robot, is drastically altered. Moreover, the model can be trained based on a large quantity of simulated data that is based on simulator(s) performing simulated episode(s) in view of the model. One or more portions of the model can be further trained based on a relatively smaller quantity of real training data.Type: GrantFiled: July 17, 2023Date of Patent: April 28, 2026Assignee: GOOGLE LLCInventors: Alexander Toshev, Fereshteh Sadeghi, Sergey Levine
-
Publication number: 20260077490Abstract: Training and/or utilizing a hierarchical reinforcement learning (HRL) model for robotic control. The HRL model can include at least a higher-level policy model and a lower-level policy model. Some implementations relate to technique(s) that enable more efficient off-policy training to be utilized in training of the higher-level policy model and/or the lower-level policy model. Some of those implementations utilize off-policy correction, which re-labels higher-level actions of experience data, generated in the past utilizing a previously trained version of the HRL model, with modified higher-level actions. The modified higher-level actions are then utilized to off-policy train the higher-level policy model. This can enable effective off-policy training despite the lower-level policy model being a different version at training time (relative to the version when the experience data was collected).Type: ApplicationFiled: November 21, 2025Publication date: March 19, 2026Inventors: Honglak Lee, Shixiang Gu, Sergey Levine
-
Publication number: 20260014708Abstract: A method may include receiving an image of a robot, receiving a language instruction of a task to be performed by the robot, generating a plurality of image sequences of the robot performing the task based on the received image of the robot and the language instruction, selecting a first image sequence among the plurality of image sequences having a highest probability of performing the task, and determining a plurality of actions to be performed by a second robot to perform the task based on the first image sequence.Type: ApplicationFiled: May 13, 2025Publication date: January 15, 2026Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha, The Trustees of Princeton University, The Regents of the University of CaliforniaInventors: Kyle Hatch, Ashwin Balakrishna, Suraj Nair, Blake Wulfe, Mikhal Itkina, Thomas Kollar, Benjamin Burchfiel, Benjamin Eysenbach, Oier Mees, Seohong Park, Sergey Levine
-
Patent number: 12498677Abstract: Implementations disclosed herein relate to mitigating the reality gap through training a simulation-to-real machine learning model (“Sim2Real” model) using a vision-based robot task machine learning model. The vision-based robot task machine learning model can be, for example, a reinforcement learning (“RL”) neural network model (RL-network), such as an RL-network that represents a Q-function.Type: GrantFiled: May 15, 2020Date of Patent: December 16, 2025Assignee: GOOGLE LLCInventors: Kanishka Rao, Chris Harris, Julian Ibarz, Alexander Irpan, Seyed Mohammad Khansari Zadeh, Sergey Levine
-
Patent number: 12479093Abstract: Training and/or utilizing a hierarchical reinforcement learning (HRL) model for robotic control. The HRL model can include at least a higher-level policy model and a lower-level policy model. Some implementations relate to technique(s) that enable more efficient off-policy training to be utilized in training of the higher-level policy model and/or the lower-level policy model. Some of those implementations utilize off-policy correction, which re-labels higher-level actions of experience data, generated in the past utilizing a previously trained version of the HRL model, with modified higher-level actions. The modified higher-level actions are then utilized to off-policy train the higher-level policy model. This can enable effective off-policy training despite the lower-level policy model being a different version at training time (relative to the version when the experience data was collected).Type: GrantFiled: May 24, 2024Date of Patent: November 25, 2025Assignee: GOOGLE LLCInventors: Honglak Lee, Shixiang Gu, Sergey Levine
-
Publication number: 20250153363Abstract: Implementations described herein relate to training and refining robotic control policies using imitation learning techniques. A robotic control policy can be initially trained based on human demonstrations of various robotic tasks. Further, the robotic control policy can be refined based on human interventions while a robot is performing a robotic task. In some implementations, the robotic control policy may determine whether the robot will fail in performance of the robotic task, and prompt a human to intervene in performance of the robotic task. In additional or alternative implementations, a representation of the sequence of actions can be visually rendered for presentation to the human can proactively intervene in performance of the robotic task.Type: ApplicationFiled: January 16, 2025Publication date: May 15, 2025Inventors: Seyed Mohammad Khansari Zadeh, Eric Jang, Daniel Lam, Daniel Kappler, Matthew Bennice, Brent Austin, Yunfei Bai, Sergey Levine, Alexander Irpan, Nicolas Sievers, Chelsea Finn
-
Publication number: 20250153352Abstract: Implementations utilize deep reinforcement learning to train a policy neural network that parameterizes a policy for determining a robotic action based on a current state. Some of those implementations collect experience data from multiple robots that operate simultaneously. Each robot generates instances of experience data during iterative performance of episodes that are each explorations of performing a task, and that are each guided based on the policy network and the current policy parameters for the policy network during the episode. The collected experience data is generated during the episodes and is used to train the policy network by iteratively updating policy parameters of the policy network based on a batch of collected experience data. Further, prior to performance of each of a plurality of episodes performed by the robots, the current updated policy parameters can be provided (or retrieved) for utilization in performance of the episode.Type: ApplicationFiled: January 16, 2025Publication date: May 15, 2025Inventors: Sergey Levine, Ethan Holly, Shixiang Gu, Timothy Lillicrap
-
Patent number: 12240113Abstract: Implementations utilize deep reinforcement learning to train a policy neural network that parameterizes a policy for determining a robotic action based on a current state. Some of those implementations collect experience data from multiple robots that operate simultaneously. Each robot generates instances of experience data during iterative performance of episodes that are each explorations of performing a task, and that are each guided based on the policy network and the current policy parameters for the policy network during the episode. The collected experience data is generated during the episodes and is used to train the policy network by iteratively updating policy parameters of the policy network based on a batch of collected experience data. Further, prior to performance of each of a plurality of episodes performed by the robots, the current updated policy parameters can be provided (or retrieved) for utilization in performance of the episode.Type: GrantFiled: December 1, 2023Date of Patent: March 4, 2025Assignee: GOOGLE LLCInventors: Sergey Levine, Ethan Holly, Shixiang Gu, Timothy Lillicrap
-
Patent number: 12226920Abstract: Implementations described herein relate to training and refining robotic control policies using imitation learning techniques. A robotic control policy can be initially trained based on human demonstrations of various robotic tasks. Further, the robotic control policy can be refined based on human interventions while a robot is performing a robotic task. In some implementations, the robotic control policy may determine whether the robot will fail in performance of the robotic task, and prompt a human to intervene in performance of the robotic task. In additional or alternative implementations, a representation of the sequence of actions can be visually rendered for presentation to the human can proactively intervene in performance of the robotic task.Type: GrantFiled: August 11, 2023Date of Patent: February 18, 2025Assignee: GOOGLE LLCInventors: Seyed Mohammad Khansari Zadeh, Eric Jang, Daniel Lam, Daniel Kappler, Matthew Bennice, Brent Austin, Yunfei Bai, Sergey Levine, Alexander Irpan, Nicolas Sievers, Chelsea Finn
-
Patent number: 12103564Abstract: A method of generating an output trajectory of an ego vehicle includes recording trajectory data of the ego vehicle and pedestrian agents from a scene of a training environment of the ego vehicle. The method includes identifying at least one pedestrian agent from the pedestrian agents within the scene of the training environment of the ego vehicle causing a prediction-discrepancy by the ego vehicle greater than the pedestrian agents within the scene. The method includes updating parameters of a motion prediction model of the ego vehicle based on a magnitude of the prediction-discrepancy caused by the at least one pedestrian agent on the ego vehicle to form a trained, control-aware prediction objective model. The method includes selecting a vehicle control action of the ego vehicle in response to a predicted motion from the trained, control-aware prediction objective model regarding detected pedestrian agents within a traffic environment of the ego vehicle.Type: GrantFiled: January 6, 2022Date of Patent: October 1, 2024Assignees: TOYOTA RESEARCH INSTITUTE, INC., THE REGENTS OF THE UNIVERSITY OF CALIFORNIAInventors: Rowan Thomas McAllister, Blake Warren Wulfe, Jean Mercat, Logan Michael Ellis, Sergey Levine, Adrien David Gaidon
-
Publication number: 20240308068Abstract: Training and/or utilizing a hierarchical reinforcement learning (HRL) model for robotic control. The HRL model can include at least a higher-level policy model and a lower-level policy model. Some implementations relate to technique(s) that enable more efficient off-policy training to be utilized in training of the higher-level policy model and/or the lower-level policy model. Some of those implementations utilize off-policy correction, which re-labels higher-level actions of experience data, generated in the past utilizing a previously trained version of the HRL model, with modified higher-level actions. The modified higher-level actions are then utilized to off-policy train the higher-level policy model. This can enable effective off-policy training despite the lower-level policy model being a different version at training time (relative to the version when the experience data was collected).Type: ApplicationFiled: May 24, 2024Publication date: September 19, 2024Inventors: Honglak Lee, Shixiang Gu, Sergey Levine
-
Patent number: 12083678Abstract: Techniques are disclosed that enable training a meta-learning model, for use in causing a robot to perform a task, using imitation learning as well as reinforcement learning. Some implementations relate to training the meta-learning model using imitation learning based on one or more human guided demonstrations of the task. Additional or alternative implementations relate to training the meta-learning model using reinforcement learning based on trials of the robot attempting to perform the task. Further implementations relate to using the trained meta-learning model to few shot (or one shot) learn a new task based on a human guided demonstration of the new task.Type: GrantFiled: January 23, 2020Date of Patent: September 10, 2024Assignee: GOOGLE LLCInventors: Mrinal Kalakrishnan, Yunfei Bai, Paul Wohlhart, Eric Jang, Chelsea Finn, Seyed Mohammad Khansari Zadeh, Sergey Levine, Allan Zhou, Alexander Herzog, Daniel Kappler
-
Patent number: 11992944Abstract: Training and/or utilizing a hierarchical reinforcement learning (HRL) model for robotic control. The HRL model can include at least a higher-level policy model and a lower-level policy model. Some implementations relate to technique(s) that enable more efficient off-policy training to be utilized in training of the higher-level policy model and/or the lower-level policy model. Some of those implementations utilize off-policy correction, which re-labels higher-level actions of experience data, generated in the past utilizing a previously trained version of the HRL model, with modified higher-level actions. The modified higher-level actions are then utilized to off-policy train the higher-level policy model. This can enable effective off-policy training despite the lower-level policy model being a different version at training time (relative to the version when the experience data was collected).Type: GrantFiled: May 17, 2019Date of Patent: May 28, 2024Assignee: GOOGLE LLCInventors: Honglak Lee, Shixiang Gu, Sergey Levine
-
Patent number: 11992945Abstract: Techniques are disclosed that enable training a plurality of policy networks, each policy network corresponding to a disparate robotic training task, using a mobile robot in a real world workspace. Various implementations include selecting a training task based on comparing a pose of the mobile robot to at least one parameter of a real world training workspace. For example, the training task can be selected based on the position of a landmark, within the workspace, relative to the pose. For instance, the training task can be selected such that the selected training task moves the mobile robot towards the landmark.Type: GrantFiled: November 10, 2020Date of Patent: May 28, 2024Assignee: GOOGLE LLCInventors: Jie Tan, Sehoon Ha, Peng Xu, Sergey Levine, Zhenyu Tan
-
Publication number: 20240131695Abstract: Implementations utilize deep reinforcement learning to train a policy neural network that parameterizes a policy for determining a robotic action based on a current state. Some of those implementations collect experience data from multiple robots that operate simultaneously. Each robot generates instances of experience data during iterative performance of episodes that are each explorations of performing a task, and that are each guided based on the policy network and the current policy parameters for the policy network during the episode. The collected experience data is generated during the episodes and is used to train the policy network by iteratively updating policy parameters of the policy network based on a batch of collected experience data. Further, prior to performance of each of a plurality of episodes performed by the robots, the current updated policy parameters can be provided (or retrieved) for utilization in performance of the episode.Type: ApplicationFiled: December 1, 2023Publication date: April 25, 2024Inventors: Sergey Levine, Ethan Holly, Shixiang Gu, Timothy Lillicrap
-
Publication number: 20240118667Abstract: Implementations disclosed herein relate to mitigating the reality gap through training a simulation-to-real machine learning model (“Sim2Real” model) using a vision-based robot task machine learning model. The vision-based robot task machine learning model can be, for example, a reinforcement learning (“RL”) neural network model (RL-network), such as an RL-network that represents a Q-function.Type: ApplicationFiled: May 15, 2020Publication date: April 11, 2024Inventors: Kanishka Rao, Chris Harris, Julian Ibarz, Alexander Irpan, Seyed Mohammad Khansari Zadeh, Sergey Levine
-
Patent number: 11897133Abstract: Implementations utilize deep reinforcement learning to train a policy neural network that parameterizes a policy for determining a robotic action based on a current state. Some of those implementations collect experience data from multiple robots that operate simultaneously. Each robot generates instances of experience data during iterative performance of episodes that are each explorations of performing a task, and that are each guided based on the policy network and the current policy parameters for the policy network during the episode. The collected experience data is generated during the episodes and is used to train the policy network by iteratively updating policy parameters of the policy network based on a batch of collected experience data. Further, prior to performance of each of a plurality of episodes performed by the robots, the current updated policy parameters can be provided (or retrieved) for utilization in performance of the episode.Type: GrantFiled: August 1, 2022Date of Patent: February 13, 2024Assignee: GOOGLE LLCInventors: Sergey Levine, Ethan Holly, Shixiang Gu, Timothy Lillicrap
-
Publication number: 20240017405Abstract: Training and/or using a recurrent neural network model for visual servoing of an end effector of a robot. In visual servoing, the model can be utilized to generate, at each of a plurality of time steps, an action prediction that represents a prediction of how the end effector should be moved to cause the end effector to move toward a target object. The model can be viewpoint invariant in that it can be utilized across a variety of robots having vision components at a variety of viewpoints and/or can be utilized for a single robot even when a viewpoint, of a vision component of the robot, is drastically altered. Moreover, the model can be trained based on a large quantity of simulated data that is based on simulator(s) performing simulated episode(s) in view of the model. One or more portions of the model can be further trained based on a relatively smaller quantity of real training data.Type: ApplicationFiled: July 17, 2023Publication date: January 18, 2024Inventors: Alexander Toshev, Fereshteh Sadeghi, Sergey Levine
-
Patent number: 11845183Abstract: Implementations utilize deep reinforcement learning to train a policy neural network that parameterizes a policy for determining a robotic action based on a current state. Some of those implementations collect experience data from multiple robots that operate simultaneously. Each robot generates instances of experience data during iterative performance of episodes that are each explorations of performing a task, and that are each guided based on the policy network and the current policy parameters for the policy network during the episode. The collected experience data is generated during the episodes and is used to train the policy network by iteratively updating policy parameters of the policy network based on a batch of collected experience data. Further, prior to performance of each of a plurality of episodes performed by the robots, the current updated policy parameters can be provided (or retrieved) for utilization in performance of the episode.Type: GrantFiled: August 1, 2022Date of Patent: December 19, 2023Assignee: GOOGLE LLCInventors: Sergey Levine, Ethan Holly, Shixiang Gu, Timothy Lillicrap
-
Publication number: 20230381970Abstract: Implementations described herein relate to training and refining robotic control policies using imitation learning techniques. A robotic control policy can be initially trained based on human demonstrations of various robotic tasks. Further, the robotic control policy can be refined based on human interventions while a robot is performing a robotic task. In some implementations, the robotic control policy may determine whether the robot will fail in performance of the robotic task, and prompt a human to intervene in performance of the robotic task. In additional or alternative implementations, a representation of the sequence of actions can be visually rendered for presentation to the human can proactively intervene in performance of the robotic task.Type: ApplicationFiled: August 11, 2023Publication date: November 30, 2023Inventors: Seyed Mohammad Khansari Zadeh, Eric Jang, Daniel Lam, Daniel Kappler, Matthew Bennice, Brent Austin, Yunfei Bai, Sergey Levine, Alexander Irpan, Nicolas Sievers, Chelsea Finn