Patents by Inventor Alexander Herzog

Alexander Herzog has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Utilizing past contact physics in robotic manipulation (e.g., pushing) of an object

Patent number: 11833661

Abstract: Utilization of past dynamics sample(s), that reflect past contact physics information, in training and/or utilizing a neural network model. The neural network model represents a learned value function (e.g., a Q-value function) and that, when trained, can be used in selecting a sequence of robotic actions to implement in robotic manipulation (e.g., pushing) of an object by a robot. In various implementations, a past dynamics sample for an episode of robotic manipulation can include at least two past images from the episode, as well as one or more past force sensor readings that temporally correspond to the past images from the episode.

Type: Grant

Filed: October 31, 2021

Date of Patent: December 5, 2023

Assignee: GOOGLE LLC

Inventors: Zhuo Xu, Wenhao Yu, Alexander Herzog, Wenlong Lu, Chuyuan Fu, Yunfei Bai, C. Karen Liu, Daniel Ho
Asynchronous robotic control using most recently selected robotic action data

Patent number: 11685045

Abstract: Asynchronous robotic control utilizing a trained critic network. During performance of a robotic task based on a sequence of robotic actions determined utilizing the critic network, a corresponding next robotic action of the sequence is determined while a corresponding previous robotic action of the sequence is still being implemented. Optionally, the next robotic action can be fully determined and/or can begin to be implemented before implementation of the previous robotic action is completed. In determining the next robotic action, most recently selected robotic action data is processed using the critic network, where such data conveys information about the previous robotic action that is still being implemented. Some implementations additionally or alternatively relate to determining when to implement a robotic action that is determined in an asynchronous manner.

Type: Grant

Filed: September 8, 2020

Date of Patent: June 27, 2023

Assignee: X DEVELOPMENT LLC

Inventors: Alexander Herzog, Dmitry Kalashnikov, Julian Ibarz
Generating reinforcement learning data that is compatible with reinforcement learning for a robotic task

Patent number: 11610153

Abstract: Utilizing at least one existing policy (e.g. a manually engineered policy) for a robotic task, in generating reinforcement learning (RL) data that can be used in training an RL policy for an instance of RL of the robotic task. The existing policy can be one that, standing alone, will not generate data that is compatible with the instance of RL for the robotic task. In contrast, the generated RL data is compatible with RL for the robotic task at least by virtue of it including state data that is in a state space of the RL for the robotic task, and including actions that are in the action space of the RL for the robotic task. The generated RL data can be used in at least some of the initial training for the RL policy using reinforcement learning.

Type: Grant

Filed: December 30, 2019

Date of Patent: March 21, 2023

Assignee: X DEVELOPMENT LLC

Inventors: Alexander Herzog, Adrian Li, Mrinal Kalakrishnan, Benjamin Holson
LEARNING ROBOTIC SKILLS WITH IMITATION AND REINFORCEMENT AT SCALE

Publication number: 20220410380

Abstract: Utilizing an initial set of offline positive-only robotic demonstration data for pre-training an actor network and a critic network for robotic control, followed by further training of the networks based on online robotic episodes that utilize the network(s). Implementations enable the actor network to be effectively pre-trained, while mitigating occurrences of and/or the extent of forgetting when further trained based on episode data. Implementations additionally or alternatively enable the actor network to be trained to a given degree of effectiveness in fewer training steps. In various implementations, one or more adaptation techniques are utilized in performing the robotic episodes and/or in performing the robotic training. The adaptation techniques can each, individually, result in one or more corresponding advantages and, when used in any combination, the corresponding advantages can accumulate.

Type: Application

Filed: June 17, 2022

Publication date: December 29, 2022

Inventors: Yao Lu, Mengyuan Yan, Seyed Mohammad Khansari Zadeh, Alexander Herzog, Eric Jang, Karol Hausman, Yevgen Chebotar, Sergey Levine, Alexander Irpan
TRAINING A POLICY MODEL FOR A ROBOTIC TASK, USING REINFORCEMENT LEARNING AND UTILIZING DATA THAT IS BASED ON EPISODES, OF THE ROBOTIC TASK, GUIDED BY AN ENGINEERED POLICY

Publication number: 20220245503

Abstract: Implementations disclosed herein relate to utilizing at least one existing manually engineered policy, for a robotic task, in training an RL policy model that can be used to at least selectively replace a portion of the engineered policy. The RL policy model can be trained for replacing a portion of a robotic task and can be trained based on data from episodes of attempting performance of the robotic task, including episodes in which the portion is performed based on the engineered policy and/or other portion(s) are performed based on the engineered policy. Once trained, the RL policy model can be used, at least selectively and in lieu of utilization of the engineered policy, to perform the portion of robotic task, while other portion(s) of the robotic task are performed utilizing the engineered policy and/or other similarly trained (but distinct) RL policy model(s).

Type: Application

Filed: January 29, 2021

Publication date: August 4, 2022

Inventors: Adrian Li, Benjamin Holson, Alexander Herzog, Mrinal Kalakrishnan
UTILIZING PAST CONTACT PHYSICS IN ROBOTIC MANIPULATION (E.G., PUSHING) OF AN OBJECT

Publication number: 20220134546

Abstract: Utilization of past dynamics sample(s), that reflect past contact physics information, in training and/or utilizing a neural network model. The neural network model represents a learned value function (e.g., a Q-value function) and that, when trained, can be used in selecting a sequence of robotic actions to implement in robotic manipulation (e.g., pushing) of an object by a robot. In various implementations, a past dynamics sample for an episode of robotic manipulation can include at least two past images from the episode, as well as one or more past force sensor readings that temporally correspond to the past images from the episode.

Type: Application

Filed: October 31, 2021

Publication date: May 5, 2022

Inventors: Zhuo Xu, Wenhao Yu, Alexander Herzog, Wenlong Lu, Chuyuan Fu, Yunfei Bai, C. Karen Liu, Daniel Ho
EFFICIENT ADAPTION OF ROBOT CONTROL POLICY FOR NEW TASK USING META-LEARNING BASED ON META-IMITATION LEARNING AND META-REINFORCEMENT LEARNING

Publication number: 20220105624

Abstract: Techniques are disclosed that enable training a meta-learning model, for use in causing a robot to perform a task, using imitation learning as well as reinforcement learning. Some implementations relate to training the meta-learning model using imitation learning based on one or more human guided demonstrations of the task. Additional or alternative implementations relate to training the meta-learning model using reinforcement learning based on trials of the robot attempting to perform the task. Further implementations relate to using the trained meta-learning model to few shot (or one shot) learn a new task based on a human guided demonstration of the new task.

Type: Application

Filed: January 23, 2020

Publication date: April 7, 2022

Inventors: Mrinal Kalakrishnan, Yunfei Bai, Paul Wohlhart, Eric Jang, Chelsea Finn, Seyed Mohammad Khansari Zadeh, Sergey Levine, Allan Zhou, Alexander Herzog, Daniel Kappler
DEEP REINFORCEMENT LEARNING FOR ROBOTIC MANIPULATION

Publication number: 20210237266

Abstract: Using large-scale reinforcement learning to train a policy model that can be utilized by a robot in performing a robotic task in which the robot interacts with one or more environmental objects. In various implementations, off-policy deep reinforcement learning is used to train the policy model, and the off-policy deep reinforcement learning is based on self-supervised data collection. The policy model can be a neural network model. Implementations of the reinforcement learning utilized in training the neural network model utilize a continuous-action variant of Q-learning. Through techniques disclosed herein, implementations can learn policies that generalize effectively to previously unseen objects, previously unseen environments, etc.

Type: Application

Filed: June 14, 2019

Publication date: August 5, 2021

Inventors: Dmitry Kalashnikov, Alexander Irpan, Peter Pastor Sampedro, Julian Ibarz, Alexander Herzog, Eric Jang, Deirdre Quillen, Ethan Holly, Sergey Levine
Interface process for an all points addressable printer

Patent number: 4651278

Abstract: This invention is a process for interconnecting an all points addressable printer with a host application program wherein the application presents output to be printed to the printer and wherein the host application can be present on a variety of different computer equipment such as a large host computer, a standalone workstation, or workstation on a local area network and wherein the all points addressable page printer can utilize any type of printing technology such as electrophotographic, magnetic or other and wherein the printer and the application host are interconnected by communicating means such as a channel, local area network, or telecommunication line and wherein any type of transmission protocol can be used and wherein the process enables the transmission of commands and data from the host application to the printer in a manner which is independent of the communication means and transmission protocol.

Type: Grant

Filed: February 11, 1985

Date of Patent: March 17, 1987

Assignee: International Business Machines Corporation

Inventors: Alexander Herzog, James W. Marlin, Brian G. Platte, Filip J. Yeskel
Optical mark recognition for controlling input devices, hosts, and output devices

Patent number: 4571699

Abstract: The operation of a document distribution network having one or more input work stations, a linking network with one or more nodes and one or more output work stations, is controlled by a job control sheet. The job control sheet is partitioned into a plurality of control zones. Each zone contains dedicated marked sense information for controlling the input work stations, the network nodes and the output work stations. The input work station includes a marked sense recognition device which coacts with the job control sheet to identify the presence or absence of the control zones. Marked sense information which is associated with the input station control zones is extracted and utilized to control the input work station. The marked sense information which is associated with network nodes control zone is encoded and transmitted with identifying marks to the network nodes for further processing.

Type: Grant

Filed: June 3, 1982

Date of Patent: February 18, 1986

Assignee: International Business Machines Corporation

Inventors: Alexander Herzog, Larry L. Honomichl, Jagdish M. Nagda, Teddy A. Rehage