Patents by Inventor Soeren Pirk

Soeren Pirk has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ROBOT NAVIGATION IN DEPENDENCE ON GESTURE(S) OF HUMAN(S) IN ENVIRONMENT WITH ROBOT

Publication number: 20240094736

Abstract: Training and/or utilizing a high-level neural network (NN) model, such as a sequential NN model. The high-level NN model, when trained, can be used to process a sequence of consecutive state data instances (e.g., N most recent, including a current state date instance) to generate a sequence of outputs that indicate a sequence of position deltas. The sequence of position deltas can be used to generate an intermediate target position for navigation and, optionally, an intermediate target orientation that corresponds to the intermediate target position. The intermediate target position and, optionally, the intermediate target orientation, can be provided to a low-level navigation policy, such as an MPC policy, and used by the low-level navigation policy as its goal position (and optionally goal orientation) for a plurality of iterations (e.g., until a new intermediate target position (and optionally new target orientation) is generated using the high-level NN model.

Type: Application

Filed: August 30, 2023

Publication date: March 21, 2024

Inventors: Catie Cuan, Tsang-Wei Lee, Anthony G. Francis, JR., Alexander Toshev, Soeren Pirk
Training a deep neural network model to generate rich object-centric embeddings of robotic vision data

Patent number: 11887363

Abstract: Training a machine learning model (e.g., a neural network model such as a convolutional neural network (CNN) model) so that, when trained, the model can be utilized in processing vision data (e.g., from a vision component of a robot), that captures an object, to generate a rich object-centric embedding for the vision data. The generated embedding can enable differentiation of even subtle variations of attributes of the object captured by the vision data.

Type: Grant

Filed: September 27, 2019

Date of Patent: January 30, 2024

Assignee: GOOGLE LLC

Inventors: Soeren Pirk, Yunfei Bai, Pierre Sermanet, Seyed Mohammad Khansari Zadeh, Harrison Lynch
UNSUPERVISED DEPTH PREDICTION NEURAL NETWORKS

Publication number: 20230419521

Abstract: A system for generating a depth output for an image is described. The system receives input images that depict the same scene, each input image including one or more potential objects. The system generates, for each input image, a respective background image and processes the background images to generate a camera motion output that characterizes the motion of the camera between the input images. For each potential object, the system generates a respective object motion output for the potential object based on the input images and the camera motion output. The system processes a particular input image of the input images using a depth prediction neural network (NN) to generate a depth output for the particular input image, and updates the current values of parameters of the depth prediction NN based on the particular depth output, the camera motion output, and the object motion outputs for the potential objects.

Type: Application

Filed: September 13, 2023

Publication date: December 28, 2023

Inventors: Vincent Michael Casser, Soeren Pirk, Reza Mahjourian, Anelia Angelova
TRAINING INSTANCE SEGMENTATION NEURAL NETWORKS THROUGH CONTRASTIVE LEARNING

Publication number: 20230334842

Abstract: Methods, systems, and apparatus for processing inputs that include video frames using neural networks. In one aspect, a system comprises one or more computers configured to obtain a set of one or more training images and, for each training image, ground truth instance data that identifies, for each of one or more object instances, a corresponding region of the training image that depicts the object instance. For each training image in the set, the one or more computers process the training image using an instance segmentation neural network to generate an embedding output comprising a respective embedding for each of a plurality of output pixels. The one or more computers then train the instance segmentation neural network to minimize a loss function.

Type: Application

Filed: April 18, 2023

Publication date: October 19, 2023

Inventors: Alex Zihao Zhu, Vincent Michael Casser, Henrik Kretzschmar, Reza Mahjourian, Soeren Pirk
Unsupervised depth prediction neural networks

Patent number: 11783500

Abstract: A system for generating a depth output for an image is described. The system receives input images that depict the same scene, each input image including one or more potential objects. The system generates, for each input image, a respective background image and processes the background images to generate a camera motion output that characterizes the motion of the camera between the input images. For each potential object, the system generates a respective object motion output for the potential object based on the input images and the camera motion output. The system processes a particular input image of the input images using a depth prediction neural network (NN) to generate a depth output for the particular input image, and updates the current values of parameters of the depth prediction NN based on the particular depth output, the camera motion output, and the object motion outputs for the potential objects.

Type: Grant

Filed: September 5, 2019

Date of Patent: October 10, 2023

Assignee: Google LLC

Inventors: Vincent Michael Casser, Soeren Pirk, Reza Mahjourian, Anelia Angelova
Training neural networks using consistency measures

Patent number: 11544498

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network using consistency measures. One of the methods includes processing a particular training example from a mediator training data set using a first neural network to generate a first output for a first machine learning task; processing the particular training example in the mediator training data set using each of one or more second neural networks, wherein each second neural network is configured to generate a second output for a respective second machine learning task; determining, for each second machine learning task, a consistency target output for the first machine learning task; determining, for each second machine learning task, an error between the first output and the consistency target output corresponding to the second machine learning task; and generating a parameter update for the first neural network from the determined errors.

Type: Grant

Filed: March 5, 2021

Date of Patent: January 3, 2023

Assignee: Google LLC

Inventors: Ariel Gordon, Soeren Pirk, Anelia Angelova, Vincent Michael Casser, Yao Lu, Anthony Brohan, Zhao Chen, Jan Dlabal
DETERMINING ENVIRONMENT-CONDITIONED ACTION SEQUENCES FOR ROBOTIC TASKS

Publication number: 20220331962

Abstract: Training and/or using a machine learning model for performing robotic tasks is disclosed herein. In many implementations, an environment-conditioned action sequence prediction model is used to determine a set of actions as well as a corresponding particular order for the actions for the robot to perform to complete the task. In many implementations, each action in the set of actions has a corresponding action network used to control the robot in performing the action.

Type: Application

Filed: September 9, 2020

Publication date: October 20, 2022

Inventors: Soeren Pirk, Seyed Mohammad Khansari Zadeh, Karol Hausman, Alexander Toshev
TRAINING PERSPECTIVE COMPUTER VISION MODELS USING VIEW SYNTHESIS

Publication number: 20210390407

Abstract: Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for training a perspective computer vision model. The model is configured to receive input data characterizing an input scene in an environment from an input viewpoint and to process the input data in accordance with a set of model parameters to generate an output perspective representation of the scene from the input viewpoint. The system trains the model based on first data characterizing a scene in the environment from a first viewpoint and second data characterizing the scene in the environment from a second, different viewpoint.

Type: Application

Filed: June 10, 2021

Publication date: December 16, 2021

Inventors: Vincent Michael Casser, Yuning Chai, Dragomir Anguelov, Hang Zhao, Henrik Kretzschmar, Reza Mahjourian, Anelia Angelova, Ariel Gordon, Soeren Pirk
TRAINING A DEEP NEURAL NETWORK MODEL TO GENERATE RICH OBJECT-CENTRIC EMBEDDINGS OF ROBOTIC VISION DATA

Publication number: 20210334599

Abstract: Training a machine learning model (e.g., a neural network model such as a convolutional neural network (CNN) model) so that, when trained, the model can be utilized in processing vision data (e.g., from a vision component of a robot), that captures an object, to generate a rich object-centric embedding for the vision data. The generated embedding can enable differentiation of even subtle variations of attributes of the object captured by the vision data.

Type: Application

Filed: September 27, 2019

Publication date: October 28, 2021

Inventors: Soeren Pirk, Yunfei Bai, Pierre Sermanet, Seyed Mohammad Khansari Zadeh, Harrison Lynch
UNSUPERVISED DEPTH PREDICTION NEURAL NETWORKS

Publication number: 20210319578

Abstract: A system for generating a depth output for an image is described. The system receives input images that depict the same scene, each input image including one or more potential objects. The system generates, for each input image, a respective background image and processes the background images to generate a camera motion output that characterizes the motion of the camera between the input images. For each potential object, the system generates a respective object motion output for the potential object based on the input images and the camera motion output. The system processes a particular input image of the input images using a depth prediction neural network (NN) to generate a depth output for the particular input image, and updates the current values of parameters of the depth prediction NN based on the particular depth output, the camera motion output, and the object motion outputs for the potential objects.

Type: Application

Filed: September 5, 2019

Publication date: October 14, 2021

Inventors: Vincent Michael Casser, Soeren Pirk, Reza Mahjourian, Anelia Angelova
TRAINING NEURAL NETWORKS USING CONSISTENCY MEASURES

Publication number: 20210279511

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network using consistency measures. One of the methods includes processing a particular training example from a mediator training data set using a first neural network to generate a first output for a first machine learning task; processing the particular training example in the mediator training data set using each of one or more second neural networks, wherein each second neural network is configured to generate a second output for a respective second machine learning task; determining, for each second machine learning task, a consistency target output for the first machine learning task; determining, for each second machine learning task, an error between the first output and the consistency target output corresponding to the second machine learning task; and generating a parameter update for the first neural network from the determined errors.

Type: Application

Filed: March 5, 2021

Publication date: September 9, 2021

Inventors: Ariel Gordon, Soeren Pirk, Anelia Angelova, Vincent Michael Casser, Yao Lu, Anthony Brohan, Zhao Chen, Jan Dlabal
Future semantic segmentation prediction using 3D structure

Patent number: 11100646

Abstract: A method for generating a predicted segmentation map for potential objects in a future scene depicted in a future image is described. The method includes receiving input images that depict a same scene; processing a current input image to generate a segmentation map for potential objects in the current input image and a respective depth map; generating a point cloud for the current input image; processing the input images to generate, for each pair of two input images in the sequence, a respective ego-motion output that characterizes motion of the camera between the two input images; processing the ego-motion outputs to generate a future ego-motion output; processing the point cloud of the current input image and the future ego-motion output to generate a future point cloud; and processing the future point cloud to generate the predicted segmentation map for potential objects in the future scene depicted in the future image.

Type: Grant

Filed: September 6, 2019

Date of Patent: August 24, 2021

Assignee: Google LLC

Inventors: Suhani Vora, Reza Mahjourian, Soeren Pirk, Anelia Angelova
ROBOTIC MANIPULATION USING DOMAIN-INVARIANT 3D REPRESENTATIONS PREDICTED FROM 2.5D VISION DATA

Publication number: 20210101286

Abstract: Implementations relate to training a point cloud prediction model that can be utilized to process a single-view two-and-a-half-dimensional (2.5D) observation of an object, to generate a domain-invariant three-dimensional (3D) representation of the object. Implementations additionally or alternatively relate to utilizing the domain-invariant 3D representation to train a robotic manipulation policy model using, as at least part of the input to the robotic manipulation policy model during training, the domain-invariant 3D representations of simulated objects to be manipulated. Implementations additionally or alternatively relate to utilizing the trained robotic manipulation policy model in control of a robot based on output generated by processing generated domain-invariant 3D representations utilizing the robotic manipulation policy model.

Type: Application

Filed: February 28, 2020

Publication date: April 8, 2021

Inventors: Honglak Lee, Xinchen Yan, Soeren Pirk, Yunfei Bai, Seyed Mohammad Khansari Zadeh, Yuanzheng Gong, Jasmine Hsu
FUTURE SEMANTIC SEGMENTATION PREDICTION USING 3D STRUCTURE

Publication number: 20210073997

Abstract: This disclosure describes a system including one or more computers and one or more non-transitory storage devices storing instructions that, when executed by one or more computers, cause the one or more computers to perform operations for generating a predicted segmentation map for potential objects in a future scene depicted in a future image.

Type: Application

Filed: September 6, 2019

Publication date: March 11, 2021

Inventors: Suhani Vora, Reza Mahjourian, Soeren Pirk, Anelia Angelova