Patents by Inventor Rahul Sukthankar

Rahul Sukthankar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods for reconstructing body shape and pose

Patent number: 11908071

Abstract: The present disclosure is generally directed to reconstructing representations of bodies from images. An example method of the present disclosure includes inputting, into a machine-learned reconstruction model, input data descriptive of an image depicting a body; predicting, using a machine-learned marker prediction component of the reconstruction model, a set of surface marker locations on the body; and outputting, using a machine-learned marker poser component of the reconstruction model, an output representation of the body that corresponds to the set of surface marker locations. In the example method, one or more parameters of the reconstruction model were learned at least in part based on a consistency loss corresponding to a distance between relaxed-constraint representations generated from a prior set of surface marker locations predicted according to the one or more parameters and parametric representations generated from the prior set using kinematic constraints associated with the body.

Type: Grant

Filed: October 7, 2021

Date of Patent: February 20, 2024

Assignee: GOOGLE LLC

Inventors: Cristian Sminchisescu, Mihai Zanfir, Andrei Zanfir, Eduard Gabriel Bazavan, William Tafel Freeman, Rahul Sukthankar
Systems and methods for refined object estimation from image data

Patent number: 11836221

Abstract: Systems and methods are directed to a method for estimation of an object state from image data. The method can include obtaining two-dimensional image data depicting an object. The method can include processing, with an estimation portion of a machine-learned object state estimation model, the two-dimensional image data to obtain an initial estimated state of the object. The method can include, for each of one or more refinement iterations, obtaining a previous loss value associated with a previous estimated state for the object, processing the previous loss value to obtain a current estimated state of the object, and evaluating a loss function to determine a loss value associated with the current estimated state of the object. The method can include providing a final estimated state for the object.

Type: Grant

Filed: March 12, 2021

Date of Patent: December 5, 2023

Assignee: GOOGLE LLC

Inventors: Cristian Sminchisescu, Andrei Zanfir, Eduard Gabriel Bazavan, Mihai Zanfir, William Tafel Freeman, Rahul Sukthankar
Determining structure and motion in images using neural networks

Patent number: 11763466

Abstract: A system comprising an encoder neural network, a scene structure decoder neural network, and a motion decoder neural network. The encoder neural network is configured to: receive a first image and a second image; and process the first image and the second image to generate an encoded representation of the first image and the second image. The scene structure decoder neural network is configured to process the encoded representation to generate a structure output characterizing a structure of a scene depicted in the first image. The motion decoder neural network configured to process the encoded representation to generate a motion output characterizing motion between the first image and the second image.

Type: Grant

Filed: December 23, 2020

Date of Patent: September 19, 2023

Assignee: Google LLC

Inventors: Cordelia Luise Schmid, Sudheendra Vijayanarasimhan, Susanna Maria Ricco, Bryan Andrew Seybold, Rahul Sukthankar, Aikaterini Fragkiadaki
Generative Nonlinear Human Shape Models

Publication number: 20230169727

Abstract: The present disclosure provides a statistical, articulated 3D human shape modeling pipeline within a fully trainable, modular, deep learning framework. In particular, aspects of the present disclosure are directed to a machine-learned 3D human shape model with at least facial and body shape components that are jointly trained end-to-end on a set of training data. Joint training of the model components (e.g., including both facial, hands, and rest of body components) enables improved consistency of synthesis between the generated face and body shapes.

Type: Application

Filed: April 30, 2020

Publication date: June 1, 2023

Inventors: Cristian Sminchisescu, Hongyi Xu, Eduard Gabriel Bazavan, Andrei Zanfir, William T. Freeman, Rahul Sukthankar
Systems And Methods For Reconstructing Body Shape And Pose

Publication number: 20230116884

Abstract: The present disclosure is generally directed to reconstructing representations of bodies from images. An example method of the present disclosure includes inputting, into a machine-learned reconstruction model, input data descriptive of an image depicting a body; predicting, using a machine-learned marker prediction component of the reconstruction model, a set of surface marker locations on the body; and outputting, using a machine-learned marker poser component of the reconstruction model, an output representation of the body that corresponds to the set of surface marker locations. In the example method, one or more parameters of the reconstruction model were learned at least in part based on a consistency loss corresponding to a distance between relaxed-constraint representations generated from a prior set of surface marker locations predicted according to the one or more parameters and parametric representations generated from the prior set using kinematic constraints associated with the body.

Type: Application

Filed: October 7, 2021

Publication date: April 13, 2023

Inventors: Cristian Sminchisescu, Mihai Zanfir, Andrei Zanfir, Eduard Gabriel Bazavan, William Tafel Freeman, Rahul Sukthankar
Systems and Methods for Refined Object Estimation from Image Data

Publication number: 20220292314

Abstract: Systems and methods are directed to a method for estimation of an object state from image data. The method can include obtaining two-dimensional image data depicting an object. The method can include processing, with an estimation portion of a machine-learned object state estimation model, the two-dimensional image data to obtain an initial estimated state of the object. The method can include, for each of one or more refinement iterations, obtaining a previous loss value associated with a previous estimated state for the object, processing the previous loss value to obtain a current estimated state of the object, and evaluating a loss function to determine a loss value associated with the current estimated state of the object. The method can include providing a final estimated state for the object.

Type: Application

Filed: March 12, 2021

Publication date: September 15, 2022

Inventors: Cristian Sminchisescu, Andrei Zanfir, Eduard Gabriel Bazavan, Mihai Zanfir, William Tafel Freeman, Rahul Sukthankar
METHODS AND SYSTEMS FOR ENCODING IMAGES

Publication number: 20220237882

Abstract: The present disclosure is directed to encoding images. In particular, one or more computing devices can receive data representing one or more machine learning (ML) models configured, at least in part, to encode images comprising objects of a particular type. The computing device(s) can receive data representing an image comprising one or more objects of the particular type. The computing device(s) can generate, based at least in part on the data representing the image and the data representing the ML model(s), data representing an encoded version of the image that alters at least a portion of the image comprising the object(s) such that when the encoded version of the image is decoded, the object(s) are unrecognizable as being of the particular type by one or more object-recognition ML models based at least in part upon which the ML model(s) configured to encode the images were trained.

Type: Application

Filed: May 28, 2019

Publication date: July 28, 2022

Inventors: Shumeet Baluja, Rahul Sukthankar
Action localization in images and videos using relational features

Patent number: 11163989

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing action localization in images and videos. In one aspect, a system comprises a data processing apparatus; a memory in data communication with the data processing apparatus and storing instructions that cause the data processing apparatus to perform image processing and video processing operations comprising: receiving an input comprising an image depicting a person; identifying a plurality of context positions from the image; determining respective feature representations of each of the context positions; providing a feature representation of the person and the feature representations of each of the context positions to a context neural network to obtain relational features, wherein the relational features represent relationships between the person and the context positions; and determining an action performed by the person using the feature representation of the person and the relational features.

Type: Grant

Filed: August 6, 2019

Date of Patent: November 2, 2021

Assignee: Google LLC

Inventors: Chen Sun, Abhinav Shrivastava, Cordelia Luise Schmid, Rahul Sukthankar, Kevin Patrick Murphy, Carl Martin Vondrick
ACTION LOCALIZATION USING RELATIONAL FEATURES

Publication number: 20210166009

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing action localization. In one aspect, a system comprises a data processing apparatus; a memory in data communication with the data processing apparatus and storing instructions that cause the data processing apparatus to perform operations comprising: receiving an input comprising an image depicting a person; identifying a plurality of context positions from the image; determining respective feature representations of each of the context positions; providing a feature representation of the person and the feature representations of each of the context positions to a context neural network to obtain relational features, wherein the relational features represent relationships between the person and the context positions; and determining an action performed by the person using the feature representation of the person and the relational features.

Type: Application

Filed: August 6, 2019

Publication date: June 3, 2021

Inventors: Chen Sun, Abhinav Shrivastava, Cordelia Luise Schmid, Rahul Sukthankar, Kevin Patrick Murphy, Carl Martin Vondrick
Agent navigation using visual inputs

Patent number: 11010948

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for navigation using visual inputs. One of the systems includes a mapping subsystem configured to, at each time step of a plurality of time steps, generate a characterization of an environment from an image of the environment at the time step, wherein the characterization comprises an environment map identifying locations in the environment having a particular characteristic, and wherein generating the characterization comprises, for each time step: obtaining the image of the environment at the time step, processing the image to generate a first initial characterization for the time step, obtaining a final characterization for a previous time step, processing the characterization for the previous time step to generate a second initial characterization for the time step, and combining the first initial characterization and the second initial characterization to generate a final characterization for the time step.

Type: Grant

Filed: February 9, 2018

Date of Patent: May 18, 2021

Assignee: Google LLC

Inventors: Rahul Sukthankar, Saurabh Gupta, James Christopher Davidson, Sergey Vladimir Levine, Jitendra Malik
DETERMINING STRUCTURE AND MOTION IN IMAGES USING NEURAL NETWORKS

Publication number: 20210118153

Abstract: A system comprising an encoder neural network, a scene structure decoder neural network, and a motion decoder neural network. The encoder neural network is configured to: receive a first image and a second image; and process the first image and the second image to generate an encoded representation of the first image and the second image. The scene structure decoder neural network is configured to process the encoded representation to generate a structure output characterizing a structure of a scene depicted in the first image. The motion decoder neural network configured to process the encoded representation to generate a motion output characterizing motion between the first image and the second image.

Type: Application

Filed: December 23, 2020

Publication date: April 22, 2021

Inventors: Cordelia Luise Schmid, Sudheendra Vijayanarasimhan, Susanna Maria Ricco, Bryan Andrew Seybold, Rahul Sukthankar, Aikaterini Fragkiadaki
Determining structure and motion in images using neural networks

Patent number: 10878583

Abstract: A system comprising an encoder neural network, a scene structure decoder neural network, and a motion decoder neural network. The encoder neural network is configured to: receive a first image and a second image; and process the first image and the second image to generate an encoded representation of the first image and the second image. The scene structure decoder neural network is configured to process the encoded representation to generate a structure output characterizing a structure of a scene depicted in the first image. The motion decoder neural network configured to process the encoded representation to generate a motion output characterizing motion between the first image and the second image.

Type: Grant

Filed: December 1, 2017

Date of Patent: December 29, 2020

Assignee: Google LLC

Inventors: Cordelia Luise Schmid, Sudheendra Vijayanarasimhan, Susanna Maria Ricco, Bryan Andrew Seybold, Rahul Sukthankar, Aikaterini Fragkiadaki
DETERMINING STRUCTURE AND MOTION IN IMAGES USING NEURAL NETWORKS

Publication number: 20200349722

Abstract: A system comprising an encoder neural network, a scene structure decoder neural network, and a motion decoder neural network. The encoder neural network is configured to: receive a first image and a second image; and process the first image and the second image to generate an encoded representation of the first image and the second image. The scene structure decoder neural network is configured to process the encoded representation to generate a structure output characterizing a structure of a scene depicted in the first image. The motion decoder neural network configured to process the encoded representation to generate a motion output characterizing motion between the first image and the second image.

Type: Application

Filed: December 1, 2017

Publication date: November 5, 2020

Inventors: Cordelia Luise Schmid, Sudheendra Vijayanarasimhan, Susanna Maria Ricco, Bryan Andrew Seybold, Rahul Sukthankar, Aikaterini Fragkiadaki
Image compression with recurrent neural networks

Patent number: 10713818

Abstract: Methods, and systems, including computer programs encoded on computer storage media for compressing data items with variable compression rate. A system includes an encoder sub-network configured to receive a system input image and to generate an encoded representation of the system input image, the encoder sub-network including a first stack of neural network layers including one or more LSTM neural network layers and one or more non-LSTM neural network layers, the first stack configured to, at each of a plurality of time steps, receive an input image for the time step that is derived from the system input image and generate a corresponding first stack output, and a binarizing neural network layer configured to receive a first stack output as input and generate a corresponding binarized output.

Type: Grant

Filed: January 28, 2019

Date of Patent: July 14, 2020

Assignee: Google LLC

Inventors: George Dan Toderici, Sean O'Malley, Rahul Sukthankar, Sung Jin Hwang, Damien Vincent, Nicholas Johnston, David Charles Minnen, Joel Shor, Michele Covell
Compression of occupancy or indicator grids

Patent number: 10681388

Abstract: Encoding and decoding occupancy information is disclosed. A method includes determining row sums for the region, determining column sums for the region, encoding, in a compressed bitstream, at least one of the row sums and the column sums, and encoding, in the compressed bitstream and based on a coding order, at least one of the rows and the columns of the region. The coding order is based on the encoded at least one of the row sums and the column sums. The row sums include, for each row of the region, a respective count of a number of locations in the row having a specified value. The column sums include, for each column of the region, a respective count of a number of locations in the column having the specified value. A location having the specified value is indicative of the occupancy information at the location.

Type: Grant

Filed: January 30, 2018

Date of Patent: June 9, 2020
Category learning neural networks

Patent number: 10635979

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining a clustering of images into a plurality of semantic categories. In one aspect, a method comprises: training a categorization neural network, comprising, at each of a plurality of iterations: processing an image depicting an object using the categorization neural network to generate (i) a current prediction for whether the image depicts an object or a background region, and (ii) a current embedding of the image; determining a plurality of current cluster centers based on the current values of the categorization neural network parameters, wherein each cluster center represents a respective semantic category; and determining a gradient of an objective function that includes a classification loss and a clustering loss, wherein the clustering loss depends on a similarity between the current embedding of the image and the current cluster centers.

Type: Grant

Filed: July 15, 2019

Date of Patent: April 28, 2020

Assignee: Google LLC

Inventors: Steven Hickson, Anelia Angelova, Irfan Aziz Essa, Rahul Sukthankar
CATEGORY LEARNING NEURAL NETWORKS

Publication number: 20200027002

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining a clustering of images into a plurality of semantic categories. In one aspect, a method comprises: training a categorization neural network, comprising, at each of a plurality of iterations: processing an image depicting an object using the categorization neural network to generate (i) a current prediction for whether the image depicts an object or a background region, and (ii) a current embedding of the image; determining a plurality of current cluster centers based on the current values of the categorization neural network parameters, wherein each cluster center represents a respective semantic category; and determining a gradient of an objective function that includes a classification loss and a clustering loss, wherein the clustering loss depends on a similarity between the current embedding of the image and the current cluster centers.

Type: Application

Filed: July 15, 2019

Publication date: January 23, 2020

Inventors: Steven Hickson, Anelia Angelova, Irfan Aziz Essa, Rahul Sukthankar
AGENT NAVIGATION USING VISUAL INPUTS

Publication number: 20190371025

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for navigation using visual inputs. One of the systems includes a mapping subsystem configured to, at each time step of a plurality of time steps, generate a characterization of an environment from an image of the environment at the time step, wherein the characterization comprises an environment map identifying locations in the environment having a particular characteristic, and wherein generating the characterization comprises, for each time step: obtaining the image of the environment at the time step, processing the image to generate a first initial characterization for the time step, obtaining a final characterization for a previous time step, processing the characterization for the previous time step to generate a second initial characterization for the time step, and combining the first initial characterization and the second initial characterization to generate a final characterization for the time step.

Type: Application

Filed: February 9, 2018

Publication date: December 5, 2019

Inventors: Rahul Sukthankar, Saurabh Gupta, James Christopher Davidson, Sergey Vladimir Levine, Jitendra Malik
Compression of occupancy or indicator grids

Publication number: 20190238893

Abstract: Encoding and decoding occupancy information is disclosed. A method includes determining row sums for the region, determining column sums for the region, encoding, in a compressed bitstream, at least one of the row sums and the column sums, and encoding, in the compressed bitstream and based on a coding order, at least one of the rows and the columns of the region. The coding order is based on the encoded at least one of the row sums and the column sums. The row sums include, for each row of the region, a respective count of a number of locations in the row having a specified value. The column sums include, for each column of the region, a respective count of a number of locations in the column having the specified value. A location having the specified value is indicative of the occupancy information at the location.

Type: Application

Filed: January 30, 2018

Publication date: August 1, 2019
Image compression with recurrent neural networks

Patent number: 10192327

Abstract: Methods, and systems, including computer programs encoded on computer storage media for compressing data items with variable compression rate. A system includes an encoder sub-network configured to receive a system input image and to generate an encoded representation of the system input image, the encoder sub-network including a first stack of neural network layers including one or more LSTM neural network layers and one or more non-LSTM neural network layers, the first stack configured to, at each of a plurality of time steps, receive an input image for the time step that is derived from the system input image and generate a corresponding first stack output, and a binarizing neural network layer configured to receive a first stack output as input and generate a corresponding binarized output.

Type: Grant

Filed: February 3, 2017

Date of Patent: January 29, 2019

Assignee: Google LLC

Inventors: George Dan Toderici, Sean O'Malley, Rahul Sukthankar, Sung Jin Hwang, Damien Vincent, Nicholas Johnston, David Charles Minnen, Joel Shor, Michele Covell

1 2 3 next