Patents by Inventor Sudheendra Vijayanarasimhan

Sudheendra Vijayanarasimhan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Determining structure and motion in images using neural networks

Patent number: 11763466

Abstract: A system comprising an encoder neural network, a scene structure decoder neural network, and a motion decoder neural network. The encoder neural network is configured to: receive a first image and a second image; and process the first image and the second image to generate an encoded representation of the first image and the second image. The scene structure decoder neural network is configured to process the encoded representation to generate a structure output characterizing a structure of a scene depicted in the first image. The motion decoder neural network configured to process the encoded representation to generate a motion output characterizing motion between the first image and the second image.

Type: Grant

Filed: December 23, 2020

Date of Patent: September 19, 2023

Assignee: Google LLC

Inventors: Cordelia Luise Schmid, Sudheendra Vijayanarasimhan, Susanna Maria Ricco, Bryan Andrew Seybold, Rahul Sukthankar, Aikaterini Fragkiadaki
Machine learning methods and apparatus for semantic robotic grasping

Patent number: 11717959

Abstract: Deep machine learning methods and apparatus related to semantic robotic grasping are provided. Some implementations relate to training a training a grasp neural network, a semantic neural network, and a joint neural network of a semantic grasping model. In some of those implementations, the joint network is a deep neural network and can be trained based on both: grasp losses generated based on grasp predictions generated over a grasp neural network, and semantic losses generated based on semantic predictions generated over the semantic neural network. Some implementations are directed to utilization of the trained semantic grasping model to servo, or control, a grasping end effector of a robot to achieve a successful grasp of an object having desired semantic feature(s).

Type: Grant

Filed: June 28, 2018

Date of Patent: August 8, 2023

Assignee: GOOGLE LLC

Inventors: Eric Jang, Sudheendra Vijayanarasimhan, Peter Pastor Sampedro, Julian Ibarz, Sergey Levine
Generating a video segment of an action from a video

Patent number: 11663827

Abstract: A computer-implemented method includes receiving a video that includes multiple frames. The method further includes identifying a start time and an end time of each action in the video based on application of one or more of an audio classifier, an RGB classifier, and a motion classifier. The method further includes identifying video segments from the video that include frames between the start time and the end time for each action in the video. The method further includes generating a confidence score for each of the video segments based on a probability that a corresponding action corresponds to one or more of a set of predetermined actions. The method further includes selecting a subset of the video segments based on the confidence score for each of the video segments.

Type: Grant

Filed: July 13, 2022

Date of Patent: May 30, 2023

Assignee: Google LLC

Inventors: Sudheendra Vijayanarasimhan, Alexis Bienvenu, David Ross, Timothy Novikoff, Arvind Balasubramanian
GENERATING A VIDEO SEGMENT OF AN ACTION FROM A VIDEO

Publication number: 20220351516

Abstract: A computer-implemented method includes receiving a video that includes multiple frames. The method further includes identifying a start time and an end time of each action in the video based on application of one or more of an audio classifier, an RGB classifier, and a motion classifier. The method further includes identifying video segments from the video that include frames between the start time and the end time for each action in the video. The method further includes generating a confidence score for each of the video segments based on a probability that a corresponding action corresponds to one or more of a set of predetermined actions. The method further includes selecting a subset of the video segments based on the confidence score for each of the video segments.

Type: Application

Filed: July 13, 2022

Publication date: November 3, 2022

Applicant: Google LLC

Inventors: Sudheendra Vijayanarasimhan, Alexis Bienvenu, David Ross, Timothy Novikoff, Arvind Balasubramanian
Generating a video segment of an action from a video

Patent number: 11393209

Abstract: A computer-implemented method includes receiving a video that includes multiple frames. The method further includes identifying a start time and an end time of each action in the video based on application of one or more of an audio classifier, an RGB classifier, and a motion classifier. The method further includes identifying video segments from the video that include frames between the start time and the end time for each action in the video. The method further includes generating a confidence score for each of the video segments based on a probability that a corresponding action corresponds to one or more of a set of predetermined actions. The method further includes selecting a subset of the video segments based on the confidence score for each of the video segments.

Type: Grant

Filed: August 5, 2020

Date of Patent: July 19, 2022

Assignee: Google LLC

Inventors: Sudheendra Vijayanarasimhan, Alexis Bienvenu, David Ross, Timothy Novikoff, Arvind Balasubramanian
FEATURE-BASED VIDEO ANNOTATION

Publication number: 20220207873

Abstract: A system and methodology provide for annotating videos with entities and associated probabilities of existence of the entities within video frames. A computer-implemented method identifies an entity from a plurality of entities identifying characteristics of video items. The computer-implemented method selects a set of features correlated with the entity based on a value of a feature of a plurality of features, determines a classifier for the entity using the set of features, and determines an aggregation calibration function for the entity based on the set of features. The computer-implemented method selects a video frame from a video item, where the video frame having associated features, and determines a probability of existence of the entity based on the associated features using the classifier and the aggregation calibration function.

Type: Application

Filed: December 13, 2021

Publication date: June 30, 2022

Inventors: Balakrishnan Varadarajan, George Dan Toderici, Apostol Natsev, Nitin Khandelwal, Sudheendra Vijayanarasimhan, Weilong Yang, Sanketh Shetty
Feature-based video annotation

Patent number: 11200423

Abstract: A system and methodology provide for annotating videos with entities and associated probabilities of existence of the entities within video frames. A computer-implemented method identifies an entity from a plurality of entities identifying characteristics of video items. The computer-implemented method selects a set of features correlated with the entity based on a value of a feature of a plurality of features, determines a classifier for the entity using the set of features, and determines an aggregation calibration function for the entity based on the set of features. The computer-implemented method selects a video frame from a video item, where the video frame having associated features, and determines a probability of existence of the entity based on the associated features using the classifier and the aggregation calibration function.

Type: Grant

Filed: November 18, 2019

Date of Patent: December 14, 2021

Assignee: Google LLC

Inventors: Balakrishnan Varadarajan, George Dan Toderici, Apostol Natsev, Nitin Khandelwal, Sudheendra Vijayanarasimhan, Weilong Yang, Sanketh Shetty
Classifying videos using neural networks

Patent number: 11074454

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying videos using neural networks. One of the methods includes obtaining a temporal sequence of video frames, wherein the temporal sequence comprises a respective video frame from a particular video at each of a plurality time steps; for each time step of the plurality of time steps: processing the video frame at the time step using a convolutional neural network to generate features of the video frame; and processing the features of the video frame using an LSTM neural network to generate a set of label scores for the time step and classifying the video as relating to one or more of the topics represented by labels in the set of labels from the label scores for each of the plurality of time steps.

Type: Grant

Filed: May 13, 2019

Date of Patent: July 27, 2021

Assignee: Google LLC

Inventors: Sudheendra Vijayanarasimhan, George Dan Toderici, Yue Hei Ng, Matthew John Hausknecht, Oriol Vinyals, Rajat Monga
Deep machine learning methods and apparatus for robotic grasping

Patent number: 11045949

Abstract: Deep machine learning methods and apparatus related to manipulation of an object by an end effector of a robot. Some implementations relate to training a semantic grasping model to predict a measure that indicates whether motion data for an end effector of a robot will result in a successful grasp of an object; and to predict an additional measure that indicates whether the object has desired semantic feature(s). Some implementations are directed to utilization of the trained semantic grasping model to servo a grasping end effector of a robot to achieve a successful grasp of an object having desired semantic feature(s).

Type: Grant

Filed: March 19, 2020

Date of Patent: June 29, 2021

Assignee: GOOGLE LLC

Inventors: Sudheendra Vijayanarasimhan, Eric Jang, Peter Pastor Sampedro, Sergey Levine
SELECTING AND PRESENTING REPRESENTATIVE FRAMES FOR VIDEO PREVIEWS

Publication number: 20210166035

Abstract: A computer-implemented method for selecting representative frames for videos is provided. The method includes receiving a video and identifying a set of features for each of the frames of the video. The features including frame-based features and semantic features. The semantic features identifying likelihoods of semantic concepts being present as content in the frames of the video. A set of video segments for the video is subsequently generated. Each video segment includes a chronological subset of frames from the video and each frame is associated with at least one of the semantic features. The method generates a score for each frame of the subset of frames for each video segment based at least on the semantic features, and selecting a representative frame for each video segment based on the scores of the frames in the video segment. The representative frame represents and summarizes the video segment.

Type: Application

Filed: December 14, 2020

Publication date: June 3, 2021

Inventors: Sanketh Shetty, Tomas Izo, Min-Hsuan Tsai, Sudheendra Vijayanarasimhan, Apostol Natsev, Sami Abu-El-Haija, George Dan Toderici, Susana Ricco, Balakrishnan Varadarajan, Nicola Muscettola, WeiHsin Gu, Weilong Yang, Nitin Khandelwal, Phuong Le
DETERMINING STRUCTURE AND MOTION IN IMAGES USING NEURAL NETWORKS

Publication number: 20210118153

Abstract: A system comprising an encoder neural network, a scene structure decoder neural network, and a motion decoder neural network. The encoder neural network is configured to: receive a first image and a second image; and process the first image and the second image to generate an encoded representation of the first image and the second image. The scene structure decoder neural network is configured to process the encoded representation to generate a structure output characterizing a structure of a scene depicted in the first image. The motion decoder neural network configured to process the encoded representation to generate a motion output characterizing motion between the first image and the second image.

Type: Application

Filed: December 23, 2020

Publication date: April 22, 2021

Inventors: Cordelia Luise Schmid, Sudheendra Vijayanarasimhan, Susanna Maria Ricco, Bryan Andrew Seybold, Rahul Sukthankar, Aikaterini Fragkiadaki
Determining structure and motion in images using neural networks

Patent number: 10878583

Abstract: A system comprising an encoder neural network, a scene structure decoder neural network, and a motion decoder neural network. The encoder neural network is configured to: receive a first image and a second image; and process the first image and the second image to generate an encoded representation of the first image and the second image. The scene structure decoder neural network is configured to process the encoded representation to generate a structure output characterizing a structure of a scene depicted in the first image. The motion decoder neural network configured to process the encoded representation to generate a motion output characterizing motion between the first image and the second image.

Type: Grant

Filed: December 1, 2017

Date of Patent: December 29, 2020

Assignee: Google LLC

Inventors: Cordelia Luise Schmid, Sudheendra Vijayanarasimhan, Susanna Maria Ricco, Bryan Andrew Seybold, Rahul Sukthankar, Aikaterini Fragkiadaki
Selecting and presenting representative frames for video previews

Patent number: 10867183

Abstract: A computer-implemented method for selecting representative frames for videos is provided. The method includes receiving a video and identifying a set of features for each of the frames of the video. The features including frame-based features and semantic features. The semantic features identifying likelihoods of semantic concepts being present as content in the frames of the video. A set of video segments for the video is subsequently generated. Each video segment includes a chronological subset of frames from the video and each frame is associated with at least one of the semantic features. The method generates a score for each frame of the subset of frames for each video segment based at least on the semantic features, and selecting a representative frame for each video segment based on the scores of the frames in the video segment. The representative frame represents and summarizes the video segment.

Type: Grant

Filed: April 23, 2018

Date of Patent: December 15, 2020

Assignee: Google LLC

Inventors: Sanketh Shetty, Tomas Izo, Min-Hsuan Tsai, Sudheendra Vijayanarasimhan, Apostol Natsev, Sami Abu-El-Haija, George Dan Toderici, Susanna Ricco, Balakrishnan Varadarajan, Nicola Muscettola, WeiHsin Gu, Weilong Yang, Nitin Khandelwal, Phuong Le
GENERATING A VIDEO SEGMENT OF AN ACTION FROM A VIDEO

Publication number: 20200364464

Abstract: A computer-implemented method includes receiving a video that includes multiple frames. The method further includes identifying a start time and an end time of each action in the video based on application of one or more of an audio classifier, an RGB classifier, and a motion classifier. The method further includes identifying video segments from the video that include frames between the start time and the end time for each action in the video. The method further includes generating a confidence score for each of the video segments based on a probability that a corresponding action corresponds to one or more of a set of predetermined actions. The method further includes selecting a subset of the video segments based on the confidence score for each of the video segments.

Type: Application

Filed: August 5, 2020

Publication date: November 19, 2020

Applicant: Google LLC

Inventors: Sudheendra Vijayanarasimhan, Alexis Bienvenu, David Ross, Timothy Novikoff, Arvind Balasubramanian
DETERMINING STRUCTURE AND MOTION IN IMAGES USING NEURAL NETWORKS

Publication number: 20200349722

Abstract: A system comprising an encoder neural network, a scene structure decoder neural network, and a motion decoder neural network. The encoder neural network is configured to: receive a first image and a second image; and process the first image and the second image to generate an encoded representation of the first image and the second image. The scene structure decoder neural network is configured to process the encoded representation to generate a structure output characterizing a structure of a scene depicted in the first image. The motion decoder neural network configured to process the encoded representation to generate a motion output characterizing motion between the first image and the second image.

Type: Application

Filed: December 1, 2017

Publication date: November 5, 2020

Inventors: Cordelia Luise Schmid, Sudheendra Vijayanarasimhan, Susanna Maria Ricco, Bryan Andrew Seybold, Rahul Sukthankar, Aikaterini Fragkiadaki
MACHINE LEARNING METHODS AND APPARATUS FOR SEMANTIC ROBOTIC GRASPING

Publication number: 20200338722

Abstract: Deep machine learning methods and apparatus related to semantic robotic grasping are provided. Some implementations relate to training a training a grasp neural network, a semantic neural network, and a joint neural network of a semantic grasping model. In some of those implementations, the joint network is a deep neural network and can be trained based on both: grasp losses generated based on grasp predictions generated over a grasp neural network, and semantic losses generated based on semantic predictions generated over the semantic neural network. Some implementations are directed to utilization of the trained semantic grasping model to servo, or control, a grasping end effector of a robot to achieve a successful grasp of an object having desired semantic feature(s).

Type: Application

Filed: June 28, 2018

Publication date: October 29, 2020

Inventors: Eric Jang, Sudheendra Vijayanarasimhan, Peter Pastor Sampedro, Julian Ibarz, Sergey Levine
Generating a video segment of an action from a video

Patent number: 10740620

Abstract: A computer-implemented method includes receiving a video that includes multiple frames. The method further includes identifying a start time and an end time of each action in the video based on application of one or more of an audio classifier, an RGB classifier, and a motion classifier. The method further includes identifying video segments from the video that include frames between the start time and the end time for each action in the video. The method further includes generating a confidence score for each of the video segments based on a probability that a corresponding action corresponds to one or more of a set of predetermined actions. The method further includes selecting a subset of the video segments based on the confidence score for each of the video segments.

Type: Grant

Filed: October 12, 2017

Date of Patent: August 11, 2020

Assignee: Google LLC

Inventors: Sudheendra Vijayanarasimhan, Alexis Bienvenu, David Ross, Timothy Novikoff, Arvind Balasubramanian
DEEP MACHINE LEARNING METHODS AND APPARATUS FOR ROBOTIC GRASPING

Publication number: 20200215686

Abstract: Deep machine learning methods and apparatus related to manipulation of an object by an end effector of a robot. Some implementations relate to training a semantic grasping model to predict a measure that indicates whether motion data for an end effector of a robot will result in a successful grasp of an object; and to predict an additional measure that indicates whether the object has desired semantic feature(s). Some implementations are directed to utilization of the trained semantic grasping model to servo a grasping end effector of a robot to achieve a successful grasp of an object having desired semantic feature(s).

Type: Application

Filed: March 19, 2020

Publication date: July 9, 2020

Inventors: Sudheendra Vijayanarasimhan, Eric Jang, Peter Pastor Sampedro, Sergey Levine
Deep machine learning methods and apparatus for robotic grasping

Patent number: 10639792

Abstract: Deep machine learning methods and apparatus related to manipulation of an object by an end effector of a robot. Some implementations relate to training a semantic grasping model to predict a measure that indicates whether motion data for an end effector of a robot will result in a successful grasp of an object; and to predict an additional measure that indicates whether the object has desired semantic feature(s). Some implementations are directed to utilization of the trained semantic grasping model to servo a grasping end effector of a robot to achieve a successful grasp of an object having desired semantic feature(s).

Type: Grant

Filed: January 26, 2018

Date of Patent: May 5, 2020

Assignee: GOOGLE LLC

Inventors: Sudheendra Vijayanarasimhan, Eric Jang, Peter Pastor Sampedro, Sergey Levine
FEATURE-BASED VIDEO ANNOTATION

Publication number: 20200082173

Abstract: A system and methodology provide for annotating videos with entities and associated probabilities of existence of the entities within video frames. A computer-implemented method identifies an entity from a plurality of entities identifying characteristics of video items. The computer-implemented method selects a set of features correlated with the entity based on a value of a feature of a plurality of features, determines a classifier for the entity using the set of features, and determines an aggregation calibration function for the entity based on the set of features. The computer-implemented method selects a video frame from a video item, where the video frame having associated features, and determines a probability of existence of the entity based on the associated features using the classifier and the aggregation calibration function.

Type: Application

Filed: November 18, 2019

Publication date: March 12, 2020

Inventors: Balakrishnan Varadarajan, George Dan Toderici, Apostol Natsev, Nitin Khandelwal, Sudheendra Vijayanarasimhan, Weilong Yang, Sanketh Shetty

1 2 3 next