Patents by Inventor Xitong Yang

Xitong Yang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11631239
    Abstract: Iterative prediction systems and methods for the task of action detection process an inputted sequence of video frames to generate an output of both action tubes and respective action labels, wherein the action tubes comprise a sequence of bounding boxes on each video frame. An iterative predictor processes large offsets between the bounding boxes and the ground-truth.
    Type: Grant
    Filed: April 22, 2021
    Date of Patent: April 18, 2023
    Assignee: NVIDIA CORPORATION
    Inventors: Xiaodong Yang, Ming-Yu Liu, Jan Kautz, Fanyi Xiao, Xitong Yang
  • Patent number: 11594006
    Abstract: There are numerous features in video that can be detected using computer-based systems, such as objects and/or motion. The detection of these features, and in particular the detection of motion, has many useful applications, such as action recognition, activity detection, object tracking, etc. The present disclosure provides a neural network that learns motion from unlabeled video frames. In particular, the neural network uses the unlabeled video frames to perform self-supervised hierarchical motion learning. The present disclosure also describes how the learned motion can be used in video action recognition.
    Type: Grant
    Filed: August 20, 2020
    Date of Patent: February 28, 2023
    Assignee: NVIDIA CORPORATION
    Inventors: Xiaodong Yang, Xitong Yang, Sifei Liu, Jan Kautz
  • Publication number: 20210241489
    Abstract: Iterative prediction systems and methods for the task of action detection process an inputted sequence of video frames to generate an output of both action tubes and respective action labels, wherein the action tubes comprise a sequence of bounding boxes on each video frame. An iterative predictor processes large offsets between the bounding boxes and the ground-truth.
    Type: Application
    Filed: April 22, 2021
    Publication date: August 5, 2021
    Inventors: Xiaodong YANG, Ming-Yu LIU, Jan KAUTZ, Fanyi XIAO, Xitong YANG
  • Patent number: 11017556
    Abstract: Iterative prediction systems and methods for the task of action detection process an inputted sequence of video frames to generate an output of both action tubes and respective action labels, wherein the action tubes comprise a sequence of bounding boxes on each video frame. An iterative predictor processes large offsets between the bounding boxes and the ground-truth.
    Type: Grant
    Filed: October 4, 2018
    Date of Patent: May 25, 2021
    Assignee: NVIDIA Corporation
    Inventors: Xiaodong Yang, Xitong Yang, Fanyi Xiao, Ming-Yu Liu, Jan Kautz
  • Patent number: 10943154
    Abstract: Multi-modal data representing driving events and corresponding actions related to the driving events can be obtained and used to train a neural network at least in part by using a triplet loss computed for the driving events as a regression loss to determine an embedding of driving event data. In some cases, using the trained neural network, a retrieval request for an input driving event and corresponding action can be processed by determining, from the neural network, one or more similar driving events or corresponding actions in the multi-modal data.
    Type: Grant
    Filed: January 22, 2019
    Date of Patent: March 9, 2021
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Ahmed Taha, Yi-Ting Chen, Teruhisa Misu, Larry Davis, Xitong Yang
  • Publication number: 20210064931
    Abstract: There are numerous features in video that can be detected using computer-based systems, such as objects and/or motion. The detection of these features, and in particular the detection of motion, has many useful applications, such as action recognition, activity detection, object tracking, etc. The present disclosure provides a neural network that learns motion from unlabeled video frames. In particular, the neural network uses the unlabeled video frames to perform self-supervised hierarchical motion learning. The present disclosure also describes how the learned motion can be used in video action recognition.
    Type: Application
    Filed: August 20, 2020
    Publication date: March 4, 2021
    Inventors: Xiaodong Yang, Xitong Yang, Sifei Liu, Jan Kautz
  • Publication number: 20200234086
    Abstract: Multi-modal data representing driving events and corresponding actions related to the driving events can be obtained and used to train a neural network at least in part by using a triplet loss computed for the driving events as a regression loss to determine an embedding of driving event data. In some cases, using the trained neural network, a retrieval request for an input driving event and corresponding action can be processed by determining, from the neural network, one or more similar driving events or corresponding actions in the multi-modal data.
    Type: Application
    Filed: January 22, 2019
    Publication date: July 23, 2020
    Inventors: Ahmed TAHA, Yi-Ting CHEN, Teruhisa MISU, Larry DAVIS, Xitong YANG
  • Publication number: 20190102908
    Abstract: Iterative prediction systems and methods for the task of action detection process an inputted sequence of video frames to generate an output of both action tubes and respective action labels, wherein the action tubes comprise a sequence of bounding boxes on each video frame. An iterative predictor processes large offsets between the bounding boxes and the ground-truth.
    Type: Application
    Filed: October 4, 2018
    Publication date: April 4, 2019
    Inventors: Xiaodong YANG, Xitong YANG, Fanyi XIAO, Ming-Yu LIU, Jan KAUTZ
  • Patent number: 9805255
    Abstract: A multimodal sensing system includes various devices that work together to automatically classify an action. A video camera captures a sequence of digital images. At least one other sensor device captures other sensed data (e.g., motion data). The system will extract video features from the digital images so that each extracted image feature is associated with a time period. It will extract other features from the other sensed data so that each extracted other feature is associated with a time period. The system will fuse a group of the extracted video features and a group of the extracted other features to create a fused feature representation for a time period. It will then analyze the fused feature representation to identify a class, access a data store of classes and actions to identify an action that is associated with the class, and save the identified action to a memory device.
    Type: Grant
    Filed: January 29, 2016
    Date of Patent: October 31, 2017
    Assignee: Conduent Business Services, LLC
    Inventors: Xitong Yang, Edgar A. Bernal, Sriganesh Madhvanath, Raja Bala, Palghat S. Ramesh, Qun Li, Jayant Kumar
  • Publication number: 20170220854
    Abstract: A multimodal sensing system includes various devices that work together to automatically classify an action. A video camera captures a sequence of digital images. At least one other sensor device captures other sensed data (e.g., motion data). The system will extract video features from the digital images so that each extracted image feature is associated with a time period. It will extract other features from the other sensed data so that each extracted other feature is associated with a time period. The system will fuse a group of the extracted video features and a group of the extracted other features to create a fused feature representation for a time period. It will then analyze the fused feature representation to identify a class, access a data store of classes and actions to identify an action that is associated with the class, and save the identified action to a memory device.
    Type: Application
    Filed: January 29, 2016
    Publication date: August 3, 2017
    Inventors: Xitong Yang, Edgar A. Bernal, Sriganesh Madhvanath, Raja Bala, Palghat S. Ramesh, Qun Li, Jayant Kumar