Patents by Inventor Honglu Zhou

Honglu Zhou has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEMS AND METHODS FOR VIDEO MODELS WITH PROCEDURE UNDERSTANDING

Publication number: 20240161464

Abstract: Embodiments described herein provide systems and methods for training video models to perform a task from an input instructional video. A procedure knowledge graph (PKG) may be generated with nodes representing procedure steps, and edges representing relationships between the steps. The PKG may be generated based on text and/or video training data which includes procedures (e.g., instructional videos). Using the PKG, a video model may be trained using the PKG to provide supervisory training signals for a number of tasks. Once the model is trained, it may be fine-tuned for a specific task which benefits from the model being trained in a way that makes the model embed procedural information when encoding videos.

Type: Application

Filed: January 25, 2023

Publication date: May 16, 2024

Inventors: Roberto Martin-Martin, Silvio Savarese, Honglu Zhou, Juan Carlos Niebles Duque
TIME MARKING CHAPTERS IN MEDIA ITEMS AT A PLATFORM USING MACHINE-LEARNING

Publication number: 20230421855

Abstract: Methods and systems for time marking of media items at a platform using machine-learning are provided herein. An indication of a identified media item is provided as input to a machine-learning model and one or more outputs of the machine-learning model is obtained. The one or more obtained outputs comprise time marks identifying each of the plurality of content segments of the media item. Each of the plurality of content segments is associated with a segment start indicator for a timeline of the media item. A resulting duration is determined of a combination of the plurality of content segments for which the time marks were obtained from the one or more of outputs of the machine-learning model. Responsive to determining that the resulting duration is less than the duration of the media item, one or more further inputs is provided to the machine learning model.

Type: Application

Filed: September 11, 2023

Publication date: December 28, 2023

Inventors: Chenjie Gu, Wei-Hong Chuang, Min-Hsuan Tsai, Jianfeng Yang, Ji Zhang, Honglu Zhou, Hassan Akbari
Time marking chapters in media items at a platform using machine-learning

Patent number: 11758233

Abstract: Methods and systems for time marking of media items at a platform using machine-learning are provided herein. A media item is provided to users of a platform. An indication of the identified media item is provided as input to a machine-learning model that is trained using different feature types of historical media items to predict a plurality of content segments of a given media item each depicting, to the one or more users, a distinct section of the media item. One or more outputs of the machine-learning model are obtained comprising time marks identifying each of the plurality of content segments of the media item. Each of the plurality of content segments are associated with a segment start indicator for a timeline of the media item. The media item and an indication of each segment start indicator is provided for presentation to at least one user.

Type: Grant

Filed: June 8, 2022

Date of Patent: September 12, 2023

Assignee: Google LLC

Inventors: Chenjie Gu, Wei-Hong Chuang, Min-Hsuan Tsai, Jianfeng Yang, Ji Zhang, Honglu Zhou, Hassan Akbari
Multi-hop transformer for spatio-temporal reasoning and localization

Patent number: 11741712

Abstract: A method for using a multi-hop reasoning framework to perform multi-step compositional long-term reasoning is presented. The method includes extracting feature maps and frame-level representations from a video stream by using a convolutional neural network (CNN), performing object representation learning and detection, linking objects through time via tracking to generate object tracks and image feature tracks, feeding the object tracks and the image feature tracks to a multi-hop transformer that hops over frames in the video stream while concurrently attending to one or more of the objects in the video stream until the multi-hop transformer arrives at a correct answer, and employing video representation learning and recognition from the objects and image context to locate a target object within the video stream.

Type: Grant

Filed: September 1, 2021

Date of Patent: August 29, 2023

Inventors: Asim Kadav, Farley Lai, Hans Peter Graf, Alexandru Niculescu-Mizil, Renqiang Min, Honglu Zhou
COMPOSITIONAL REASONING OF GORUP ACTIVITY IN VIDEOS WITH KEYPOINT-ONLY MODALITY

Publication number: 20230148017

Abstract: A method for compositional reasoning of group activity in videos with keypoint-only modality is presented. The method includes obtaining video frames from a video stream received from a plurality of video image capturing devices, extracting keypoints all of persons detected in the video frames to define keypoint data, tokenizing the keypoint data with time and segment information, clustering groups of keypoint persons in the video frames and passing the clustering groups through multi-scale prediction, and performing a prediction to provide a group activity prediction of a scene in the video frames.

Type: Application

Filed: October 5, 2022

Publication date: May 11, 2023

Inventors: Asim Kadav, Farley Lai, Hans Peter Graf, Honglu Zhou
MULTI-HOP TRANSFORMER FOR SPATIO-TEMPORAL REASONING AND LOCALIZATION

Publication number: 20220101007

Abstract: A method for using a multi-hop reasoning framework to perform multi-step compositional long-term reasoning is presented. The method includes extracting feature maps and frame-level representations from a video stream by using a convolutional neural network (CNN), performing object representation learning and detection, linking objects through time via tracking to generate object tracks and image feature tracks, feeding the object tracks and the image feature tracks to a multi-hop transformer that hops over frames in the video stream while concurrently attending to one or more of the objects in the video stream until the multi-hop transformer arrives at a correct answer, and employing video representation learning and recognition from the objects and image context to locate a target object within the video stream.

Type: Application

Filed: September 1, 2021

Publication date: March 31, 2022

Inventors: Asim Kadav, Farley Lai, Hans Peter Graf, Alexandru Niculescu-Mizil, Renqiang Min, Honglu Zhou

SYSTEMS AND METHODS FOR VIDEO MODELS WITH PROCEDURE UNDERSTANDING

TIME MARKING CHAPTERS IN MEDIA ITEMS AT A PLATFORM USING MACHINE-LEARNING

Time marking chapters in media items at a platform using machine-learning

Multi-hop transformer for spatio-temporal reasoning and localization

COMPOSITIONAL REASONING OF GORUP ACTIVITY IN VIDEOS WITH KEYPOINT-ONLY MODALITY

MULTI-HOP TRANSFORMER FOR SPATIO-TEMPORAL REASONING AND LOCALIZATION