Patents by Inventor Du Le Hong Tran

Du Le Hong Tran has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

VIDEO FRAME INTERPOLATION USING THREE-DIMENSIONAL SPACE-TIME CONVOLUTION

Publication number: 20230344962

Abstract: A method includes receiving an input video stream and providing, to a convolutional neural network (CNN), multiple image frames of the video stream including a target pair of consecutive frames, a frame immediately preceding the target pair, and a frame immediately following the target pair. The method includes generating, by the CNN, multiple interpolated image frames by performing 3D space-time convolution on the multiple image frames and outputting a video stream in which the interpolated image frames are inserted between the frames of the target pair. The convolution may include passing a 3D filter over the multiple image frames in common width and height dimensions, and in a depth dimension representing the number of frames. Generating the interpolated image frames may include generating image data for multiple color channels in respective convolutional layers. The CNN may be trained to predict non-linear movements that occur over multiple image frames.

Type: Application

Filed: March 31, 2021

Publication date: October 26, 2023

Inventors: Du Le Hong Tran, Subrahmanya Sai Tarun Kalluri, Deepak Pathak
Anticipating future video based on present video

Patent number: 11636681

Abstract: In one embodiment, a method includes accessing a first set of images of multiple images of a scene, wherein the first set of images show the scene during a time period. The method includes generating, by processing the first set of images using a first machine-learning model, one or more attributes representing observed actions performed in the scene during the time period. The method includes predicting, by processing the generated one or more attributes using a second machine-learning model, one or more actions that would happen in the scene after the time period.

Type: Grant

Filed: November 19, 2019

Date of Patent: April 25, 2023

Assignee: Meta Platforms, Inc.

Inventors: Heng Wang, Du Le Hong Tran, Antoine Miech, Lorenzo Torresani
Convolutional neural network based on groupwise convolution for efficient video analysis

Patent number: 10984245

Abstract: In one embodiment, a method includes receiving a request for information associated with a video, determining the information associated with the video by processing the video using a machine-learning model which is based on a convolutional neural network comprising a plurality of layers, wherein at least one of the plurality of layers comprises one or more building blocks, wherein at least one of the one or more building blocks comprises a first filter configured to perform a three-dimensional (3D) pointwise convolutional operation and a second filter configured to perform a three-dimensional (3D) groupwise convolutional operation, and outputting the information associated with the video in response to the request.

Type: Grant

Filed: February 26, 2019

Date of Patent: April 20, 2021

Assignee: Facebook, Inc.

Inventors: Du Le Hong Tran, Kaiming He, Heng Wang, Matthew Dan Feiszli, Lorenzo Torresani
Video analysis using convolutional networks

Patent number: 10706350

Abstract: In one embodiment, a method includes, by a computing device, receiving a plurality of inputs for a convolution layer of a convolutional neural network, the convolution layer having one or more input channels and one or more output channels, wherein the inputs are received via the input channels, generating, by convolving the inputs with one or more two-dimensional filters, a plurality of intermediate values, and generating, by convolving the intermediate values with one or more one-dimensional filters, a plurality of outputs, wherein the one-dimensional filters receive the intermediate values from the two-dimensional filters via intermediate channels. The method may provide the outputs to a subsequent layer of the convolutional neural network via the output channels. Each of the two dimensions of the two-dimensional filter may correspond to a spatial dimension, and the one dimension of the one-dimensional filter may correspond to a temporal dimension.

Type: Grant

Filed: August 10, 2018

Date of Patent: July 7, 2020

Assignee: Facebook, Inc.

Inventors: Du Le Hong Tran, Benjamin Ray, Balmanohar Paluri
Anticipating Future Video Based on Present Video

Publication number: 20200160064

Abstract: In one embodiment, a method includes accessing a first set of images of multiple images of a scene, wherein the first set of images show the scene during a time period. The method includes generating, by processing the first set of images using a first machine-learning model, one or more attributes representing observed actions performed in the scene during the time period. The method includes predicting, by processing the generated one or more attributes using a second machine-learning model, one or more actions that would happen in the scene after the time period.

Type: Application

Filed: November 19, 2019

Publication date: May 21, 2020

Inventors: Heng Wang, Du Le Hong Tran, Antoine Miech, Lorenzo Torresani
Systems and methods for determining video feature descriptors based on convolutional neural networks

Patent number: 10198637

Abstract: Systems, methods, and non-transitory computer-readable media can acquire video content for which video feature descriptors are to be determined. The video content can be processed based at least in part on a convolutional neural network including a set of two-dimensional convolutional layers and a set of three-dimensional convolutional layers. One or more outputs can be generated from the convolutional neural network. A plurality of video feature descriptors for the video content can be determined based at least in part on the one or more outputs from the convolutional neural network.

Type: Grant

Filed: December 20, 2017

Date of Patent: February 5, 2019

Assignee: Facebook, Inc.

Inventors: Du Le Hong Tran, Balamanohar Paluri, Lubomir Bourdev, Robert D. Fergus, Sumit Chopra
SYSTEMS AND METHODS FOR DETERMINING VIDEO FEATURE DESCRIPTORS BASED ON CONVOLUTIONAL NEURAL NETWORKS

Publication number: 20180114069

Abstract: Systems, methods, and non-transitory computer-readable media can acquire video content for which video feature descriptors are to be determined. The video content can be processed based at least in part on a convolutional neural network including a set of two-dimensional convolutional layers and a set of three-dimensional convolutional layers. One or more outputs can be generated from the convolutional neural network. A plurality of video feature descriptors for the video content can be determined based at least in part on the one or more outputs from the convolutional neural network.

Type: Application

Filed: December 20, 2017

Publication date: April 26, 2018

Inventors: Du Le Hong Tran, Balamanohar Paluri, Lubomir Bourdev, Robert D. Fergus, Sumit Chopra
Systems and methods for determining video feature descriptors based on convolutional neural networks

Patent number: 9858484

Abstract: Systems, methods, and non-transitory computer-readable media can acquire video content for which video feature descriptors are to be determined. The video content can be processed based at least in part on a convolutional neural network including a set of two-dimensional convolutional layers and a set of three-dimensional convolutional layers. One or more outputs can be generated from the convolutional neural network. A plurality of video feature descriptors for the video content can be determined based at least in part on the one or more outputs from the convolutional neural network.

Type: Grant

Filed: December 30, 2014

Date of Patent: January 2, 2018

Assignee: Facebook, Inc.

Inventors: Du Le Hong Tran, Balamanohar Paluri, Lubomir Bourdev, Robert D. Fergus, Sumit Chopra
Systems and methods for processing content using convolutional neural networks

Patent number: 9754351

Abstract: Systems, methods, and non-transitory computer-readable media can obtain a set of video frames at a first resolution. Process the set of video frames using a convolutional neural network to output one or more signals, the convolutional neural network including (i) a set of two-dimensional convolutional layers and (ii) a set of three-dimensional convolutional layers, wherein the processing causes the set of video frames to be reduced to a second resolution. Process the one or more signals using a set of three-dimensional de-convolutional layers of the convolutional neural network. Obtain one or more outputs corresponding to the set of video frames from the convolutional neural network.

Type: Grant

Filed: December 29, 2015

Date of Patent: September 5, 2017

Assignee: Facebook, Inc.

Inventors: Balamanohar Paluri, Du Le Hong Tran, Lubomir Bourdev, Robert D. Fergus
SYSTEMS AND METHODS FOR PROCESSING CONTENT USING CONVOLUTIONAL NEURAL NETWORKS

Publication number: 20170132758

Abstract: Systems, methods, and non-transitory computer-readable media can obtain a set of video frames at a first resolution. Process the set of video frames using a convolutional neural network to output one or more signals, the convolutional neural network including (i) a set of two-dimensional convolutional layers and (ii) a set of three-dimensional convolutional layers, wherein the processing causes the set of video frames to be reduced to a second resolution. Process the one or more signals using a set of three-dimensional de-convolutional layers of the convolutional neural network. Obtain one or more outputs corresponding to the set of video frames from the convolutional neural network.

Type: Application

Filed: December 29, 2015

Publication date: May 11, 2017

Inventors: Balamanohar Paluri, Du Le Hong Tran, Lubomir Bourdev, Robert D. Fergus
SYSTEMS AND METHODS FOR DETERMINING VIDEO FEATURE DESCRIPTORS BASED ON CONVOLUTIONAL NEURAL NETWORKS

Publication number: 20160189009

Abstract: Systems, methods, and non-transitory computer-readable media can acquire video content for which video feature descriptors are to be determined. The video content can be processed based at least in part on a convolutional neural network including a set of two-dimensional convolutional layers and a set of three-dimensional convolutional layers. One or more outputs can be generated from the convolutional neural network. A plurality of video feature descriptors for the video content can be determined based at least in part on the one or more outputs from the convolutional neural network.

Type: Application

Filed: December 30, 2014

Publication date: June 30, 2016

Inventors: Du Le Hong Tran, Balamanohar Paluri, Lubomir Bourdev, Robert D. Fergus, Sumit Chopra