Patents by Inventor Deqing Sun

Deqing Sun has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods for training models to predict dense correspondences in images using geodesic distances

Patent number: 11954899

Abstract: Systems and methods for training models to predict dense correspondences across images such as human images. A model may be trained using synthetic training data created from one or more 3D computer models of a subject. In addition, one or more geodesic distances derived from the surfaces of one or more of the 3D models may be used to generate one or more loss values, which may in turn be used in modifying the model's parameters during training.

Type: Grant

Filed: March 11, 2021

Date of Patent: April 9, 2024

Assignee: GOOGLE LLC

Inventors: Yinda Zhang, Feitong Tan, Danhang Tang, Mingsong Dou, Kaiwen Guo, Sean Ryan Francesco Fanello, Sofien Bouaziz, Cem Keskin, Ruofei Du, Rohit Kumar Pandey, Deqing Sun
Systems and Methods for Training Models to Predict Dense Correspondences in Images Using Geodesic Distances

Publication number: 20240046618

Abstract: Systems and methods for training models to predict dense correspondences across images such as human images. A model may be trained using synthetic training data created from one or more 3D computer models of a subject. In addition, one or more geodesic distances derived from the surfaces of one or more of the 3D models may be used to generate one or more loss values, which may in turn be used in modifying the model's parameters during training.

Type: Application

Filed: March 11, 2021

Publication date: February 8, 2024

Inventors: Yinda Zhang, Feitong Tan, Danhang Tang, Mingsong Dou, Kaiwen Guo, Sean Ryan Francesco Fanello, Sofien Bouaziz, Cem Keskin, Ruofei Du, Rohit Kumar Pandey, Deqing Sun
Learning Articulated Shape Reconstruction from Imagery

Publication number: 20240013497

Abstract: A computing system and method can be used to render a 3D shape from one or more images. In particular, the present disclosure provides a general pipeline for learning articulated shape reconstruction from images (LASR). The pipeline can reconstruct rigid or nonrigid 3D shapes. In particular, the pipeline can automatically decompose non-rigidly deforming shapes into rigid motions near rigid-bones. This pipeline incorporates an analysis-by-synthesis strategy and forward-renders silhouette, optical flow, and color images which can be compared against the video observations to adjust the internal parameters of the model. By inverting a rendering pipeline and incorporating optical flow, the pipeline can recover a mesh of a 3D model from the one or more images input by a user.

Type: Application

Filed: December 21, 2020

Publication date: January 11, 2024

Inventors: Deqing Sun, Varun Jampani, Gengshan Yang, Daniel Vlasic, Huiwen Chang, Forrester H. Cole, Ce Liu, William Tafel Freeman
Learnable cost volume for determining pixel correspondence

Patent number: 11790550

Abstract: A method includes obtaining a first plurality of feature vectors associated with a first image and a second plurality of feature vectors associated with a second image. The method also includes generating a plurality of transformed feature vectors by transforming each respective feature vector of the first plurality of feature vectors by a kernel matrix trained to define an elliptical inner product space. The method additionally includes generating a cost volume by determining, for each respective transformed feature vector of the plurality of transformed feature vectors, a plurality of inner products, wherein each respective inner product of the plurality of inner products is between the respective transformed feature vector and a corresponding candidate feature vector of a corresponding subset of the second plurality of feature vectors. The method further includes determining, based on the cost volume, a pixel correspondence between the first image and the second image.

Type: Grant

Filed: July 8, 2020

Date of Patent: October 17, 2023

Assignee: Google LLC

Inventors: Taihong Xiao, Deqing Sun, Ming-Hsuan Yang, Qifei Wang, Jinwei Yuan
Bilateral convolution layer network for processing point clouds

Patent number: 11636668

Abstract: A method includes filtering a point cloud transformation of a 3D object to generate a 3D lattice and processing the 3D lattice through a series of bilateral convolution networks (BCL), each BCL in the series having a lower lattice feature scale than a preceding BCL in the series. The output of each BCL in the series is concatenated to generate an intermediate 3D lattice. Further filtering of the intermediate 3D lattice generates a first prediction of features of the 3D object.

Type: Grant

Filed: May 22, 2018

Date of Patent: April 25, 2023

Inventors: Varun Jampani, Hang Su, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
Learning rigidity of dynamic scenes for three-dimensional scene flow estimation

Patent number: 11508076

Abstract: A neural network model receives color data for a sequence of images corresponding to a dynamic scene in three-dimensional (3D) space. Motion of objects in the image sequence results from a combination of a dynamic camera orientation and motion or a change in the shape of an object in the 3D space. The neural network model generates two components that are used to produce a 3D motion field representing the dynamic (non-rigid) part of the scene. The two components are information identifying dynamic and static portions of each image and the camera orientation. The dynamic portions of each image contain motion in the 3D space that is independent of the camera orientation. In other words, the motion in the 3D space (estimated 3D scene flow data) is separated from the motion of the camera.

Type: Grant

Filed: January 22, 2021

Date of Patent: November 22, 2022

Assignee: NVIDIA Corporation

Inventors: Zhaoyang Lv, Kihwan Kim, Deqing Sun, Alejandro Jose Troccoli, Jan Kautz
Using residual video data resulting from a compression of original video data to improve a decompression of the original video data

Patent number: 11496773

Abstract: A method, computer readable medium, and system are disclosed for identifying residual video data. This data describes data that is lost during a compression of original video data. For example, the original video data may be compressed and then decompressed, and this result may be compared to the original video data to determine the residual video data. This residual video data is transformed into a smaller format by means of encoding, binarizing, and compressing, and is sent to a destination. At the destination, the residual video data is transformed back into its original format and is used during the decompression of the compressed original video data to improve a quality of the decompressed original video data.

Type: Grant

Filed: June 18, 2021

Date of Patent: November 8, 2022

Assignee: NVIDIA CORPORATION

Inventors: Yi-Hsuan Tsai, Ming-Yu Liu, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
Learnable Cost Volume for Determining Pixel Correspondence

Publication number: 20220189051

Abstract: A method includes obtaining a first plurality of feature vectors associated with a first image and a second plurality of feature vectors associated with a second image. The method also includes generating a plurality of transformed feature vectors by transforming each respective feature vector of the first plurality of feature vectors by a kernel matrix trained to define an elliptical inner product space. The method additionally includes generating a cost volume by determining, for each respective transformed feature vector of the plurality of transformed feature vectors, a plurality of inner products, wherein each respective inner product of the plurality of inner products is between the respective transformed feature vector and a corresponding candidate feature vector of a corresponding subset of the second plurality of feature vectors. The method further includes determining, based on the cost volume, a pixel correspondence between the first image and the second image.

Type: Application

Filed: July 8, 2020

Publication date: June 16, 2022

Inventors: Taihong Xiao, Deqing Sun, Ming-Hsuan Yang, Qifei Wang, Jinwei Yuan
Training a neural network to predict superpixels using segmentation-aware affinity loss

Patent number: 11256961

Abstract: Segmentation is the identification of separate objects within an image. An example is identification of a pedestrian passing in front of a car, where the pedestrian is a first object and the car is a second object. Superpixel segmentation is the identification of regions of pixels within an object that have similar properties. An example is identification of pixel regions having a similar color, such as different articles of clothing worn by the pedestrian and different components of the car. A pixel affinity neural network (PAN) model is trained to generate pixel affinity maps for superpixel segmentation. The pixel affinity map defines the similarity of two points in space. In an embodiment, the pixel affinity map indicates a horizontal affinity and vertical affinity for each pixel in the image. The pixel affinity map is processed to identify the superpixels.

Type: Grant

Filed: July 6, 2020

Date of Patent: February 22, 2022

Assignee: NVIDIA Corporation

Inventors: Wei-Chih Tu, Ming-Yu Liu, Varun Jampani, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
USING RESIDUAL VIDEO DATA RESULTING FROM A COMPRESSION OF ORIGINAL VIDEO DATA TO IMPROVE A DECOMPRESSION OF THE ORIGINAL VIDEO DATA

Publication number: 20210314629

Abstract: A method, computer readable medium, and system are disclosed for identifying residual video data. This data describes data that is lost during a compression of original video data. For example, the original video data may be compressed and then decompressed, and this result may be compared to the original video data to determine the residual video data. This residual video data is transformed into a smaller format by means of encoding, binarizing, and compressing, and is sent to a destination. At the destination, the residual video data is transformed back into its original format and is used during the decompression of the compressed original video data to improve a quality of the decompressed original video data.

Type: Application

Filed: June 18, 2021

Publication date: October 7, 2021

Inventors: Yi-Hsuan Tsai, Ming-Yu Liu, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
Using residual video data resulting from a compression of original video data to improve a decompression of the original video data

Patent number: 11082720

Abstract: A method, computer readable medium, and system are disclosed for identifying residual video data. This data describes data that is lost during a compression of original video data. For example, the original video data may be compressed and then decompressed, and this result may be compared to the original video data to determine the residual video data. This residual video data is transformed into a smaller format by means of encoding, binarizing, and compressing, and is sent to a destination. At the destination, the residual video data is transformed back into its original format and is used during the decompression of the compressed original video data to improve a quality of the decompressed original video data.

Type: Grant

Filed: November 14, 2018

Date of Patent: August 3, 2021

Assignee: NVIDIA CORPORATION

Inventors: Yi-Hsuan Tsai, Ming-Yu Liu, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
LEARNING RIGIDITY OF DYNAMIC SCENES FOR THREE-DIMENSIONAL SCENE FLOW ESTIMATION

Publication number: 20210150736

Abstract: A neural network model receives color data for a sequence of images corresponding to a dynamic scene in three-dimensional (3D) space. Motion of objects in the image sequence results from a combination of a dynamic camera orientation and motion or a change in the shape of an object in the 3D space. The neural network model generates two components that are used to produce a 3D motion field representing the dynamic (non-rigid) part of the scene. The two components are information identifying dynamic and static portions of each image and the camera orientation. The dynamic portions of each image contain motion in the 3D space that is independent of the camera orientation. In other words, the motion in the 3D space (estimated 3D scene flow data) is separated from the motion of the camera.

Type: Application

Filed: January 22, 2021

Publication date: May 20, 2021

Inventors: Zhaoyang Lv, Kihwan Kim, Deqing Sun, Alejandro Jose Troccoli, Jan Kautz
Scene flow estimation using shared features

Patent number: 10986325

Abstract: Scene flow represents the three-dimensional (3D) structure and movement of objects in a video sequence in three dimensions from frame-to-frame and is used to track objects and estimate speeds for autonomous driving applications. Scene flow is recovered by a neural network system from a video sequence captured from at least two viewpoints (e.g., cameras), such as a left-eye and right-eye of a viewer. An encoder portion of the system extracts features from frames of the video sequence. The features are input to a first decoder to predict optical flow and a second decoder to predict disparity. The optical flow represents pixel movement in (x,y) and the disparity represents pixel movement in z (depth). When combined, the optical flow and disparity represent the scene flow.

Type: Grant

Filed: September 12, 2019

Date of Patent: April 20, 2021

Assignee: NVIDIA Corporation

Inventors: Deqing Sun, Varun Jampani, Erik Gundersen Learned-Miller, Huaizu Jiang
VIDEO INTERPOLATION USING ONE OR MORE NEURAL NETWORKS

Publication number: 20210067735

Abstract: Apparatuses, systems, and techniques to enhance video. In at least one embodiment, one or more neural networks are used to create, from a first video, a second video having a higher frame rate, higher resolution, or reduced number of missing or corrupt video frames.

Type: Application

Filed: September 3, 2019

Publication date: March 4, 2021

Inventors: Fitsum Reda, Deqing Sun, Aysegul Dundar, Mohammad Shoeybi, Guilin Liu, Kevin Shih, Andrew Tao, Jan Kautz, Bryan Catanzaro
Learning rigidity of dynamic scenes for three-dimensional scene flow estimation

Patent number: 10929987

Abstract: A neural network model receives color data for a sequence of images corresponding to a dynamic scene in three-dimensional (3D) space. Motion of objects in the image sequence results from a combination of a dynamic camera orientation and motion or a change in the shape of an object in the 3D space. The neural network model generates two components that are used to produce a 3D motion field representing the dynamic (non-rigid) part of the scene. The two components are information identifying dynamic and static portions of each image and the camera orientation. The dynamic portions of each image contain motion in the 3D space that is independent of the camera orientation. In other words, the motion in the 3D space (estimated 3D scene flow data) is separated from the motion of the camera.

Type: Grant

Filed: August 1, 2018

Date of Patent: February 23, 2021

Assignee: NVIDIA Corporation

Inventors: Zhaoyang Lv, Kihwan Kim, Deqing Sun, Alejandro Jose Troccoli, Jan Kautz
TRAINING A NEURAL NETWORK TO PREDICT SUPERPIXELS USING SEGMENTATION-AWARE AFFINITY LOSS

Publication number: 20200334502

Abstract: Segmentation is the identification of separate objects within an image. An example is identification of a pedestrian passing in front of a car, where the pedestrian is a first object and the car is a second object. Superpixel segmentation is the identification of regions of pixels within an object that have similar properties. An example is identification of pixel regions having a similar color, such as different articles of clothing worn by the pedestrian and different components of the car. A pixel affinity neural network (PAN) model is trained to generate pixel affinity maps for superpixel segmentation. The pixel affinity map defines the similarity of two points in space. In an embodiment, the pixel affinity map indicates a horizontal affinity and vertical affinity for each pixel in the image. The pixel affinity map is processed to identify the superpixels.

Type: Application

Filed: July 6, 2020

Publication date: October 22, 2020

Inventors: Wei-Chih Tu, Ming-Yu Liu, Varun Jampani, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
Superpixel sampling networks

Patent number: 10789678

Abstract: A superpixel sampling network utilizes a neural network coupled to a differentiable simple linear iterative clustering component to determine pixel-superpixel associations from a set of pixel features output by the neural network. The superpixel sampling network computes updated superpixel centers and final pixel-superpixel associations over a number of iterations.

Type: Grant

Filed: September 13, 2018

Date of Patent: September 29, 2020

Assignee: NVIDIA Corp.

Inventors: Varun Jampani, Deqing Sun, Ming-Yu Liu, Jan Kautz
VIEW SYNTHESIS USING NEURAL NETWORKS

Publication number: 20200294194

Abstract: A video stitching system combines video from different cameras to form a panoramic video that, in various embodiments, is temporally stable and tolerant to strong parallax. In an embodiment, the system provides a smooth spatial interpolation that can be used to connect the input video images. In an embodiment, the system applies an interpolation layer to slices of the overlapping video sources, and the network learns a dense flow field to smoothly align the input videos with spatial interpolation. Various embodiments are applicable to areas such as virtual reality, immersive telepresence, autonomous driving, and video surveillance.

Type: Application

Filed: March 11, 2019

Publication date: September 17, 2020

Inventors: Deqing Sun, Orazio Gallo, Jan Kautz, Jinwei GU, Wei-Sheng Lai
Multi-frame video interpolation using optical flow

Patent number: 10776688

Abstract: Video interpolation is used to predict one or more intermediate frames at timesteps defined between two consecutive frames. A first neural network model approximates optical flow data defining motion between the two consecutive frames. A second neural network model refines the optical flow data and predicts visibility maps for each timestep. The two consecutive frames are warped according to the refined optical flow data for each timestep to produce pairs of warped frames for each timestep. The second neural network model then fuses the pair of warped frames based on the visibility maps to produce the intermediate frame for each timestep. Artifacts caused by motion boundaries and occlusions are reduced in the predicted intermediate frames.

Type: Grant

Filed: October 24, 2018

Date of Patent: September 15, 2020

Assignee: NVIDIA Corporation

Inventors: Huaizu Jiang, Deqing Sun, Varun Jampani
Training a neural network to predict superpixels using segmentation-aware affinity loss

Patent number: 10748036

Abstract: Segmentation is the identification of separate objects within an image. An example is identification of a pedestrian passing in front of a car, where the pedestrian is a first object and the car is a second object. Superpixel segmentation is the identification of regions of pixels within an object that have similar properties An example is identification of pixel regions having a similar color, such as different articles of clothing worn by the pedestrian and different components of the car. A pixel affinity neural network (PAN) model is trained to generate pixel affinity maps for superpixel segmentation. The pixel affinity map defines the similarity of two points in space. In an embodiment, the pixel affinity map indicates a horizontal affinity and vertical affinity for each pixel in the image. The pixel affinity map is processed to identify the superpixels.

Type: Grant

Filed: November 13, 2018

Date of Patent: August 18, 2020

Assignee: NVIDIA Corporation

Inventors: Wei-Chih Tu, Ming-Yu Liu, Varun Jampani, Deqing Sun, Ming-Hsuan Yang, Jan Kautz

1 2 next