Patents by Inventor Jan Kautz

Jan Kautz has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

PERFORMING OCCLUSION-AWARE GLOBAL 3D POSE AND SHAPE ESTIMATION OF ARTICULATED OBJECTS

Publication number: 20230070514

Abstract: In order to determine accurate three-dimensional (3D) models for objects within a video, the objects are first identified and tracked within the video, and a pose and shape are estimated for these tracked objects. A translation and global orientation are removed from the tracked objects to determine local motion for the objects, and motion infilling is performed to fill in any missing portions for the object within the video. A global trajectory is then determined for the objects within the video, and the infilled motion and global trajectory are then used to determine infilled global motion for the object within the video. This enables the accurate depiction of each object as a 3D pose sequence for that model that accounts for occlusions and global factors within the video.

Type: Application

Filed: January 25, 2022

Publication date: March 9, 2023

Inventors: Ye Yuan, Umar Iqbal, Pavlo Molchanov, Jan Kautz
Self-supervised hierarchical motion learning for video action recognition

Patent number: 11594006

Abstract: There are numerous features in video that can be detected using computer-based systems, such as objects and/or motion. The detection of these features, and in particular the detection of motion, has many useful applications, such as action recognition, activity detection, object tracking, etc. The present disclosure provides a neural network that learns motion from unlabeled video frames. In particular, the neural network uses the unlabeled video frames to perform self-supervised hierarchical motion learning. The present disclosure also describes how the learned motion can be used in video action recognition.

Type: Grant

Filed: August 20, 2020

Date of Patent: February 28, 2023

Assignee: NVIDIA CORPORATION

Inventors: Xiaodong Yang, Xitong Yang, Sifei Liu, Jan Kautz
Few-shot training of a neural network

Patent number: 11593661

Abstract: A neural network is trained to identify one or more features of an image. The neural network is trained using a small number of original images, from which a plurality of additional images are derived. The additional images generated by rotating and decoding embeddings of the image in a latent space generated by an autoencoder. The images generated by the rotation and decoding exhibit changes to a feature that is in proportion to the amount of rotation.

Type: Grant

Filed: April 19, 2019

Date of Patent: February 28, 2023

Assignee: NVIDIA Corporation

Inventors: Seonwook Park, Shalini De Mello, Pavlo Molchanov, Umar Iqbal, Jan Kautz
SYNTHESIZING VIDEO FROM AUDIO USING ONE OR MORE NEURAL NETWORKS

Publication number: 20230035306

Abstract: Apparatuses, systems, and techniques are presented to generate media content.

Type: Application

Filed: July 21, 2021

Publication date: February 2, 2023

Inventors: Ming-Yu Liu, Koki Nagano, Yeongho Seol, Jose Rafael Valle Gomes da Costa, Jaewoo Seo, Ting-Chun Wang, Arun Mallya, Sameh Khamis, Wei Ping, Rohan Badlani, Kevin Jonathan Shih, Bryan Catanzaro, Simon Yuen, Jan Kautz
IMAGE PROCESSING USING COUPLED SEGMENTATION AND EDGE LEARNING

Publication number: 20230015989

Abstract: The disclosure provides a learning framework that unifies both semantic segmentation and semantic edge detection. A learnable recurrent message passing layer is disclosed where semantic edges are considered as explicitly learned gating signals to refine segmentation and improve dense prediction quality by finding compact structures for message paths. The disclosure includes a method for coupled segmentation and edge learning. In one example, the method includes: (1) receiving an input image, (2) generating, from the input image, a semantic feature map, an affinity map, and a semantic edge map from a single backbone network of a convolutional neural network (CNN), and (3) producing a refined semantic feature map by smoothing pixels of the semantic feature map using spatial propagation, and controlling the smoothing using both affinity values from the affinity map and edge values from the semantic edge map.

Type: Application

Filed: July 1, 2021

Publication date: January 19, 2023

Inventors: Zhiding Yu, Rui Huang, Wonmin Byeon, Sifei Liu, Guilin Liu, Thomas Breuel, Anima Anandkumar, Jan Kautz
TRAINING OBJECT DETECTION SYSTEMS WITH GENERATED IMAGES

Publication number: 20230004760

Abstract: Apparatuses, systems, and techniques to identify objects within an image using self-supervised machine learning. In at least one embodiment, a machine learning system is trained to recognize objects by training a first network to recognize objects within images that are generated by a second network. In at least one embodiment, the second network is a controllable network.

Type: Application

Filed: June 28, 2021

Publication date: January 5, 2023

Inventors: Siva Karthik Mustikovela, Shalini De Mello, Aayush Prakash, Umar Iqbal, Sifei Liu, Jan Kautz
View synthesis for dynamic scenes

Patent number: 11546568

Abstract: Apparatuses, systems, and techniques are presented to perform monocular view synthesis of a dynamic scene. Single and multi-view depth information can be determined for a collection of images of a dynamic scene, and a blender network can be used to combine image features for foreground, background, and missing image regions using fused depth maps inferred form the single and multi-view depth information.

Type: Grant

Filed: March 6, 2020

Date of Patent: January 3, 2023

Assignee: NVIDIA CORPORATION

Inventors: Jae Shin Yoon, Jan Kautz, Kihwan Kim
SCORE-BASED GENERATIVE MODELING IN LATENT SPACE

Publication number: 20220405583

Abstract: One embodiment of the present invention sets forth a technique for training a generative model. The technique includes converting a first data point included in a training dataset into a first set of values associated with a base distribution for a score-based generative model. The technique also includes performing one or more denoising operations via the score-based generative model to convert the first set of values into a first set of latent variable values associated with a latent space. The technique further includes performing one or more additional operations to convert the first set of latent variable values into a second data point. Finally, the technique includes computing one or more losses based on the first data point and the second data point and generating a trained generative model based on the one or more losses, wherein the trained generative model includes the score-based generative model.

Type: Application

Filed: February 25, 2022

Publication date: December 22, 2022

Inventors: Arash VAHDAT, Karsten KREIS, Jan KAUTZ
SCORE-BASED GENERATIVE MODELING IN LATENT SPACE

Publication number: 20220398697

Abstract: One embodiment of the present invention sets forth a technique for generating data. The technique includes sampling from a first distribution associated with the score-based generative model to generate a first set of values. The technique also includes performing one or more denoising operations via the score-based generative model to convert the first set of values into a first set of latent variable values associated with a latent space. The technique further includes converting the first set of latent variable values into a generative output.

Type: Application

Filed: February 25, 2022

Publication date: December 15, 2022

Inventors: Arash VAHDAT, Karsten KREIS, Jan KAUTZ
NEURAL NETWORK PATH PLANNING

Publication number: 20220396289

Abstract: Apparatuses, systems, and techniques to calculate a plurality of paths, through which an autonomous device is to traverse. In at least one embodiment, a plurality of paths are calculated using one or more neural networks based, at least in part, on one or more distance values output by the one or more neural networks.

Type: Application

Filed: June 15, 2021

Publication date: December 15, 2022

Inventors: Xueting Li, Sifei Liu, Shalini De Mello, Jan Kautz
ARCHITECTURE-AGNOSTIC FEDERATED LEARNING SYSTEM

Publication number: 20220391781

Abstract: A method performed by a server is provided. The method comprises sending copies of a set of parameters of a hyper network (HN) to at least one client device, receiving from each client device in the at least one client device, a corresponding set of updated parameters of the HN, and determining a next set of parameters of the HN based on the corresponding sets of updated parameters received from the at least one client device. Each client device generates the corresponding set of updated parameters based on a local model architecture of the client device.

Type: Application

Filed: May 27, 2022

Publication date: December 8, 2022

Inventors: Or Litany, Haggai Maron, David Jesus Acuna Marrero, Jan Kautz, Sanja Fidler, Gal Chechik
Future object trajectory predictions for autonomous machine applications

Patent number: 11514293

Abstract: In various examples, historical trajectory information of objects in an environment may be tracked by an ego-vehicle and encoded into a state feature. The encoded state features for each of the objects observed by the ego-vehicle may be used—e.g., by a bi-directional long short-term memory (LSTM) network—to encode a spatial feature. The encoded spatial feature and the encoded state feature for an object may be used to predict lateral and/or longitudinal maneuvers for the object, and the combination of this information may be used to determine future locations of the object. The future locations may be used by the ego-vehicle to determine a path through the environment, or may be used by a simulation system to control virtual objects—according to trajectories determined from the future locations—through a simulation environment.

Type: Grant

Filed: September 9, 2019

Date of Patent: November 29, 2022

Assignee: NVIDIA Corporation

Inventors: Ruben Villegas, Alejandro Troccoli, Iuri Frosio, Stephen Tyree, Wonmin Byeon, Jan Kautz
Driver gaze tracking system for use in vehicles

Patent number: 11506888

Abstract: A gaze tracking system for use by the driver of a vehicle includes an opaque frame circumferentially enclosing a transparent field of view of the driver, light emitting diodes coupled to the opaque frame for emitting infrared light onto various regions of the driver's eye gazing through the transparent field of view, and diodes for sensing intensity of infrared light reflected off of various regions of the driver's eye.

Type: Grant

Filed: September 20, 2019

Date of Patent: November 22, 2022

Assignee: NVIDIA CORP.

Inventors: Eric Whitmire, Kaan Aksit, Michael Stengel, Jan Kautz, David Luebke, Ben Boudaoud
Learning rigidity of dynamic scenes for three-dimensional scene flow estimation

Patent number: 11508076

Abstract: A neural network model receives color data for a sequence of images corresponding to a dynamic scene in three-dimensional (3D) space. Motion of objects in the image sequence results from a combination of a dynamic camera orientation and motion or a change in the shape of an object in the 3D space. The neural network model generates two components that are used to produce a 3D motion field representing the dynamic (non-rigid) part of the scene. The two components are information identifying dynamic and static portions of each image and the camera orientation. The dynamic portions of each image contain motion in the 3D space that is independent of the camera orientation. In other words, the motion in the 3D space (estimated 3D scene flow data) is separated from the motion of the camera.

Type: Grant

Filed: January 22, 2021

Date of Patent: November 22, 2022

Assignee: NVIDIA Corporation

Inventors: Zhaoyang Lv, Kihwan Kim, Deqing Sun, Alejandro Jose Troccoli, Jan Kautz
Using residual video data resulting from a compression of original video data to improve a decompression of the original video data

Patent number: 11496773

Abstract: A method, computer readable medium, and system are disclosed for identifying residual video data. This data describes data that is lost during a compression of original video data. For example, the original video data may be compressed and then decompressed, and this result may be compared to the original video data to determine the residual video data. This residual video data is transformed into a smaller format by means of encoding, binarizing, and compressing, and is sent to a destination. At the destination, the residual video data is transformed back into its original format and is used during the decompression of the compressed original video data to improve a quality of the decompressed original video data.

Type: Grant

Filed: June 18, 2021

Date of Patent: November 8, 2022

Assignee: NVIDIA CORPORATION

Inventors: Yi-Hsuan Tsai, Ming-Yu Liu, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
Three-dimensional (3D) pose estimation from a monocular camera

Patent number: 11488418

Abstract: Estimating a three-dimensional (3D) pose of an object, such as a hand or body (human, animal, robot, etc.), from a 2D image is necessary for human-computer interaction. A hand pose can be represented by a set of points in 3D space, called keypoints. Two coordinates (x,y) represent spatial displacement and a third coordinate represents a depth of every point with respect to the camera. A monocular camera is used to capture an image of the 3D pose, but does not capture depth information. A neural network architecture is configured to generate a depth value for each keypoint in the captured image, even when portions of the pose are occluded, or the orientation of the object is ambiguous. Generation of the depth values enables estimation of the 3D pose of the object.

Type: Grant

Filed: December 28, 2020

Date of Patent: November 1, 2022

Assignee: NVIDIA Corporation

Inventors: Umar Iqbal, Pavlo Molchanov, Thomas Michael Breuel, Jan Kautz
CONTEXT-AWARE SYNTHESIS AND PLACEMENT OF OBJECT INSTANCES

Publication number: 20220335672

Abstract: One embodiment of a method includes applying a first generator model to a semantic representation of an image to generate an affine transformation, where the affine transformation represents a bounding box associated with at least one region within the image. The method further includes applying a second generator model to the affine transformation and the semantic representation to generate a shape of an object. The method further includes inserting the object into the image based on the bounding box and the shape.

Type: Application

Filed: January 26, 2022

Publication date: October 20, 2022

Inventors: Donghoon LEE, Sifei LIU, Jinwei GU, Ming-Yu LIU, Jan KAUTZ
TECHNIQUES TO IDENTIFY DATA USED TO TRAIN ONE OR MORE NEURAL NETWORKS

Publication number: 20220284232

Abstract: Apparatuses, systems, and techniques to identify one or more images used to train one or more neural networks. In at least one embodiment, one or more images used to train one or more neural networks are identified, based on, for example, one or more labels of one or more objects within the one or more images.

Type: Application

Filed: March 1, 2021

Publication date: September 8, 2022

Inventors: Hongxu Yin, Arun Mallya, Arash Vahdat, Jose Manuel Alvarez Lopez, Jan Kautz, Pavlo Molchanov
THREE-DIMENSIONAL OBJECT RECONSTRUCTION FROM A VIDEO

Publication number: 20220270318

Abstract: A three-dimensional (3D) object reconstruction neural network system learns to predict a 3D shape representation of an object from a video that includes the object. The 3D reconstruction technique may be used for content creation, such as generation of 3D characters for games, movies, and 3D printing. When 3D characters are generated from video, the content may also include motion of the character, as predicted based on the video. The 3D object construction technique exploits temporal consistency to reconstruct a dynamic 3D representation of the object from an unlabeled video. Specifically, an object in a video has a consistent shape and consistent texture across multiple frames. Texture, base shape, and part correspondence invariance constraints may be applied to fine-tune the neural network system. The reconstruction technique generalizes well—particularly for non-rigid objects.

Type: Application

Filed: May 2, 2022

Publication date: August 25, 2022

Inventors: Xueting Li, Sifei Liu, Kihwan Kim, Shalini De Mello, Jan Kautz
3D human body pose estimation using a model trained from unlabeled multi-view data

Patent number: 11417011

Abstract: Learning to estimate a 3D body pose, and likewise the pose of any type of object, from a single 2D image is of great interest for many practical graphics applications and generally relies on neural networks that have been trained with sample data which annotates (labels) each sample 2D image with a known 3D pose. Requiring this labeled training data however has various drawbacks, including for example that traditionally used training data sets lack diversity and therefore limit the extent to which neural networks are able to estimate 3D pose. Expanding these training data sets is also difficult since it requires manually provided annotations for 2D images, which is time consuming and prone to errors. The present disclosure overcomes these and other limitations of existing techniques by providing a model that is trained from unlabeled multi-view data for use in 3D pose estimation.

Type: Grant

Filed: June 9, 2020

Date of Patent: August 16, 2022

Assignee: NVIDIA CORPORATION

Inventors: Umar Iqbal, Pavlo Molchanov, Jan Kautz

prev 1 2 3 4 5 6 7 … next