Patents by Inventor Jan Kautz

Jan Kautz has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Learning contrastive representation for semantic correspondence

Patent number: 11960570

Abstract: A multi-level contrastive training strategy for training a neural network relies on image pairs (no other labels) to learn semantic correspondences at the image level and region or pixel level. The neural network is trained using contrasting image pairs including different objects and corresponding image pairs including different views of the same object. Conceptually, contrastive training pulls corresponding image pairs closer and pushes contrasting image pairs apart. An image-level contrastive loss is computed from the outputs (predictions) of the neural network and used to update parameters (weights) of the neural network via backpropagation. The neural network is also trained via pixel-level contrastive learning using only image pairs. Pixel-level contrastive learning receives an image pair, where each image includes an object in a particular category.

Type: Grant

Filed: August 25, 2021

Date of Patent: April 16, 2024

Assignee: NVIDIA Corporation

Inventors: Taihong Xiao, Sifei Liu, Shalini De Mello, Zhiding Yu, Jan Kautz
TECHNIQUES FOR HETEROGENEOUS CONTINUAL LEARNING WITH MACHINE LEARNING MODEL ARCHITECTURE PROGRESSION

Publication number: 20240119361

Abstract: One embodiment of a method for training a first machine learning model having a different architecture than a second machine learning model includes receiving a first data set, performing one or more operations to generate a second data set based on the first data set and the second machine learning model, wherein the second data set includes at least one feature associated with one or more tasks that the second machine learning model was previously trained to perform, and performing one or more operations to train the first machine learning model based on the second data set and the second machine learning model.

Type: Application

Filed: July 6, 2023

Publication date: April 11, 2024

Inventors: Hongxu YIN, Wonmin BYEON, Jan KAUTZ, Divyam MADAAN, Pavlo MOLCHANOV
Joint representation learning from images and text

Patent number: 11948078

Abstract: The disclosure provides a framework or system for learning visual representation using a large set of image/text pairs. The disclosure provides, for example, a method of visual representation learning, a joint representation learning system, and an artificial intelligence (AI) system that employs one or more of the trained models from the method or system. The AI system can be used, for example, in autonomous or semi-autonomous vehicles. In one example, the method of visual representation learning includes: (1) receiving a set of image embeddings from an image representation model and a set of text embeddings from a text representation model, and (2) training, employing mutual information, a critic function by learning relationships between the set of image embeddings and the set of text embeddings.

Type: Grant

Filed: August 21, 2020

Date of Patent: April 2, 2024

Assignee: NVIDIA Corporation

Inventors: Arash Vahdat, Tanmay Gupta, Xiaodong Yang, Jan Kautz
Learning robotic tasks using one or more neural networks

Patent number: 11941719

Abstract: Various embodiments enable a robot, or other autonomous or semi-autonomous device or system, to receive data involving the performance of a task in the physical world. The data can be provided as input to a perception network to infer a set of percepts about the task, which can correspond to relationships between objects observed during the performance. The percepts can be provided as input to a plan generation network, which can infer a set of actions as part of a plan. Each action can correspond to one of the observed relationships. The plan can be reviewed and any corrections made, either manually or through another demonstration of the task. Once the plan is verified as correct, the plan (and any related data) can be provided as input to an execution network that can infer instructions to cause the robot, and/or another robot, to perform the task.

Type: Grant

Filed: January 23, 2019

Date of Patent: March 26, 2024

Assignee: NVIDIA Corporation

Inventors: Jonathan Tremblay, Stan Birchfield, Stephen Tyree, Thang To, Jan Kautz, Artem Molchanov
LANDMARK DETECTION WITH AN ITERATIVE NEURAL NETWORK

Publication number: 20240096115

Abstract: Landmark detection refers to the detection of landmarks within an image or a video, and is used in many computer vision tasks such emotion recognition, face identity verification, hand tracking, gesture recognition, and eye gaze tracking. Current landmark detection methods rely on a cascaded computation through cascaded networks or an ensemble of multiple models, which starts with an initial guess of the landmarks and iteratively produces corrected landmarks which match the input more finely. However, the iterations required by current methods typically increase the training memory cost linearly, and do not have an obvious stopping criteria. Moreover, these methods tend to exhibit jitter in landmark detection results for video. The present disclosure improves current landmark detection methods by providing landmark detection using an iterative neural network.

Type: Application

Filed: September 7, 2023

Publication date: March 21, 2024

Inventors: Pavlo Molchanov, Jan Kautz, Arash Vahdat, Hongxu Yin, Paul Micaelli
CAMERA AND ARTICULATED OBJECT MOTION ESTIMATION FROM VIDEO

Publication number: 20240070874

Abstract: Estimating motion of a human or other object in video is a common computer task with applications in robotics, sports, mixed reality, etc. However, motion estimation becomes difficult when the camera capturing the video is moving, because the observed object and camera motions are entangled. The present disclosure provides for joint estimation of the motion of a camera and the motion of articulated objects captured in video by the camera.

Type: Application

Filed: April 17, 2023

Publication date: February 29, 2024

Inventors: Muhammed Kocabas, Ye Yuan, Umar Iqbal, Pavlo Molchanov, Jan Kautz
POSE TRANSFER FOR THREE-DIMENSIONAL CHARACTERS USING A LEARNED SHAPE CODE

Publication number: 20240070987

Abstract: Transferring pose to three-dimensional characters is a common computer graphics task that typically involves transferring the pose of a reference avatar to a (stylized) three-dimensional character. Since three-dimensional characters are created by professional artists through imagination and exaggeration, and therefore, unlike human or animal avatars, have distinct shape and features, matching the pose of a three-dimensional character to that of a reference avatar generally requires manually creating shape information for the three-dimensional character that is required for pose transfer. The present disclosure provides for the automated transfer of a reference pose to a three-dimensional character, based specifically on a learned shape code for the three-dimensional character.

Type: Application

Filed: February 15, 2023

Publication date: February 29, 2024

Inventors: Xueting Li, Sifei Liu, Shalini De Mello, Orazio Gallo, Jiashun Wang, Jan Kautz
Learning and propagating visual attributes

Patent number: 11907846

Abstract: One embodiment of the present invention sets forth a technique for performing spatial propagation. The technique includes generating a first directed acyclic graph (DAG) by connecting spatially adjacent points included in a set of unstructured points via directed edges along a first direction. The technique also includes applying a first set of neural network layers to one or more images associated with the set of unstructured points to generate (i) a set of features for the set of unstructured points and (ii) a set of pairwise affinities between the spatially adjacent points connected by the directed edges. The technique further includes generating a set of labels for the set of unstructured points by propagating the set of features across the first DAG based on the set of pairwise affinities.

Type: Grant

Filed: September 10, 2020

Date of Patent: February 20, 2024

Assignee: NVIDIA Corporation

Inventors: Sifei Liu, Shalini De Mello, Varun Jampani, Jan Kautz, Xueting Li
DIFFERENTIABLE OBJECT INSERTION USING HYBRID LIGHTING VOLUMES FOR SYNTHETIC DATA GENERATION APPLICATIONS

Publication number: 20240054720

Abstract: Systems and methods generate a hybrid lighting model for rendering objects within an image. The hybrid lighting model includes lighting effects attributed to a first source, such as the sun, and to a second source, such as spatially-varying effects of objects within the image. The hybrid lighting model may be generated for an input image and then one or more virtual objects may be rendered to appear as if part of the input image, where the hybrid lighting model is used to apply one or more lighting effects to the one or more virtual objects.

Type: Application

Filed: August 11, 2022

Publication date: February 15, 2024

Inventors: Sanja Fidler, Zian Wang, Jan Kautz, Wenzheng Chen
Three-dimensional object reconstruction from a video

Patent number: 11880927

Abstract: A three-dimensional (3D) object reconstruction neural network system learns to predict a 3D shape representation of an object from a video that includes the object. The 3D reconstruction technique may be used for content creation, such as generation of 3D characters for games, movies, and 3D printing. When 3D characters are generated from video, the content may also include motion of the character, as predicted based on the video. The 3D object construction technique exploits temporal consistency to reconstruct a dynamic 3D representation of the object from an unlabeled video. Specifically, an object in a video has a consistent shape and consistent texture across multiple frames. Texture, base shape, and part correspondence invariance constraints may be applied to fine-tune the neural network system. The reconstruction technique generalizes well—particularly for non-rigid objects.

Type: Grant

Filed: May 19, 2023

Date of Patent: January 23, 2024

Assignee: NVIDIA Corporation

Inventors: Xueting Li, Sifei Liu, Kihwan Kim, Shalini De Mello, Jan Kautz
NEURAL NETWORK-BASED IMAGE LIGHTING

Publication number: 20240020897

Abstract: Apparatuses, systems, and techniques are presented to generate image data. In at least one embodiment, one or more neural networks are used to cause a lighting effect to be applied to one or more objects within one or more images based, at least in part, on synthetically generated images of the one or more objects.

Type: Application

Filed: July 12, 2022

Publication date: January 18, 2024

Inventors: Ting-Chun Wang, Ming-Yu Liu, Koki Nagano, Sameh Khamis, Jan Kautz
GLOBAL CONTEXT VISION TRANSFORMER

Publication number: 20230394781

Abstract: Vision transformers are deep learning models that employ a self-attention mechanism to obtain feature representations for an input image. To date, the configuration of vision transformers has limited the self-attention computation to a local window of the input image, such that short-range dependencies are modeled in the output. The present disclosure provides a vision transformer that captures global context, and that is therefore able to model long-range dependencies in its output.

Type: Application

Filed: December 16, 2022

Publication date: December 7, 2023

Applicant: NVIDIA Corporation

Inventors: Ali Hatamizadeh, Hongxu Yin, Jan Kautz, Pavlo Molchanov
FEW-SHOT TRAINING OF A NEURAL NETWORK

Publication number: 20230368501

Abstract: A neural network is trained to identify one or more features of an image. The neural network is trained using a small number of original images, from which a plurality of additional images are derived. The additional images generated by rotating and decoding embeddings of the image in a latent space generated by an autoencoder. The images generated by the rotation and decoding exhibit changes to a feature that is in proportion to the amount of rotation.

Type: Application

Filed: February 24, 2023

Publication date: November 16, 2023

Inventors: Seonwook Park, Shalini De Mello, Pavlo Molchanov, Umar Iqbal, Jan Kautz
Image processing using coupled segmentation and edge learning

Patent number: 11790633

Abstract: The disclosure provides a learning framework that unifies both semantic segmentation and semantic edge detection. A learnable recurrent message passing layer is disclosed where semantic edges are considered as explicitly learned gating signals to refine segmentation and improve dense prediction quality by finding compact structures for message paths. The disclosure includes a method for coupled segmentation and edge learning. In one example, the method includes: (1) receiving an input image, (2) generating, from the input image, a semantic feature map, an affinity map, and a semantic edge map from a single backbone network of a convolutional neural network (CNN), and (3) producing a refined semantic feature map by smoothing pixels of the semantic feature map using spatial propagation, and controlling the smoothing using both affinity values from the affinity map and edge values from the semantic edge map.

Type: Grant

Filed: July 1, 2021

Date of Patent: October 17, 2023

Assignee: NVIDIA Corporation

Inventors: Zhiding Yu, Rui Huang, Wonmin Byeon, Sifei Liu, Guilin Liu, Thomas Breuel, Anima Anandkumar, Jan Kautz
IMAGE STITCHING WITH DYNAMIC SEAM PLACEMENT BASED ON OBJECT SALIENCY FOR SURROUND VIEW VISUALIZATION

Publication number: 20230316458

Abstract: In various examples, dynamic seam placement is used to position seams in regions of overlapping image data to avoid crossing salient objects or regions. Objects may be detected from image frames representing overlapping views of an environment surrounding an ego-object such as a vehicle. The images may be aligned to create an aligned composite image or surface (e.g., a panorama, a 360° image, bowl shaped surface) with regions of overlapping image data, and a representation of the detected objects and/or salient regions (e.g., a saliency mask) may be generated and projected onto the aligned composite image or surface. Seams may be positioned in the overlapping regions to avoid or minimize crossing salient pixels represented in the projected masks, and the image data may be blended at the seams to create a stitched image or surface (e.g., a stitched panorama, stitched 360° image, stitched textured surface).

Type: Application

Filed: February 23, 2023

Publication date: October 5, 2023

Inventors: Yuzhuo REN, Kenneth TURKOWSKI, Nuri Murat ARAR, Orazio GALLO, Jan KAUTZ, Niranjan AVADHANAM, Hang SU
IMAGE STITCHING WITH DYNAMIC SEAM PLACEMENT BASED ON EGO-VEHICLE STATE FOR SURROUND VIEW VISUALIZATION

Publication number: 20230319218

Abstract: In various examples, a state machine is used to select between a default seam placement or dynamic seam placement that avoids salient regions, and to enable and disable dynamic seam placement based on speed of ego-motion, direction of ego-motion, proximity to salient objects, active viewport, driver gaze, and/or other factors. Images representing overlapping views of an environment may be aligned to create an aligned composite image or surface (e.g., a panorama, a 360° image, bowl shaped surface) with overlapping regions of image data, and a default or dynamic seam placement may be selected based on driving scenario (e.g., driving direction, speed, proximity to nearby objects). As such, seams may be positioned in the overlapping regions of image data, and the image data may be blended at the seams to create a stitched image or surface (e.g., a stitched panorama, stitched 360° image, stitched textured surface).

Type: Application

Filed: February 23, 2023

Publication date: October 5, 2023

Inventors: Yuzhuo REN, Nuri Murat ARAR, Orazio GALLO, Jan KAUTZ, Niranjan AVADHANAM, Hang SU
IMAGE STITCHING WITH AN ADAPTIVE THREE-DIMENSIONAL BOWL MODEL OF THE SURROUNDING ENVIRONMENT FOR SURROUND VIEW VISUALIZATION

Publication number: 20230316635

Abstract: In various examples, an environment surrounding an ego-object is visualized using an adaptive 3D bowl that models the environment with a shape that changes based on distance (and direction) to one or more representative point(s) on detected objects. Distance (and direction) to detected objects may be determined using 3D object detection or a top-down 2D or 3D occupancy grid, and used to adapt the shape of the adaptive 3D bowl in various ways (e.g., by sizing its ground plane to fit within the distance to the closest detected object, fitting a shape using an optimization algorithm). The adaptive 3D bowl may be enabled or disabled during each time slice (e.g., based on ego-speed), and the 3D bowl for each time slice may be used to render a visualization of the environment (e.g., a top-down projection image, a textured 3D bowl, and/or a rendered view thereof).

Type: Application

Filed: February 23, 2023

Publication date: October 5, 2023

Inventors: Hairong JIANG, Nuri Murat ARAR, Orazio GALLO, Jan KAUTZ, Ronan LETOQUIN
THREE-DIMENSIONAL OBJECT RECONSTRUCTION FROM A VIDEO

Publication number: 20230290038

Abstract: A three-dimensional (3D) object reconstruction neural network system learns to predict a 3D shape representation of an object from a video that includes the object. The 3D reconstruction technique may be used for content creation, such as generation of 3D characters for games, movies, and 3D printing. When 3D characters are generated from video, the content may also include motion of the character, as predicted based on the video. The 3D object construction technique exploits temporal consistency to reconstruct a dynamic 3D representation of the object from an unlabeled video. Specifically, an object in a video has a consistent shape and consistent texture across multiple frames. Texture, base shape, and part correspondence invariance constraints may be applied to fine-tune the neural network system. The reconstruction technique generalizes well—particularly for non-rigid objects.

Type: Application

Filed: May 19, 2023

Publication date: September 14, 2023

Inventors: Xueting Li, Sifei Liu, Kihwan Kim, Shalini De Mello, Jan Kautz
Segmentation using an unsupervised neural network training technique

Patent number: 11748887

Abstract: Systems and methods to detect one or more segments of one or more objects within one or more images based, at least in part, on a neural network trained in an unsupervised manner to infer the one or more segments. Systems and methods to help train one or more neural networks to detect one or more segments of one or more objects within one or more images in an unsupervised manner.

Type: Grant

Filed: April 8, 2019

Date of Patent: September 5, 2023

Assignee: NVIDIA Corporation

Inventors: Varun Jampani, Wei-Chih Hung, Sifei Liu, Pavlo Molchanov, Jan Kautz
MACHINE-LEARNING TECHNIQUES FOR REPRESENTING ITEMS IN A SPECTRAL DOMAIN

Publication number: 20230267306

Abstract: In various embodiments, a training application generates a trained machine learning model that represents items in a spectral domain. The training application executes a first neural network on a first set of data points associated with both a first item and the spectral domain to generate a second neural network. Subsequently, the training application generates a set of predicted data points that are associated with both the first item and the spectral domain via the second neural network. The training application generates the trained machine learning model based on the first neural network, the second neural network, and the set of predicted data points. The trained machine learning model maps one or more positions within the spectral domain to one or more values associated with an item based on a set of data points associated with both the item and the spectral domain.

Type: Application

Filed: September 20, 2022

Publication date: August 24, 2023

Inventors: Benjamin ECKART, Jan KAUTZ, Chao LIU, Benjamin WU

1 2 3 4 5 … next