Patents by Inventor Jan Kautz

Jan Kautz has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

VIDEO INTERPOLATION USING ONE OR MORE NEURAL NETWORKS

Publication number: 20210067735

Abstract: Apparatuses, systems, and techniques to enhance video. In at least one embodiment, one or more neural networks are used to create, from a first video, a second video having a higher frame rate, higher resolution, or reduced number of missing or corrupt video frames.

Type: Application

Filed: September 3, 2019

Publication date: March 4, 2021

Inventors: Fitsum Reda, Deqing Sun, Aysegul Dundar, Mohammad Shoeybi, Guilin Liu, Kevin Shih, Andrew Tao, Jan Kautz, Bryan Catanzaro
CROSS-DOMAIN IMAGE PROCESSING FOR OBJECT RE-IDENTIFICATION

Publication number: 20210064907

Abstract: Object re-identification refers to a process by which images that contain an object of interest are retrieved from a set of images captured using disparate cameras or in disparate environments. Object re-identification has many useful applications, particularly as it is applied to people (e.g. person tracking). Current re-identification processes rely on convolutional neural networks (CNNs) that learn re-identification for a particular object class from labeled training data specific to a certain domain (e.g. environment), but that do not apply well in other domains. The present disclosure provides cross-domain disentanglement of id-related and id-unrelated factors. In particular, the disentanglement is performed using a labeled image set and an unlabeled image set, respectively captured from different domains but for a same object class.

Type: Application

Filed: August 20, 2020

Publication date: March 4, 2021

Inventors: Xiaodong Yang, Yang Zou, Zhiding Yu, Jan Kautz
SELF-SUPERVISED HIERARCHICAL MOTION LEARNING FOR VIDEO ACTION RECOGNITION

Publication number: 20210064931

Abstract: There are numerous features in video that can be detected using computer-based systems, such as objects and/or motion. The detection of these features, and in particular the detection of motion, has many useful applications, such as action recognition, activity detection, object tracking, etc. The present disclosure provides a neural network that learns motion from unlabeled video frames. In particular, the neural network uses the unlabeled video frames to perform self-supervised hierarchical motion learning. The present disclosure also describes how the learned motion can be used in video action recognition.

Type: Application

Filed: August 20, 2020

Publication date: March 4, 2021

Inventors: Xiaodong Yang, Xitong Yang, Sifei Liu, Jan Kautz
JOINT REPRESENTATION LEARNING FROM IMAGES AND TEXT

Publication number: 20210056353

Abstract: The disclosure provides a framework or system for learning visual representation using a large set of image/text pairs. The disclosure provides, for example, a method of visual representation learning, a joint representation learning system, and an artificial intelligence (AI) system that employs one or more of the trained models from the method or system. The AI system can be used, for example, in autonomous or semi-autonomous vehicles. In one example, the method of visual representation learning includes: (1) receiving a set of image embeddings from an image representation model and a set of text embeddings from a text representation model, and (2) training, employing mutual information, a critic function by learning relationships between the set of image embeddings and the set of text embeddings.

Type: Application

Filed: August 21, 2020

Publication date: February 25, 2021

Inventors: Arash Vahdat, Tanmay Gupta, Xiaodong Yang, Jan Kautz
Three-dimensional (3D) pose estimation from a monocular camera

Patent number: 10929654

Abstract: Estimating a three-dimensional (3D) pose of an object, such as a hand or body (human, animal, robot, etc.), from a 2D image is necessary for human-computer interaction. A hand pose can be represented by a set of points in 3D space, called keypoints. Two coordinates (x,y) represent spatial displacement and a third coordinate represents a depth of every point with respect to the camera. A monocular camera is used to capture an image of the 3D pose, but does not capture depth information. A neural network architecture is configured to generate a depth value for each keypoint in the captured image, even when portions of the pose are occluded, or the orientation of the object is ambiguous. Generation of the depth values enables estimation of the 3D pose of the object.

Type: Grant

Filed: March 1, 2019

Date of Patent: February 23, 2021

Assignee: NVIDIA Corporation

Inventors: Umar Iqbal, Pavlo Molchanov, Thomas Michael Breuel, Jan Kautz
Learning rigidity of dynamic scenes for three-dimensional scene flow estimation

Patent number: 10929987

Abstract: A neural network model receives color data for a sequence of images corresponding to a dynamic scene in three-dimensional (3D) space. Motion of objects in the image sequence results from a combination of a dynamic camera orientation and motion or a change in the shape of an object in the 3D space. The neural network model generates two components that are used to produce a 3D motion field representing the dynamic (non-rigid) part of the scene. The two components are information identifying dynamic and static portions of each image and the camera orientation. The dynamic portions of each image contain motion in the 3D space that is independent of the camera orientation. In other words, the motion in the 3D space (estimated 3D scene flow data) is separated from the motion of the camera.

Type: Grant

Filed: August 1, 2018

Date of Patent: February 23, 2021

Assignee: NVIDIA Corporation

Inventors: Zhaoyang Lv, Kihwan Kim, Deqing Sun, Alejandro Jose Troccoli, Jan Kautz
Guided hallucination for missing image content using a neural network

Patent number: 10922793

Abstract: Missing image content is generated using a neural network. In an embodiment, a high resolution image and associated high resolution semantic label map are generated from a low resolution image and associated low resolution semantic label map. The input image/map pair (low resolution image and associated low resolution semantic label map) lacks detail and is therefore missing content. Rather than simply enhancing the input image/map pair, data missing in the input image/map pair is improvised or hallucinated by a neural network, creating plausible content while maintaining spatio-temporal consistency. Missing content is hallucinated to generate a detailed zoomed in portion of an image. Missing content is hallucinated to generate different variations of an image, such as different seasons or weather conditions for a driving video.

Type: Grant

Filed: March 14, 2019

Date of Patent: February 16, 2021

Assignee: NVIDIA Corporation

Inventors: Seung-Hwan Baek, Kihwan Kim, Jinwei Gu, Orazio Gallo, Alejandro Jose Troccoli, Ming-Yu Liu, Jan Kautz
Photorealistic image stylization using a neural network model

Patent number: 10872399

Abstract: Photorealistic image stylization concerns transferring style of a reference photo to a content photo with the constraint that the stylized photo should remain photorealistic. Examples of styles include seasons (summer, winter, etc.), weather (sunny, rainy, foggy, etc.), lighting (daytime, nighttime, etc.). A photorealistic image stylization process includes a stylization step and a smoothing step. The stylization step transfers the style of the reference photo to the content photo. A photo style transfer neural network model receives a photorealistic content image and a photorealistic style image and generates an intermediate stylized photorealistic image that includes the content of the content image modified according to the style image. A smoothing function receives the intermediate stylized photorealistic image and pixel similarity data and generates the stylized photorealistic image, ensuring spatially consistent stylizations.

Type: Grant

Filed: January 11, 2019

Date of Patent: December 22, 2020

Assignee: NVIDIA Corporation

Inventors: Yijun Li, Ming-Yu Liu, Ming-Hsuan Yang, Jan Kautz
WEAKLY-SUPERVISED OBJECT DETECTION USING ONE OR MORE NEURAL NETWORKS

Publication number: 20200394458

Abstract: Apparatuses, systems, and techniques to detect object in images including digital representations of those objects. In at least one embodiment, one or more objects are detected in an image based, at least in part, on one or more pseudo-labels corresponding to said one or more objects.

Type: Application

Filed: June 17, 2019

Publication date: December 17, 2020

Inventors: Zhiding Yu, Jason Ren, Xiaodong Yang, Ming-Yu Liu, Jan Kautz
Budget-aware method for detecting activity in video

Patent number: 10860859

Abstract: Detection of activity in video content, and more particularly detecting in video start and end frames inclusive of an activity and a classification for the activity, is fundamental for video analytics including categorizing, searching, indexing, segmentation, and retrieval of videos. Existing activity detection processes rely on a large set of features and classifiers that exhaustively run over every time step of a video at multiple temporal scales, or as a small improvement computationally propose segments of the video on which to perform classification. These existing activity detection processes, however, are computationally expensive, particularly when trying to achieve activity detection accuracy, and moreover are not configurable for any particular time or computation budget. The present disclosure provides a time and/or computation budget-aware method for detecting activity in video that relies on a recurrent neural network implementing a learned policy.

Type: Grant

Filed: November 28, 2018

Date of Patent: December 8, 2020

Assignee: NVIDIA Corporation

Inventors: Xiaodong Yang, Pavlo Molchanov, Jan Kautz, Behrooz Mahasseni
Gaze tracking system for use in head mounted displays

Patent number: 10838492

Abstract: A gaze tracking system for use in head mounted displays includes an eyepiece having an opaque frame circumferentially enclosing a transparent field of view, light emitting diodes coupled to the opaque frame for emitting infrared light onto various regions of an eye gazing through the transparent field of view, and diodes for sensing intensity of infrared light reflected off of various regions of the eye.

Type: Grant

Filed: September 20, 2019

Date of Patent: November 17, 2020

Assignee: NVIDIA Corp.

Inventors: Eric Whitmire, Kaan Aksit, Michael Stengel, Jan Kautz, David Luebke, Ben Boudaoud
Fast multi-scale point cloud registration with a hierarchical gaussian mixture

Patent number: 10826786

Abstract: Point cloud registration sits at the core of many important and challenging 3D perception problems including autonomous navigation, object/scene recognition, and augmented reality (AR). A new registration algorithm is presented that achieves speed and accuracy by registering a point cloud to a representation of a reference point cloud. A target point cloud is registered to the reference point cloud by iterating through a number of cycles of an EM algorithm where, during an Expectation step, each point in the target point cloud is associated with a node of a hierarchical tree data structure and, during a Maximization step, an estimated transformation is determined based on the association of the points with corresponding nodes of the hierarchical tree data structure. The estimated transformation is determined by solving a minimization problem associated with a sum, over a number of mixture components, over terms related to a Mahalanobis distance.

Type: Grant

Filed: March 12, 2019

Date of Patent: November 3, 2020

Assignee: NVIDIA Corporation

Inventors: Benjamin David Eckart, Kihwan Kim, Jan Kautz
DEEP-LEARNING METHOD FOR SEPARATING REFLECTION AND TRANSMISSION IMAGES VISIBLE AT A SEMI-REFLECTIVE SURFACE IN A COMPUTER IMAGE OF A REAL-WORLD SCENE

Publication number: 20200342263

Abstract: When a computer image is generated from a real-world scene having a semi-reflective surface (e.g. window), the computer image will create, at the semi-reflective surface from the viewpoint of the camera, both a reflection of a scene in front of the semi-reflective surface and a transmission of a scene located behind the semi-reflective surface. Similar to a person viewing the real-world scene from different locations, angles, etc., the reflection and transmission may change, and also move relative to each other, as the viewpoint of the camera changes. Unfortunately, the dynamic nature of the reflection and transmission negatively impacts the performance of many computer applications, but performance can generally be improved if the reflection and transmission are separated. The present disclosure uses deep learning to separate reflection and transmission at a semi-reflective surface of a computer image generated from a real-world scene.

Type: Application

Filed: July 8, 2020

Publication date: October 29, 2020

Inventors: Orazio Gallo, Jinwei Gu, Jan Kautz, Patrick Wieschollek
TRAINING A NEURAL NETWORK TO PREDICT SUPERPIXELS USING SEGMENTATION-AWARE AFFINITY LOSS

Publication number: 20200334502

Abstract: Segmentation is the identification of separate objects within an image. An example is identification of a pedestrian passing in front of a car, where the pedestrian is a first object and the car is a second object. Superpixel segmentation is the identification of regions of pixels within an object that have similar properties. An example is identification of pixel regions having a similar color, such as different articles of clothing worn by the pedestrian and different components of the car. A pixel affinity neural network (PAN) model is trained to generate pixel affinity maps for superpixel segmentation. The pixel affinity map defines the similarity of two points in space. In an embodiment, the pixel affinity map indicates a horizontal affinity and vertical affinity for each pixel in the image. The pixel affinity map is processed to identify the superpixels.

Type: Application

Filed: July 6, 2020

Publication date: October 22, 2020

Inventors: Wei-Chih Tu, Ming-Yu Liu, Varun Jampani, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
FEW-SHOT TRAINING OF A NEURAL NETWORK

Publication number: 20200334543

Abstract: A neural network is trained to identify one or more features of an image. The neural network is trained using a small number of original images, from which a plurality of additional images are derived. The additional images generated by rotating and decoding embeddings of the image in a latent space generated by an autoencoder. The images generated by the rotation and decoding exhibit changes to a feature that is in proportion to the amount of rotation.

Type: Application

Filed: April 19, 2019

Publication date: October 22, 2020

Inventors: Seonwook Park, Shalini De Mello, Pavlo Molchanov, Umar Iqbal, Jan Kautz
SEGMENTATION USING AN UNSUPERVISED NEURAL NETWORK TRAINING TECHNIQUE

Publication number: 20200320401

Abstract: Systems and methods to detect one or more segments of one or more objects within one or more images based, at least in part, on a neural network trained in an unsupervised manner to infer the one or more segments. Systems and methods to help train one or more neural networks to detect one or more segments of one or more objects within one or more images in an unsupervised manner.

Type: Application

Filed: April 8, 2019

Publication date: October 8, 2020

Inventors: Varun Jampani, Wei-Chih Hung, Sifei Liu, Pavlo Molchanov, Jan Kautz
Superpixel sampling networks

Patent number: 10789678

Abstract: A superpixel sampling network utilizes a neural network coupled to a differentiable simple linear iterative clustering component to determine pixel-superpixel associations from a set of pixel features output by the neural network. The superpixel sampling network computes updated superpixel centers and final pixel-superpixel associations over a number of iterations.

Type: Grant

Filed: September 13, 2018

Date of Patent: September 29, 2020

Assignee: NVIDIA Corp.

Inventors: Varun Jampani, Deqing Sun, Ming-Yu Liu, Jan Kautz
Equivariant landmark transformation for landmark localization

Patent number: 10783394

Abstract: A method, computer readable medium, and system are disclosed to generate coordinates of landmarks within images. The landmark locations may be identified on an image of a human face and used for emotion recognition, face identity verification, eye gaze tracking, pose estimation, etc. A transform is applied to input image data to produce transformed input image data. The transform is also applied to predicted coordinates for landmarks of the input image data to produce transformed predicted coordinates. A neural network model processes the transformed input image data to generate additional landmarks of the transformed input image data and additional predicted coordinates for each one of the additional landmarks. Parameters of the neural network model are updated to reduce differences between the transformed predicted coordinates and the additional predicted coordinates.

Type: Grant

Filed: June 12, 2018

Date of Patent: September 22, 2020

Assignee: NVIDIA Corporation

Inventors: Pavlo Molchanov, Stephen Walter Tyree, Jan Kautz, Sina Honari
Semi-supervised learning for landmark localization

Patent number: 10783393

Abstract: A method, computer readable medium, and system are disclosed for sequential multi-tasking to generate coordinates of landmarks within images. The landmark locations may be identified on an image of a human face and used for emotion recognition, face identity verification, eye gaze tracking, pose estimation, etc. A neural network model processes input image data to generate pixel-level likelihood estimates for landmarks in the input image data and a soft-argmax function computes predicted coordinates of each landmark based on the pixel-level likelihood estimates.

Type: Grant

Filed: June 12, 2018

Date of Patent: September 22, 2020

Assignee: NVIDIA Corporation

Inventors: Pavlo Molchanov, Stephen Walter Tyree, Jan Kautz, Sina Honari
VIEW SYNTHESIS USING NEURAL NETWORKS

Publication number: 20200294194

Abstract: A video stitching system combines video from different cameras to form a panoramic video that, in various embodiments, is temporally stable and tolerant to strong parallax. In an embodiment, the system provides a smooth spatial interpolation that can be used to connect the input video images. In an embodiment, the system applies an interpolation layer to slices of the overlapping video sources, and the network learns a dense flow field to smoothly align the input videos with spatial interpolation. Various embodiments are applicable to areas such as virtual reality, immersive telepresence, autonomous driving, and video surveillance.

Type: Application

Filed: March 11, 2019

Publication date: September 17, 2020

Inventors: Deqing Sun, Orazio Gallo, Jan Kautz, Jinwei GU, Wei-Sheng Lai

prev … 2 3 4 5 6 7 8 9 10 … next