Patents by Inventor Jan Kautz

Jan Kautz has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230144458
    Abstract: In examples, locations of facial landmarks may be applied to one or more machine learning models (MLMs) to generate output data indicating profiles corresponding to facial expressions, such as facial action coding system (FACS) values. The output data may be used to determine geometry of a model. For example, video frames depicting one or more faces may be analyzed to determine the locations. The facial landmarks may be normalized, then be applied to the MLM(s) to infer the profile(s), which may then be used to animate the mode for expression retargeting from the video. The MLM(s) may include sub-networks that each analyze a set of input data corresponding to a region of the face to determine profiles that correspond to the region. The profiles from the sub-networks, along global locations of facial landmarks may be used by a subsequent network to infer the profiles for the overall face.
    Type: Application
    Filed: October 31, 2022
    Publication date: May 11, 2023
    Inventors: Alexander Malafeev, Shalini De Mello, Jaewoo Seo, Umar Iqbal, Koki Nagano, Jan Kautz, Simon Yuen
  • Patent number: 11645530
    Abstract: A method, computer readable medium, and system are disclosed for visual sequence learning using neural networks. The method includes the steps of replacing a non-recurrent layer within a trained convolutional neural network model with a recurrent layer to produce a visual sequence learning neural network model and transforming feedforward weights for the non-recurrent layer into input-to-hidden weights of the recurrent layer to produce a transformed recurrent layer. The method also includes the steps of setting hidden-to-hidden weights of the recurrent layer to initial values and processing video image data by the visual sequence learning neural network model to generate classification or regression output data.
    Type: Grant
    Filed: May 19, 2021
    Date of Patent: May 9, 2023
    Assignee: NVIDIA Corporation
    Inventors: Xiaodong Yang, Pavlo Molchanov, Jan Kautz
  • Patent number: 11636668
    Abstract: A method includes filtering a point cloud transformation of a 3D object to generate a 3D lattice and processing the 3D lattice through a series of bilateral convolution networks (BCL), each BCL in the series having a lower lattice feature scale than a preceding BCL in the series. The output of each BCL in the series is concatenated to generate an intermediate 3D lattice. Further filtering of the intermediate 3D lattice generates a first prediction of features of the 3D object.
    Type: Grant
    Filed: May 22, 2018
    Date of Patent: April 25, 2023
    Inventors: Varun Jampani, Hang Su, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
  • Patent number: 11631239
    Abstract: Iterative prediction systems and methods for the task of action detection process an inputted sequence of video frames to generate an output of both action tubes and respective action labels, wherein the action tubes comprise a sequence of bounding boxes on each video frame. An iterative predictor processes large offsets between the bounding boxes and the ground-truth.
    Type: Grant
    Filed: April 22, 2021
    Date of Patent: April 18, 2023
    Assignee: NVIDIA CORPORATION
    Inventors: Xiaodong Yang, Ming-Yu Liu, Jan Kautz, Fanyi Xiao, Xitong Yang
  • Publication number: 20230088912
    Abstract: In various examples, historical trajectory information of objects in an environment may be tracked by an ego-vehicle and encoded into a state feature. The encoded state features for each of the objects observed by the ego-vehicle may be used—e.g., by a bi-directional long short-term memory (LSTM) network—to encode a spatial feature. The encoded spatial feature and the encoded state feature for an object may be used to predict lateral and/or longitudinal maneuvers for the object, and the combination of this information may be used to determine future locations of the object. The future locations may be used by the ego-vehicle to determine a path through the environment, or may be used by a simulation system to control virtual objects—according to trajectories determined from the future locations—through a simulation environment.
    Type: Application
    Filed: September 26, 2022
    Publication date: March 23, 2023
    Inventors: Ruben Villegas, Alejandro Troccoli, Iuri Frosio, Stephen Tyree, Wonmin Byeon, Jan Kautz
  • Publication number: 20230080247
    Abstract: A vision transformer is a deep learning model used to perform vision processing tasks such as image recognition. Vision transformers are currently designed with a plurality of same-size blocks that perform the vision processing tasks. However, some portions of these blocks are unnecessary and not only slow down the vision transformer but use more memory than required. In response, parameters of these blocks are analyzed to determine a score for each parameter, and if the score falls below a threshold, the parameter is removed from the associated block. This reduces a size of the resulting vision transformer, which reduces unnecessary memory usage and increases performance.
    Type: Application
    Filed: December 14, 2021
    Publication date: March 16, 2023
    Inventors: Hongxu Yin, Huanrui Yang, Pavlo Molchanov, Jan Kautz
  • Publication number: 20230070514
    Abstract: In order to determine accurate three-dimensional (3D) models for objects within a video, the objects are first identified and tracked within the video, and a pose and shape are estimated for these tracked objects. A translation and global orientation are removed from the tracked objects to determine local motion for the objects, and motion infilling is performed to fill in any missing portions for the object within the video. A global trajectory is then determined for the objects within the video, and the infilled motion and global trajectory are then used to determine infilled global motion for the object within the video. This enables the accurate depiction of each object as a 3D pose sequence for that model that accounts for occlusions and global factors within the video.
    Type: Application
    Filed: January 25, 2022
    Publication date: March 9, 2023
    Inventors: Ye Yuan, Umar Iqbal, Pavlo Molchanov, Jan Kautz
  • Publication number: 20230074706
    Abstract: A multi-level contrastive training strategy for training a neural network relies on image pairs (no other labels) to learn semantic correspondences at the image level and region or pixel level. The neural network is trained using contrasting image pairs including different objects and corresponding image pairs including different views of the same object. Conceptually, contrastive training pulls corresponding image pairs closer and pushes contrasting image pairs apart. An image-level contrastive loss is computed from the outputs (predictions) of the neural network and used to update parameters (weights) of the neural network via backpropagation. The neural network is also trained via pixel-level contrastive learning using only image pairs. Pixel-level contrastive learning receives an image pair, where each image includes an object in a particular category.
    Type: Application
    Filed: August 25, 2021
    Publication date: March 9, 2023
    Inventors: Taihong Xiao, Sifei Liu, Shalini De Mello, Zhiding Yu, Jan Kautz
  • Patent number: 11594006
    Abstract: There are numerous features in video that can be detected using computer-based systems, such as objects and/or motion. The detection of these features, and in particular the detection of motion, has many useful applications, such as action recognition, activity detection, object tracking, etc. The present disclosure provides a neural network that learns motion from unlabeled video frames. In particular, the neural network uses the unlabeled video frames to perform self-supervised hierarchical motion learning. The present disclosure also describes how the learned motion can be used in video action recognition.
    Type: Grant
    Filed: August 20, 2020
    Date of Patent: February 28, 2023
    Assignee: NVIDIA CORPORATION
    Inventors: Xiaodong Yang, Xitong Yang, Sifei Liu, Jan Kautz
  • Patent number: 11593661
    Abstract: A neural network is trained to identify one or more features of an image. The neural network is trained using a small number of original images, from which a plurality of additional images are derived. The additional images generated by rotating and decoding embeddings of the image in a latent space generated by an autoencoder. The images generated by the rotation and decoding exhibit changes to a feature that is in proportion to the amount of rotation.
    Type: Grant
    Filed: April 19, 2019
    Date of Patent: February 28, 2023
    Assignee: NVIDIA Corporation
    Inventors: Seonwook Park, Shalini De Mello, Pavlo Molchanov, Umar Iqbal, Jan Kautz
  • Publication number: 20230035306
    Abstract: Apparatuses, systems, and techniques are presented to generate media content.
    Type: Application
    Filed: July 21, 2021
    Publication date: February 2, 2023
    Inventors: Ming-Yu Liu, Koki Nagano, Yeongho Seol, Jose Rafael Valle Gomes da Costa, Jaewoo Seo, Ting-Chun Wang, Arun Mallya, Sameh Khamis, Wei Ping, Rohan Badlani, Kevin Jonathan Shih, Bryan Catanzaro, Simon Yuen, Jan Kautz
  • Publication number: 20230015989
    Abstract: The disclosure provides a learning framework that unifies both semantic segmentation and semantic edge detection. A learnable recurrent message passing layer is disclosed where semantic edges are considered as explicitly learned gating signals to refine segmentation and improve dense prediction quality by finding compact structures for message paths. The disclosure includes a method for coupled segmentation and edge learning. In one example, the method includes: (1) receiving an input image, (2) generating, from the input image, a semantic feature map, an affinity map, and a semantic edge map from a single backbone network of a convolutional neural network (CNN), and (3) producing a refined semantic feature map by smoothing pixels of the semantic feature map using spatial propagation, and controlling the smoothing using both affinity values from the affinity map and edge values from the semantic edge map.
    Type: Application
    Filed: July 1, 2021
    Publication date: January 19, 2023
    Inventors: Zhiding Yu, Rui Huang, Wonmin Byeon, Sifei Liu, Guilin Liu, Thomas Breuel, Anima Anandkumar, Jan Kautz
  • Publication number: 20230004760
    Abstract: Apparatuses, systems, and techniques to identify objects within an image using self-supervised machine learning. In at least one embodiment, a machine learning system is trained to recognize objects by training a first network to recognize objects within images that are generated by a second network. In at least one embodiment, the second network is a controllable network.
    Type: Application
    Filed: June 28, 2021
    Publication date: January 5, 2023
    Inventors: Siva Karthik Mustikovela, Shalini De Mello, Aayush Prakash, Umar Iqbal, Sifei Liu, Jan Kautz
  • Patent number: 11546568
    Abstract: Apparatuses, systems, and techniques are presented to perform monocular view synthesis of a dynamic scene. Single and multi-view depth information can be determined for a collection of images of a dynamic scene, and a blender network can be used to combine image features for foreground, background, and missing image regions using fused depth maps inferred form the single and multi-view depth information.
    Type: Grant
    Filed: March 6, 2020
    Date of Patent: January 3, 2023
    Assignee: NVIDIA CORPORATION
    Inventors: Jae Shin Yoon, Jan Kautz, Kihwan Kim
  • Publication number: 20220405583
    Abstract: One embodiment of the present invention sets forth a technique for training a generative model. The technique includes converting a first data point included in a training dataset into a first set of values associated with a base distribution for a score-based generative model. The technique also includes performing one or more denoising operations via the score-based generative model to convert the first set of values into a first set of latent variable values associated with a latent space. The technique further includes performing one or more additional operations to convert the first set of latent variable values into a second data point. Finally, the technique includes computing one or more losses based on the first data point and the second data point and generating a trained generative model based on the one or more losses, wherein the trained generative model includes the score-based generative model.
    Type: Application
    Filed: February 25, 2022
    Publication date: December 22, 2022
    Inventors: Arash VAHDAT, Karsten KREIS, Jan KAUTZ
  • Publication number: 20220398697
    Abstract: One embodiment of the present invention sets forth a technique for generating data. The technique includes sampling from a first distribution associated with the score-based generative model to generate a first set of values. The technique also includes performing one or more denoising operations via the score-based generative model to convert the first set of values into a first set of latent variable values associated with a latent space. The technique further includes converting the first set of latent variable values into a generative output.
    Type: Application
    Filed: February 25, 2022
    Publication date: December 15, 2022
    Inventors: Arash VAHDAT, Karsten KREIS, Jan KAUTZ
  • Publication number: 20220396289
    Abstract: Apparatuses, systems, and techniques to calculate a plurality of paths, through which an autonomous device is to traverse. In at least one embodiment, a plurality of paths are calculated using one or more neural networks based, at least in part, on one or more distance values output by the one or more neural networks.
    Type: Application
    Filed: June 15, 2021
    Publication date: December 15, 2022
    Inventors: Xueting Li, Sifei Liu, Shalini De Mello, Jan Kautz
  • Publication number: 20220391781
    Abstract: A method performed by a server is provided. The method comprises sending copies of a set of parameters of a hyper network (HN) to at least one client device, receiving from each client device in the at least one client device, a corresponding set of updated parameters of the HN, and determining a next set of parameters of the HN based on the corresponding sets of updated parameters received from the at least one client device. Each client device generates the corresponding set of updated parameters based on a local model architecture of the client device.
    Type: Application
    Filed: May 27, 2022
    Publication date: December 8, 2022
    Inventors: Or Litany, Haggai Maron, David Jesus Acuna Marrero, Jan Kautz, Sanja Fidler, Gal Chechik
  • Patent number: 11514293
    Abstract: In various examples, historical trajectory information of objects in an environment may be tracked by an ego-vehicle and encoded into a state feature. The encoded state features for each of the objects observed by the ego-vehicle may be used—e.g., by a bi-directional long short-term memory (LSTM) network—to encode a spatial feature. The encoded spatial feature and the encoded state feature for an object may be used to predict lateral and/or longitudinal maneuvers for the object, and the combination of this information may be used to determine future locations of the object. The future locations may be used by the ego-vehicle to determine a path through the environment, or may be used by a simulation system to control virtual objects—according to trajectories determined from the future locations—through a simulation environment.
    Type: Grant
    Filed: September 9, 2019
    Date of Patent: November 29, 2022
    Assignee: NVIDIA Corporation
    Inventors: Ruben Villegas, Alejandro Troccoli, Iuri Frosio, Stephen Tyree, Wonmin Byeon, Jan Kautz
  • Patent number: 11506888
    Abstract: A gaze tracking system for use by the driver of a vehicle includes an opaque frame circumferentially enclosing a transparent field of view of the driver, light emitting diodes coupled to the opaque frame for emitting infrared light onto various regions of the driver's eye gazing through the transparent field of view, and diodes for sensing intensity of infrared light reflected off of various regions of the driver's eye.
    Type: Grant
    Filed: September 20, 2019
    Date of Patent: November 22, 2022
    Assignee: NVIDIA CORP.
    Inventors: Eric Whitmire, Kaan Aksit, Michael Stengel, Jan Kautz, David Luebke, Ben Boudaoud