Patents by Inventor Jan Kautz

Jan Kautz has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210390653
    Abstract: Various embodiments enable a robot, or other autonomous or semi-autonomous device or system, to receive data involving the performance of a task in the physical world. The data can be provided as input to a perception network to infer a set of percepts about the task, which can correspond to relationships between objects observed during the performance. The percepts can be provided as input to a plan generation network, which can infer a set of actions as part of a plan. Each action can correspond to one of the observed relationships. The plan can be reviewed and any corrections made, either manually or through another demonstration of the task. Once the plan is verified as correct, the plan (and any related data) can be provided as input to an execution network that can infer instructions to cause the robot, and/or another robot, to perform the task.
    Type: Application
    Filed: August 26, 2021
    Publication date: December 16, 2021
    Inventors: Jonathan Tremblay, Stan Birchfield, Stephen Tyree, Thang To, Jan Kautz, Artem Molchanov
  • Publication number: 20210326694
    Abstract: Apparatuses, systems, and techniques are presented to determine distance for one or more objects. In at least one embodiment, a disparity network is trained to determine distance data from input stereoscopic images using a loss function that includes at least one of a gradient loss term and an occlusion loss term.
    Type: Application
    Filed: April 20, 2020
    Publication date: October 21, 2021
    Inventors: Jialiang Wang, Varun Jampani, Stan Birchfield, Charles Loop, Jan Kautz
  • Publication number: 20210314629
    Abstract: A method, computer readable medium, and system are disclosed for identifying residual video data. This data describes data that is lost during a compression of original video data. For example, the original video data may be compressed and then decompressed, and this result may be compared to the original video data to determine the residual video data. This residual video data is transformed into a smaller format by means of encoding, binarizing, and compressing, and is sent to a destination. At the destination, the residual video data is transformed back into its original format and is used during the decompression of the compressed original video data to improve a quality of the decompressed original video data.
    Type: Application
    Filed: June 18, 2021
    Publication date: October 7, 2021
    Inventors: Yi-Hsuan Tsai, Ming-Yu Liu, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
  • Patent number: 11132543
    Abstract: A method, computer readable medium, and system are disclosed for performing unconstrained appearance-based gaze estimation. The method includes the steps of identifying an image of an eye and a head orientation associated with the image of the eye, determining an orientation for the eye by analyzing, within a convolutional neural network (CNN), the image of the eye and the head orientation associated with the image of the eye, and returning the orientation of the eye.
    Type: Grant
    Filed: December 27, 2017
    Date of Patent: September 28, 2021
    Assignee: NVIDIA CORPORATION
    Inventors: Rajeev Ranjan, Shalini De Mello, Jan Kautz
  • Publication number: 20210287430
    Abstract: Apparatuses, systems, and techniques to identify a shape or camera pose of a three-dimensional object from a two-dimensional image of the object. In at least one embodiment, objects are identified in an image using one or more neural networks that have been trained on objects of a similar category and a three-dimensional mesh template.
    Type: Application
    Filed: April 15, 2020
    Publication date: September 16, 2021
    Inventors: Xueting Li, Sifei Liu, Kihwan Kim, Shalini De Mello, Varun Jampani, Jan Kautz
  • Publication number: 20210271977
    Abstract: A method, computer readable medium, and system are disclosed for visual sequence learning using neural networks. The method includes the steps of replacing a non-recurrent layer within a trained convolutional neural network model with a recurrent layer to produce a visual sequence learning neural network model and transforming feedforward weights for the non-recurrent layer into input-to-hidden weights of the recurrent layer to produce a transformed recurrent layer. The method also includes the steps of setting hidden-to-hidden weights of the recurrent layer to initial values and processing video image data by the visual sequence learning neural network model to generate classification or regression output data.
    Type: Application
    Filed: May 19, 2021
    Publication date: September 2, 2021
    Inventors: Xiaodong Yang, Pavlo Molchanov, Jan Kautz
  • Publication number: 20210248772
    Abstract: Learning to estimate a 3D body pose, and likewise the pose of any type of object, from a single 2D image is of great interest for many practical graphics applications and generally relies on neural networks that have been trained with sample data which annotates (labels) each sample 2D image with a known 3D pose. Requiring this labeled training data however has various drawbacks, including for example that traditionally used training data sets lack diversity and therefore limit the extent to which neural networks are able to estimate 3D pose. Expanding these training data sets is also difficult since it requires manually provided annotations for 2D images, which is time consuming and prone to errors. The present disclosure overcomes these and other limitations of existing techniques by providing a model that is trained from unlabeled multi-view data for use in 3D pose estimation.
    Type: Application
    Filed: June 9, 2020
    Publication date: August 12, 2021
    Inventors: Umar Iqbal, Pavlo Molchanov, Jan Kautz
  • Publication number: 20210241489
    Abstract: Iterative prediction systems and methods for the task of action detection process an inputted sequence of video frames to generate an output of both action tubes and respective action labels, wherein the action tubes comprise a sequence of bounding boxes on each video frame. An iterative predictor processes large offsets between the bounding boxes and the ground-truth.
    Type: Application
    Filed: April 22, 2021
    Publication date: August 5, 2021
    Inventors: Xiaodong YANG, Ming-Yu LIU, Jan KAUTZ, Fanyi XIAO, Xitong YANG
  • Patent number: 11082720
    Abstract: A method, computer readable medium, and system are disclosed for identifying residual video data. This data describes data that is lost during a compression of original video data. For example, the original video data may be compressed and then decompressed, and this result may be compared to the original video data to determine the residual video data. This residual video data is transformed into a smaller format by means of encoding, binarizing, and compressing, and is sent to a destination. At the destination, the residual video data is transformed back into its original format and is used during the decompression of the compressed original video data to improve a quality of the decompressed original video data.
    Type: Grant
    Filed: November 14, 2018
    Date of Patent: August 3, 2021
    Assignee: NVIDIA CORPORATION
    Inventors: Yi-Hsuan Tsai, Ming-Yu Liu, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
  • Publication number: 20210233273
    Abstract: Apparatuses, systems, and techniques that determine the pose of a human hand from a 2-D image are described herein. In at least one embodiment, training of a neural network is augmented using weakly labeled or unlabeled pose data which is augmented with losses based on a human hand model.
    Type: Application
    Filed: January 24, 2020
    Publication date: July 29, 2021
    Inventors: Adrian Spurr, Pavlo Molchanov, Umar Iqbal, Jan Kautz
  • Patent number: 11049018
    Abstract: A method, computer readable medium, and system are disclosed for visual sequence learning using neural networks. The method includes the steps of replacing a non-recurrent layer within a trained convolutional neural network model with a recurrent layer to produce a visual sequence learning neural network model and transforming feedforward weights for the non-recurrent layer into input-to-hidden weights of the recurrent layer to produce a transformed recurrent layer. The method also includes the steps of setting hidden-to-hidden weights of the recurrent layer to initial values and processing video image data by the visual sequence learning neural network model to generate classification or regression output data.
    Type: Grant
    Filed: January 25, 2018
    Date of Patent: June 29, 2021
    Assignee: NVIDIA Corporation
    Inventors: Xiaodong Yang, Pavlo Molchanov, Jan Kautz
  • Patent number: 11037051
    Abstract: Planar regions in three-dimensional scenes offer important geometric cues in a variety of three-dimensional perception tasks such as scene understanding, scene reconstruction, and robot navigation. Image analysis to detect planar regions can be performed by a deep learning architecture that includes a number of neural networks configured to estimate parameters for the planar regions. The neural networks process an image to detect an arbitrary number of plane objects in the image. Each plane object is associated with a number of estimated parameters including bounding box parameters, plane normal parameters, and a segmentation mask. Global parameters for the image, including a depth map, can also be estimated by one of the neural networks. Then, a segmentation refinement network jointly optimizes (i.e., refines) the segmentation masks for each instance of the plane objects and combines the refined segmentation masks to generate an aggregate segmentation mask for the image.
    Type: Grant
    Filed: September 10, 2019
    Date of Patent: June 15, 2021
    Assignee: NVIDIA Corporation
    Inventors: Kihwan Kim, Jinwei Gu, Chen Liu, Jan Kautz
  • Patent number: 11017556
    Abstract: Iterative prediction systems and methods for the task of action detection process an inputted sequence of video frames to generate an output of both action tubes and respective action labels, wherein the action tubes comprise a sequence of bounding boxes on each video frame. An iterative predictor processes large offsets between the bounding boxes and the ground-truth.
    Type: Grant
    Filed: October 4, 2018
    Date of Patent: May 25, 2021
    Assignee: NVIDIA Corporation
    Inventors: Xiaodong Yang, Xitong Yang, Fanyi Xiao, Ming-Yu Liu, Jan Kautz
  • Publication number: 20210150757
    Abstract: Apparatuses, systems, and techniques to identify orientations of objects within images. In at least one embodiment, one or more neural networks are trained to identify an orientations of one or more objects based, at least in part, on one or more characteristics of the object other than the object's orientation.
    Type: Application
    Filed: November 20, 2019
    Publication date: May 20, 2021
    Inventors: Siva Karthik Mustikovela, Varun Jampani, Shalini De Mello, Sifei Liu, Umar Iqbal, Jan Kautz
  • Publication number: 20210150736
    Abstract: A neural network model receives color data for a sequence of images corresponding to a dynamic scene in three-dimensional (3D) space. Motion of objects in the image sequence results from a combination of a dynamic camera orientation and motion or a change in the shape of an object in the 3D space. The neural network model generates two components that are used to produce a 3D motion field representing the dynamic (non-rigid) part of the scene. The two components are information identifying dynamic and static portions of each image and the camera orientation. The dynamic portions of each image contain motion in the 3D space that is independent of the camera orientation. In other words, the motion in the 3D space (estimated 3D scene flow data) is separated from the motion of the camera.
    Type: Application
    Filed: January 22, 2021
    Publication date: May 20, 2021
    Inventors: Zhaoyang Lv, Kihwan Kim, Deqing Sun, Alejandro Jose Troccoli, Jan Kautz
  • Publication number: 20210142177
    Abstract: Apparatuses, systems, and techniques are presented to generate data useful for further training of a neural network. In at least one embodiment, one or more neural networks can be re-trained based, at least in part, on data generated by the one or more neural networks including data used to previously train the one or more neural networks.
    Type: Application
    Filed: November 13, 2019
    Publication date: May 13, 2021
    Inventors: Arun Mallya, Jan Kautz, Zhizhong Li, Pavlo Molchanov, Hongxu Danny Yin
  • Publication number: 20210133990
    Abstract: Apparatuses, systems, and techniques to generate a 3D model of an object. In at least one embodiment, a 3D model of an object is generated by one or more neural networks, based on a plurality of images of the object.
    Type: Application
    Filed: November 5, 2019
    Publication date: May 6, 2021
    Inventors: Benjamin David Eckart, Wentao Yuan, Varun Jampani, Kihwan Kim, Jan Kautz
  • Publication number: 20210117661
    Abstract: Estimating a three-dimensional (3D) pose of an object, such as a hand or body (human, animal, robot, etc.), from a 2D image is necessary for human-computer interaction. A hand pose can be represented by a set of points in 3D space, called keypoints. Two coordinates (x,y) represent spatial displacement and a third coordinate represents a depth of every point with respect to the camera. A monocular camera is used to capture an image of the 3D pose, but does not capture depth information. A neural network architecture is configured to generate a depth value for each keypoint in the captured image, even when portions of the pose are occluded, or the orientation of the object is ambiguous. Generation of the depth values enables estimation of the 3D pose of the object.
    Type: Application
    Filed: December 28, 2020
    Publication date: April 22, 2021
    Inventors: Umar Iqbal, Pavlo Molchanov, Thomas Michael Breuel, Jan Kautz
  • Patent number: 10984286
    Abstract: A style transfer neural network may be used to generate stylized synthetic images, where real images provide the style (e.g., seasons, weather, lighting) for transfer to synthetic images. The stylized synthetic images may then be used to train a recognition neural network. In turn, the trained neural network may be used to predict semantic labels for the real images, providing recognition data for the real images. Finally, the real training dataset (real images and predicted recognition data) and the synthetic training dataset are used by the style transfer neural network to generate stylized synthetic images. The training of the neural network, prediction of recognition data for the real images, and stylizing of the synthetic images may be repeated for a number of iterations. The stylization operation more closely aligns a covariate of the synthetic images to the covariate of the real images, improving accuracy of the recognition neural network.
    Type: Grant
    Filed: February 1, 2019
    Date of Patent: April 20, 2021
    Assignee: NVIDIA Corporation
    Inventors: Aysegul Dundar, Ming-Yu Liu, Ting-Chun Wang, John Zedlewski, Jan Kautz
  • Patent number: 10964061
    Abstract: A deep neural network (DNN) system learns a map representation for estimating a camera position and orientation (pose). The DNN is trained to learn a map representation corresponding to the environment, defining positions and attributes of structures, trees, walls, vehicles, etc. The DNN system learns a map representation that is versatile and performs well for many different environments (indoor, outdoor, natural, synthetic, etc.). The DNN system receives images of an environment captured by a camera (observations) and outputs an estimated camera pose within the environment. The estimated camera pose is used to perform camera localization, i.e., recover the three-dimensional (3D) position and orientation of a moving camera, which is a fundamental task in computer vision with a wide variety of applications in robot navigation, car localization for autonomous driving, device localization for mobile navigation, and augmented/virtual reality.
    Type: Grant
    Filed: May 12, 2020
    Date of Patent: March 30, 2021
    Assignee: NVIDIA Corporation
    Inventors: Jinwei Gu, Samarth Manoj Brahmbhatt, Kihwan Kim, Jan Kautz