Patents by Inventor Pavlo Molchanov

Pavlo Molchanov has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240127067
    Abstract: Systems and methods are disclosed for improving natural robustness of sparse neural networks. Pruning a dense neural network may improve inference speed and reduces the memory footprint and energy consumption of the resulting sparse neural network while maintaining a desired level of accuracy. In real-world scenarios in which sparse neural networks deployed in autonomous vehicles perform tasks such as object detection and classification for acquired inputs (images), the neural networks need to be robust to new environments, weather conditions, camera effects, etc. Applying sharpness-aware minimization (SAM) optimization during training of the sparse neural network improves performance for out of distribution (OOD) images compared with using conventional stochastic gradient descent (SGD) optimization. SAM optimizes a neural network to find a flat minimum: a region that both has a small loss value, but that also lies within a region of low loss.
    Type: Application
    Filed: August 31, 2023
    Publication date: April 18, 2024
    Inventors: Annamarie Bair, Hongxu Yin, Pavlo Molchanov, Maying Shen, Jose Manuel Alvarez Lopez
  • Publication number: 20240119291
    Abstract: Machine learning is a process that learns a neural network model from a given dataset, where the model can then be used to make a prediction about new data. In order to reduce the size, computation, and latency of a neural network model, a compression technique can be employed which includes model sparsification. To avoid the negative consequences of pruning a fully pretrained neural network model and on the other hand of training a sparse model in the first place without any recovery option, the present disclosure provides a dynamic neural network model sparsification process which allows for recovery of previously pruned parts to improve the quality of the sparse neural network model.
    Type: Application
    Filed: May 30, 2023
    Publication date: April 11, 2024
    Inventors: Jose M. Alvarez Lopez, Pavlo Molchanov, Hongxu Yin, Maying Shen, Lei Mao, Xinglong Sun
  • Publication number: 20240119361
    Abstract: One embodiment of a method for training a first machine learning model having a different architecture than a second machine learning model includes receiving a first data set, performing one or more operations to generate a second data set based on the first data set and the second machine learning model, wherein the second data set includes at least one feature associated with one or more tasks that the second machine learning model was previously trained to perform, and performing one or more operations to train the first machine learning model based on the second data set and the second machine learning model.
    Type: Application
    Filed: July 6, 2023
    Publication date: April 11, 2024
    Inventors: Hongxu YIN, Wonmin BYEON, Jan KAUTZ, Divyam MADAAN, Pavlo MOLCHANOV
  • Publication number: 20240096115
    Abstract: Landmark detection refers to the detection of landmarks within an image or a video, and is used in many computer vision tasks such emotion recognition, face identity verification, hand tracking, gesture recognition, and eye gaze tracking. Current landmark detection methods rely on a cascaded computation through cascaded networks or an ensemble of multiple models, which starts with an initial guess of the landmarks and iteratively produces corrected landmarks which match the input more finely. However, the iterations required by current methods typically increase the training memory cost linearly, and do not have an obvious stopping criteria. Moreover, these methods tend to exhibit jitter in landmark detection results for video. The present disclosure improves current landmark detection methods by providing landmark detection using an iterative neural network.
    Type: Application
    Filed: September 7, 2023
    Publication date: March 21, 2024
    Inventors: Pavlo Molchanov, Jan Kautz, Arash Vahdat, Hongxu Yin, Paul Micaelli
  • Patent number: 11934955
    Abstract: Systems and methods for more accurate and robust determination of subject characteristics from an image of the subject. One or more machine learning models receive as input an image of a subject, and output both facial landmarks and associated confidence values. Confidence values represent the degrees to which portions of the subject's face corresponding to those landmarks are occluded, i.e., the amount of uncertainty in the position of each landmark location. These landmark points and their associated confidence values, and/or associated information, may then be input to another set of one or more machine learning models which may output any facial analysis quantity or quantities, such as the subject's gaze direction, head pose, drowsiness state, cognitive load, or distraction state.
    Type: Grant
    Filed: October 31, 2022
    Date of Patent: March 19, 2024
    Assignee: NVIDIA Corporation
    Inventors: Nuri Murat Arar, Niranjan Avadhanam, Nishant Puri, Shagan Sah, Rajath Shetty, Sujay Yadawadkar, Pavlo Molchanov
  • Publication number: 20240070874
    Abstract: Estimating motion of a human or other object in video is a common computer task with applications in robotics, sports, mixed reality, etc. However, motion estimation becomes difficult when the camera capturing the video is moving, because the observed object and camera motions are entangled. The present disclosure provides for joint estimation of the motion of a camera and the motion of articulated objects captured in video by the camera.
    Type: Application
    Filed: April 17, 2023
    Publication date: February 29, 2024
    Inventors: Muhammed Kocabas, Ye Yuan, Umar Iqbal, Pavlo Molchanov, Jan Kautz
  • Publication number: 20230394781
    Abstract: Vision transformers are deep learning models that employ a self-attention mechanism to obtain feature representations for an input image. To date, the configuration of vision transformers has limited the self-attention computation to a local window of the input image, such that short-range dependencies are modeled in the output. The present disclosure provides a vision transformer that captures global context, and that is therefore able to model long-range dependencies in its output.
    Type: Application
    Filed: December 16, 2022
    Publication date: December 7, 2023
    Applicant: NVIDIA Corporation
    Inventors: Ali Hatamizadeh, Hongxu Yin, Jan Kautz, Pavlo Molchanov
  • Publication number: 20230368501
    Abstract: A neural network is trained to identify one or more features of an image. The neural network is trained using a small number of original images, from which a plurality of additional images are derived. The additional images generated by rotating and decoding embeddings of the image in a latent space generated by an autoencoder. The images generated by the rotation and decoding exhibit changes to a feature that is in proportion to the amount of rotation.
    Type: Application
    Filed: February 24, 2023
    Publication date: November 16, 2023
    Inventors: Seonwook Park, Shalini De Mello, Pavlo Molchanov, Umar Iqbal, Jan Kautz
  • Patent number: 11748887
    Abstract: Systems and methods to detect one or more segments of one or more objects within one or more images based, at least in part, on a neural network trained in an unsupervised manner to infer the one or more segments. Systems and methods to help train one or more neural networks to detect one or more segments of one or more objects within one or more images in an unsupervised manner.
    Type: Grant
    Filed: April 8, 2019
    Date of Patent: September 5, 2023
    Assignee: NVIDIA Corporation
    Inventors: Varun Jampani, Wei-Chih Hung, Sifei Liu, Pavlo Molchanov, Jan Kautz
  • Publication number: 20230186077
    Abstract: One embodiment of the present invention sets forth a technique for executing a transformer neural network. The technique includes computing a first set of halting scores for a first set of tokens that has been input into a first layer of the transformer neural network. The technique also includes determining that a first halting score included in the first set of halting scores exceeds a threshold value. The technique further includes in response to the first halting score exceeding the threshold value, causing a first token that is included in the first set of tokens and is associated with the first halting score not to be processed by one or more layers within the transformer neural network that are subsequent to the first layer.
    Type: Application
    Filed: June 15, 2022
    Publication date: June 15, 2023
    Inventors: Hongxu YIN, Jan KAUTZ, Jose Manuel ALVAREZ LOPEZ, Arun MALLYA, Pavlo MOLCHANOV, Arash VAHDAT
  • Patent number: 11645530
    Abstract: A method, computer readable medium, and system are disclosed for visual sequence learning using neural networks. The method includes the steps of replacing a non-recurrent layer within a trained convolutional neural network model with a recurrent layer to produce a visual sequence learning neural network model and transforming feedforward weights for the non-recurrent layer into input-to-hidden weights of the recurrent layer to produce a transformed recurrent layer. The method also includes the steps of setting hidden-to-hidden weights of the recurrent layer to initial values and processing video image data by the visual sequence learning neural network model to generate classification or regression output data.
    Type: Grant
    Filed: May 19, 2021
    Date of Patent: May 9, 2023
    Assignee: NVIDIA Corporation
    Inventors: Xiaodong Yang, Pavlo Molchanov, Jan Kautz
  • Publication number: 20230080247
    Abstract: A vision transformer is a deep learning model used to perform vision processing tasks such as image recognition. Vision transformers are currently designed with a plurality of same-size blocks that perform the vision processing tasks. However, some portions of these blocks are unnecessary and not only slow down the vision transformer but use more memory than required. In response, parameters of these blocks are analyzed to determine a score for each parameter, and if the score falls below a threshold, the parameter is removed from the associated block. This reduces a size of the resulting vision transformer, which reduces unnecessary memory usage and increases performance.
    Type: Application
    Filed: December 14, 2021
    Publication date: March 16, 2023
    Inventors: Hongxu Yin, Huanrui Yang, Pavlo Molchanov, Jan Kautz
  • Publication number: 20230078171
    Abstract: Systems and methods for more accurate and robust determination of subject characteristics from an image of the subject. One or more machine learning models receive as input an image of a subject, and output both facial landmarks and associated confidence values. Confidence values represent the degrees to which portions of the subject's face corresponding to those landmarks are occluded, i.e., the amount of uncertainty in the position of each landmark location. These landmark points and their associated confidence values, and/or associated information, may then be input to another set of one or more machine learning models which may output any facial analysis quantity or quantities, such as the subject's gaze direction, head pose, drowsiness state, cognitive load, or distraction state.
    Type: Application
    Filed: October 31, 2022
    Publication date: March 16, 2023
    Inventors: Nuri Murat Arar, Niranjan Avadhanam, Nishant Puri, Shagan Sah, Rajath Shetty, Sujay Yadawadkar, Pavlo Molchanov
  • Publication number: 20230077258
    Abstract: Apparatuses, systems, and techniques are presented to simplify neural networks. In at least one embodiment, one or more portions of one or more neural networks are cause to be removed based, at least in part, on one or more performance metrics of the one or more neural networks.
    Type: Application
    Filed: August 10, 2021
    Publication date: March 9, 2023
    Inventors: Maying Shen, Pavlo Molchanov, Hongxu Yin, Lei Mao, Jianna Liu, Jose Manuel Alvarez Lopez
  • Publication number: 20230070514
    Abstract: In order to determine accurate three-dimensional (3D) models for objects within a video, the objects are first identified and tracked within the video, and a pose and shape are estimated for these tracked objects. A translation and global orientation are removed from the tracked objects to determine local motion for the objects, and motion infilling is performed to fill in any missing portions for the object within the video. A global trajectory is then determined for the objects within the video, and the infilled motion and global trajectory are then used to determine infilled global motion for the object within the video. This enables the accurate depiction of each object as a 3D pose sequence for that model that accounts for occlusions and global factors within the video.
    Type: Application
    Filed: January 25, 2022
    Publication date: March 9, 2023
    Inventors: Ye Yuan, Umar Iqbal, Pavlo Molchanov, Jan Kautz
  • Patent number: 11593661
    Abstract: A neural network is trained to identify one or more features of an image. The neural network is trained using a small number of original images, from which a plurality of additional images are derived. The additional images generated by rotating and decoding embeddings of the image in a latent space generated by an autoencoder. The images generated by the rotation and decoding exhibit changes to a feature that is in proportion to the amount of rotation.
    Type: Grant
    Filed: April 19, 2019
    Date of Patent: February 28, 2023
    Assignee: NVIDIA Corporation
    Inventors: Seonwook Park, Shalini De Mello, Pavlo Molchanov, Umar Iqbal, Jan Kautz
  • Patent number: 11488418
    Abstract: Estimating a three-dimensional (3D) pose of an object, such as a hand or body (human, animal, robot, etc.), from a 2D image is necessary for human-computer interaction. A hand pose can be represented by a set of points in 3D space, called keypoints. Two coordinates (x,y) represent spatial displacement and a third coordinate represents a depth of every point with respect to the camera. A monocular camera is used to capture an image of the 3D pose, but does not capture depth information. A neural network architecture is configured to generate a depth value for each keypoint in the captured image, even when portions of the pose are occluded, or the orientation of the object is ambiguous. Generation of the depth values enables estimation of the 3D pose of the object.
    Type: Grant
    Filed: December 28, 2020
    Date of Patent: November 1, 2022
    Assignee: NVIDIA Corporation
    Inventors: Umar Iqbal, Pavlo Molchanov, Thomas Michael Breuel, Jan Kautz
  • Patent number: 11487968
    Abstract: Systems and methods for more accurate and robust determination of subject characteristics from an image of the subject. One or more machine learning models receive as input an image of a subject, and output both facial landmarks and associated confidence values. Confidence values represent the degrees to which portions of the subject's face corresponding to those landmarks are occluded, i.e., the amount of uncertainty in the position of each landmark location. These landmark points and their associated confidence values, and/or associated information, may then be input to another set of one or more machine learning models which may output any facial analysis quantity or quantities, such as the subject's gaze direction, head pose, drowsiness state, cognitive load, or distraction state.
    Type: Grant
    Filed: August 27, 2020
    Date of Patent: November 1, 2022
    Assignee: NVIDIA Corporation
    Inventors: Nuri Murat Arar, Niranjan Avadhanam, Nishant Puri, Shagan Sah, Rajath Shetty, Sujay Yadawadkar, Pavlo Molchanov
  • Publication number: 20220292360
    Abstract: Apparatuses, systems, and techniques to remove one or more nodes of a neural network. In at least one embodiment, one or more nodes of a neural network are removed, based on, for example, whether the one or more nodes are likely to affect performance of the neural network.
    Type: Application
    Filed: March 15, 2021
    Publication date: September 15, 2022
    Inventors: Maying Shen, Pavlo Molchanov, Hongxu Yin, Jose Manuel Alvarez Lopez
  • Publication number: 20220284283
    Abstract: Apparatuses, systems, and techniques to invert a neural network. In at least one embodiment, one or more neural network layers are inverted and, in at least one embodiment, loaded in reverse order.
    Type: Application
    Filed: March 8, 2021
    Publication date: September 8, 2022
    Inventors: Hongxu Yin, Pavlo Molchanov, Jose Manuel Alvarez Lopez, Xin Dong