Patents by Inventor Pavlo Molchanov

Pavlo Molchanov has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SHARPNESS-AWARE MINIMIZATION FOR ROBUSTNESS IN SPARSE NEURAL NETWORKS

Publication number: 20240127067

Abstract: Systems and methods are disclosed for improving natural robustness of sparse neural networks. Pruning a dense neural network may improve inference speed and reduces the memory footprint and energy consumption of the resulting sparse neural network while maintaining a desired level of accuracy. In real-world scenarios in which sparse neural networks deployed in autonomous vehicles perform tasks such as object detection and classification for acquired inputs (images), the neural networks need to be robust to new environments, weather conditions, camera effects, etc. Applying sharpness-aware minimization (SAM) optimization during training of the sparse neural network improves performance for out of distribution (OOD) images compared with using conventional stochastic gradient descent (SGD) optimization. SAM optimizes a neural network to find a flat minimum: a region that both has a small loss value, but that also lies within a region of low loss.

Type: Application

Filed: August 31, 2023

Publication date: April 18, 2024

Inventors: Annamarie Bair, Hongxu Yin, Pavlo Molchanov, Maying Shen, Jose Manuel Alvarez Lopez
DYNAMIC NEURAL NETWORK MODEL SPARSIFICATION

Publication number: 20240119291

Abstract: Machine learning is a process that learns a neural network model from a given dataset, where the model can then be used to make a prediction about new data. In order to reduce the size, computation, and latency of a neural network model, a compression technique can be employed which includes model sparsification. To avoid the negative consequences of pruning a fully pretrained neural network model and on the other hand of training a sparse model in the first place without any recovery option, the present disclosure provides a dynamic neural network model sparsification process which allows for recovery of previously pruned parts to improve the quality of the sparse neural network model.

Type: Application

Filed: May 30, 2023

Publication date: April 11, 2024

Inventors: Jose M. Alvarez Lopez, Pavlo Molchanov, Hongxu Yin, Maying Shen, Lei Mao, Xinglong Sun
TECHNIQUES FOR HETEROGENEOUS CONTINUAL LEARNING WITH MACHINE LEARNING MODEL ARCHITECTURE PROGRESSION

Publication number: 20240119361

Abstract: One embodiment of a method for training a first machine learning model having a different architecture than a second machine learning model includes receiving a first data set, performing one or more operations to generate a second data set based on the first data set and the second machine learning model, wherein the second data set includes at least one feature associated with one or more tasks that the second machine learning model was previously trained to perform, and performing one or more operations to train the first machine learning model based on the second data set and the second machine learning model.

Type: Application

Filed: July 6, 2023

Publication date: April 11, 2024

Inventors: Hongxu YIN, Wonmin BYEON, Jan KAUTZ, Divyam MADAAN, Pavlo MOLCHANOV
LANDMARK DETECTION WITH AN ITERATIVE NEURAL NETWORK

Publication number: 20240096115

Abstract: Landmark detection refers to the detection of landmarks within an image or a video, and is used in many computer vision tasks such emotion recognition, face identity verification, hand tracking, gesture recognition, and eye gaze tracking. Current landmark detection methods rely on a cascaded computation through cascaded networks or an ensemble of multiple models, which starts with an initial guess of the landmarks and iteratively produces corrected landmarks which match the input more finely. However, the iterations required by current methods typically increase the training memory cost linearly, and do not have an obvious stopping criteria. Moreover, these methods tend to exhibit jitter in landmark detection results for video. The present disclosure improves current landmark detection methods by providing landmark detection using an iterative neural network.

Type: Application

Filed: September 7, 2023

Publication date: March 21, 2024

Inventors: Pavlo Molchanov, Jan Kautz, Arash Vahdat, Hongxu Yin, Paul Micaelli
Neural network based facial analysis using facial landmarks and associated confidence values

Patent number: 11934955

Abstract: Systems and methods for more accurate and robust determination of subject characteristics from an image of the subject. One or more machine learning models receive as input an image of a subject, and output both facial landmarks and associated confidence values. Confidence values represent the degrees to which portions of the subject's face corresponding to those landmarks are occluded, i.e., the amount of uncertainty in the position of each landmark location. These landmark points and their associated confidence values, and/or associated information, may then be input to another set of one or more machine learning models which may output any facial analysis quantity or quantities, such as the subject's gaze direction, head pose, drowsiness state, cognitive load, or distraction state.

Type: Grant

Filed: October 31, 2022

Date of Patent: March 19, 2024

Assignee: NVIDIA Corporation

Inventors: Nuri Murat Arar, Niranjan Avadhanam, Nishant Puri, Shagan Sah, Rajath Shetty, Sujay Yadawadkar, Pavlo Molchanov
CAMERA AND ARTICULATED OBJECT MOTION ESTIMATION FROM VIDEO

Publication number: 20240070874

Abstract: Estimating motion of a human or other object in video is a common computer task with applications in robotics, sports, mixed reality, etc. However, motion estimation becomes difficult when the camera capturing the video is moving, because the observed object and camera motions are entangled. The present disclosure provides for joint estimation of the motion of a camera and the motion of articulated objects captured in video by the camera.

Type: Application

Filed: April 17, 2023

Publication date: February 29, 2024

Inventors: Muhammed Kocabas, Ye Yuan, Umar Iqbal, Pavlo Molchanov, Jan Kautz
GLOBAL CONTEXT VISION TRANSFORMER

Publication number: 20230394781

Abstract: Vision transformers are deep learning models that employ a self-attention mechanism to obtain feature representations for an input image. To date, the configuration of vision transformers has limited the self-attention computation to a local window of the input image, such that short-range dependencies are modeled in the output. The present disclosure provides a vision transformer that captures global context, and that is therefore able to model long-range dependencies in its output.

Type: Application

Filed: December 16, 2022

Publication date: December 7, 2023

Applicant: NVIDIA Corporation

Inventors: Ali Hatamizadeh, Hongxu Yin, Jan Kautz, Pavlo Molchanov
FEW-SHOT TRAINING OF A NEURAL NETWORK

Publication number: 20230368501

Abstract: A neural network is trained to identify one or more features of an image. The neural network is trained using a small number of original images, from which a plurality of additional images are derived. The additional images generated by rotating and decoding embeddings of the image in a latent space generated by an autoencoder. The images generated by the rotation and decoding exhibit changes to a feature that is in proportion to the amount of rotation.

Type: Application

Filed: February 24, 2023

Publication date: November 16, 2023

Inventors: Seonwook Park, Shalini De Mello, Pavlo Molchanov, Umar Iqbal, Jan Kautz
Segmentation using an unsupervised neural network training technique

Patent number: 11748887

Abstract: Systems and methods to detect one or more segments of one or more objects within one or more images based, at least in part, on a neural network trained in an unsupervised manner to infer the one or more segments. Systems and methods to help train one or more neural networks to detect one or more segments of one or more objects within one or more images in an unsupervised manner.

Type: Grant

Filed: April 8, 2019

Date of Patent: September 5, 2023

Assignee: NVIDIA Corporation

Inventors: Varun Jampani, Wei-Chih Hung, Sifei Liu, Pavlo Molchanov, Jan Kautz
ADAPTIVE TOKEN DEPTH ADJUSTMENT IN TRANSFORMER NEURAL NETWORKS

Publication number: 20230186077

Abstract: One embodiment of the present invention sets forth a technique for executing a transformer neural network. The technique includes computing a first set of halting scores for a first set of tokens that has been input into a first layer of the transformer neural network. The technique also includes determining that a first halting score included in the first set of halting scores exceeds a threshold value. The technique further includes in response to the first halting score exceeding the threshold value, causing a first token that is included in the first set of tokens and is associated with the first halting score not to be processed by one or more layers within the transformer neural network that are subsequent to the first layer.

Type: Application

Filed: June 15, 2022

Publication date: June 15, 2023

Inventors: Hongxu YIN, Jan KAUTZ, Jose Manuel ALVAREZ LOPEZ, Arun MALLYA, Pavlo MOLCHANOV, Arash VAHDAT
Transforming convolutional neural networks for visual sequence learning

Patent number: 11645530

Abstract: A method, computer readable medium, and system are disclosed for visual sequence learning using neural networks. The method includes the steps of replacing a non-recurrent layer within a trained convolutional neural network model with a recurrent layer to produce a visual sequence learning neural network model and transforming feedforward weights for the non-recurrent layer into input-to-hidden weights of the recurrent layer to produce a transformed recurrent layer. The method also includes the steps of setting hidden-to-hidden weights of the recurrent layer to initial values and processing video image data by the visual sequence learning neural network model to generate classification or regression output data.

Type: Grant

Filed: May 19, 2021

Date of Patent: May 9, 2023

Assignee: NVIDIA Corporation

Inventors: Xiaodong Yang, Pavlo Molchanov, Jan Kautz
PRUNING A VISION TRANSFORMER

Publication number: 20230080247

Abstract: A vision transformer is a deep learning model used to perform vision processing tasks such as image recognition. Vision transformers are currently designed with a plurality of same-size blocks that perform the vision processing tasks. However, some portions of these blocks are unnecessary and not only slow down the vision transformer but use more memory than required. In response, parameters of these blocks are analyzed to determine a score for each parameter, and if the score falls below a threshold, the parameter is removed from the associated block. This reduces a size of the resulting vision transformer, which reduces unnecessary memory usage and increases performance.

Type: Application

Filed: December 14, 2021

Publication date: March 16, 2023

Inventors: Hongxu Yin, Huanrui Yang, Pavlo Molchanov, Jan Kautz
NEURAL NETWORK BASED FACIAL ANALYSIS USING FACIAL LANDMARKS AND ASSOCIATED CONFIDENCE VALUES

Publication number: 20230078171

Abstract: Systems and methods for more accurate and robust determination of subject characteristics from an image of the subject. One or more machine learning models receive as input an image of a subject, and output both facial landmarks and associated confidence values. Confidence values represent the degrees to which portions of the subject's face corresponding to those landmarks are occluded, i.e., the amount of uncertainty in the position of each landmark location. These landmark points and their associated confidence values, and/or associated information, may then be input to another set of one or more machine learning models which may output any facial analysis quantity or quantities, such as the subject's gaze direction, head pose, drowsiness state, cognitive load, or distraction state.

Type: Application

Filed: October 31, 2022

Publication date: March 16, 2023

Inventors: Nuri Murat Arar, Niranjan Avadhanam, Nishant Puri, Shagan Sah, Rajath Shetty, Sujay Yadawadkar, Pavlo Molchanov
PERFORMANCE-AWARE SIZE REDUCTION FOR NEURAL NETWORKS

Publication number: 20230077258

Abstract: Apparatuses, systems, and techniques are presented to simplify neural networks. In at least one embodiment, one or more portions of one or more neural networks are cause to be removed based, at least in part, on one or more performance metrics of the one or more neural networks.

Type: Application

Filed: August 10, 2021

Publication date: March 9, 2023

Inventors: Maying Shen, Pavlo Molchanov, Hongxu Yin, Lei Mao, Jianna Liu, Jose Manuel Alvarez Lopez
PERFORMING OCCLUSION-AWARE GLOBAL 3D POSE AND SHAPE ESTIMATION OF ARTICULATED OBJECTS

Publication number: 20230070514

Abstract: In order to determine accurate three-dimensional (3D) models for objects within a video, the objects are first identified and tracked within the video, and a pose and shape are estimated for these tracked objects. A translation and global orientation are removed from the tracked objects to determine local motion for the objects, and motion infilling is performed to fill in any missing portions for the object within the video. A global trajectory is then determined for the objects within the video, and the infilled motion and global trajectory are then used to determine infilled global motion for the object within the video. This enables the accurate depiction of each object as a 3D pose sequence for that model that accounts for occlusions and global factors within the video.

Type: Application

Filed: January 25, 2022

Publication date: March 9, 2023

Inventors: Ye Yuan, Umar Iqbal, Pavlo Molchanov, Jan Kautz
Few-shot training of a neural network

Patent number: 11593661

Abstract: A neural network is trained to identify one or more features of an image. The neural network is trained using a small number of original images, from which a plurality of additional images are derived. The additional images generated by rotating and decoding embeddings of the image in a latent space generated by an autoencoder. The images generated by the rotation and decoding exhibit changes to a feature that is in proportion to the amount of rotation.

Type: Grant

Filed: April 19, 2019

Date of Patent: February 28, 2023

Assignee: NVIDIA Corporation

Inventors: Seonwook Park, Shalini De Mello, Pavlo Molchanov, Umar Iqbal, Jan Kautz
Three-dimensional (3D) pose estimation from a monocular camera

Patent number: 11488418

Abstract: Estimating a three-dimensional (3D) pose of an object, such as a hand or body (human, animal, robot, etc.), from a 2D image is necessary for human-computer interaction. A hand pose can be represented by a set of points in 3D space, called keypoints. Two coordinates (x,y) represent spatial displacement and a third coordinate represents a depth of every point with respect to the camera. A monocular camera is used to capture an image of the 3D pose, but does not capture depth information. A neural network architecture is configured to generate a depth value for each keypoint in the captured image, even when portions of the pose are occluded, or the orientation of the object is ambiguous. Generation of the depth values enables estimation of the 3D pose of the object.

Type: Grant

Filed: December 28, 2020

Date of Patent: November 1, 2022

Assignee: NVIDIA Corporation

Inventors: Umar Iqbal, Pavlo Molchanov, Thomas Michael Breuel, Jan Kautz
Neural network based facial analysis using facial landmarks and associated confidence values

Patent number: 11487968

Abstract: Systems and methods for more accurate and robust determination of subject characteristics from an image of the subject. One or more machine learning models receive as input an image of a subject, and output both facial landmarks and associated confidence values. Confidence values represent the degrees to which portions of the subject's face corresponding to those landmarks are occluded, i.e., the amount of uncertainty in the position of each landmark location. These landmark points and their associated confidence values, and/or associated information, may then be input to another set of one or more machine learning models which may output any facial analysis quantity or quantities, such as the subject's gaze direction, head pose, drowsiness state, cognitive load, or distraction state.

Type: Grant

Filed: August 27, 2020

Date of Patent: November 1, 2022

Assignee: NVIDIA Corporation

Inventors: Nuri Murat Arar, Niranjan Avadhanam, Nishant Puri, Shagan Sah, Rajath Shetty, Sujay Yadawadkar, Pavlo Molchanov
PRUNING NEURAL NETWORKS

Publication number: 20220292360

Abstract: Apparatuses, systems, and techniques to remove one or more nodes of a neural network. In at least one embodiment, one or more nodes of a neural network are removed, based on, for example, whether the one or more nodes are likely to affect performance of the neural network.

Type: Application

Filed: March 15, 2021

Publication date: September 15, 2022

Inventors: Maying Shen, Pavlo Molchanov, Hongxu Yin, Jose Manuel Alvarez Lopez
NEURAL NETWORK TRAINING TECHNIQUE

Publication number: 20220284283

Abstract: Apparatuses, systems, and techniques to invert a neural network. In at least one embodiment, one or more neural network layers are inverted and, in at least one embodiment, loaded in reverse order.

Type: Application

Filed: March 8, 2021

Publication date: September 8, 2022

Inventors: Hongxu Yin, Pavlo Molchanov, Jose Manuel Alvarez Lopez, Xin Dong

1 2 3 4 next