Patents by Inventor Wonmin Byeon

Wonmin Byeon has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

META-LEARNING OF REPRESENTATIONS USING SELF-SUPERVISED TASKS

Publication number: 20250103906

Abstract: One embodiment of the present invention sets forth a technique for performing meta-learning. The technique includes performing a first set of training iterations to convert a prediction learning network into a first trained prediction learning network based on a first support set of training data and executing a representation learning network and the first trained prediction learning network to generate a first set of supervised training output and a first set of self-supervised training output based on a first query set of training data corresponding to the first support set of training data. The technique also includes performing a first training iteration to convert the representation learning network into a first trained representation learning network based on a first loss associated with the first set of supervised training output and a second loss associated with the first set of self-supervised training output.

Type: Application

Filed: September 20, 2023

Publication date: March 27, 2025

Inventors: Wonmin BYEON, Sudarshan BABU, Shalini DE MELLO, Jan KAUTZ
META-TESTING OF REPRESENTATIONS LEARNED USING SELF-SUPERVISED TASKS

Publication number: 20250095350

Abstract: One embodiment of the present invention sets forth a technique for executing a machine learning model. The technique includes performing a first set of training iterations to convert a prediction learning network into a first trained prediction learning network based on a first support set associated with a first set of classes. The technique also includes executing a first trained representation learning network to convert a first data sample into a first latent representation, where the first trained representation learning network is generated by training a representation learning network using a first query set, a first set of self-supervised losses, and a first set of supervised losses. The technique further includes executing the first trained prediction learning network to convert the first latent representation into a first prediction of a first class that is not included in the second set of classes.

Type: Application

Filed: September 20, 2023

Publication date: March 20, 2025

Inventors: Wonmin BYEON, Sudarshan BABU, Shalini DE MELLO, Jan KAUTZ
TRAINING A TRANSFORMER NEURAL NETWORK TO PERFORM TASK-SPECIFIC PARAMETER SELECTION

Publication number: 20250094813

Abstract: One embodiment of the present invention sets forth a technique for training a transformer neural network. The technique includes inputting a first task token and a first set of samples into the transformer neural network and training the transformer neural network using a first set of losses between predictions generated by the transformer neural network from the first task token and first set of samples as well as a first set of labels. The technique also includes converting the first task token into a second task token that is larger than the first task token, inputting the second task token and a second set of samples into the transformer neural network, and training the transformer neural network using a second set of losses between predictions generated by the transformer neural network from the second task token and the second set of samples as well as a second set of labels.

Type: Application

Filed: September 20, 2023

Publication date: March 20, 2025

Inventors: Wonmin BYEON, Sudarshan BABU, Shalini DE MELLO, Jan KAUTZ
FEW-SHOT CONTINUAL LEARNING WITH TASK-SPECIFIC PARAMETER SELECTION

Publication number: 20250094819

Abstract: One embodiment of the present invention sets forth a technique for executing a transformer neural network. The technique includes executing a first attention unit included in the transformer neural network to convert a first input token into a first query, a first key, and a first plurality of values, where each value included in the first plurality of values represents a sub-task associated with the transformer neural network. The technique also includes computing a first plurality of outputs associated with the first input token based on the first query, the first key, and the first plurality of values. The technique further includes performing a task associated with an input corresponding to the first input token based on the first input token and the first plurality of outputs.

Type: Application

Filed: September 20, 2023

Publication date: March 20, 2025

Inventors: Wonmin BYEON, Sudarshan BABU, Shalini DE MELLO, Jan KAUTZ
Future object trajectory predictions for autonomous machine applications

Patent number: 11989642

Abstract: In various examples, historical trajectory information of objects in an environment may be tracked by an ego-vehicle and encoded into a state feature. The encoded state features for each of the objects observed by the ego-vehicle may be used—e.g., by a bi-directional long short-term memory (LSTM) network—to encode a spatial feature. The encoded spatial feature and the encoded state feature for an object may be used to predict lateral and/or longitudinal maneuvers for the object, and the combination of this information may be used to determine future locations of the object. The future locations may be used by the ego-vehicle to determine a path through the environment, or may be used by a simulation system to control virtual objects—according to trajectories determined from the future locations—through a simulation environment.

Type: Grant

Filed: September 26, 2022

Date of Patent: May 21, 2024

Assignee: NVIDIA Corporation

Inventors: Ruben Villegas, Alejandro Troccoli, Iuri Frosio, Stephen Tyree, Wonmin Byeon, Jan Kautz
DIFFUSION-BASED OPEN-VOCABULARY SEGMENTATION

Publication number: 20240153093

Abstract: An open-vocabulary diffusion-based panoptic segmentation system is not limited to perform segmentation using only object categories seen during training, and instead can also successfully perform segmentation of object categories not seen during training and only seen during testing and inferencing. In contrast with conventional techniques, a text-conditioned diffusion (generative) model is used to perform the segmentation. The text-conditioned diffusion model is pre-trained to generate images from text captions, including computing internal representations that provide spatially well-differentiated object features. The internal representations computed within the diffusion model comprise object masks and a semantic visual representation of the object. The semantic visual representation may be extracted from the diffusion model and used in conjunction with a text representation of a category label to classify the object.

Type: Application

Filed: May 1, 2023

Publication date: May 9, 2024

Inventors: Jiarui Xu, Shalini De Mello, Sifei Liu, Arash Vahdat, Wonmin Byeon
CONVOLUTIONAL STRUCTURED STATE SPACE MODEL

Publication number: 20240127041

Abstract: Systems and methods are disclosed related to a convolutional structured state space model (ConvSSM), which has a tensor-structured state but a continuous-time parameterization and linear state updates. The linearity may be exploited to use parallel scans for subquadratic parallelization across the spatiotemporal sequence. The ConvSSM effectively models long-range dependencies and, when followed by a nonlinear operation forms a spatiotemporal layer (ConvS5) that does not require compressing frames into tokens, can be efficiently parallelized across the sequence, provides an unbounded context, and enables fast autoregressive generation.

Type: Application

Filed: August 21, 2023

Publication date: April 18, 2024

Inventors: Jimmy Smith, Wonmin Byeon, Shalini De Mello
SYNTHETIC DATASET GENERATOR

Publication number: 20240127075

Abstract: Machine learning is a process that learns a model from a given dataset, where the model can then be used to make a prediction about new data. In order to reduce the costs associated with collecting and labeling real world datasets for use in training the model, computer processes can synthetically generate datasets which simulate real world data. The present disclosure improves the effectiveness of such synthetic datasets for training machine learning models used in real world applications, in particular by generating a synthetic dataset that is specifically targeted to a specified downstream task (e.g. a particular computer vision task, a particular natural language processing task, etc.).

Type: Application

Filed: June 21, 2023

Publication date: April 18, 2024

Applicant: NVIDIA Corporation

Inventors: Shalini De Mello, Christian Jacobsen, Xunlei Wu, Stephen Tyree, Alice Li, Wonmin Byeon, Shangru Li
TECHNIQUES FOR HETEROGENEOUS CONTINUAL LEARNING WITH MACHINE LEARNING MODEL ARCHITECTURE PROGRESSION

Publication number: 20240119361

Abstract: One embodiment of a method for training a first machine learning model having a different architecture than a second machine learning model includes receiving a first data set, performing one or more operations to generate a second data set based on the first data set and the second machine learning model, wherein the second data set includes at least one feature associated with one or more tasks that the second machine learning model was previously trained to perform, and performing one or more operations to train the first machine learning model based on the second data set and the second machine learning model.

Type: Application

Filed: July 6, 2023

Publication date: April 11, 2024

Inventors: Hongxu YIN, Wonmin BYEON, Jan KAUTZ, Divyam MADAAN, Pavlo MOLCHANOV
AUDIO-DRIVEN FACIAL ANIMATION WITH EMOTION SUPPORT USING MACHINE LEARNING

Publication number: 20240013462

Abstract: A deep neural network can be trained to output motion or deformation information for a character that is representative of the character uttering speech contained in audio input, which is accurate for an emotional state of the character. The character can have different facial components or regions (e.g., head, skin, eyes, tongue) modeled separately, such that the network can output motion or deformation information for each of these different facial components. During training, the network can be provided with emotion and/or style vectors that indicate information to be used in generating realistic animation for input speech, as may relate to one or more emotions to be exhibited by the character, a relative weighting of those emotions, and any style or adjustments to be made to how the character expresses that emotional state. The network output can be provided to a renderer to generate audio-driven facial animation that is emotion-accurate.

Type: Application

Filed: July 7, 2022

Publication date: January 11, 2024

Inventors: Yeongho Seol, Simon Yuen, Dmitry Aleksandrovich Korobchenko, Mingquan Zhou, Ronan Browne, Wonmin Byeon
Image processing using coupled segmentation and edge learning

Patent number: 11790633

Abstract: The disclosure provides a learning framework that unifies both semantic segmentation and semantic edge detection. A learnable recurrent message passing layer is disclosed where semantic edges are considered as explicitly learned gating signals to refine segmentation and improve dense prediction quality by finding compact structures for message paths. The disclosure includes a method for coupled segmentation and edge learning. In one example, the method includes: (1) receiving an input image, (2) generating, from the input image, a semantic feature map, an affinity map, and a semantic edge map from a single backbone network of a convolutional neural network (CNN), and (3) producing a refined semantic feature map by smoothing pixels of the semantic feature map using spatial propagation, and controlling the smoothing using both affinity values from the affinity map and edge values from the semantic edge map.

Type: Grant

Filed: July 1, 2021

Date of Patent: October 17, 2023

Assignee: NVIDIA Corporation

Inventors: Zhiding Yu, Rui Huang, Wonmin Byeon, Sifei Liu, Guilin Liu, Thomas Breuel, Anima Anandkumar, Jan Kautz
PERFORMING SEMANTIC SEGMENTATION TRAINING WITH IMAGE/TEXT PAIRS

Publication number: 20230177810

Abstract: Semantic segmentation includes the task of providing pixel-wise annotations for a provided image. To train a machine learning environment to perform semantic segmentation, image/caption pairs are retrieved from one or more databases. These image/caption pairs each include an image and associated textual caption. The image portion of each image/caption pair is passed to an image encoder of the machine learning environment that outputs potential pixel groupings (e.g., potential segments of pixels) within each image, while nouns are extracted from the caption portion and are converted to text prompts which are then passed to a text encoder that outputs a corresponding text representation. Contrastive loss operations are then performed on features extracted from these pixel groupings and text representations to determine an extracted feature for each noun of each caption that most closely matches the extracted features for the associated image.

Type: Application

Filed: June 29, 2022

Publication date: June 8, 2023

Inventors: Jiarui Xu, Shalini De Mello, Sifei Liu, Wonmin Byeon, Thomas Breuel, Jan Kautz
PERFORMING SIMULATIONS USING MACHINE LEARNING

Publication number: 20230153604

Abstract: To assist a machine learning environment in modelling a complex physical simulation (such as a numerical simulation or physics simulation), a correlation between input coordinates is determined. For example, a discrete solution (e.g., the correlation between the plurality of input coordinates) may be obtained from a non-discrete (e.g., continuous) physics space by performing a conversion from the physics space to a grid space. This correlation is input along with the coordinates into a machine learning environment to obtain results from the simulation. As a result, instead of implementing resource and power-intensive simulations to solve these computation problems, a machine learning environment implemented using less power and computing resources may solve these computation problems in a faster and more efficient manner.

Type: Application

Filed: July 26, 2022

Publication date: May 18, 2023

Inventors: Wonmin Byeon, Benjamin Wu, Oliver Hennigh
NOVEL METHOD OF TRAINING A NEURAL NETWORK

Publication number: 20230146647

Abstract: Apparatuses, systems, and techniques to perform and facilitate preservation of neural coding network weights over time. In at least one embodiment, a convolutional neural coding network is trained using a set of tasks such that said convolutional neural coding network retains an ability to perform inferencing based on tasks from previous training.

Type: Application

Filed: November 5, 2021

Publication date: May 11, 2023

Inventors: Wonmin Byeon, Shalini De Mello, Ankur Arjun Mali
FUTURE OBJECT TRAJECTORY PREDICTIONS FOR AUTONOMOUS MACHINE APPLICATIONS

Publication number: 20230088912

Abstract: In various examples, historical trajectory information of objects in an environment may be tracked by an ego-vehicle and encoded into a state feature. The encoded state features for each of the objects observed by the ego-vehicle may be used—e.g., by a bi-directional long short-term memory (LSTM) network—to encode a spatial feature. The encoded spatial feature and the encoded state feature for an object may be used to predict lateral and/or longitudinal maneuvers for the object, and the combination of this information may be used to determine future locations of the object. The future locations may be used by the ego-vehicle to determine a path through the environment, or may be used by a simulation system to control virtual objects—according to trajectories determined from the future locations—through a simulation environment.

Type: Application

Filed: September 26, 2022

Publication date: March 23, 2023

Inventors: Ruben Villegas, Alejandro Troccoli, Iuri Frosio, Stephen Tyree, Wonmin Byeon, Jan Kautz
IMAGE PROCESSING USING COUPLED SEGMENTATION AND EDGE LEARNING

Publication number: 20230015989

Abstract: The disclosure provides a learning framework that unifies both semantic segmentation and semantic edge detection. A learnable recurrent message passing layer is disclosed where semantic edges are considered as explicitly learned gating signals to refine segmentation and improve dense prediction quality by finding compact structures for message paths. The disclosure includes a method for coupled segmentation and edge learning. In one example, the method includes: (1) receiving an input image, (2) generating, from the input image, a semantic feature map, an affinity map, and a semantic edge map from a single backbone network of a convolutional neural network (CNN), and (3) producing a refined semantic feature map by smoothing pixels of the semantic feature map using spatial propagation, and controlling the smoothing using both affinity values from the affinity map and edge values from the semantic edge map.

Type: Application

Filed: July 1, 2021

Publication date: January 19, 2023

Inventors: Zhiding Yu, Rui Huang, Wonmin Byeon, Sifei Liu, Guilin Liu, Thomas Breuel, Anima Anandkumar, Jan Kautz
Future object trajectory predictions for autonomous machine applications

Patent number: 11514293

Abstract: In various examples, historical trajectory information of objects in an environment may be tracked by an ego-vehicle and encoded into a state feature. The encoded state features for each of the objects observed by the ego-vehicle may be used—e.g., by a bi-directional long short-term memory (LSTM) network—to encode a spatial feature. The encoded spatial feature and the encoded state feature for an object may be used to predict lateral and/or longitudinal maneuvers for the object, and the combination of this information may be used to determine future locations of the object. The future locations may be used by the ego-vehicle to determine a path through the environment, or may be used by a simulation system to control virtual objects—according to trajectories determined from the future locations—through a simulation environment.

Type: Grant

Filed: September 9, 2019

Date of Patent: November 29, 2022

Assignee: NVIDIA Corporation

Inventors: Ruben Villegas, Alejandro Troccoli, Iuri Frosio, Stephen Tyree, Wonmin Byeon, Jan Kautz
IMAGE SEGMENTATION USING A NEURAL NETWORK TRANSLATION MODEL

Publication number: 20220254029

Abstract: The neural network includes an encoder, a common decoder, and a residual decoder. The encoder encodes input images into a latent space. The latent space disentangles unique features from other common features. The common decoder decodes common features resident in the latent space to generate translated images which lack the unique features. The residual decoder decodes unique features resident in the latent space to generate image deltas corresponding to the unique features. The neural network combines the translated images with the image deltas to generate combined images that may include both common features and unique features. The combined images can be used to drive autoencoding. Once training is complete, the residual decoder can be modified to generate segmentation masks that indicate any regions of a given input image where a unique feature resides.

Type: Application

Filed: October 13, 2021

Publication date: August 11, 2022

Inventors: Eugene Vorontsov, Wonmin Byeon, Shalini De Mello, Varun Jampani, Ming-Yu Liu, Pavlo Molchanov
Neural network system for stereo image matching

Patent number: 11062471

Abstract: Stereo matching generates a disparity map indicating pixels offsets between matched points in a stereo image pair. A neural network may be used to generate disparity maps in real time by matching image features in stereo images using only 2D convolutions. The proposed method is faster than 3D convolution-based methods, with only a slight accuracy loss and higher generalization capability. A 3D efficient cost aggregation volume is generated by combining cost maps for each disparity level. Different disparity levels correspond to different amounts of shift between pixels in the left and right image pair. In general, each disparity level is inversely proportional to a different distance from the viewpoint.

Type: Grant

Filed: May 6, 2020

Date of Patent: July 13, 2021

Assignee: NVIDIA Corporation

Inventors: Yiran Zhong, Wonmin Byeon, Charles Loop, Stanley Thomas Birchfield
DUAL RECURRENT NEURAL NETWORK ARCHITECTURE FOR MODELING LONG-TERM DEPENDENCIES IN SEQUENTIAL DATA

Publication number: 20210089867

Abstract: Learning the dynamics of an environment and predicting consequences in the future is a recent technical advancement that can be applied to video prediction, speech recognition, among other applications. Generally, machine learning, such as deep learning models, neural networks, or other artificial intelligence algorithms are used to make the predictions. However, current artificial intelligence algorithms used for making predictions are typically limited to making short-term future predictions, mainly as a result of 1) the presence of complex dynamics in high-dimensional video data, 2) prediction error propagation over time, and 3) inherent uncertainty of the future. The present disclosure enables the modeling of long-term dependencies in sequential data for use in making long-term predictions by providing a dual (i.e. two-part) recurrent neural network architecture.

Type: Application

Filed: September 24, 2019

Publication date: March 25, 2021

Inventors: Wonmin Byeon, Jan Kautz

1 2 next