Patents by Inventor Chaowei Xiao

Chaowei Xiao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

TRAJECTORY STITCHING FOR ACCELERATING DIFFUSION MODELS

Publication number: 20250103968

Abstract: Diffusion models are machine learning algorithms that are uniquely trained to generate high-quality data from an input lower-quality data. Diffusion probabilistic models use discrete-time random processes or continuous-time stochastic differential equations (SDEs) that learn to gradually remove the noise added to the data points. With diffusion probabilistic models, high quality output currently requires sampling from a large diffusion probabilistic model which corners at a high computational cost. The present disclosure stitches together the trajectory of two or more inferior diffusion probabilistic models during a denoising process, which can in turn accelerate the denoising process by avoiding use of only a single large diffusion probabilistic model.

Type: Application

Filed: August 30, 2024

Publication date: March 27, 2025

Inventors: Zizheng Pan, De-An Huang, Weili Nie, Zhiding Yu, Chaowei Xiao, Anima Anandkumar
REINFORCEMENT LEARNING FOR TRAFFIC SIMULATION

Publication number: 20250029489

Abstract: In various examples, a traffic model including one or more traffic scenarios may be generated and/or updated based on using human feedback. Human feedback may be provided indicating a preference for various traffic scenarios to identify which scenarios in a model are more realistic. A reward model may capture the preference information and rank the realism of one or more traffic scenarios.

Type: Application

Filed: October 12, 2023

Publication date: January 23, 2025

Inventors: Yulong Cao, Chaowei Xiao, Marco Pavone, Boris Ivanovic
System and method for retrieval-based controllable molecule generation

Patent number: 12159694

Abstract: A machine learning framework is described for performing generation of candidate molecules for, e.g., drug discovery or other applications. The framework utilizes a pre-trained encoder-decoder model to interface between representations of molecules and embeddings for those molecules in a latent space. A fusion module is located between the encoder and decoder and is used to fuse an embedding for an input molecule with embeddings for one or more exemplary molecules selected from a database that is constructed according to a design criteria. The fused embedding is decoded using the decoder to generate a candidate molecule. The fusion module is trained to reconstruct a nearest neighbor to the input molecule from the database based on the sample of exemplary molecules. An iterative approach may be used during inference to dynamically update the database to include newly generated candidate molecules.

Type: Grant

Filed: July 17, 2023

Date of Patent: December 3, 2024

Assignee: NVIDIA Corporation

Inventors: Weili Nie, Zichao Wang, Chaowei Xiao, Animashree Anandkumar
VISION-LANGUAGE MODEL WITH AN ENSEMBLE OF EXPERTS

Publication number: 20240265690

Abstract: A vision-language model learns skills and domain knowledge via distinct and separate task-specific neural networks, referred to as experts. Each expert is independently optimized for a specific task, facilitating the use of domain-specific data and architectures that are not feasible with a single large neural network trained for multiple tasks. The vision-language model implemented as an ensemble of pre-trained experts and is more efficiently trained compared with the single large neural network. During training, the vision-language model integrates specialized skills and domain knowledge, rather than trying to simultaneously learn multiple tasks, resulting in effective multi-modal learning.

Type: Application

Filed: December 19, 2023

Publication date: August 8, 2024

Inventors: Animashree Anandkumar, Linxi Fan, Zhiding Yu, Chaowei Xiao, Shikun Liu
NEURAL NETWORK-BASED PERTURBATION REMOVAL

Publication number: 20240104698

Abstract: Apparatuses, systems, and techniques are presented to remove unintended variations introduced into data. In at least one embodiment, a first image of an object can be generated based, at least in part, upon adding noise to, and removing the noise from, a second image of the object.

Type: Application

Filed: April 12, 2022

Publication date: March 28, 2024

Inventors: Weili Nie, Yujia Huang, Chaowei Xiao, Arash Vahdat, Anima Anandkumar
NEURAL NETWORK-BASED LANGUAGE RESTRICTION

Publication number: 20240095447

Abstract: Apparatuses, systems, and techniques are presented to identify and prevent generation of restricted content. In at least one embodiment, one or more neural networks are used to identify restricted content based only on the restricted content.

Type: Application

Filed: June 22, 2022

Publication date: March 21, 2024

Inventors: Wei Ping, Boxin Wang, Chaowei Xiao, Mohammad Shoeybi, Mostofa Patwary, Anima Anandkumar, Bryan Catanzaro
NEURAL NETWORK PROMPT TUNING

Publication number: 20240095534

Abstract: Apparatuses, systems, and techniques to perform neural networks. In at least one embodiment, a most consistent output of one or more pre-trained neural networks is to be selected. In at least one embodiment, a most consistent output of one or more pre-trained neural networks is to be selected based, at least in part, on a plurality of variances of one or more inputs to the one or more neural networks.

Type: Application

Filed: September 7, 2023

Publication date: March 21, 2024

Inventors: Anima Anandkumar, Chaowei Xiao, Weili Nie, De-An Huang, Zhiding Yu, Manli Shu
SPARSE VOXEL TRANSFORMER FOR CAMERA-BASED 3D SEMANTIC SCENE COMPLETION

Publication number: 20240087222

Abstract: An artificial intelligence framework is described that incorporates a number of neural networks and a number of transformers for converting a two-dimensional image into three-dimensional semantic information. Neural networks convert one or more images into a set of image feature maps, depth information associated with the one or more images, and query proposals based on the depth information. A first transformer implements a cross-attention mechanism to process the set of image feature maps in accordance with the query proposals. The output of the first transformer is combined with a mask token to generate initial voxel features of the scene. A second transformer implements a self-attention mechanism to convert the initial voxel features into refined voxel features, which are up-sampled and processed by a lightweight neural network to generate the three-dimensional semantic information, which may be used by, e.g., an autonomous vehicle for various advanced driver assistance system (ADAS) functions.

Type: Application

Filed: November 20, 2023

Publication date: March 14, 2024

Inventors: Yiming Li, Zhiding Yu, Christopher B. Choy, Chaowei Xiao, Jose Manuel Alvarez Lopez, Sanja Fidler, Animashree Anandkumar
PERFORMING VISUAL RELATIONAL REASONING

Publication number: 20240078423

Abstract: A vision transformer (ViT) is a deep learning model that performs one or more vision processing tasks. ViTs may be modified to include a global task that clusters images with the same concept together to produce semantically consistent relational representations, as well as a local task that guides the ViT to discover object-centric semantic correspondence across images. A database of concepts and associated features may be created and used to train the global and local tasks, which may then enable the ViT to perform visual relational reasoning faster, without supervision, and outside of a synthetic domain.

Type: Application

Filed: August 22, 2022

Publication date: March 7, 2024

Inventors: Xiaojian Ma, Weili Nie, Zhiding Yu, Huaizu Jiang, Chaowei Xiao, Yuke Zhu, Anima Anandkumar
PERFORMING VISUAL RELATIONAL REASONING

Publication number: 20240062534

Abstract: A vision transformer (ViT) is a deep learning model that performs one or more vision processing tasks. ViTs may be modified to include a global task that clusters images with the same concept together to produce semantically consistent relational representations, as well as a local task that guides the ViT to discover object-centric semantic correspondence across images. A database of concepts and associated features may be created and used to train the global and local tasks, which may then enable the ViT to perform visual relational reasoning faster, without supervision, and outside of a synthetic domain.

Type: Application

Filed: August 22, 2022

Publication date: February 22, 2024

Inventors: Xiaojian Ma, Weili Nie, Zhiding Yu, Huaizu Jiang, Chaowei Xiao, Yuke Zhu, Anima Anandkumar
SYSTEM AND METHOD FOR RETRIEVAL-BASED CONTROLLABLE MOLECULE GENERATION

Publication number: 20240029836

Abstract: A machine learning framework is described for performing generation of candidate molecules for, e.g., drug discovery or other applications. The framework utilizes a pre-trained encoder-decoder model to interface between representations of molecules and embeddings for those molecules in a latent space. A fusion module is located between the encoder and decoder and is used to fuse an embedding for an input molecule with embeddings for one or more exemplary molecules selected from a database that is constructed according to a design criteria. The fused embedding is decoded using the decoder to generate a candidate molecule. The fusion module is trained to reconstruct a nearest neighbor to the input molecule from the database based on the sample of exemplary molecules. An iterative approach may be used during inference to dynamically update the database to include newly generated candidate molecules.

Type: Application

Filed: July 17, 2023

Publication date: January 25, 2024

Inventors: Weili Nie, Zichao Wang, Chaowei Xiao, Animashree Anandkumar
ROBUST TRAJECTORY PREDICTIONS AGAINST ADVERSARIAL ATTACKS IN AUTONOMOUS MACHINES AND APPLICATIONS

Publication number: 20240028673

Abstract: In various examples, robust trajectory predictions against adversarial attacks in autonomous machines and applications are described herein. Systems and methods are disclosed that perform adversarial training for trajectory predictions determined using a neural network(s). In order to improve the training, the systems and methods may devise a deterministic attach that creates a deterministic gradient path within a probabilistic model to generate adversarial samples for training. Additionally, the systems and methods may introduce a hybrid objective that interleaves the adversarial training and learning from clean data to anchor the output from the neural network(s) on stable, clean data distribution. Furthermore, the systems and methods may use a domain-specific data augmentation technique that generates diverse, realistic, and dynamically-feasible samples for additional training of the neural network(s).

Type: Application

Filed: March 8, 2023

Publication date: January 25, 2024

Inventors: Chaowei Xiao, Yolong Cao, Danfei Xu, Animashree Anandkumar, Marco Pavone, Xinshuo Weng
TRAJECTORY GENERATION

Publication number: 20240017745

Abstract: Apparatuses, systems, and techniques to generate trajectory data for moving objects. In at least one embodiment, adversarial trajectories are generated to evaluate a trajectory prediction model and are based, at least in part, on a differentiable dynamic model.

Type: Application

Filed: July 14, 2022

Publication date: January 18, 2024

Inventors: Yulong Cao, Chaowei Xiao, Danfei Xu, Anima Anandkumar, Marco Pavone
TECHNIQUES FOR WEAKLY SUPERVISED REFERRING IMAGE SEGMENTATION

Publication number: 20240013504

Abstract: One embodiment of a method for training a machine learning model includes receiving a training data set that includes at least one image, text referring to at least one object included in the at least one image, and at least one bounding box annotation associated with the at least one object, and performing, based on the training data set, one or more operations to generate a trained machine learning model to segment images based on text, where the one or more operations to generate the trained machine learning model include minimizing a loss function that comprises at least one of a multiple instance learning loss term or an energy loss term

Type: Application

Filed: October 31, 2022

Publication date: January 11, 2024

Inventors: Zhiding YU, Boyi LI, Chaowei XIAO, De-An HUANG, Weili NIE, Linxi FAN, Anima ANANDKUMAR
ROBUST VISION TRANSFORMERS

Publication number: 20230290135

Abstract: Apparatuses, systems, and techniques to generate a robust representation of an image. In at least one embodiment, input tokens of an input image are received, and an inference about the input image is generated based on a vision transformer (ViT) system comprising at least one self-attention module to perform token mixing and a channel self-attention module to perform channel processing.

Type: Application

Filed: March 9, 2023

Publication date: September 14, 2023

Inventors: Daquan Zhou, Zhiding Yu, Enze Xie, Anima Anandkumar, Chaowei Xiao, Jose Manuel Alvarez Lopez