Patents by Inventor Anima Anandkumar
Anima Anandkumar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12657259Abstract: Apparatuses, systems, and techniques to train neural networks to perform image processing tasks. In at least one embodiment, one or more second neural networks are used to train one or more first neural networks based, at least in part, on a first object type in one or more images and a second object type in the one or more images, in parallel.Type: GrantFiled: December 18, 2020Date of Patent: June 16, 2026Assignee: NVIDIA CorporationInventors: Zhiding Yu, Wuyang Chen, Shalini De Mello, Sifei Liu, Jose Manuel Alvarez Lopez, Anima Anandkumar
-
Publication number: 20260154975Abstract: 3D object detection is a computer vision task that generally detects (e.g. classifies and localizes) objects in 3D space from the 2D images or videos that capture the objects. Current techniques used for 3D object detection rely on machine learning processes that learn to detect 3D objects from existing images annotated with high-quality 3D information including depth information generally obtained using lidar technology. However, due to lidar's limited measurable range, current machine learning solutions to 3D object detection do not support detection of 3D objects beyond the lidar range, which is needed for numerous applications, including autonomous driving applications where existing close or midrange 3D object detection does not always meet the safety-critical requirement of autonomous driving. The present disclosure provides for 3D object detection using a technique that supports long-range detection (i.e. detection beyond the lidar range).Type: ApplicationFiled: January 21, 2026Publication date: June 4, 2026Inventors: Zetong Yang, Zhiding Yu, Ren Hao Wang, Chris Choy, Anima Anandkumar, Jose M. Alvarez Lopez
-
Publication number: 20260154378Abstract: Apparatuses, systems, and techniques to modify a set of training data used for machine learning. In at least one embodiment, a set of images used for training a machine learning system is resampled by augmenting the set of images with additional images of under represented object types extracted from portions of existing training images in the set.Type: ApplicationFiled: January 27, 2026Publication date: June 4, 2026Inventors: Nai Chen Chang, Jose Manuel Alvarez Lopez, Zhiding Yu, Anima Anandkumar, Sanja Fidler
-
Patent number: 12602936Abstract: 3D object detection is a computer vision task that generally detects (e.g. classifies and localizes) objects in 3D space from the 2D images or videos that capture the objects. Current techniques used for 3D object detection rely on machine learning processes that learn to detect 3D objects from existing images annotated with high-quality 3D information including depth information generally obtained using lidar technology. However, due to lidar's limited measurable range, current machine learning solutions to 3D object detection do not support detection of 3D objects beyond the lidar range, which is needed for numerous applications, including autonomous driving applications where existing close or midrange 3D object detection does not always meet the safety-critical requirement of autonomous driving. The present disclosure provides for 3D object detection using a technique that supports long-range detection (i.e. detection beyond the lidar range).Type: GrantFiled: July 18, 2023Date of Patent: April 14, 2026Assignee: NVIDIA CORPORATIONInventors: Zetong Yang, Zhiding Yu, Ren Hao Wang, Chris Choy, Anima Anandkumar, Jose M. Alvarez Lopez
-
Publication number: 20260087643Abstract: Apparatuses, systems, and techniques to track one or more objects in one or more frames of a video. In at least one embodiment, one or more objects in one or more frames of a video are tracked based on, for example, one or more sets of embeddings.Type: ApplicationFiled: November 25, 2025Publication date: March 26, 2026Inventors: De-An Huang, Zhiding Yu, Anima Anandkumar
-
Patent number: 12554799Abstract: Apparatuses, systems, and techniques to modify a set of training data used for machine learning. In at least one embodiment, a set of images used for training a machine learning system is resampled by augmenting the set of images with additional images of under represented object types extracted from portions of existing training images in the set.Type: GrantFiled: March 1, 2021Date of Patent: February 17, 2026Assignee: NVIDIA CorporationInventors: Nai Chen Chang, Jose Manuel Alvarez Lopez, Zhiding Yu, Anima Anandkumar, Sanja Fidler
-
Patent number: 12548310Abstract: Apparatuses, systems, and techniques are presented to detect one or more objects in one or more images. In at least one embodiment, one or more neural networks can be trained to detect one or more objects, in one or more unlabeled images, based at least in part upon one or more predicted segmentations of the one or more objects.Type: GrantFiled: February 15, 2022Date of Patent: February 10, 2026Assignee: NVIDIA CorporationInventors: Xinlong Wang, Zhiding Yu, Shalini De Mello, Anima Anandkumar, Jose Manuel Alvarez Lopez
-
Patent number: 12547893Abstract: A vision transformer (ViT) is a deep learning model that performs one or more vision processing tasks. ViTs may be modified to include a global task that clusters images with the same concept together to produce semantically consistent relational representations, as well as a local task that guides the ViT to discover object-centric semantic correspondence across images. A database of concepts and associated features may be created and used to train the global and local tasks, which may then enable the ViT to perform visual relational reasoning faster, without supervision, and outside of a synthetic domain.Type: GrantFiled: August 22, 2022Date of Patent: February 10, 2026Assignee: NVIDIA CORPORATIONInventors: Xiaojian Ma, Weili Nie, Zhiding Yu, Huaizu Jiang, Chaowei Xiao, Yuke Zhu, Anima Anandkumar
-
Patent number: 12518398Abstract: Apparatuses, systems, and techniques to track one or more objects in one or more frames of a video. In at least one embodiment, one or more objects in one or more frames of a video are tracked based on, for example, one or more sets of embeddings.Type: GrantFiled: May 5, 2023Date of Patent: January 6, 2026Assignee: NVIDIA CorporationInventors: De-An Huang, Zhiding Yu, Anima Anandkumar
-
Publication number: 20250322675Abstract: 3D objection detection is a computer vision task that generally refers to detecting (e.g. classifying and localizing) an object in 3D space from an image or video that captures the object. This computer vision task has many useful applications, such as autonomous driving applications which rely on the detection of 3D objects in a local environment to make autonomous driving decisions. State-of-the-art 3D object detectors generally rely on machine learning, but current training processes for these detectors do not specifically address false negative detections, or missed objects, which are often caused by occlusions and/or cluttered backgrounds in the given image/video. Reducing false negatives is crucial for many downstream applications, particularly autonomous driving applications which rely on accurate detection of obstacles for making safe driving decisions. The present disclosure provides for a multi-stage training process that reduces false negative detections by 3D object detectors.Type: ApplicationFiled: April 16, 2024Publication date: October 16, 2025Inventors: Yilun Chen, Zhiding Yu, Shiyi Lan, Anima Anandkumar, Jose M. Alvarez Lopez
-
Publication number: 20250322902Abstract: A method for designing proteins using multi-objective reinforcement learning can include generating, by one or more processors using a machine model, based on an initial protein sequence data structure, a plurality of protein sequences, the machine learning model configured based on reinforcement learning from a plurality of reward metrics including at least one reward metric associated with experimental data regarding example sequence data, scoring, by the one or more processors, using a plurality of scoring functions, the plurality of protein sequences, to select a subset of protein sequences of the plurality of protein sequences, and outputting one or more selected protein sequences of the subset of selected protein sequences.Type: ApplicationFiled: April 9, 2025Publication date: October 16, 2025Applicant: UCHICAGO ARGONNE, LLCInventors: Arvind RAMANATHAN, Gautham DHARUMAN, Heng MA, Priyanka Varadaraja SETTY, Logan Timothy WARD, Ozan GOKDEMIR, Alexander BRACE, Kyle HIPPE, Anima ANANDKUMAR
-
Patent number: 12430564Abstract: A manipulation task may include operations performed by one or more manipulation entities on one or more objects. This manipulation task may be broken down into a plurality of sequential sub-tasks (policies). These policies may be fine-tuned so that a terminal state distribution of a given policy matches an initial state distribution of another policy that immediately follows the given policy within the plurality of policies. The fine-tuned plurality of policies may then be chained together and implemented within a manipulation environment.Type: GrantFiled: March 1, 2022Date of Patent: September 30, 2025Assignee: NVIDIA CORPORATIONInventors: Yuke Zhu, Anima Anandkumar, Youngwoon Lee
-
Publication number: 20250218160Abstract: Apparatuses, systems, and techniques of using one or more machine learning processes (e.g., neural network(s)) to detect objects from a plurality of image frames. In at least one embodiment, a plurality of image frames are fused into a feature map using one or more neural networks. In at least one embodiment, a plurality of image frames are processed using one or more neural networks to detect objects in a 3D space.Type: ApplicationFiled: December 27, 2023Publication date: July 3, 2025Inventors: Renhao Wang, Zhiding Yu, Shiyi Lan, Ke Chen, Anima Anandkumar, Jose Manuel Alvarez Lopez
-
Publication number: 20250103968Abstract: Diffusion models are machine learning algorithms that are uniquely trained to generate high-quality data from an input lower-quality data. Diffusion probabilistic models use discrete-time random processes or continuous-time stochastic differential equations (SDEs) that learn to gradually remove the noise added to the data points. With diffusion probabilistic models, high quality output currently requires sampling from a large diffusion probabilistic model which corners at a high computational cost. The present disclosure stitches together the trajectory of two or more inferior diffusion probabilistic models during a denoising process, which can in turn accelerate the denoising process by avoiding use of only a single large diffusion probabilistic model.Type: ApplicationFiled: August 30, 2024Publication date: March 27, 2025Inventors: Zizheng Pan, De-An Huang, Weili Nie, Zhiding Yu, Chaowei Xiao, Anima Anandkumar
-
Publication number: 20250078489Abstract: One embodiment of the present invention sets forth a technique for training an image classifier. The technique includes training a first vision transformer model to generate patch labels for corresponding images patches of images, converting the patch labels to token labels, and training a second vision transformer model to classify images based on the token labels.Type: ApplicationFiled: December 15, 2023Publication date: March 6, 2025Inventors: Bingyin ZHAO, Jose Manuel ALVAREZ LOPEZ, Anima ANANDKUMAR, Shi Yi LAN, Zhiding YU
-
Publication number: 20250020481Abstract: Apparatuses, systems, and techniques are presented to determination about objects in an environment. In at least one embodiment, a neural network can be used to determine one or more positions of one or more objects within a three-dimensional (3D) environment and to generate a segmented map of the 3D environment based, at least in part, on one or more two dimensional (2D) images of the one or more objects.Type: ApplicationFiled: April 7, 2022Publication date: January 16, 2025Inventors: Enze Xie, Zhiding Yu, Jonah Philion, Anima Anandkumar, Sanja Fidler, Jose Manuel Alvarez Lopez
-
Publication number: 20240273682Abstract: Image restoration generally involves recovering a target clean image from a given image having noise, blurring, or other degraded features. Current image restoration solutions typically include a diffusion model that is trained for image restoration by a forward process that progressively diffuses data to noise, and then by learning in a reverse process to generate the data from the noise. However, the forward process relies on Gaussian noise to diffuse the original data, which has little or no structural information corresponding to the original data versus learning from the degraded image itself which is much more structurally informative compared to the random Gaussian noise. Similar problems also exist for other data-to-data translation tasks.Type: ApplicationFiled: February 2, 2024Publication date: August 15, 2024Inventors: Weili Nie, Guan-Horng Liu, Arash Vahdat, De-An Huang, Anima Anandkumar
-
Publication number: 20240249538Abstract: 3D object detection is a computer vision task that generally detects (e.g. classifies and localizes) objects in 3D space from the 2D images or videos that capture the objects. Current techniques used for 3D object detection rely on machine learning processes that learn to detect 3D objects from existing images annotated with high-quality 3D information including depth information generally obtained using lidar technology. However, due to lidar's limited measurable range, current machine learning solutions to 3D object detection do not support detection of 3D objects beyond the lidar range, which is needed for numerous applications, including autonomous driving applications where existing close or midrange 3D object detection does not always meet the safety-critical requirement of autonomous driving. The present disclosure provides for 3D object detection using a technique that supports long-range detection (i.e. detection beyond the lidar range).Type: ApplicationFiled: July 18, 2023Publication date: July 25, 2024Inventors: Zetong Yang, Zhiding Yu, Ren Hao Wang, Chris Choy, Anima Anandkumar, Jose M. Alvarez Lopez
-
Publication number: 20240221166Abstract: Video instance segmentation is a computer vision task that aims to detect, segment, and track objects continuously in videos. It can be used in numerous real-world applications, such as video editing, three-dimensional (3D) reconstruction, 3D navigation (e.g. for autonomous driving and/or robotics), and view point estimation. However, current machine learning-based processes employed for video instance segmentation are lacking, particularly because the densely annotated videos needed for supervised training of high-quality models are not readily available and are not easily generated. To address the issues in the prior art, the present disclosure provides point-level supervision for video instance segmentation in a manner that allows the resulting machine learning model to handle any object category.Type: ApplicationFiled: December 22, 2023Publication date: July 4, 2024Inventors: Zhiding Yu, Shuaiyi Huang, De-An Huang, Shiyi Lan, Subhashree Radhakrishnan, Jose M. Alvarez Lopez, Anima Anandkumar
-
Patent number: 11977386Abstract: Techniques to generate driving scenarios for autonomous vehicles characterize a path in a driving scenario according to metrics such as narrowness and effort. Nodes of the path are assigned a time for action to avoid collision from the node. The generated scenarios may be simulated in a computer.Type: GrantFiled: November 18, 2022Date of Patent: May 7, 2024Assignee: NVIDIA CORP.Inventors: Siva Kumar Sastry Hari, Iuri Frosio, Zahra Ghodsi, Anima Anandkumar, Timothy Tsai, Stephen W. Keckler, Alejandro Troccoli