Patents by Inventor Zhiding Yu

Zhiding Yu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

VIDEO INSTANCE SEGMENTATION

Publication number: 20240037756

Abstract: Apparatuses, systems, and techniques to track one or more objects in one or more frames of a video. In at least one embodiment, one or more objects in one or more frames of a video are tracked based on, for example, one or more sets of embeddings.

Type: Application

Filed: May 5, 2023

Publication date: February 1, 2024

Inventors: De-An Huang, Zhiding Yu, Anima Anandkumar
TECHNIQUES FOR WEAKLY SUPERVISED REFERRING IMAGE SEGMENTATION

Publication number: 20240013504

Abstract: One embodiment of a method for training a machine learning model includes receiving a training data set that includes at least one image, text referring to at least one object included in the at least one image, and at least one bounding box annotation associated with the at least one object, and performing, based on the training data set, one or more operations to generate a trained machine learning model to segment images based on text, where the one or more operations to generate the trained machine learning model include minimizing a loss function that comprises at least one of a multiple instance learning loss term or an energy loss term

Type: Application

Filed: October 31, 2022

Publication date: January 11, 2024

Inventors: Zhiding YU, Boyi LI, Chaowei XIAO, De-An HUANG, Weili NIE, Linxi FAN, Anima ANANDKUMAR
ESTIMATING OPTIMAL TRAINING DATA SET SIZE FOR MACHINE LEARNING MODEL SYSTEMS AND APPLICATIONS

Publication number: 20230385687

Abstract: Approaches for training data set size estimation for machine learning model systems and applications are described. Examples include a machine learning model training system that estimates target data requirements for training a machine learning model, given an approximate relationship between training data set size and model performance using one or more validation score estimation functions. To derive a validation score estimation function, a regression data set is generated from training data, and subsets of the regression data set are used to train the machine learning model. A validation score is computed for the subsets and used to compute regression function parameters to curve fit the selected regression function to the training data set. The validation score estimation function is then solved for and provides an output of an estimate of the number additional training samples needed for the validation score estimation function to meet or exceed a target validation score.

Type: Application

Filed: May 31, 2022

Publication date: November 30, 2023

Inventors: Rafid Reza Mahmood, James Robert Lucas, David Jesus Acuna Marrero, Daiqing Li, Jonah Philion, Jose Manuel Alvarez Lopez, Zhiding Yu, Sanja Fidler, Marc Law
ESTIMATING OPTIMAL TRAINING DATA SET SIZES FOR MACHINE LEARNING MODEL SYSTEMS AND APPLICATIONS

Publication number: 20230376849

Abstract: In various examples, estimating optimal training data set sizes for machine learning model systems and applications. Systems and methods are disclosed that estimate an amount of data to include in a training data set, where the training data set is then used to train one or more machine learning models to reach a target validation performance. To estimate the amount of training data, subsets of an initial training data set may be used to train the machine learning model(s) in order to determine estimates for the minimum amount of training data needed to train the machine learning model(s) to reach the target validation performance. The estimates may then be used to generate one or more functions, such as a cumulative density function and/or a probability density function, wherein the function(s) is then used to estimate the amount of training data needed to train the machine learning model(s).

Type: Application

Filed: May 16, 2023

Publication date: November 23, 2023

Inventors: Rafid Reza Mahmood, Marc Law, James Robert Lucas, Zhiding Yu, Jose Manuel Alvarez Lopez, Sanja Fidler
Image processing using coupled segmentation and edge learning

Patent number: 11790633

Abstract: The disclosure provides a learning framework that unifies both semantic segmentation and semantic edge detection. A learnable recurrent message passing layer is disclosed where semantic edges are considered as explicitly learned gating signals to refine segmentation and improve dense prediction quality by finding compact structures for message paths. The disclosure includes a method for coupled segmentation and edge learning. In one example, the method includes: (1) receiving an input image, (2) generating, from the input image, a semantic feature map, an affinity map, and a semantic edge map from a single backbone network of a convolutional neural network (CNN), and (3) producing a refined semantic feature map by smoothing pixels of the semantic feature map using spatial propagation, and controlling the smoothing using both affinity values from the affinity map and edge values from the semantic edge map.

Type: Grant

Filed: July 1, 2021

Date of Patent: October 17, 2023

Assignee: NVIDIA Corporation

Inventors: Zhiding Yu, Rui Huang, Wonmin Byeon, Sifei Liu, Guilin Liu, Thomas Breuel, Anima Anandkumar, Jan Kautz
ROBUST VISION TRANSFORMERS

Publication number: 20230290135

Abstract: Apparatuses, systems, and techniques to generate a robust representation of an image. In at least one embodiment, input tokens of an input image are received, and an inference about the input image is generated based on a vision transformer (ViT) system comprising at least one self-attention module to perform token mixing and a channel self-attention module to perform channel processing.

Type: Application

Filed: March 9, 2023

Publication date: September 14, 2023

Inventors: Daquan Zhou, Zhiding Yu, Enze Xie, Anima Anandkumar, Chaowei Xiao, Jose Manuel Alvarez Lopez
LEARNING DENSE CORRESPONDENCES FOR IMAGES

Publication number: 20230252692

Abstract: Embodiments of the present disclosure relate to learning dense correspondences for images. Systems and methods are disclosed that disentangle structure and texture (or style) representations of GAN synthesized images by learning a dense pixel-level correspondence map for each image during image synthesis. A canonical coordinate frame is defined and a structure latent code for each generated image is warped to align with the canonical coordinate frame. In sum, the structure associated with the latent code is mapped into a shared coordinate space (canonical coordinate space), thereby establishing correspondences in the shared coordinate space. A correspondence generation system receives the warped coordinate correspondences as an encoded image structure. The encoded image structure and a texture latent code are used to synthesize an image. The shared coordinate space enables propagation of semantic labels from reference images to synthesized images.

Type: Application

Filed: September 1, 2022

Publication date: August 10, 2023

Inventors: Sifei Liu, Jiteng Mu, Shalini De Mello, Zhiding Yu, Jan Kautz
TECHNIQUES TO USE A NEURAL NETWORK TO EXPAND AN IMAGE

Publication number: 20230186428

Abstract: Apparatuses, systems, and techniques for texture synthesis from small input textures in images using convolutional neural networks. In at least one embodiment, one or more convolutional layers are used in conjunction with one or more transposed convolution operations to generate a large textured output image from a small input textured image while preserving global features and texture, according to various novel techniques described herein.

Type: Application

Filed: February 6, 2023

Publication date: June 15, 2023

Inventors: Guilin Liu, Andrew Tao, Bryan Christopher Catanzaro, Ting-Chun Wang, Zhiding Yu, Shiqiu Liu, Fitsum Reda, Karan Sapra, Brandon Rowlett
LEARNING CONTRASTIVE REPRESENTATION FOR SEMANTIC CORRESPONDENCE

Publication number: 20230074706

Abstract: A multi-level contrastive training strategy for training a neural network relies on image pairs (no other labels) to learn semantic correspondences at the image level and region or pixel level. The neural network is trained using contrasting image pairs including different objects and corresponding image pairs including different views of the same object. Conceptually, contrastive training pulls corresponding image pairs closer and pushes contrasting image pairs apart. An image-level contrastive loss is computed from the outputs (predictions) of the neural network and used to update parameters (weights) of the neural network via backpropagation. The neural network is also trained via pixel-level contrastive learning using only image pairs. Pixel-level contrastive learning receives an image pair, where each image includes an object in a particular category.

Type: Application

Filed: August 25, 2021

Publication date: March 9, 2023

Inventors: Taihong Xiao, Sifei Liu, Shalini De Mello, Zhiding Yu, Jan Kautz
IMAGE PROCESSING USING COUPLED SEGMENTATION AND EDGE LEARNING

Publication number: 20230015989

Abstract: The disclosure provides a learning framework that unifies both semantic segmentation and semantic edge detection. A learnable recurrent message passing layer is disclosed where semantic edges are considered as explicitly learned gating signals to refine segmentation and improve dense prediction quality by finding compact structures for message paths. The disclosure includes a method for coupled segmentation and edge learning. In one example, the method includes: (1) receiving an input image, (2) generating, from the input image, a semantic feature map, an affinity map, and a semantic edge map from a single backbone network of a convolutional neural network (CNN), and (3) producing a refined semantic feature map by smoothing pixels of the semantic feature map using spatial propagation, and controlling the smoothing using both affinity values from the affinity map and edge values from the semantic edge map.

Type: Application

Filed: July 1, 2021

Publication date: January 19, 2023

Inventors: Zhiding Yu, Rui Huang, Wonmin Byeon, Sifei Liu, Guilin Liu, Thomas Breuel, Anima Anandkumar, Jan Kautz
AUTOMATIC LABELING AND SEGMENTATION USING MACHINE LEARNING MODELS

Publication number: 20220292306

Abstract: In various examples, training methods as described to generate a trained neural network that is robust to various environmental features. In an embodiment, training includes modifying images of a dataset and generating boundary boxes and/or other segmentation information for the modified images which is used to train a neural network.

Type: Application

Filed: March 15, 2021

Publication date: September 15, 2022

Inventors: Subhashree Radhakrishnan, Partha Sriram, Farzin Aghdasi, Seunghwan Cha, Zhiding Yu
USING NEURAL NETWORKS TO PERFORM OBJECT DETECTION, INSTANCE SEGMENTATION, AND SEMANTIC CORRESPONDENCE FROM BOUNDING BOX SUPERVISION

Publication number: 20220261593

Abstract: Apparatuses, systems, and techniques to train one or more neural networks. In at least one embodiment, one or more neural networks are trained to perform segmentation tasks based at least in part on training data comprising bounding box annotations.

Type: Application

Filed: February 16, 2021

Publication date: August 18, 2022

Inventors: Zhiding Yu, Shiyi Lan, Chris Choy, Subhashree Radhakrishnan, Guilin Liu, Yuke Zhu, Anima Anandkumar
Cross-domain image processing for object re-identification

Patent number: 11367268

Abstract: Object re-identification refers to a process by which images that contain an object of interest are retrieved from a set of images captured using disparate cameras or in disparate environments. Object re-identification has many useful applications, particularly as it is applied to people (e.g. person tracking). Current re-identification processes rely on convolutional neural networks (CNNs) that learn re-identification for a particular object class from labeled training data specific to a certain domain (e.g. environment), but that do not apply well in other domains. The present disclosure provides cross-domain disentanglement of id-related and id-unrelated factors. In particular, the disentanglement is performed using a labeled image set and an unlabeled image set, respectively captured from different domains but for a same object class.

Type: Grant

Filed: August 20, 2020

Date of Patent: June 21, 2022

Assignee: NVIDIA CORPORATION

Inventors: Xiaodong Yang, Yang Zou, Zhiding Yu, Jan Kautz
NEURAL NETWORK TRAINING TECHNIQUE

Publication number: 20210334644

Abstract: Apparatuses, systems, and techniques to train one or more neural networks. In at least one embodiment, one or more neural networks are trained based, at least in part, on inferencing output from one or more second neural networks.

Type: Application

Filed: April 27, 2020

Publication date: October 28, 2021

Inventors: Zhiding Yu, Wuyang Chen, Anima Anandkumar
TECHNIQUES TO USE A NEURAL NETWORK TO EXPAND AN IMAGE

Publication number: 20210279841

Abstract: Apparatuses, systems, and techniques for texture synthesis from small input textures in images using convolutional neural networks. In at least one embodiment, one or more convolutional layers are used in conjunction with one or more transposed convolution operations to generate a large textured output image from a small input textured image while preserving global features and texture, according to various novel techniques described herein.

Type: Application

Filed: March 9, 2020

Publication date: September 9, 2021

Inventors: Guilin Liu, Andrew Tao, Bryan Christopher Catanzaro, Ting-Chun Wang, Zhiding Yu, Shiqiu Liu, Fitsum Reda, Karan Sapra, Brandon Rowlett
CROSS-DOMAIN IMAGE PROCESSING FOR OBJECT RE-IDENTIFICATION

Publication number: 20210064907

Abstract: Object re-identification refers to a process by which images that contain an object of interest are retrieved from a set of images captured using disparate cameras or in disparate environments. Object re-identification has many useful applications, particularly as it is applied to people (e.g. person tracking). Current re-identification processes rely on convolutional neural networks (CNNs) that learn re-identification for a particular object class from labeled training data specific to a certain domain (e.g. environment), but that do not apply well in other domains. The present disclosure provides cross-domain disentanglement of id-related and id-unrelated factors. In particular, the disentanglement is performed using a labeled image set and an unlabeled image set, respectively captured from different domains but for a same object class.

Type: Application

Filed: August 20, 2020

Publication date: March 4, 2021

Inventors: Xiaodong Yang, Yang Zou, Zhiding Yu, Jan Kautz
WEAKLY-SUPERVISED OBJECT DETECTION USING ONE OR MORE NEURAL NETWORKS

Publication number: 20200394458

Abstract: Apparatuses, systems, and techniques to detect object in images including digital representations of those objects. In at least one embodiment, one or more objects are detected in an image based, at least in part, on one or more pseudo-labels corresponding to said one or more objects.

Type: Application

Filed: June 17, 2019

Publication date: December 17, 2020

Inventors: Zhiding Yu, Jason Ren, Xiaodong Yang, Ming-Yu Liu, Jan Kautz
IMAGE IDENTIFICATION USING NEURAL NETWORKS

Publication number: 20200302176

Abstract: A neural network is trained to perform a re-identification task in which it is determined whether one or more features present in a first image appear also in a second image. During training, a generative portion of one or more neural networks generates variations of an input image, and a discriminative portion of the one or more neural networks learns to perform the re-identification task based at least in part on the variations of the image. During training, the generative and discriminative portions of the one or more neural networks share an encoder which encodes information used by the generative and discriminative portions.

Type: Application

Filed: March 18, 2019

Publication date: September 24, 2020

Inventors: Xiaodong Yang, Zhedong Zheng, Zhiding Yu
Multi-label semantic boundary detection system

Patent number: 10410353

Abstract: A image processing system for multi-label semantic edge detection in an image includes an image interface to receive an image of a scene including at least one object, a memory to store a neural network trained for performing a multi-label edge classification of input images assigning each pixel of edges of objects in the input images into one or multiple semantic classes, a processor to transform the image into a multi-label edge-map using the neural network detecting an edge of the object in the image and assigning multiple semantic labels to at least some pixels forming the edge, and an output interface to render the multi-label edge-map.

Type: Grant

Filed: September 28, 2017

Date of Patent: September 10, 2019

Assignee: Mitsubishi Electric Research Laboratories, Inc.

Inventors: Chen Feng, Zhiding Yu, Srikumar Ramalingam
DOMAIN ADAPTATION VIA CLASS-BALANCED SELF-TRAINING WITH SPATIAL PRIORS

Publication number: 20190130220

Abstract: A vehicle, system and method of navigating a vehicle. The vehicle and system include a digital camera for capturing a target image of a target domain of the vehicle, and a processor. The processor is configured to: determine a target segmentation loss for training the neural network to perform semantic segmentation of a target image in a target domain, determine a value of a pseudo-label of the target image by reducing the target segmentation loss while providing aa supervision of the training over the target domain, perform semantic segmentation on the target image using the trained neural network to segment the target image and classify an object in the target image, and navigate the vehicle based on the classified object in the target image.

Type: Application

Filed: April 10, 2018

Publication date: May 2, 2019

Inventors: Yang Zou, Zhiding Yu, Vijayakumar Bhagavatula, Jinsong Wang

prev 1 2 3 next