Patents by Inventor Zhiding Yu

Zhiding Yu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240037756
    Abstract: Apparatuses, systems, and techniques to track one or more objects in one or more frames of a video. In at least one embodiment, one or more objects in one or more frames of a video are tracked based on, for example, one or more sets of embeddings.
    Type: Application
    Filed: May 5, 2023
    Publication date: February 1, 2024
    Inventors: De-An Huang, Zhiding Yu, Anima Anandkumar
  • Publication number: 20240013504
    Abstract: One embodiment of a method for training a machine learning model includes receiving a training data set that includes at least one image, text referring to at least one object included in the at least one image, and at least one bounding box annotation associated with the at least one object, and performing, based on the training data set, one or more operations to generate a trained machine learning model to segment images based on text, where the one or more operations to generate the trained machine learning model include minimizing a loss function that comprises at least one of a multiple instance learning loss term or an energy loss term
    Type: Application
    Filed: October 31, 2022
    Publication date: January 11, 2024
    Inventors: Zhiding YU, Boyi LI, Chaowei XIAO, De-An HUANG, Weili NIE, Linxi FAN, Anima ANANDKUMAR
  • Publication number: 20230385687
    Abstract: Approaches for training data set size estimation for machine learning model systems and applications are described. Examples include a machine learning model training system that estimates target data requirements for training a machine learning model, given an approximate relationship between training data set size and model performance using one or more validation score estimation functions. To derive a validation score estimation function, a regression data set is generated from training data, and subsets of the regression data set are used to train the machine learning model. A validation score is computed for the subsets and used to compute regression function parameters to curve fit the selected regression function to the training data set. The validation score estimation function is then solved for and provides an output of an estimate of the number additional training samples needed for the validation score estimation function to meet or exceed a target validation score.
    Type: Application
    Filed: May 31, 2022
    Publication date: November 30, 2023
    Inventors: Rafid Reza Mahmood, James Robert Lucas, David Jesus Acuna Marrero, Daiqing Li, Jonah Philion, Jose Manuel Alvarez Lopez, Zhiding Yu, Sanja Fidler, Marc Law
  • Publication number: 20230376849
    Abstract: In various examples, estimating optimal training data set sizes for machine learning model systems and applications. Systems and methods are disclosed that estimate an amount of data to include in a training data set, where the training data set is then used to train one or more machine learning models to reach a target validation performance. To estimate the amount of training data, subsets of an initial training data set may be used to train the machine learning model(s) in order to determine estimates for the minimum amount of training data needed to train the machine learning model(s) to reach the target validation performance. The estimates may then be used to generate one or more functions, such as a cumulative density function and/or a probability density function, wherein the function(s) is then used to estimate the amount of training data needed to train the machine learning model(s).
    Type: Application
    Filed: May 16, 2023
    Publication date: November 23, 2023
    Inventors: Rafid Reza Mahmood, Marc Law, James Robert Lucas, Zhiding Yu, Jose Manuel Alvarez Lopez, Sanja Fidler
  • Patent number: 11790633
    Abstract: The disclosure provides a learning framework that unifies both semantic segmentation and semantic edge detection. A learnable recurrent message passing layer is disclosed where semantic edges are considered as explicitly learned gating signals to refine segmentation and improve dense prediction quality by finding compact structures for message paths. The disclosure includes a method for coupled segmentation and edge learning. In one example, the method includes: (1) receiving an input image, (2) generating, from the input image, a semantic feature map, an affinity map, and a semantic edge map from a single backbone network of a convolutional neural network (CNN), and (3) producing a refined semantic feature map by smoothing pixels of the semantic feature map using spatial propagation, and controlling the smoothing using both affinity values from the affinity map and edge values from the semantic edge map.
    Type: Grant
    Filed: July 1, 2021
    Date of Patent: October 17, 2023
    Assignee: NVIDIA Corporation
    Inventors: Zhiding Yu, Rui Huang, Wonmin Byeon, Sifei Liu, Guilin Liu, Thomas Breuel, Anima Anandkumar, Jan Kautz
  • Publication number: 20230290135
    Abstract: Apparatuses, systems, and techniques to generate a robust representation of an image. In at least one embodiment, input tokens of an input image are received, and an inference about the input image is generated based on a vision transformer (ViT) system comprising at least one self-attention module to perform token mixing and a channel self-attention module to perform channel processing.
    Type: Application
    Filed: March 9, 2023
    Publication date: September 14, 2023
    Inventors: Daquan Zhou, Zhiding Yu, Enze Xie, Anima Anandkumar, Chaowei Xiao, Jose Manuel Alvarez Lopez
  • Publication number: 20230252692
    Abstract: Embodiments of the present disclosure relate to learning dense correspondences for images. Systems and methods are disclosed that disentangle structure and texture (or style) representations of GAN synthesized images by learning a dense pixel-level correspondence map for each image during image synthesis. A canonical coordinate frame is defined and a structure latent code for each generated image is warped to align with the canonical coordinate frame. In sum, the structure associated with the latent code is mapped into a shared coordinate space (canonical coordinate space), thereby establishing correspondences in the shared coordinate space. A correspondence generation system receives the warped coordinate correspondences as an encoded image structure. The encoded image structure and a texture latent code are used to synthesize an image. The shared coordinate space enables propagation of semantic labels from reference images to synthesized images.
    Type: Application
    Filed: September 1, 2022
    Publication date: August 10, 2023
    Inventors: Sifei Liu, Jiteng Mu, Shalini De Mello, Zhiding Yu, Jan Kautz
  • Publication number: 20230186428
    Abstract: Apparatuses, systems, and techniques for texture synthesis from small input textures in images using convolutional neural networks. In at least one embodiment, one or more convolutional layers are used in conjunction with one or more transposed convolution operations to generate a large textured output image from a small input textured image while preserving global features and texture, according to various novel techniques described herein.
    Type: Application
    Filed: February 6, 2023
    Publication date: June 15, 2023
    Inventors: Guilin Liu, Andrew Tao, Bryan Christopher Catanzaro, Ting-Chun Wang, Zhiding Yu, Shiqiu Liu, Fitsum Reda, Karan Sapra, Brandon Rowlett
  • Publication number: 20230074706
    Abstract: A multi-level contrastive training strategy for training a neural network relies on image pairs (no other labels) to learn semantic correspondences at the image level and region or pixel level. The neural network is trained using contrasting image pairs including different objects and corresponding image pairs including different views of the same object. Conceptually, contrastive training pulls corresponding image pairs closer and pushes contrasting image pairs apart. An image-level contrastive loss is computed from the outputs (predictions) of the neural network and used to update parameters (weights) of the neural network via backpropagation. The neural network is also trained via pixel-level contrastive learning using only image pairs. Pixel-level contrastive learning receives an image pair, where each image includes an object in a particular category.
    Type: Application
    Filed: August 25, 2021
    Publication date: March 9, 2023
    Inventors: Taihong Xiao, Sifei Liu, Shalini De Mello, Zhiding Yu, Jan Kautz
  • Publication number: 20230015989
    Abstract: The disclosure provides a learning framework that unifies both semantic segmentation and semantic edge detection. A learnable recurrent message passing layer is disclosed where semantic edges are considered as explicitly learned gating signals to refine segmentation and improve dense prediction quality by finding compact structures for message paths. The disclosure includes a method for coupled segmentation and edge learning. In one example, the method includes: (1) receiving an input image, (2) generating, from the input image, a semantic feature map, an affinity map, and a semantic edge map from a single backbone network of a convolutional neural network (CNN), and (3) producing a refined semantic feature map by smoothing pixels of the semantic feature map using spatial propagation, and controlling the smoothing using both affinity values from the affinity map and edge values from the semantic edge map.
    Type: Application
    Filed: July 1, 2021
    Publication date: January 19, 2023
    Inventors: Zhiding Yu, Rui Huang, Wonmin Byeon, Sifei Liu, Guilin Liu, Thomas Breuel, Anima Anandkumar, Jan Kautz
  • Publication number: 20220292306
    Abstract: In various examples, training methods as described to generate a trained neural network that is robust to various environmental features. In an embodiment, training includes modifying images of a dataset and generating boundary boxes and/or other segmentation information for the modified images which is used to train a neural network.
    Type: Application
    Filed: March 15, 2021
    Publication date: September 15, 2022
    Inventors: Subhashree Radhakrishnan, Partha Sriram, Farzin Aghdasi, Seunghwan Cha, Zhiding Yu
  • Publication number: 20220261593
    Abstract: Apparatuses, systems, and techniques to train one or more neural networks. In at least one embodiment, one or more neural networks are trained to perform segmentation tasks based at least in part on training data comprising bounding box annotations.
    Type: Application
    Filed: February 16, 2021
    Publication date: August 18, 2022
    Inventors: Zhiding Yu, Shiyi Lan, Chris Choy, Subhashree Radhakrishnan, Guilin Liu, Yuke Zhu, Anima Anandkumar
  • Patent number: 11367268
    Abstract: Object re-identification refers to a process by which images that contain an object of interest are retrieved from a set of images captured using disparate cameras or in disparate environments. Object re-identification has many useful applications, particularly as it is applied to people (e.g. person tracking). Current re-identification processes rely on convolutional neural networks (CNNs) that learn re-identification for a particular object class from labeled training data specific to a certain domain (e.g. environment), but that do not apply well in other domains. The present disclosure provides cross-domain disentanglement of id-related and id-unrelated factors. In particular, the disentanglement is performed using a labeled image set and an unlabeled image set, respectively captured from different domains but for a same object class.
    Type: Grant
    Filed: August 20, 2020
    Date of Patent: June 21, 2022
    Assignee: NVIDIA CORPORATION
    Inventors: Xiaodong Yang, Yang Zou, Zhiding Yu, Jan Kautz
  • Publication number: 20210334644
    Abstract: Apparatuses, systems, and techniques to train one or more neural networks. In at least one embodiment, one or more neural networks are trained based, at least in part, on inferencing output from one or more second neural networks.
    Type: Application
    Filed: April 27, 2020
    Publication date: October 28, 2021
    Inventors: Zhiding Yu, Wuyang Chen, Anima Anandkumar
  • Publication number: 20210279841
    Abstract: Apparatuses, systems, and techniques for texture synthesis from small input textures in images using convolutional neural networks. In at least one embodiment, one or more convolutional layers are used in conjunction with one or more transposed convolution operations to generate a large textured output image from a small input textured image while preserving global features and texture, according to various novel techniques described herein.
    Type: Application
    Filed: March 9, 2020
    Publication date: September 9, 2021
    Inventors: Guilin Liu, Andrew Tao, Bryan Christopher Catanzaro, Ting-Chun Wang, Zhiding Yu, Shiqiu Liu, Fitsum Reda, Karan Sapra, Brandon Rowlett
  • Publication number: 20210064907
    Abstract: Object re-identification refers to a process by which images that contain an object of interest are retrieved from a set of images captured using disparate cameras or in disparate environments. Object re-identification has many useful applications, particularly as it is applied to people (e.g. person tracking). Current re-identification processes rely on convolutional neural networks (CNNs) that learn re-identification for a particular object class from labeled training data specific to a certain domain (e.g. environment), but that do not apply well in other domains. The present disclosure provides cross-domain disentanglement of id-related and id-unrelated factors. In particular, the disentanglement is performed using a labeled image set and an unlabeled image set, respectively captured from different domains but for a same object class.
    Type: Application
    Filed: August 20, 2020
    Publication date: March 4, 2021
    Inventors: Xiaodong Yang, Yang Zou, Zhiding Yu, Jan Kautz
  • Publication number: 20200394458
    Abstract: Apparatuses, systems, and techniques to detect object in images including digital representations of those objects. In at least one embodiment, one or more objects are detected in an image based, at least in part, on one or more pseudo-labels corresponding to said one or more objects.
    Type: Application
    Filed: June 17, 2019
    Publication date: December 17, 2020
    Inventors: Zhiding Yu, Jason Ren, Xiaodong Yang, Ming-Yu Liu, Jan Kautz
  • Publication number: 20200302176
    Abstract: A neural network is trained to perform a re-identification task in which it is determined whether one or more features present in a first image appear also in a second image. During training, a generative portion of one or more neural networks generates variations of an input image, and a discriminative portion of the one or more neural networks learns to perform the re-identification task based at least in part on the variations of the image. During training, the generative and discriminative portions of the one or more neural networks share an encoder which encodes information used by the generative and discriminative portions.
    Type: Application
    Filed: March 18, 2019
    Publication date: September 24, 2020
    Inventors: Xiaodong Yang, Zhedong Zheng, Zhiding Yu
  • Patent number: 10410353
    Abstract: A image processing system for multi-label semantic edge detection in an image includes an image interface to receive an image of a scene including at least one object, a memory to store a neural network trained for performing a multi-label edge classification of input images assigning each pixel of edges of objects in the input images into one or multiple semantic classes, a processor to transform the image into a multi-label edge-map using the neural network detecting an edge of the object in the image and assigning multiple semantic labels to at least some pixels forming the edge, and an output interface to render the multi-label edge-map.
    Type: Grant
    Filed: September 28, 2017
    Date of Patent: September 10, 2019
    Assignee: Mitsubishi Electric Research Laboratories, Inc.
    Inventors: Chen Feng, Zhiding Yu, Srikumar Ramalingam
  • Publication number: 20190130220
    Abstract: A vehicle, system and method of navigating a vehicle. The vehicle and system include a digital camera for capturing a target image of a target domain of the vehicle, and a processor. The processor is configured to: determine a target segmentation loss for training the neural network to perform semantic segmentation of a target image in a target domain, determine a value of a pseudo-label of the target image by reducing the target segmentation loss while providing aa supervision of the training over the target domain, perform semantic segmentation on the target image using the trained neural network to segment the target image and classify an object in the target image, and navigate the vehicle based on the classified object in the target image.
    Type: Application
    Filed: April 10, 2018
    Publication date: May 2, 2019
    Inventors: Yang Zou, Zhiding Yu, Vijayakumar Bhagavatula, Jinsong Wang