Patents by Inventor Ming-Yu Liu

Ming-Yu Liu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250111588
    Abstract: Systems and methods of the present disclosure include interactive editing for generated three-dimensional (3D) models, such as those represented by neural radiance fields (NeRFs). A 3D model may be presented to a user in which the user may identify one or more localized regions for editing and/or modification. The localized regions may be selected and a corresponding 3D volume for that region may be provided to one or more generative networks, along with a prompt, to generate new content for the localized regions. Each of the original NeRF and the newly generated NeRF for the new content may then be combined into a single NeRF for a combined 3D representation with the original content and the localized modifications.
    Type: Application
    Filed: October 2, 2023
    Publication date: April 3, 2025
    Inventors: Karsten Julian Kreis, Maria Shugrina, Ming-Yu Liu, Or Perel, Sanja Fidler, Towaki Alan Takikawa, Tsung-Yi Lin, Xiaohui Zeng
  • Publication number: 20250111222
    Abstract: Performance of a neural network is usually a function of the capacity, or complexity, of the neural network, including the depth of the neural network (i.e. the number of layers in the neural network) and/or the width of the neural network (i.e. the number of hidden channels). However, improving performance of a neural network by simply increasing its capacity has drawbacks, the most notable being the increased computational cost of a higher-capacity neural network. Since modern neural networks are configured such that the same neural network is evaluated regardless of the input, a higher capacity neural network means a higher computational cost incurred per input processed. The present disclosure provides for a multi-layer neural network that allows for dynamic path selection through the neural network when processing an input, which in turn can allow for increased neural network capacity without incurring the typical increased computation cost associated therewith.
    Type: Application
    Filed: September 29, 2023
    Publication date: April 3, 2025
    Inventors: Zekun Hao, Ming-Yu Liu, Arun Mallya
  • Publication number: 20250061153
    Abstract: A generative model can be used for generation of spatial layouts and graphs. Such a model can progressively grow these layouts and graphs based on local statistics, where nodes can represent spatial control points of the layout, and edges can represent segments or paths between nodes, such as may correspond to road segments. A generative model can utilize an encoder-decoder architecture where the encoder is a recurrent neural network (RNN) that encodes local incoming paths into a node and the decoder is another RNN that generates outgoing nodes and edges connecting an existing node to the newly generated nodes. Generation is done iteratively, and can finish once all nodes are visited or another end condition is satisfied. Such a model can generate layouts by additionally conditioning on a set of attributes, giving control to a user in generating the layout.
    Type: Application
    Filed: November 1, 2024
    Publication date: February 20, 2025
    Inventors: Hang Chu, Daiqing Li, David Jesus Acuna Marrero, Amlan Kar, Maria Shugrina, Ming-Yu Liu, Antonio Torralba Barriuso, Sanja Fidler
  • Patent number: 12175350
    Abstract: In at least one embodiment, differentiable neural architecture search and reinforcement learning are combined under one framework to discover network architectures with desired properties such as high accuracy, low latency, or both. In at least one embodiment, an objective function for search based on generalization error prevents the selection of architectures prone to overfitting.
    Type: Grant
    Filed: September 10, 2019
    Date of Patent: December 24, 2024
    Assignee: NVIDIA Corporation
    Inventors: Arash Vahdat, Arun Mohanray Mallya, Ming-Yu Liu, Jan Kautz
  • Publication number: 20240406405
    Abstract: Systems and methods herein address reference frame selection in video streaming applications using one or more processing units to identify a frame of a sequence of frames as a blurred frame based at least in part on a first variance of motion (VoM) of the frame being less than or equal to an adaptive threshold that is based in part on a moving average of variance of motion (MAoV) determined using one or more reference frames.
    Type: Application
    Filed: August 8, 2024
    Publication date: December 5, 2024
    Inventors: Aurobinda Maharana, Vignesh Ungrapalli, Ming-Yu Liu
  • Publication number: 20240397077
    Abstract: Systems and methods herein address reference frame selection in video streaming applications using one or more processing units to decode a frame of an encoded video stream that uses an inter-frame depicting an object and an intra-frame depicting the object, the intra-frame being included in a set of intra-frames based at least in part on at least one attribute of the object as depicted in the intra-frame being different from the at least one attribute of the object as depicted in other intra-frames of the set of intra-frames.
    Type: Application
    Filed: July 22, 2024
    Publication date: November 28, 2024
    Inventors: Aurobinda Maharana, Arun Mallya, Ming-Yu Liu, Abhijit Patait
  • Publication number: 20240338871
    Abstract: One embodiment of a method includes applying a first generator model to a semantic representation of an image to generate an affine transformation, where the affine transformation represents a bounding box associated with at least one region within the image. The method further includes applying a second generator model to the affine transformation and the semantic representation to generate a shape of an object. The method further includes inserting the object into the image based on the bounding box and the shape.
    Type: Application
    Filed: June 18, 2024
    Publication date: October 10, 2024
    Inventors: Donghoom LEE, Sifei Liu, Jinwei Gu, Ming-Yu Liu, Jan Kautz
  • Publication number: 20240303494
    Abstract: A few-shot, unsupervised image-to-image translation (“FUNIT”) algorithm is disclosed that accepts as input images of previously-unseen target classes. These target classes are specified at inference time by only a few images, such as a single image or a pair of images, of an object of the target type. A FUNIT network can be trained using a data set containing images of many different object classes, in order to translate images from one class to another class by leveraging few input images of the target class. By learning to extract appearance patterns from the few input images for the translation task, the network learns a generalizable appearance pattern extractor that can be applied to images of unseen classes at translation time for a few-shot image-to-image translation task.
    Type: Application
    Filed: May 16, 2024
    Publication date: September 12, 2024
    Inventors: Ming-Yu LIU, Xun HUANG, Tero Tapani KARRAS, Timo AILA, Jaakko LEHTINEN
  • Publication number: 20240296627
    Abstract: In various examples, a deep three-dimensional (3D) conditional generative model is implemented that can synthesize high resolution 3D shapes using simple guides—such as coarse voxels, point clouds, etc.—by marrying implicit and explicit 3D representations into a hybrid 3D representation. The present approach may directly optimize for the reconstructed surface, allowing for the synthesis of finer geometric details with fewer artifacts. The systems and methods described herein may use a deformable tetrahedral grid that encodes a discretized signed distance function (SDF) and a differentiable marching tetrahedral layer that converts the implicit SDF representation to an explicit surface mesh representation. This combination allows joint optimization of the surface geometry and topology as well as generation of the hierarchy of subdivisions using reconstruction and adversarial losses defined explicitly on the surface mesh.
    Type: Application
    Filed: May 13, 2024
    Publication date: September 5, 2024
    Inventors: Tianchang Shen, Jun Gao, Kangxue Yin, Ming-Yu Liu, Sanja Fidler
  • Patent number: 12075061
    Abstract: Systems and methods herein address reference frame selection in video streaming applications using one or more processing units to identify a frame of a sequence of frames as a blurred frame based at least in part on a first variance of motion (VoM) of the frame being less than or equal to an adaptive threshold that is based in part on a moving average of variance of motion (MAoV) determined using one or more reference frames.
    Type: Grant
    Filed: September 29, 2022
    Date of Patent: August 27, 2024
    Assignee: Nvidia Corporation
    Inventors: Aurobinda Maharana, Vignesh Ungrapalli, Ming-Yu Liu
  • Publication number: 20240253217
    Abstract: Apparatuses, systems, and techniques to calculate a combined loss value based on applying one or more loss functions to the plurality of samples generated by a diffusion model to update the samples to determine a synthesized motions of one or more objects.
    Type: Application
    Filed: December 13, 2023
    Publication date: August 1, 2024
    Inventors: Arash Vahdat, Hongxu Yin, Jan Kautz, Jiaming Song, Ming-Yu Liu, Morteza Mardani, Qinsheng Zhang
  • Publication number: 20240257460
    Abstract: Apparatuses, systems, and techniques to generate pixels based on other pixels. In at least one embodiment, one or more neural networks are used to generate one or more pixels based, at least in part, on sets of pixels surrounding the one or more pixels.
    Type: Application
    Filed: November 18, 2022
    Publication date: August 1, 2024
    Inventors: Chen-Hsuan Lin, Zhaoshuo Li, Thomas Müller-Höhne, Alex John Bauld Evans, Ming-Yu Liu, Alexander Georg Keller
  • Patent number: 12047595
    Abstract: Systems and methods herein address reference frame selection in video streaming applications using one or more processing units to decode a frame of an encoded video stream that uses an inter-frame depicting an object and an intra-frame depicting the object, the intra-frame being included in a set of intra-frames based at least in part on at least one attribute of the object as depicted in the intra-frame being different from the at least one attribute of the object as depicted in other intra-frames of the set of intra-frames.
    Type: Grant
    Filed: September 29, 2022
    Date of Patent: July 23, 2024
    Assignee: Nvidia Corporation
    Inventors: Aurobinda Maharana, Arun Mallya, Ming-Yu Liu, Abhijit Patait
  • Publication number: 20240193887
    Abstract: Synthesis of high-quality 3D shapes with smooth surfaces has various creative and practical use cases, such as 3D content creation and CAD modeling. A vector field decoder neural network is trained to predict a generative vector field (GVF) representation of a 3D shape from a latent representation (latent code or feature volume) of the 3D shape. The GVF representation is agnostic to surface orientation, all dimensions of the vector field vary smoothly, the GVF can represent both watertight and non-watertight 3D shapes, and there is a one-to-one mapping between a predicted 3D shape and the ground truth 3D shape (i.e., the mapping is bijective). The vector field decoder can synthesize 3D shapes in multiple categories and can also synthesize 3D shapes for objects that were not included in the training dataset. In other words, the vector field decoder is also capable of zero-shot generation.
    Type: Application
    Filed: July 28, 2023
    Publication date: June 13, 2024
    Inventors: Zekun Hao, Ming-Yu Liu, Arun Mohanray Mallya
  • Publication number: 20240161250
    Abstract: Techniques are disclosed herein for generating a content item. The techniques include performing one or more first denoising operations based on an input and a first machine learning model to generate a first content item, and performing one or more second denoising operations based on the input, the first content item, and a second machine learning model to generate a second content item, where the first machine learning model is trained to denoise content items having an amount of corruption within a first corruption range, the second machine learning model is trained to denoise content items having an amount of corruption within a second corruption range, and the second corruption range is lower than the first corruption range.
    Type: Application
    Filed: October 11, 2023
    Publication date: May 16, 2024
    Inventors: Yogesh BALAJI, Timo Oskari AILA, Miika AITTALA, Bryan CATANZARO, Xun HUANG, Tero Tapani KARRAS, Karsten KREIS, Samuli LAINE, Ming-Yu LIU, Seungjun NAH, Jiaming SONG, Arash VAHDAT, Qinsheng ZHANG
  • Publication number: 20240161403
    Abstract: Text-to-image generation generally refers to the process of generating an image from one or more text prompts input by a user. While artificial intelligence has been a valuable tool for text-to-image generation, current artificial intelligence-based solutions are more limited as it relates to text-to-3D content creation. For example, these solutions are oftentimes category-dependent, or synthesize 3D content at a low resolution. The present disclosure provides a process and architecture for high-resolution text-to-3D content creation.
    Type: Application
    Filed: August 9, 2023
    Publication date: May 16, 2024
    Inventors: Chen-Hsuan Lin, Tsung-Yi Lin, Ming-Yu Liu, Sanja Fidler, Karsten Kreis, Luming Tang, Xiaohui Zeng, Jun Gao, Xun Huang, Towaki Takikawa
  • Patent number: 11983815
    Abstract: In various examples, a deep three-dimensional (3D) conditional generative model is implemented that can synthesize high resolution 3D shapes using simple guides—such as coarse voxels, point clouds, etc.—by marrying implicit and explicit 3D representations into a hybrid 3D representation. The present approach may directly optimize for the reconstructed surface, allowing for the synthesis of finer geometric details with fewer artifacts. The systems and methods described herein may use a deformable tetrahedral grid that encodes a discretized signed distance function (SDF) and a differentiable marching tetrahedral layer that converts the implicit SDF representation to an explicit surface mesh representation. This combination allows joint optimization of the surface geometry and topology as well as generation of the hierarchy of subdivisions using reconstruction and adversarial losses defined explicitly on the surface mesh.
    Type: Grant
    Filed: April 11, 2022
    Date of Patent: May 14, 2024
    Assignee: NVIDIA Corporation
    Inventors: Tianchang Shen, Jun Gao, Kangxue Yin, Ming-Yu Liu, Sanja Fidler
  • Publication number: 20240144568
    Abstract: Apparatuses, systems, and techniques are presented to generate digital content. In at least one embodiment, one or more neural networks are used to generate video information based at least in part upon voice information and a combination of image features and facial landmarks corresponding to one or more images of a person.
    Type: Application
    Filed: September 6, 2022
    Publication date: May 2, 2024
    Inventors: Siddharth Gururani, Arun Mallya, Ting-Chun Wang, Jose Rafael Valle da Costa, Ming-Yu Liu
  • Publication number: 20240114162
    Abstract: Systems and methods herein address reference frame selection in video streaming applications using one or more processing units to decode a frame of an encoded video stream that uses an inter-frame depicting an object and an intra-frame depicting the object, the intra-frame being included in a set of intra-frames based at least in part on at least one attribute of the object as depicted in the intra-frame being different from the at least one attribute of the object as depicted in other intra-frames of the set of intra-frames.
    Type: Application
    Filed: September 29, 2022
    Publication date: April 4, 2024
    Inventors: Aurobinda Maharana, Arun Mallya, Ming-Yu Liu, Abhijit Patait
  • Publication number: 20240114144
    Abstract: Systems and methods herein address reference frame selection in video streaming applications using one or more processing units to identify a frame of a sequence of frames as a blurred frame based at least in part on a first variance of motion (VoM) of the frame being less than or equal to an adaptive threshold that is based in part on a moving average of variance of motion (MAoV) determined using one or more reference frames.
    Type: Application
    Filed: September 29, 2022
    Publication date: April 4, 2024
    Inventors: Aurobinda Maharana, Vignesh Ungrapalli, Ming-Yu Liu