Patents by Inventor Ming-Yu Liu

Ming-Yu Liu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Speech-driven animation using one or more neural networks

Patent number: 12367628

Abstract: Apparatuses, systems, and techniques are presented to generate digital content. In at least one embodiment, one or more neural networks are used to generate video information based at least in part upon voice information and a combination of image features and facial landmarks corresponding to one or more images of a person.

Type: Grant

Filed: September 6, 2022

Date of Patent: July 22, 2025

Assignee: NVIDIA Corporation

Inventors: Siddharth Gururani, Arun Mallya, Ting-Chun Wang, Jose Rafael Valle da Costa, Ming-Yu Liu
Generating images of object motion using one or more neural networks

Patent number: 12333638

Abstract: Apparatuses, systems, and techniques are presented to reconstruct one or more images. In at least one embodiment, one or more neural networks are used to generate one or more images of one or more objects based, at least in part, on input indicating motion of the one or more objects.

Type: Grant

Filed: November 9, 2021

Date of Patent: June 17, 2025

Assignee: NVIDIA Corporaton

Inventors: Ting-Chun Wang, Tim Brooks, Ming-Yu Liu, Tero Karras, Jaakko Lehtinen
NEURAL NETWORKS TO GENERATE OBJECTS WITHIN DIFFERENT IMAGES

Publication number: 20250166237

Abstract: Apparatuses, processors, computing systems, devices, non-transitory computer medium, and/or methods for using neural networks for generating multiple related images. In at least one embodiment, a processor includes circuitry to use one or more neural networks to generate several images, where each image includes a same object (e.g., same subject) and different backgrounds. For example, a processor including one or more circuits to use one or more neural networks to generate one or more objects (e.g., an animal, a vehicle, a person) within two or more different images (e.g., different backgrounds such as weather, season, environment) based, at least in part, on one or more indications (e.g., text prompts) by one or more users indicating content of at least one of the two or more different images (e.g., objects and/or backgrounds for each image in text such as adjectives and nouns) other than the one or more objects.

Type: Application

Filed: November 22, 2023

Publication date: May 22, 2025

Inventors: Yu Zeng, Yogesh Balaji, Ting-Chun Wang, Xun Huang, Ming-Yu Liu
INTERACTIVE NEURAL FIELD EDITING IN CONTENT GENERATION SYSTEMS AND APPLICATIONS

Publication number: 20250111588

Abstract: Systems and methods of the present disclosure include interactive editing for generated three-dimensional (3D) models, such as those represented by neural radiance fields (NeRFs). A 3D model may be presented to a user in which the user may identify one or more localized regions for editing and/or modification. The localized regions may be selected and a corresponding 3D volume for that region may be provided to one or more generative networks, along with a prompt, to generate new content for the localized regions. Each of the original NeRF and the newly generated NeRF for the new content may then be combined into a single NeRF for a combined 3D representation with the original content and the localized modifications.

Type: Application

Filed: October 2, 2023

Publication date: April 3, 2025

Inventors: Karsten Julian Kreis, Maria Shugrina, Ming-Yu Liu, Or Perel, Sanja Fidler, Towaki Alan Takikawa, Tsung-Yi Lin, Xiaohui Zeng
DYNAMIC PATH SELECTION FOR PROCESSING THROUGH A MULTI-LAYER NEURAL NETWORK

Publication number: 20250111222

Abstract: Performance of a neural network is usually a function of the capacity, or complexity, of the neural network, including the depth of the neural network (i.e. the number of layers in the neural network) and/or the width of the neural network (i.e. the number of hidden channels). However, improving performance of a neural network by simply increasing its capacity has drawbacks, the most notable being the increased computational cost of a higher-capacity neural network. Since modern neural networks are configured such that the same neural network is evaluated regardless of the input, a higher capacity neural network means a higher computational cost incurred per input processed. The present disclosure provides for a multi-layer neural network that allows for dynamic path selection through the neural network when processing an input, which in turn can allow for increased neural network capacity without incurring the typical increased computation cost associated therewith.

Type: Application

Filed: September 29, 2023

Publication date: April 3, 2025

Inventors: Zekun Hao, Ming-Yu Liu, Arun Mallya
ITERATIVE SPATIAL GRAPH GENERATION

Publication number: 20250061153

Abstract: A generative model can be used for generation of spatial layouts and graphs. Such a model can progressively grow these layouts and graphs based on local statistics, where nodes can represent spatial control points of the layout, and edges can represent segments or paths between nodes, such as may correspond to road segments. A generative model can utilize an encoder-decoder architecture where the encoder is a recurrent neural network (RNN) that encodes local incoming paths into a node and the decoder is another RNN that generates outgoing nodes and edges connecting an existing node to the newly generated nodes. Generation is done iteratively, and can finish once all nodes are visited or another end condition is satisfied. Such a model can generate layouts by additionally conditioning on a set of attributes, giving control to a user in generating the layout.

Type: Application

Filed: November 1, 2024

Publication date: February 20, 2025

Inventors: Hang Chu, Daiqing Li, David Jesus Acuna Marrero, Amlan Kar, Maria Shugrina, Ming-Yu Liu, Antonio Torralba Barriuso, Sanja Fidler
Machine-learning-based architecture search method for a neural network

Patent number: 12175350

Abstract: In at least one embodiment, differentiable neural architecture search and reinforcement learning are combined under one framework to discover network architectures with desired properties such as high accuracy, low latency, or both. In at least one embodiment, an objective function for search based on generalization error prevents the selection of architectures prone to overfitting.

Type: Grant

Filed: September 10, 2019

Date of Patent: December 24, 2024

Assignee: NVIDIA Corporation

Inventors: Arash Vahdat, Arun Mohanray Mallya, Ming-Yu Liu, Jan Kautz
FRAME SELECTION FOR STREAMING APPLICATIONS

Publication number: 20240406405

Abstract: Systems and methods herein address reference frame selection in video streaming applications using one or more processing units to identify a frame of a sequence of frames as a blurred frame based at least in part on a first variance of motion (VoM) of the frame being less than or equal to an adaptive threshold that is based in part on a moving average of variance of motion (MAoV) determined using one or more reference frames.

Type: Application

Filed: August 8, 2024

Publication date: December 5, 2024

Inventors: Aurobinda Maharana, Vignesh Ungrapalli, Ming-Yu Liu
FRAME SELECTION FOR STREAMING APPLICATIONS

Publication number: 20240397077

Abstract: Systems and methods herein address reference frame selection in video streaming applications using one or more processing units to decode a frame of an encoded video stream that uses an inter-frame depicting an object and an intra-frame depicting the object, the intra-frame being included in a set of intra-frames based at least in part on at least one attribute of the object as depicted in the intra-frame being different from the at least one attribute of the object as depicted in other intra-frames of the set of intra-frames.

Type: Application

Filed: July 22, 2024

Publication date: November 28, 2024

Inventors: Aurobinda Maharana, Arun Mallya, Ming-Yu Liu, Abhijit Patait
CONTEXT-AWARE SYNTHESIS AND PLACEMENT OF OBJECT INSTANCES

Publication number: 20240338871

Abstract: One embodiment of a method includes applying a first generator model to a semantic representation of an image to generate an affine transformation, where the affine transformation represents a bounding box associated with at least one region within the image. The method further includes applying a second generator model to the affine transformation and the semantic representation to generate a shape of an object. The method further includes inserting the object into the image based on the bounding box and the shape.

Type: Application

Filed: June 18, 2024

Publication date: October 10, 2024

Inventors: Donghoom LEE, Sifei Liu, Jinwei Gu, Ming-Yu Liu, Jan Kautz
METHOD FOR FEW-SHOT UNSUPERVISED IMAGE-TO-IMAGE TRANSLATION

Publication number: 20240303494

Abstract: A few-shot, unsupervised image-to-image translation (“FUNIT”) algorithm is disclosed that accepts as input images of previously-unseen target classes. These target classes are specified at inference time by only a few images, such as a single image or a pair of images, of an object of the target type. A FUNIT network can be trained using a data set containing images of many different object classes, in order to translate images from one class to another class by leveraging few input images of the target class. By learning to extract appearance patterns from the few input images for the translation task, the network learns a generalizable appearance pattern extractor that can be applied to images of unseen classes at translation time for a few-shot image-to-image translation task.

Type: Application

Filed: May 16, 2024

Publication date: September 12, 2024

Inventors: Ming-Yu LIU, Xun HUANG, Tero Tapani KARRAS, Timo AILA, Jaakko LEHTINEN
SYNTHESIZING HIGH RESOLUTION 3D SHAPES FROM LOWER RESOLUTION REPRESENTATIONS FOR SYNTHETIC DATA GENERATION SYSTEMS AND APPLICATIONS

Publication number: 20240296627

Abstract: In various examples, a deep three-dimensional (3D) conditional generative model is implemented that can synthesize high resolution 3D shapes using simple guides—such as coarse voxels, point clouds, etc.—by marrying implicit and explicit 3D representations into a hybrid 3D representation. The present approach may directly optimize for the reconstructed surface, allowing for the synthesis of finer geometric details with fewer artifacts. The systems and methods described herein may use a deformable tetrahedral grid that encodes a discretized signed distance function (SDF) and a differentiable marching tetrahedral layer that converts the implicit SDF representation to an explicit surface mesh representation. This combination allows joint optimization of the surface geometry and topology as well as generation of the hierarchy of subdivisions using reconstruction and adversarial losses defined explicitly on the surface mesh.

Type: Application

Filed: May 13, 2024

Publication date: September 5, 2024

Inventors: Tianchang Shen, Jun Gao, Kangxue Yin, Ming-Yu Liu, Sanja Fidler
Frame selection for streaming applications

Patent number: 12075061

Abstract: Systems and methods herein address reference frame selection in video streaming applications using one or more processing units to identify a frame of a sequence of frames as a blurred frame based at least in part on a first variance of motion (VoM) of the frame being less than or equal to an adaptive threshold that is based in part on a moving average of variance of motion (MAoV) determined using one or more reference frames.

Type: Grant

Filed: September 29, 2022

Date of Patent: August 27, 2024

Assignee: Nvidia Corporation

Inventors: Aurobinda Maharana, Vignesh Ungrapalli, Ming-Yu Liu
NEURAL NETWORKS TO GENERATE PIXELS

Publication number: 20240257460

Abstract: Apparatuses, systems, and techniques to generate pixels based on other pixels. In at least one embodiment, one or more neural networks are used to generate one or more pixels based, at least in part, on sets of pixels surrounding the one or more pixels.

Type: Application

Filed: November 18, 2022

Publication date: August 1, 2024

Inventors: Chen-Hsuan Lin, Zhaoshuo Li, Thomas Müller-Höhne, Alex John Bauld Evans, Ming-Yu Liu, Alexander Georg Keller
LOSS-GUIDED DIFFUSION MODELS

Publication number: 20240253217

Abstract: Apparatuses, systems, and techniques to calculate a combined loss value based on applying one or more loss functions to the plurality of samples generated by a diffusion model to update the samples to determine a synthesized motions of one or more objects.

Type: Application

Filed: December 13, 2023

Publication date: August 1, 2024

Inventors: Arash Vahdat, Hongxu Yin, Jan Kautz, Jiaming Song, Ming-Yu Liu, Morteza Mardani, Qinsheng Zhang
Frame selection for streaming applications

Patent number: 12047595

Abstract: Systems and methods herein address reference frame selection in video streaming applications using one or more processing units to decode a frame of an encoded video stream that uses an inter-frame depicting an object and an intra-frame depicting the object, the intra-frame being included in a set of intra-frames based at least in part on at least one attribute of the object as depicted in the intra-frame being different from the at least one attribute of the object as depicted in other intra-frames of the set of intra-frames.

Type: Grant

Filed: September 29, 2022

Date of Patent: July 23, 2024

Assignee: Nvidia Corporation

Inventors: Aurobinda Maharana, Arun Mallya, Ming-Yu Liu, Abhijit Patait
NEURAL VECTOR FIELDS FOR 3D SHAPE GENERATION

Publication number: 20240193887

Abstract: Synthesis of high-quality 3D shapes with smooth surfaces has various creative and practical use cases, such as 3D content creation and CAD modeling. A vector field decoder neural network is trained to predict a generative vector field (GVF) representation of a 3D shape from a latent representation (latent code or feature volume) of the 3D shape. The GVF representation is agnostic to surface orientation, all dimensions of the vector field vary smoothly, the GVF can represent both watertight and non-watertight 3D shapes, and there is a one-to-one mapping between a predicted 3D shape and the ground truth 3D shape (i.e., the mapping is bijective). The vector field decoder can synthesize 3D shapes in multiple categories and can also synthesize 3D shapes for objects that were not included in the training dataset. In other words, the vector field decoder is also capable of zero-shot generation.

Type: Application

Filed: July 28, 2023

Publication date: June 13, 2024

Inventors: Zekun Hao, Ming-Yu Liu, Arun Mohanray Mallya
HIGH RESOLUTION TEXT-TO-3D CONTENT CREATION

Publication number: 20240161403

Abstract: Text-to-image generation generally refers to the process of generating an image from one or more text prompts input by a user. While artificial intelligence has been a valuable tool for text-to-image generation, current artificial intelligence-based solutions are more limited as it relates to text-to-3D content creation. For example, these solutions are oftentimes category-dependent, or synthesize 3D content at a low resolution. The present disclosure provides a process and architecture for high-resolution text-to-3D content creation.

Type: Application

Filed: August 9, 2023

Publication date: May 16, 2024

Inventors: Chen-Hsuan Lin, Tsung-Yi Lin, Ming-Yu Liu, Sanja Fidler, Karsten Kreis, Luming Tang, Xiaohui Zeng, Jun Gao, Xun Huang, Towaki Takikawa
TECHNIQUES FOR DENOISING DIFFUSION USING AN ENSEMBLE OF EXPERT DENOISERS

Publication number: 20240161250

Abstract: Techniques are disclosed herein for generating a content item. The techniques include performing one or more first denoising operations based on an input and a first machine learning model to generate a first content item, and performing one or more second denoising operations based on the input, the first content item, and a second machine learning model to generate a second content item, where the first machine learning model is trained to denoise content items having an amount of corruption within a first corruption range, the second machine learning model is trained to denoise content items having an amount of corruption within a second corruption range, and the second corruption range is lower than the first corruption range.

Type: Application

Filed: October 11, 2023

Publication date: May 16, 2024

Inventors: Yogesh BALAJI, Timo Oskari AILA, Miika AITTALA, Bryan CATANZARO, Xun HUANG, Tero Tapani KARRAS, Karsten KREIS, Samuli LAINE, Ming-Yu LIU, Seungjun NAH, Jiaming SONG, Arash VAHDAT, Qinsheng ZHANG
Synthesizing high resolution 3D shapes from lower resolution representations for synthetic data generation systems and applications

Patent number: 11983815

Abstract: In various examples, a deep three-dimensional (3D) conditional generative model is implemented that can synthesize high resolution 3D shapes using simple guides—such as coarse voxels, point clouds, etc.—by marrying implicit and explicit 3D representations into a hybrid 3D representation. The present approach may directly optimize for the reconstructed surface, allowing for the synthesis of finer geometric details with fewer artifacts. The systems and methods described herein may use a deformable tetrahedral grid that encodes a discretized signed distance function (SDF) and a differentiable marching tetrahedral layer that converts the implicit SDF representation to an explicit surface mesh representation. This combination allows joint optimization of the surface geometry and topology as well as generation of the hierarchy of subdivisions using reconstruction and adversarial losses defined explicitly on the surface mesh.

Type: Grant

Filed: April 11, 2022

Date of Patent: May 14, 2024

Assignee: NVIDIA Corporation

Inventors: Tianchang Shen, Jun Gao, Kangxue Yin, Ming-Yu Liu, Sanja Fidler

1 2 3 4 5 … next