Patents by Inventor Bryan Catanzaro

Bryan Catanzaro has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

PIXEL BLENDING FOR NEURAL NETWORK-BASED IMAGE GENERATION

Publication number: 20260154862

Abstract: Apparatuses, systems, and techniques are presented to generate one or more images. In at least one embodiment, two or more pixels from two or more images are blended based, at least in part, on a distance of the two or more pixels from a region of the two or more images, in which pixel colors are substantially similar.

Type: Application

Filed: August 18, 2025

Publication date: June 4, 2026

Inventors: Robert Pottorff, Karan Sapra, Andrew Tao, Bryan Catanzaro, Jarmo Lunden
DATA PATH CIRCUIT DESIGN USING REINFORCEMENT LEARNING

Publication number: 20260134187

Abstract: Apparatuses, systems, and techniques for designing a data path circuit such as a parallel prefix circuit with reinforcement learning are described. A method can include receiving a first design state of a data path circuit, inputting the first design state of the data path circuit into a machine learning model, and performing reinforcement learning using the machine learning model to output a final design state of the data path circuit, wherein the final design state of the data path circuit has decreased area, power consumption and/or delay as compared to conventionally designed data path circuits.

Type: Application

Filed: August 8, 2025

Publication date: May 14, 2026

Inventors: Rajarshi Roy, Saad Godil, Jonathan Raiman, Neel Kant, Ilyas Elkin, Ming Y. Siu, Robert Kirby, Stuart Oberman, Bryan Catanzaro
ITERATIVE CODE GENERATION AND DEBUGGING PIPELINE USING LANGUAGE MODELS

Publication number: 20260093459

Abstract: Disclosed are apparatuses, systems, and techniques for automated iterative generation and debugging of a computer code (CC) using a language model (LM). The techniques include causing the LM to perform, responsive to a task prompt, iterative generation of the CC, an individual iteration causing the LM to (i) produce multiple evaluations of a previous faulty version of the CC, (ii) generate, responsive to the multiple evaluations, multiple modified versions of the CC, and (iii) automatically select, as an output of the individual iteration, a best performing, in view of one or more tests, version of the CC from the multiple modified versions of the CC.

Type: Application

Filed: September 27, 2024

Publication date: April 2, 2026

Inventors: Jialin Song, Jonathan Raiman, Bryan Catanzaro
NORMALIZING FLOWS WITH NEURAL SPLINES FOR HIGH-QUALITY SPEECH SYNTHESIS

Publication number: 20260080858

Abstract: Disclosed are apparatuses, systems, and techniques that may use machine learning for implementing generative text-to-speech models. The techniques include identifying a mapping of speech characteristics (SC) on a target distribution of a latent variable using a non-linear transformation for at least a subset of the SC. Parameters of the non-linear transformation are determined using a neural network that approximates a statistics of the SC with a statistics predicted for the SC based on the identified mapping and the target distribution of the latent variable.

Type: Application

Filed: November 21, 2025

Publication date: March 19, 2026

Inventors: Kevin Shih, José Rafael Valle Gomes da Costa, Rohan Badlani, João Felipe Santos, Bryan Catanzaro
Video prediction using one or more neural networks

Patent number: 12574471

Abstract: Apparatuses, systems, and techniques to enhance video are disclosed. In at least one embodiment, one or more neural networks are used to create, from a first video, a second video having one or more additional video frames.

Type: Grant

Filed: January 23, 2024

Date of Patent: March 10, 2026

Assignee: NVIDIA Corporation

Inventors: Kevin Shih, Aysegul Dundar, Animesh Garg, Robert Pottorff, Andrew Tao, Bryan Catanzaro
Upsampling an image using one or more neural networks

Patent number: 12573000

Abstract: Apparatuses, systems, and techniques are presented to generate images. In at least one embodiment, one or more neural networks are used to generate one or more images using one or more pixel weights determined based, at least in part, on one or more sub-pixel offset values.

Type: Grant

Filed: October 8, 2020

Date of Patent: March 10, 2026

Assignee: NVIDIA Corporation

Inventors: Shiqiu Liu, Robert Pottorff, Guilin Liu, Karan Sapra, Jon Barker, David Tarjan, Pekka Janis, Edvard Fagerholm, Lei Yang, Kevin Shih, Marco Salvi, Timo Roman, Andrew Tao, Bryan Catanzaro
IMAGE IN-PAINTING FOR IRREGULAR HOLES USING PARTIAL CONVOLUTIONS

Publication number: 20260052256

Abstract: A neural network architecture is disclosed for performing image in-painting using partial convolution operations. The neural network processes an image and a corresponding mask that identifies holes in the image utilizing partial convolution operations, where the mask is used by the partial convolution operation to zero out coefficients of the convolution kernel corresponding to invalid pixel data for the holes. The mask is updated after each partial convolution operation is performed in an encoder section of the neural network. In one embodiment, the neural network is implemented using an encoder-decoder framework with skip links to forward representations of the features at different sections of the encoder to corresponding sections of the decoder.

Type: Application

Filed: August 22, 2025

Publication date: February 19, 2026

Inventors: Guilin Liu, Fitsum A. Reda, Kevin Shih, Ting-Chun Wang, Andrew Tao, Bryan Catanzaro
Upsampling an image using one or more neural networks

Patent number: 12555186

Abstract: Apparatuses, systems, and techniques are presented to generate images. In at least one embodiment, one or more neural networks are used to generate one or more images using one or more pixel weights determined based, at least in part, on one or more sub-pixel offset values.

Type: Grant

Filed: February 10, 2021

Date of Patent: February 17, 2026

Assignee: NVIDIA Corporation

Inventors: Shiqiu Liu, Robert Pottorff, Guilin Liu, Karan Sapra, Jon Barker, David Tarjan, Pekka Janis, Edvard Fagerholm, Lei Yang, Kevin Shih, Marco Salvi, Timo Roman, Andrew Tao, Bryan Catanzaro
Image enhancement using one or more neural networks

Patent number: 12548113

Abstract: Apparatuses, systems, and techniques are presented to generate images with one or more visual effects applied. In at least one embodiment, one or more visual effects are applied to one or more images having a resolution that is less than a first resolution and those visual effects approximated for one or more images having a resolution that is greater than or equal to the first resolution.

Type: Grant

Filed: September 3, 2020

Date of Patent: February 10, 2026

Assignee: NVIDIA Corporation

Inventors: Robert Pottorff, David Tarjan, Andrew Tao, Bryan Catanzaro
TECHNIQUES FOR IMPLEMENTING MULTIMODAL LARGE LANGUAGE MODELS WITH MIXTURES OF VISION ENCODERS

Publication number: 20250384268

Abstract: The disclosed method for training multimodal models includes performing one or more operations to train a plurality of vision language models to generate a plurality of trained vision language models, where each trained vision language model included in the plurality of trained vision language models comprises a different vision encoder and a first language model, and performing one or more operations to train a multimodal model to generate a trained multimodal model, where the trained multimodal model comprises the different vision encoders and a second language model.

Type: Application

Filed: April 7, 2025

Publication date: December 18, 2025

Inventors: Guilin LIU, Zhiding YU, Min SHI, Fuxiao LIU, Shihao WANG, Shijia LIAO, Subhashree RADHAKRISHNAN, De-An HUANG, Hongxu YIN, Karan SAPRA, Bryan CATANZARO, Andrew J. TAO, Jan KAUTZ
TECHNIQUES FOR IMPLEMENTING MULTIMODAL LARGE LANGUAGE MODELS WITH MIXTURES OF VISION ENCODERS

Publication number: 20250384295

Abstract: The disclosed method for training multimodal models includes performing one or more operations to train a plurality of vision language models to generate a plurality of trained vision language models, where each trained vision language model included in the plurality of trained vision language models comprises a different vision encoder and a first language model, and performing one or more operations to train a multimodal model to generate a trained multimodal model, where the trained multimodal model comprises the different vision encoders and a second language model.

Type: Application

Filed: April 7, 2025

Publication date: December 18, 2025

Inventors: Guilin LIU, Zhiding YU, Min SHI, Fuxiao LIU, Shihao WANG, Shijia LIAO, Subhashree RADHAKRISHNAN, De-An HUANG, Hongxu YIN, Karan SAPRA, Bryan CATANZARO, Andrew J. TAO, Jan KAUTZ
Normalizing flows with neural splines for high-quality speech synthesis

Patent number: 12488778

Abstract: Disclosed are apparatuses, systems, and techniques that may use machine learning for implementing generative text-to-speech models. The techniques include identifying a mapping of speech characteristics (SC) on a target distribution of a latent variable using a non-linear transformation for at least a subset of the SC. Parameters of the non-linear transformation are determined using a neural network that approximates a statistics of the SC with a statistics predicted for the SC based on the identified mapping and the target distribution of the latent variable.

Type: Grant

Filed: January 20, 2023

Date of Patent: December 2, 2025

Assignee: NVIDIA Corporation

Inventors: Kevin Shih, José Rafael Valle Gomes da Costa, Rohan Badlani, João Felipe Santos, Bryan Catanzaro
Image in-painting for irregular holes using partial convolutions

Patent number: 12425605

Abstract: A neural network architecture is disclosed for performing image in-painting using partial convolution operations. The neural network processes an image and a corresponding mask that identifies holes in the image utilizing partial convolution operations, where the mask is used by the partial convolution operation to zero out coefficients of the convolution kernel corresponding to invalid pixel data for the holes. The mask is updated after each partial convolution operation is performed in an encoder section of the neural network. In one embodiment, the neural network is implemented using an encoder-decoder framework with skip links to forward representations of the features at different sections of the encoder to corresponding sections of the decoder.

Type: Grant

Filed: March 21, 2019

Date of Patent: September 23, 2025

Assignee: NVIDIA Corporation

Inventors: Guilin Liu, Fitsum A. Reda, Kevin Shih, Ting-Chun Wang, Andrew Tao, Bryan Catanzaro
Image enhancement using one or more neural networks

Patent number: 12400291

Abstract: Apparatuses, systems, and techniques are presented to generate images with one or more visual effects applied. In at least one embodiment, one or more visual effects are applied to one or more images having a resolution that is less than a first resolution and those visual effects approximated for one or more images having a resolution that is greater than or equal to the first resolution.

Type: Grant

Filed: August 19, 2021

Date of Patent: August 26, 2025

Assignee: NVIDIA CORPORATION

Inventors: Robert Pottorff, David Tarjan, Andrew Tao, Bryan Catanzaro
Pixel blending for neural network-based image generation

Patent number: 12394113

Abstract: Apparatuses, systems, and techniques are presented to generate one or more images. In at least one embodiment, two or more pixels from two or more images are blended based, at least in part, on a distance of the two or more pixels from a region of the two or more images, in which pixel colors are substantially similar.

Type: Grant

Filed: June 18, 2021

Date of Patent: August 19, 2025

Assignee: NVIDIA Corporation

Inventors: Robert Pottorff, Karan Sapra, Andrew Tao, Bryan Catanzaro, Jarmo Lunden
Data path circuit design using reinforcement learning

Patent number: 12387028

Abstract: Apparatuses, systems, and techniques for designing a data path circuit such as a parallel prefix circuit with reinforcement learning are described. A method can include receiving a first design state of a data path circuit, inputting the first design state of the data path circuit into a machine learning model, and performing reinforcement learning using the machine learning model to output a final design state of the data path circuit, wherein the final design state of the data path circuit has decreased area, power consumption and/or delay as compared to conventionally designed data path circuits.

Type: Grant

Filed: November 2, 2021

Date of Patent: August 12, 2025

Assignee: NVIDIA Corporation

Inventors: Rajarshi Roy, Saad Godil, Jonathan Raiman, Neel Kant, Ilyas Elkin, Ming Y. Siu, Robert Kirby, Stuart Oberman, Bryan Catanzaro
Image enhancement using one or more neural networks

Patent number: 12373916

Abstract: Apparatuses, systems, and techniques are presented to generate images with one or more visual effects applied. In at least one embodiment, one or more visual effects are applied to one or more images having a resolution that is less than a first resolution and those visual effects approximated for one or more images having a resolution that is greater than or equal to the first resolution.

Type: Grant

Filed: August 19, 2021

Date of Patent: July 29, 2025

Assignee: NVIDIA CORPORATION

Inventors: Robert Pottorff, David Tarjan, Andrew Tao, Bryan Catanzaro
SYNTHESIZING SPEECH IN MULTIPLE LANGUAGES IN CONVERSATIONAL AI SYSTEMS AND APPLICATIONS

Publication number: 20250118286

Abstract: In various examples, synthesizing speech in multiple languages in conversational AI systems and applications is described herein. Systems and methods are disclosed that use one or more models to synthesize speech from a first language spoken by a speaker to a second, target language selected by the speaker. In some examples, to perform the translation, the model(s) may disentangle one or more attributes associated with speech from speakers, such as speakers' identities, speakers' accents, and text associated with the speech. Additionally, the model(s) may allow for fine-grained control of additional attributes associated with output speech, such as one or more frequencies, one or more energies, and one or more phoneme durations. Furthermore, the model(s) may be configured to use the accent associated with the target language when generating text, such as when aligning text encodings with one or more phonemes.

Type: Application

Filed: October 9, 2023

Publication date: April 10, 2025

Inventors: Rohan Badlani, José Rafael Valle Gomves da Costa, Kevin Jonathan Shih, Bryan Catanzaro
TECHNIQUES FOR DENOISING DIFFUSION USING AN ENSEMBLE OF EXPERT DENOISERS

Publication number: 20240161250

Abstract: Techniques are disclosed herein for generating a content item. The techniques include performing one or more first denoising operations based on an input and a first machine learning model to generate a first content item, and performing one or more second denoising operations based on the input, the first content item, and a second machine learning model to generate a second content item, where the first machine learning model is trained to denoise content items having an amount of corruption within a first corruption range, the second machine learning model is trained to denoise content items having an amount of corruption within a second corruption range, and the second corruption range is lower than the first corruption range.

Type: Application

Filed: October 11, 2023

Publication date: May 16, 2024

Inventors: Yogesh BALAJI, Timo Oskari AILA, Miika AITTALA, Bryan CATANZARO, Xun HUANG, Tero Tapani KARRAS, Karsten KREIS, Samuli LAINE, Ming-Yu LIU, Seungjun NAH, Jiaming SONG, Arash VAHDAT, Qinsheng ZHANG
NEURAL NETWORK-BASED LANGUAGE RESTRICTION

Publication number: 20240095447

Abstract: Apparatuses, systems, and techniques are presented to identify and prevent generation of restricted content. In at least one embodiment, one or more neural networks are used to identify restricted content based only on the restricted content.

Type: Application

Filed: June 22, 2022

Publication date: March 21, 2024

Inventors: Wei Ping, Boxin Wang, Chaowei Xiao, Mohammad Shoeybi, Mostofa Patwary, Anima Anandkumar, Bryan Catanzaro

1 2 3 4 next