Patents by Inventor Kevin Shih

Kevin Shih has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

NEURAL NETWORK ARCHITECTURE FOR IMPLICIT LEARNING OF A PARAMETRIC DISTRIBUTION OF DATA

Publication number: 20250111476

Abstract: Parametric distributions of data are one type of data model that can be used for various purposes such as for computer vision tasks that may include classification, segmentation, 3D reconstruction, etc. These parametric distributions of data may be computed from a given data set, which may be unstructured and/or which may include low-dimensional data. Current solutions for learning parametric distributions of data involve explicitly learning kernel parameters. However, this explicit learning approach is not only inefficient in that it requires a high computational cost (i.e. from a large number of floating point operations per second), but it also leaves room for improvement in terms of accuracy of the resulting learned model. The present disclosure provides a neural network architecture that implicitly learns a parametric distribution of data, which can reduce the computational cost while improve accuracy when compared with prior solutions that rely on the explicit learning design.

Type: Application

Filed: September 19, 2024

Publication date: April 3, 2025

Inventors: Benjamin David Eckart, Anthea Li, Chao Liu, Kevin Shih, Jan Kautz
Video prediction using one or more neural networks

Patent number: 11902705

Abstract: Apparatuses, systems, and techniques to enhance video are disclosed. In at least one embodiment, one or more neural networks are used to create, from a first video, a second video having one or more additional video frames.

Type: Grant

Filed: September 3, 2019

Date of Patent: February 13, 2024

Assignee: NVIDIA CORPORATION

Inventors: Kevin Shih, Aysegul Dundar, Animesh Garg, Robert Pottorff, Andrew Tao, Bryan Catanzaro
NORMALIZING FLOWS WITH NEURAL SPLINES FOR HIGH-QUALITY SPEECH SYNTHESIS

Publication number: 20240038212

Abstract: Disclosed are apparatuses, systems, and techniques that may use machine learning for implementing generative text-to-speech models. The techniques include identifying a mapping of speech characteristics (SC) on a target distribution of a latent variable using a non-linear transformation for at least a subset of the SC. Parameters of the non-linear transformation are determined using a neural network that approximates a statistics of the SC with a statistics predicted for the SC based on the identified mapping and the target distribution of the latent variable.

Type: Application

Filed: January 20, 2023

Publication date: February 1, 2024

Inventors: Kevin Shih, José Rafael Valle Gomes da Costa, Rohan Badlani, João Felipe Santos, Bryan Catanzaro
Unsupervised alignment for text to speech synthesis using neural networks

Patent number: 11869483

Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

Type: Grant

Filed: October 7, 2021

Date of Patent: January 9, 2024

Assignee: Nvidia Corporation

Inventors: Kevin Shih, Jose Rafael Valle Gomes da Costa, Rohan Badlani, Adrian Lancucki, Wei Ping, Bryan Catanzaro
UNSUPERVISED ALIGNMENT FOR TEXT TO SPEECH SYNTHESIS USING NEURAL NETWORKS

Publication number: 20230419947

Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

Type: Application

Filed: August 15, 2023

Publication date: December 28, 2023

Inventors: Kevin Shih, Jose Rafael Valle Gomes da Costa, Rohan Badlani, Adrian Lancucki, Wei Ping, Bryan Catanzaro
UNSUPERVISED ALIGNMENT FOR TEXT TO SPEECH SYNTHESIS USING NEURAL NETWORKS

Publication number: 20230402028

Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

Type: Application

Filed: August 28, 2023

Publication date: December 14, 2023

Inventors: Kevin Shih, Jose Rafael Valle Gomes da Costa, Rohan Badlani, Adrian Lancucki, Wei Ping, Bryan Catanzaro
Unsupervised alignment for text to speech synthesis using neural networks

Patent number: 11769481

Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

Type: Grant

Filed: October 7, 2021

Date of Patent: September 26, 2023

Assignee: Nvidia Corporation

Inventors: Kevin Shih, Jose Rafael Valle Gomes da Costa, Rohan Badlani, Adrian Lancucki, Wei Ping, Bryan Catanzaro
UNSUPERVISED ALIGNMENT FOR TEXT TO SPEECH SYNTHESIS USING NEURAL NETWORKS

Publication number: 20230110905

Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

Type: Application

Filed: October 7, 2021

Publication date: April 13, 2023

Inventors: Kevin Shih, Jose Rafael Valle Gomes da Costa, Rohan Badlani, Adrian Lancucki, Wei Ping, Bryan Catanzaro
UNSUPERVISED ALIGNMENT FOR TEXT TO SPEECH SYNTHESIS USING NEURAL NETWORKS

Publication number: 20230113950

Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

Type: Application

Filed: October 7, 2021

Publication date: April 13, 2023

Inventors: Kevin Shih, Jose Rafael Valle Gomes da Costa, Rohan Badlani, Adrian Lancucki, Wei Ping, Bryan Catanzaro
Systems and Methods for Ratiometric and Multiplexed Isothermal Amplification of Nucleic Acids

Publication number: 20220195509

Abstract: Systems and methods for isothermal amplification of nucleic acids in a ratiometric manner are disclosed. Methods of isothermal amplification of nucleic acids are convenient for areas that lack reliable electricity and/or funding for precision lab equipment. However, these methods typically result in non-ratiometric amplification of target sequences, thus preventing quantitative analysis or application of the results, such as for diagnostic testing. Systems and methods herein provide a method of isothermal amplification that ratiometrically amplifies target sequences, thus expanding the reach of diagnostics into remote and/or economically disadvantaged areas.

Type: Application

Filed: May 1, 2020

Publication date: June 23, 2022

Applicant: The Board of Trustees of the Leland Stanford Junior University

Inventors: Rhiju Das, Kevin Shih, Matthew Adrianowycz
UPSAMPLING AN IMAGE USING ONE OR MORE NEURAL NETWORKS

Publication number: 20220114700

Abstract: Apparatuses, systems, and techniques are presented to generate images. In at least one embodiment, one or more neural networks are used to generate one or more images using one or more pixel weights determined based, at least in part, on one or more sub-pixel offset values.

Type: Application

Filed: October 8, 2020

Publication date: April 14, 2022

Inventors: Shiqiu Liu, Robert Pottorff, Guilin Liu, Karan Sapra, Jon Barker, David Tarjan, Pekka Janis, Edvard Fagerholm, Lei Yang, Kevin Shih, Marco Salvi, Timo Roman, Andrew Tao, Bryan Catanzaro
UPSAMPLING AN IMAGE USING ONE OR MORE NEURAL NETWORKS

Publication number: 20220114701

Abstract: Apparatuses, systems, and techniques are presented to generate images. In at least one embodiment, one or more neural networks are used to generate one or more images using one or more pixel weights determined based, at least in part, on one or more sub-pixel offset values.

Type: Application

Filed: February 10, 2021

Publication date: April 14, 2022

Inventors: Shiqiu Liu, Robert Pottorff, Guilin Liu, Karan Sapra, Jon Barker, David Tarjan, Pekka Janis, Edvard Fagerholm, Lei Yang, Kevin Shih, Marco Salvi, Timo Roman, Andrew Tao, Bryan Catanzaro
VIDEO INTERPOLATION USING ONE OR MORE NEURAL NETWORKS

Publication number: 20210067735

Abstract: Apparatuses, systems, and techniques to enhance video. In at least one embodiment, one or more neural networks are used to create, from a first video, a second video having a higher frame rate, higher resolution, or reduced number of missing or corrupt video frames.

Type: Application

Filed: September 3, 2019

Publication date: March 4, 2021

Inventors: Fitsum Reda, Deqing Sun, Aysegul Dundar, Mohammad Shoeybi, Guilin Liu, Kevin Shih, Andrew Tao, Jan Kautz, Bryan Catanzaro
VIDEO PREDICTION USING ONE OR MORE NEURAL NETWORKS

Publication number: 20210064925

Abstract: Apparatuses, systems, and techniques to enhance video are disclosed. In at least one embodiment, one or more neural networks are used to create, from a first video, a second video having one or more additional video frames.

Type: Application

Filed: September 3, 2019

Publication date: March 4, 2021

Inventors: Kevin Shih, Aysegul Dundar, Animesh Garg, Robert Pottorff, Andrew Tao, Bryan Catanzaro
IMAGE IN-PAINTING FOR IRREGULAR HOLES USING PARTIAL CONVOLUTIONS

Publication number: 20190295228

Abstract: A neural network architecture is disclosed for performing image in-painting using partial convolution operations. The neural network processes an image and a corresponding mask that identifies holes in the image utilizing partial convolution operations, where the mask is used by the partial convolution operation to zero out coefficients of the convolution kernel corresponding to invalid pixel data for the holes. The mask is updated after each partial convolution operation is performed in an encoder section of the neural network. In one embodiment, the neural network is implemented using an encoder-decoder framework with skip links to forward representations of the features at different sections of the encoder to corresponding sections of the decoder.

Type: Application

Filed: March 21, 2019

Publication date: September 26, 2019

Inventors: Guilin Liu, Fitsum A. Reda, Kevin Shih, Ting-Chun Wang, Andrew Tao, Bryan Catanzaro
VIDEO PREDICTION USING SPATIALLY DISPLACED CONVOLUTION

Publication number: 20190297326

Abstract: A neural network architecture is disclosed for performing video frame prediction using a sequence of video frames and corresponding pairwise optical flows. The neural network processes the sequence of video frames and optical flows utilizing three-dimensional convolution operations, where time (or multiple video frames in the sequence of video frames) provides the third dimension in addition to the two-dimensional pixel space of the video frames. The neural network generates a set of parameters used to predict a next video frame in the sequence of video frames by sampling a previous video frame utilizing spatially-displaced convolution operations. In one embodiment, the set of parameters includes a displacement vector and at least one convolution kernel per pixel. Generating a pixel value in the next video frame includes applying the convolution kernel to a corresponding patch of pixels in the previous video frame based on the displacement vector.

Type: Application

Filed: March 21, 2019

Publication date: September 26, 2019

Inventors: Fitsum A. Reda, Guilin Liu, Kevin Shih, Robert Kirby, Jonathan Barker, David Tarjan, Andrew Tao, Bryan Catanzaro
Article handling device

Patent number: 10315860

Abstract: An article handling device is provided. In one embodiment, a system for article handling is provided. The system has first and second article handling devices that each have one or more gripping assemblies arranged around their periphery. The gripping assemblies are selectably movable between an open position and a closed position based on the relative position of the first and second article handling devices.

Type: Grant

Filed: May 18, 2017

Date of Patent: June 11, 2019

Assignee: The Procter and Gamble Company

Inventors: Jeffrey Kyle Werner, Robert Scott Bollinger, Cedric Dsouza, Kevin Shih Shaw
Method and apparatus for simplified device data collection

Patent number: 10142495

Abstract: A system and method for managed device data collection includes a data collector controller for control of monitoring activity of networked multifunction peripherals. A user interface includes a display rendering a plurality of processor rendered interactive user configuration screens. Displayed configuration screens solicit and receive corresponding user input. The configuring screens facilitate setting device user interaction including setting a network address, network connectivity testing, modification of device certificates, changing network settings, and a testing discovery, registration or data transfer mechanism for multifunction peripheral device data collection. A data storage stores user selection data received via rendered configuration screens and the processor outputs stored user selection data as configuration data for data collection from the multifunction peripherals.

Type: Grant

Filed: March 10, 2017

Date of Patent: November 27, 2018

Assignees: Kabushiki Kaisha Toshiba, Toshiba TEC Kabushiki Kaisha

Inventors: Kevin Shih, Dehua Zhao
METHOD AND APPARATUS FOR SIMPLIFIED DEVICE DATA COLLECTION

Publication number: 20180262629

Abstract: A system and method for managed device data collection includes a data collector controller for control of monitoring activity of networked multifunction peripherals. A user interface includes a display rendering a plurality of processor rendered interactive user configuration screens. Displayed configuration screens solicit and receive corresponding user input. The configuring screens facilitate setting device user interaction including setting a network address, network connectivity testing, modification of device certificates, changing network settings, and a testing discovery, registration or data transfer mechanism for multifunction peripheral device data collection. A data storage stores user selection data received via rendered configuration screens and the processor outputs stored user selection data as configuration data for data collection from the multifunction peripherals.

Type: Application

Filed: March 10, 2017

Publication date: September 13, 2018

Inventors: Kevin SHIH, Dehua ZHAO
ARTICLE HANDLING DEVICE

Publication number: 20170341878

Abstract: An article handling device is provided. In one embodiment, a system for article handling is provided. The system has first and second article handling devices that each have one or more gripping assemblies arranged around their periphery. The gripping assemblies are selectably movable between an open position and a closed position based on the relative position of the first and second article handling devices.

Type: Application

Filed: May 18, 2017

Publication date: November 30, 2017

Inventors: Jeffrey Kyle WERNER, Robert Scott BOLLINGER, Cedric DSOUZA, Kevin Shih SHAW

1 2 next