Patents by Inventor Ming-Hsuan Yang

Ming-Hsuan Yang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Real time perspective correction on faces

Patent number: 11132800

Abstract: Apparatus and methods related to image processing are provided. A computing device can determine a first image area of an image, such as an image captured by a camera. The computing device can determine a warping mesh for the image with a first portion of the warping mesh associated with the first image area. The computing device can determine a cost function for the warping mesh by: determining first costs associated with the first portion of the warping mesh that include costs associated with face-related transformations of the first image area to correct geometric distortions. The computing device can determine an optimized mesh based on optimizing the cost function. The computing device can modify the first image area based on the optimized mesh.

Type: Grant

Filed: October 2, 2019

Date of Patent: September 28, 2021

Assignee: Google LLC

Inventors: Yichang Shih, Chia-Kai Liang, Wei-Sheng Lai, Ming-Hsuan Yang, Siargey Pisarchyk, Ryhor Karpiak
Using residual video data resulting from a compression of original video data to improve a decompression of the original video data

Patent number: 11082720

Abstract: A method, computer readable medium, and system are disclosed for identifying residual video data. This data describes data that is lost during a compression of original video data. For example, the original video data may be compressed and then decompressed, and this result may be compared to the original video data to determine the residual video data. This residual video data is transformed into a smaller format by means of encoding, binarizing, and compressing, and is sent to a destination. At the destination, the residual video data is transformed back into its original format and is used during the decompression of the compressed original video data to improve a quality of the decompressed original video data.

Type: Grant

Filed: November 14, 2018

Date of Patent: August 3, 2021

Assignee: NVIDIA CORPORATION

Inventors: Yi-Hsuan Tsai, Ming-Yu Liu, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
Computer Vision Systems and Methods for Diverse Image-to-Image Translation Via Disentangled Representations

Publication number: 20210224947

Abstract: Computer vision systems and methods for image to image translation are provided. The system receives a first input image and a second input image and applies a content adversarial loss function to the first input image and the second input image to determine a disentanglement representation of the first input image and a disentanglement representation of the second input image. The system trains a network to generate at least one output image by applying a cross cycle consistency loss function to the first disentanglement representation and the second disentanglement representation to perform multimodal mapping between the first input image and the second input image.

Type: Application

Filed: January 19, 2021

Publication date: July 22, 2021

Applicant: Insurance Services Office, Inc.

Inventors: Hsin-Ying Lee, Hung-Yu Tseng, Jia-Bin Huang, Maneesh Kumar Singh, Ming-Hsuan Yang
RESOURCE CONSTRAINED NEURAL NETWORK ARCHITECTURE SEARCH

Publication number: 20210056378

Abstract: Methods, and systems, including computer programs encoded on computer storage media for neural network architecture search.

Type: Application

Filed: August 23, 2019

Publication date: February 25, 2021

Inventors: Ming-Hsuan Yang, Xiaojie Jin, Joshua Foster Slocum, Shengyang Dai, Jiang Wang
Real Time Perspective Correction on Faces

Publication number: 20210035307

Abstract: Apparatus and methods related to image processing are provided. A computing device can determine a first image area of an image, such as an image captured by a camera. The computing device can determine a warping mesh for the image with a first portion of the warping mesh associated with the first image area. The computing device can determine a cost function for the warping mesh by: determining first costs associated with the first portion of the warping mesh that include costs associated with face-related transformations of the first image area to correct geometric distortions. The computing device can determine an optimized mesh based on optimizing the cost function. The computing device can modify the first image area based on the optimized mesh.

Type: Application

Filed: October 2, 2019

Publication date: February 4, 2021

Inventors: Yichang Shih, Chia-Kai Liang, Wei-Sheng Lai, Ming-Hsuan Yang, Siargey Pisarchyk, Ryhor Karpiak
SYSTEM AND METHOD FOR PROVIDING UNSUPERVISED DOMAIN ADAPTATION FOR SPATIO-TEMPORAL ACTION LOCALIZATION

Publication number: 20210027066

Abstract: A system and method for providing unsupervised domain adaption for spatio-temporal action localization that includes receiving video data associated with a surrounding environment of a vehicle. The system and method also include completing an action localization model to model a temporal context of actions occurring within the surrounding environment of the vehicle based on the video data and completing an action adaption model to localize individuals and their actions and to classify the actions based on the video data. The system and method further include combining losses from the action localization model and the action adaption model to complete spatio-temporal action localization of individuals and actions that occur within the surrounding environment of the vehicle.

Type: Application

Filed: February 28, 2020

Publication date: January 28, 2021

Inventors: Yi-Ting Chen, Behzad Dariush, Nakul Agarwal, Ming-Hsuan Yang
Photorealistic image stylization using a neural network model

Patent number: 10872399

Abstract: Photorealistic image stylization concerns transferring style of a reference photo to a content photo with the constraint that the stylized photo should remain photorealistic. Examples of styles include seasons (summer, winter, etc.), weather (sunny, rainy, foggy, etc.), lighting (daytime, nighttime, etc.). A photorealistic image stylization process includes a stylization step and a smoothing step. The stylization step transfers the style of the reference photo to the content photo. A photo style transfer neural network model receives a photorealistic content image and a photorealistic style image and generates an intermediate stylized photorealistic image that includes the content of the content image modified according to the style image. A smoothing function receives the intermediate stylized photorealistic image and pixel similarity data and generates the stylized photorealistic image, ensuring spatially consistent stylizations.

Type: Grant

Filed: January 11, 2019

Date of Patent: December 22, 2020

Assignee: NVIDIA Corporation

Inventors: Yijun Li, Ming-Yu Liu, Ming-Hsuan Yang, Jan Kautz
TRAINING A NEURAL NETWORK TO PREDICT SUPERPIXELS USING SEGMENTATION-AWARE AFFINITY LOSS

Publication number: 20200334502

Abstract: Segmentation is the identification of separate objects within an image. An example is identification of a pedestrian passing in front of a car, where the pedestrian is a first object and the car is a second object. Superpixel segmentation is the identification of regions of pixels within an object that have similar properties. An example is identification of pixel regions having a similar color, such as different articles of clothing worn by the pedestrian and different components of the car. A pixel affinity neural network (PAN) model is trained to generate pixel affinity maps for superpixel segmentation. The pixel affinity map defines the similarity of two points in space. In an embodiment, the pixel affinity map indicates a horizontal affinity and vertical affinity for each pixel in the image. The pixel affinity map is processed to identify the superpixels.

Type: Application

Filed: July 6, 2020

Publication date: October 22, 2020

Inventors: Wei-Chih Tu, Ming-Yu Liu, Varun Jampani, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
Learning affinity via a spatial propagation neural network

Patent number: 10762425

Abstract: A spatial linear propagation network (SLPN) system learns the affinity matrix for vision tasks. An affinity matrix is a generic matrix that defines the similarity of two points in space. The SLPN system is trained for a particular computer vision task and refines an input map (i.e., affinity matrix) that indicates pixels the share a particular property (e.g., color, object, texture, shape, etc.). Inputs to the SLPN system are input data (e.g., pixel values for an image) and the input map corresponding to the input data to be propagated. The input data is processed to produce task-specific affinity values (guidance data). The task-specific affinity values are applied to values in the input map, with at least two weighted values from each column contributing to a value in the refined map data for the adjacent column.

Type: Grant

Filed: September 18, 2018

Date of Patent: September 1, 2020

Assignee: NVIDIA Corporation

Inventors: Sifei Liu, Shalini De Mello, Jinwei Gu, Ming-Hsuan Yang, Jan Kautz
Training a neural network to predict superpixels using segmentation-aware affinity loss

Patent number: 10748036

Abstract: Segmentation is the identification of separate objects within an image. An example is identification of a pedestrian passing in front of a car, where the pedestrian is a first object and the car is a second object. Superpixel segmentation is the identification of regions of pixels within an object that have similar properties An example is identification of pixel regions having a similar color, such as different articles of clothing worn by the pedestrian and different components of the car. A pixel affinity neural network (PAN) model is trained to generate pixel affinity maps for superpixel segmentation. The pixel affinity map defines the similarity of two points in space. In an embodiment, the pixel affinity map indicates a horizontal affinity and vertical affinity for each pixel in the image. The pixel affinity map is processed to identify the superpixels.

Type: Grant

Filed: November 13, 2018

Date of Patent: August 18, 2020

Assignee: NVIDIA Corporation

Inventors: Wei-Chih Tu, Ming-Yu Liu, Varun Jampani, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
Photorealistic Image Stylization Using a Neural Network Model

Publication number: 20190244329

Abstract: Photorealistic image stylization concerns transferring style of a reference photo to a content photo with the constraint that the stylized photo should remain photorealistic. Examples of styles include seasons (summer, winter, etc.), weather (sunny, rainy, foggy, etc.), lighting (daytime, nighttime, etc.). A photorealistic image stylization process includes a stylization step and a smoothing step. The stylization step transfers the style of the reference photo to the content photo. A photo style transfer neural network model receives a photorealistic content image and a photorealistic style image and generates an intermediate stylized photorealistic image that includes the content of the content image modified according to the style image. A smoothing function receives the intermediate stylized photorealistic image and pixel similarity data and generates the stylized photorealistic image, ensuring spatially consistent stylizations.

Type: Application

Filed: January 11, 2019

Publication date: August 8, 2019

Inventors: Yijun Li, Ming-Yu Liu, Ming-Hsuan Yang, Jan Kautz
Computer Vision Systems and Methods for Unsupervised Representation Learning by Sorting Sequences

Publication number: 20190228313

Abstract: Systems and methods for unsupervised representation learning by sorting sequences are provided. An unsupervised representation learning approach is provided which uses videos without semantic labels. The temporal coherence as a supervisory signal can be leveraged by formulating representation learning as a sequence sorting task. A plurality of temporally shuffled frames (i.e., in non-chronological order) can be used as inputs and a convolutional neural network can be trained to sort the shuffled sequences and to facilitate machine learning of features by the convolutional neural network. Features are extracted from all frame pairs and aggregated to predict the correct sequence order. As sorting shuffled image sequence requires an understanding of the statistical temporal structure of images, training with such a proxy task can allow a computer to learn rich and generalizable visual representations from digital images.

Type: Application

Filed: January 23, 2019

Publication date: July 25, 2019

Applicant: Insurance Services Office, Inc.

Inventors: Hsin-Ying Lee, Jia-Bin Huang, Maneesh Kumar Singh, Ming-Hsuan Yang
TRAINING A NEURAL NETWORK TO PREDICT SUPERPIXELS USING SEGMENTATION-AWARE AFFINITY LOSS

Publication number: 20190156154

Abstract: Segmentation is the identification of separate objects within an image. An example is identification of a pedestrian passing in front of a car, where the pedestrian is a first object and the car is a second object. Superpixel segmentation is the identification of regions of pixels within an object that have similar properties An example is identification of pixel regions having a similar color, such as different articles of clothing worn by the pedestrian and different components of the car. A pixel affinity neural network (PAN) model is trained to generate pixel affinity maps for superpixel segmentation. The pixel affinity map defines the similarity of two points in space. In an embodiment, the pixel affinity map indicates a horizonal affinity and vertical affinity for each pixel in the image. The pixel affinity map is processed to identify the superpixels.

Type: Application

Filed: November 13, 2018

Publication date: May 23, 2019

Inventors: Wei-Chih Tu, Ming-Yu Liu, Varun Jampani, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
USING RESIDUAL VIDEO DATA RESULTING FROM A COMPRESSION OF ORIGINAL VIDEO DATA TO IMPROVE A DECOMPRESSION OF THE ORIGINAL VIDEO DATA

Publication number: 20190158884

Abstract: A method, computer readable medium, and system are disclosed for identifying residual video data. This data describes data that is lost during a compression of original video data. For example, the original video data may be compressed and then decompressed, and this result may be compared to the original video data to determine the residual video data. This residual video data is transformed into a smaller format by means of encoding, binarizing, and compressing, and is sent to a destination. At the destination, the residual video data is transformed back into its original format and is used during the decompression of the compressed original video data to improve a quality of the decompressed original video data.

Type: Application

Filed: November 14, 2018

Publication date: May 23, 2019

Inventors: Yi-Hsuan Tsai, Ming-Yu Liu, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
BILATERAL CONVOLUTION LAYER NETWORK FOR PROCESSING POINT CLOUDS

Publication number: 20190147302

Abstract: A method includes filtering a point cloud transformation of a 3D object to generate a 3D lattice and processing the 3D lattice through a series of bilateral convolution networks (BCL), each BCL in the series having a lower lattice feature scale than a preceding BCL in the series. The output of each BCL in the series is concatentated to generate an intermediate 3D lattice. Further filtering of the intermediate 3D lattice generates a first prediction of features of the 3D object.

Type: Application

Filed: May 22, 2018

Publication date: May 16, 2019

Inventors: Varun Jampani, Hang Su, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
LEARNING AFFINITY VIA A SPATIAL PROPAGATION NEURAL NETWORK

Publication number: 20190095791

Abstract: A spatial linear propagation network (SLPN) system learns the affinity matrix for vision tasks. An affinity matrix is a generic matrix that defines the similarity of two points in space. The SLPN system is trained for a particular computer vision task and refines an input map (i.e., affinity matrix) that indicates pixels the share a particular property (e.g., color, object, texture, shape, etc.). Inputs to the SLPN system are input data (e.g., pixel values for an image) and the input map corresponding to the input data to be propagated. The input data is processed to produce task-specific affinity values (guidance data). The task-specific affinity values are applied to values in the input map, with at least two weighted values from each column contributing to a value in the refined map data for the adjacent column.

Type: Application

Filed: September 18, 2018

Publication date: March 28, 2019

Inventors: Sifei Liu, Shalini De Mello, Jinwei Gu, Ming-Hsuan Yang, Jan Kautz
IMAGE PROCESSING DEVICE, COMPUTER-READABLE RECORDING MEDIUM, AND IMAGE PROCESSING METHOD

Publication number: 20150071532

Abstract: Disclosed are an image processing device, an image processing method, and a non-transitory computer-readable medium that obtain a contour image in which noise edges are eliminated from local edge information and a contour of an important object is enhanced. The image processing device includes a local contour extraction unit which generates a local edge image and a global contour extraction unit which generates a global edge image. The image processing device generates a contour image by preparing the local edge image and obtaining the weighting sum of the local edge image and the global edge image.

Type: Application

Filed: September 11, 2013

Publication date: March 12, 2015

Inventors: Xiang Ruan, Lin Chen, Ming-Hsuan Yang
Online articulate object tracking with appearance and shape

Patent number: 8218817

Abstract: A visual tracker tracks an object in a sequence of input images. A tracking module detects a location of the object based on a set of weighted blocks representing the object's shape. The tracking module then refines a segmentation of the object from the background image at the detected location. Based on the refined segmentation, the set of weighted blocks are updated. By adaptively encoding appearance and shape into the block configuration, the present invention is able to efficiently and accurately track an object even in the presence of rapid motion that causes large variations in appearance and shape of the object.

Type: Grant

Filed: December 18, 2008

Date of Patent: July 10, 2012

Assignees: Honda Motor Co. Ltd., University of Florida Research Foundation, Inc.

Inventors: Ming-Hsuan Yang, Jeffrey Ho
Online sparse matrix Gaussian process regression and visual applications

Patent number: 8190549

Abstract: An online sparse matrix Gaussian process (OSMGP) uses online updates to provide an accurate and efficient regression for applications such as pose estimation and object tracking. A regression calculation module calculates a regression on a sequence of input images to generate output predictions based on a learned regression model. The regression model is efficiently updated by representing a covariance matrix of the regression model using a sparse matrix factor (e.g., a Cholesky factor). The sparse matrix factor is maintained and updated in real-time based on the output predictions. Hyperparameter optimization, variable reordering, and matrix downdating techniques can also be applied to further improve the accuracy and/or efficiency of the regression process.

Type: Grant

Filed: November 21, 2008

Date of Patent: May 29, 2012

Assignee: Honda Motor Co., Ltd.

Inventors: Ming-Hsuan Yang, Ananth Ranganathan
Simultaneous localization and mapping using multiple view feature descriptors

Patent number: 7831094

Abstract: Simultaneous localization and mapping (SLAM) utilizes multiple view feature descriptors to robustly determine location despite appearance changes that would stifle conventional systems. A SLAM algorithm generates a feature descriptor for a scene from different perspectives using kernel principal component analysis (KPCA). When the SLAM module subsequently receives a recognition image after a wide baseline change, it can refer to correspondences from the feature descriptor to continue map building and/or determine location. Appearance variations can result from, for example, a change in illumination, partial occlusion, a change in scale, a change in orientation, change in distance, warping, and the like. After an appearance variation, a structure-from-motion module uses feature descriptors to reorient itself and continue map building using an extended Kalman Filter.

Type: Grant

Filed: December 22, 2004

Date of Patent: November 9, 2010

Assignee: Honda Motor Co., Ltd.

Inventors: Rakesh Gupta, Ming-Hsuan Yang, Jason Meltzer

prev 1 2 3 4 next