Patents by Inventor Miao LIAO

Miao LIAO has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Text-driven video synthesis with phonetic dictionary

Patent number: 11587548

Abstract: Presented herein are novel approaches to synthesize video of the speech from text. In a training phase, embodiments build a phoneme-pose dictionary and train a generative neural network model using a generative adversarial network (GAN) to generate video from interpolated phoneme poses. In deployment, the trained generative neural network in conjunction with the phoneme-pose dictionary convert an input text into a video of a person speaking the words of the input text. Compared to audio-driven video generation approaches, the embodiments herein have a number of advantages: 1) they only need a fraction of the training data used by an audio-driven approach; 2) they are more flexible and not subject to vulnerability due to speaker variation; and 3) they significantly reduce the preprocessing, training, and inference times.

Type: Grant

Filed: April 2, 2021

Date of Patent: February 21, 2023

Assignee: Baidu USA LLC

Inventors: Sibo Zhang, Jiahong Yuan, Miao Liao, Liangjun Zhang
Personalized speech-to-video with three-dimensional (3D) skeleton regularization and expressive body poses

Patent number: 11514634

Abstract: Presented herein are novel embodiments for converting a given speech audio or text into a photo-realistic speaking video of a person with synchronized, realistic, and expressive body dynamics. In one or more embodiments, 3D skeleton movements are generated from the audio sequence using a recurrent neural network, and an output video is synthesized via a conditional generative adversarial network. To make movements realistic and expressive, the knowledge of an articulated 3D human skeleton and a learned dictionary of personal speech iconic gestures may be embedded into the generation process in both learning and testing pipelines. The former prevents the generation of unreasonable body distortion, while the later helps the model quickly learn meaningful body movement with a few videos. To produce photo-realistic and high-resolution video with motion details, a part-attention mechanism is inserted in the conditional GAN, where each detailed part is automatically zoomed in to have their own discriminators.

Type: Grant

Filed: June 12, 2020

Date of Patent: November 29, 2022

Assignees: Baidu USA LLC, Baidu.com Times Technology (Beijing) Co., Ltd.

Inventors: Miao Liao, Sibo Zhang, Peng Wang, Ruigang Yang
Depth-guided video inpainting for autonomous driving

Patent number: 11282164

Abstract: Systems and methods of video inpainting for autonomous driving are disclosed. For example, the method stitches a multiplicity of depth frames into a 3D map, where one or more objects in the depth frames have previously been removed. The method further projects the 3D map onto a first image frame to generate a corresponding depth map, where the first image frame includes a target inpainting region. For each target pixel within the target inpainting region of the first image frame, based on the corresponding depth map, the method further maps the target pixel within the target inpainting region of the first image frame to a candidate pixel in a second image frame. The method further determines a candidate color to fill the target pixel. The method further performs Poisson image editing on the first image frame to achieve color consistency at a boundary and between inside and outside of the target inpainting region of the first image frame.

Type: Grant

Filed: May 26, 2020

Date of Patent: March 22, 2022

Assignees: BAIDU USA LLC, BAIDU.COM TIMES TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Miao Liao, Feixiang Lu, Dingfu Zhou, Sibo Zhang, Ruigang Yang
PERSONALIZED SPEECH-TO-VIDEO WITH THREE-DIMENSIONAL (3D) SKELETON REGULARIZATION AND EXPRESSIVE BODY POSES

Publication number: 20210390748

Abstract: Presented herein are novel embodiments for converting a given speech audio or text into a photo-realistic speaking video of a person with synchronized, realistic, and expressive body dynamics. In one or more embodiments, 3D skeleton movements are generated from the audio sequence using a recurrent neural network, and an output video is synthesized via a conditional generative adversarial network. To make movements realistic and expressive, the knowledge of an articulated 3D human skeleton and a learned dictionary of personal speech iconic gestures may be embedded into the generation process in both learning and testing pipelines. The former prevents the generation of unreasonable body distortion, while the later helps the model quickly learn meaningful body movement with a few videos. To produce photo-realistic and high-resolution video with motion details, a part-attention mechanism is inserted in the conditional GAN, where each detailed part is automatically zoomed in to have their own discriminators.

Type: Application

Filed: June 12, 2020

Publication date: December 16, 2021

Applicants: Baidu USA LLC, Baidu.com Times Technology (Beijing) Co., Ltd.

Inventors: Miao LIAO, Sibo ZHANG, Peng WANG, Ruigang YANG
TEXT-DRIVEN VIDEO SYNTHESIS WITH PHONETIC DICTIONARY

Publication number: 20210390945

Abstract: Presented herein are novel approaches to synthesize video of the speech from text. In a training phase, embodiments build a phoneme-pose dictionary and train a generative neural network model using a generative adversarial network (GAN) to generate video from interpolated phoneme poses. In deployment, the trained generative neural network in conjunction with the phoneme-pose dictionary convert an input text into a video of a person speaking the words of the input text. Compared to audio-driven video generation approaches, the embodiments herein have a number of advantages: 1) they only need a fraction of the training data used by an audio-driven approach; 2) they are more flexible and not subject to vulnerability due to speaker variation; and 3) they significantly reduce the preprocessing, training, and inference times.

Type: Application

Filed: April 2, 2021

Publication date: December 16, 2021

Applicant: Baidu USA LLC

Inventors: Sibo ZHANG, Jiahong YUAN, Miao LIAO, Liangjun ZHANG
DEPTH-GUIDED VIDEO INPAINTING FOR AUTONOMOUS DRIVING

Publication number: 20210374904

Abstract: Systems and methods of video inpainting for autonomous driving are disclosed. For example, the method stitches a multiplicity of depth frames into a 3D map, where one or more objects in the depth frames have previously been removed. The method further projects the 3D map onto a first image frame to generate a corresponding depth map, where the first image frame includes a target inpainting region. For each target pixel within the target inpainting region of the first image frame, based on the corresponding depth map, the method further maps the target pixel within the target inpainting region of the first image frame to a candidate pixel in a second image frame. The method further determines a candidate color to fill the target pixel. The method further performs Poisson image editing on the first image frame to achieve color consistency at a boundary and between inside and outside of the target inpainting region of the first image frame.

Type: Application

Filed: May 26, 2020

Publication date: December 2, 2021

Inventors: Miao LIAO, Feixiang LU, Dingfu ZHOU, Sibo ZHANG, Ruigang YANG
Methods and systems for vision-based motion estimation

Patent number: 10339389

Abstract: Aspects of the present invention are related to methods and systems for vision-based computation of ego-motion.

Type: Grant

Filed: September 3, 2014

Date of Patent: July 2, 2019

Assignee: Sharp Laboratories of America, Inc.

Inventors: Xinyu Xu, Miao Liao, Petrus J. L. van Beek
Methods and systems for mobile-agent navigation

Patent number: 9969337

Abstract: Feature detection may be performed on an image. After feature descriptors for each detected feature are computed, feature matching between feature descriptors for the current image and for a key image frame is performed. If a sufficient number of good matches are identified, key points associated with the feature correspondences may be projected from image coordinates to world coordinates. A distance, in the world coordinate frame, between each feature correspondence may he computed. When the computed distances indicate sufficient movement of the mobile agent to ensure accurate motion estimation, a motion estimate may he computed from the pairs of world coordinates associated with the feature correspondences. A current camera pose in a global coordinate frame may be generated. A motion trajectory may then be determined and feature descriptors for the key image may be updated to the feature descriptors for the current image frame.

Type: Grant

Filed: November 16, 2015

Date of Patent: May 15, 2018

Assignee: Sharp Laboratories of America, Inc.

Inventor: Miao Liao
Autonomous navigation using visual odometry

Patent number: 9946264

Abstract: A system and method are provided for autonomously navigating a vehicle. The method captures a sequence of image pairs using a stereo camera. A navigation application stores a vehicle pose (history of vehicle position). The application detects a plurality of matching feature points in a first matching image pair, and determines a plurality of corresponding object points in three-dimensional (3D) space from the first image pair. A plurality of feature points are tracked from the first image pair to a second image pair, and the plurality of corresponding object points in 3D space are determined from the second image pair. From this, a vehicle pose transformation is calculated using the object points from the first and second image pairs. The rotation angle and translation are determined from the vehicle pose transformation. If the rotation angle or translation exceed a minimum threshold, the stored vehicle pose is updated.

Type: Grant

Filed: March 22, 2016

Date of Patent: April 17, 2018

Assignee: Sharp Laboratories of America, Inc.

Inventors: Miao Liao, Ming Li, Soonhac Hong
Autonomous Navigation using Visual Odometry

Publication number: 20170277197

Abstract: A system and method are provided for autonomously navigating a vehicle. The method captures a sequence of image pairs using a stereo camera. A navigation application stores a vehicle pose (history of vehicle position). The application detects a plurality of matching feature points in a first matching image pair, and determines a plurality of corresponding object points in three-dimensional (3D) space from the first image pair. A plurality of feature points are tracked from the first image pair to a second image pair, and the plurality of corresponding object points in 3D space are determined from the second image pair. From this, a vehicle pose transformation is calculated using the object points from the first and second image pairs. The rotation angle and translation are determined from the vehicle pose transformation. If the rotation angle or translation exceed a minimum threshold, the stored vehicle pose is updated.

Type: Application

Filed: March 22, 2016

Publication date: September 28, 2017

Inventors: Miao Liao, Ming Li, Soonhac Hong
Methods and systems for mobile-agent navigation

Patent number: 9625908

Abstract: Aspects of the present invention are related to methods and systems for autonomous navigation using visual landmark recognition.

Type: Grant

Filed: October 12, 2015

Date of Patent: April 18, 2017

Assignee: Sharp Laboratories of America, Inc.

Inventors: Miao Liao, Xinyu Xu, Petrus J. L. van Beek
Methods and systems for mobile-agent navigation

Patent number: 9625912

Abstract: Aspects of the present invention are related to methods and systems for autonomous navigation using visual-landmark recognition. One method may include scanning a captured image to detect a machine-readable marker and extracting encoded data therefrom. The data may include an identifying key usable to access a database to retrieve physical attributes associated with the marker. The method may include using the physical attributes to compute a position and an orientation of a mobile agent relative to a landmark object associated with the marker. The method may further include determining a path toward a next route location based on the position of the next route location and the computed position and orientation of the mobile agent and controlling the mobile agent to drive along the path toward the next route location.

Type: Grant

Filed: May 3, 2016

Date of Patent: April 18, 2017

Assignee: Sharp Laboratories of America, Inc.

Inventors: Miao Liao, Xinyu Xu, Petrus J. L. van Beek
Task light based system and gesture control

Patent number: 9622322

Abstract: A system that determines the task of the viewer and/or gestures made by the user. Based upon the determined task and/or the gestures, the lighting provided to the viewer may be modified.

Type: Grant

Filed: December 23, 2013

Date of Patent: April 11, 2017

Assignee: Sharp Laboratories of America, Inc.

Inventors: Miao Liao, Xiao-Fan Feng, Xu Chen
METHODS AND SYSTEMS FOR MOBILE-AGENT NAVIGATION

Publication number: 20160246302

Abstract: Aspects of the present invention are related to methods and systems for autonomous navigation using visual-landmark recognition. One method may include scanning a captured image to detect a machine-readable marker and extracting encoded data therefrom. The data may include an identifying key usable to access a database to retrieve physical attributes associated with the marker. The method may include using the physical attributes to compute a position and an orientation of a mobile agent relative to a landmark object associated with the marker. The method may further include determining a path toward a next route location based on the position of the next route location and the computed position and orientation of the mobile agent and controlling the mobile agent to drive along the path toward the next route location.

Type: Application

Filed: May 3, 2016

Publication date: August 25, 2016

Inventors: Miao Liao, Xinyu Xu, Petrus J. L. van Beek
Touch sensitive display system

Patent number: 9285923

Abstract: A display includes a display area that emits light and a border region surrounding at least a portion of the light emitting region. A light guide plate is overlaying the display area. A lighting module is operatively interconnected with the light guide plate to provide light to the light guide plate and positioned within the border region, and a one camera module is operatively interconnected with the light guide plate to sense light from the light guide plate and positioned within the border region. The display determines a position of a touch on the light guide plate by determining a location of frustrated total internal reflection within the light guide plate as a result of the touch on the light guide plate.

Type: Grant

Filed: December 19, 2012

Date of Patent: March 15, 2016

Assignee: Sharp Laboratories of America, Inc.

Inventors: Miao Liao, Ahmet Mufit Ferman, Xiaofan Feng, Philip B. Cowan
Methods and Systems for Mobile-Agent Navigation

Publication number: 20160068114

Abstract: Aspects of the present invention are related to methods and systems for vision-based computation of ego-motion.

Type: Application

Filed: November 16, 2015

Publication date: March 10, 2016

Inventor: Miao Liao
Methods and Systems for Mobile-Agent Navigation

Publication number: 20160062359

Abstract: Aspects of the present invention are related to methods and systems for autonomous navigation using visual landmark recognition.

Type: Application

Filed: October 12, 2015

Publication date: March 3, 2016

Inventors: Miao Liao, Xinyu Xu, Petrus J. L. van Beek
Methods and Systems for Vision-Based Motion Estimation

Publication number: 20160063330

Abstract: Aspects of the present invention are related to methods and systems for vision-based computation of ego-motion.

Type: Application

Filed: September 3, 2014

Publication date: March 3, 2016

Inventors: Xinyu Xu, Miao Liao, Petrus J. L. van Beek
Methods and systems for mobile-agent navigation

Patent number: 9157757

Abstract: Aspects of the present invention are related to methods and systems for autonomous navigation using visual landmark recognition.

Type: Grant

Filed: September 3, 2014

Date of Patent: October 13, 2015

Assignee: Sharp Laboratories of America, Inc.

Inventors: Miao Liao, Xinyu Xu, Petrus J. L. van Beek
TASK LIGHT BASED SYSTEM AND GESTURE CONTROL

Publication number: 20150181679

Abstract: A system that determines the task of the viewer and/or gestures made by the user. Based upon the determined task and/or the gestures, the lighting provided to the viewer may be modified.

Type: Application

Filed: December 23, 2013

Publication date: June 25, 2015

Applicant: Sharp Laboratories of America, Inc.

Inventors: Miao LIAO, Xiao-Fan FENG, Xu CHEN

1 2 next