Patents by Inventor Miao LIAO
Miao LIAO has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11587548Abstract: Presented herein are novel approaches to synthesize video of the speech from text. In a training phase, embodiments build a phoneme-pose dictionary and train a generative neural network model using a generative adversarial network (GAN) to generate video from interpolated phoneme poses. In deployment, the trained generative neural network in conjunction with the phoneme-pose dictionary convert an input text into a video of a person speaking the words of the input text. Compared to audio-driven video generation approaches, the embodiments herein have a number of advantages: 1) they only need a fraction of the training data used by an audio-driven approach; 2) they are more flexible and not subject to vulnerability due to speaker variation; and 3) they significantly reduce the preprocessing, training, and inference times.Type: GrantFiled: April 2, 2021Date of Patent: February 21, 2023Assignee: Baidu USA LLCInventors: Sibo Zhang, Jiahong Yuan, Miao Liao, Liangjun Zhang
-
Patent number: 11514634Abstract: Presented herein are novel embodiments for converting a given speech audio or text into a photo-realistic speaking video of a person with synchronized, realistic, and expressive body dynamics. In one or more embodiments, 3D skeleton movements are generated from the audio sequence using a recurrent neural network, and an output video is synthesized via a conditional generative adversarial network. To make movements realistic and expressive, the knowledge of an articulated 3D human skeleton and a learned dictionary of personal speech iconic gestures may be embedded into the generation process in both learning and testing pipelines. The former prevents the generation of unreasonable body distortion, while the later helps the model quickly learn meaningful body movement with a few videos. To produce photo-realistic and high-resolution video with motion details, a part-attention mechanism is inserted in the conditional GAN, where each detailed part is automatically zoomed in to have their own discriminators.Type: GrantFiled: June 12, 2020Date of Patent: November 29, 2022Assignees: Baidu USA LLC, Baidu.com Times Technology (Beijing) Co., Ltd.Inventors: Miao Liao, Sibo Zhang, Peng Wang, Ruigang Yang
-
Patent number: 11282164Abstract: Systems and methods of video inpainting for autonomous driving are disclosed. For example, the method stitches a multiplicity of depth frames into a 3D map, where one or more objects in the depth frames have previously been removed. The method further projects the 3D map onto a first image frame to generate a corresponding depth map, where the first image frame includes a target inpainting region. For each target pixel within the target inpainting region of the first image frame, based on the corresponding depth map, the method further maps the target pixel within the target inpainting region of the first image frame to a candidate pixel in a second image frame. The method further determines a candidate color to fill the target pixel. The method further performs Poisson image editing on the first image frame to achieve color consistency at a boundary and between inside and outside of the target inpainting region of the first image frame.Type: GrantFiled: May 26, 2020Date of Patent: March 22, 2022Assignees: BAIDU USA LLC, BAIDU.COM TIMES TECHNOLOGY (BEIJING) CO., LTD.Inventors: Miao Liao, Feixiang Lu, Dingfu Zhou, Sibo Zhang, Ruigang Yang
-
Publication number: 20210390945Abstract: Presented herein are novel approaches to synthesize video of the speech from text. In a training phase, embodiments build a phoneme-pose dictionary and train a generative neural network model using a generative adversarial network (GAN) to generate video from interpolated phoneme poses. In deployment, the trained generative neural network in conjunction with the phoneme-pose dictionary convert an input text into a video of a person speaking the words of the input text. Compared to audio-driven video generation approaches, the embodiments herein have a number of advantages: 1) they only need a fraction of the training data used by an audio-driven approach; 2) they are more flexible and not subject to vulnerability due to speaker variation; and 3) they significantly reduce the preprocessing, training, and inference times.Type: ApplicationFiled: April 2, 2021Publication date: December 16, 2021Applicant: Baidu USA LLCInventors: Sibo ZHANG, Jiahong YUAN, Miao LIAO, Liangjun ZHANG
-
Publication number: 20210390748Abstract: Presented herein are novel embodiments for converting a given speech audio or text into a photo-realistic speaking video of a person with synchronized, realistic, and expressive body dynamics. In one or more embodiments, 3D skeleton movements are generated from the audio sequence using a recurrent neural network, and an output video is synthesized via a conditional generative adversarial network. To make movements realistic and expressive, the knowledge of an articulated 3D human skeleton and a learned dictionary of personal speech iconic gestures may be embedded into the generation process in both learning and testing pipelines. The former prevents the generation of unreasonable body distortion, while the later helps the model quickly learn meaningful body movement with a few videos. To produce photo-realistic and high-resolution video with motion details, a part-attention mechanism is inserted in the conditional GAN, where each detailed part is automatically zoomed in to have their own discriminators.Type: ApplicationFiled: June 12, 2020Publication date: December 16, 2021Applicants: Baidu USA LLC, Baidu.com Times Technology (Beijing) Co., Ltd.Inventors: Miao LIAO, Sibo ZHANG, Peng WANG, Ruigang YANG
-
Publication number: 20210374904Abstract: Systems and methods of video inpainting for autonomous driving are disclosed. For example, the method stitches a multiplicity of depth frames into a 3D map, where one or more objects in the depth frames have previously been removed. The method further projects the 3D map onto a first image frame to generate a corresponding depth map, where the first image frame includes a target inpainting region. For each target pixel within the target inpainting region of the first image frame, based on the corresponding depth map, the method further maps the target pixel within the target inpainting region of the first image frame to a candidate pixel in a second image frame. The method further determines a candidate color to fill the target pixel. The method further performs Poisson image editing on the first image frame to achieve color consistency at a boundary and between inside and outside of the target inpainting region of the first image frame.Type: ApplicationFiled: May 26, 2020Publication date: December 2, 2021Inventors: Miao LIAO, Feixiang LU, Dingfu ZHOU, Sibo ZHANG, Ruigang YANG
-
Patent number: 10339389Abstract: Aspects of the present invention are related to methods and systems for vision-based computation of ego-motion.Type: GrantFiled: September 3, 2014Date of Patent: July 2, 2019Assignee: Sharp Laboratories of America, Inc.Inventors: Xinyu Xu, Miao Liao, Petrus J. L. van Beek
-
Patent number: 9969337Abstract: Feature detection may be performed on an image. After feature descriptors for each detected feature are computed, feature matching between feature descriptors for the current image and for a key image frame is performed. If a sufficient number of good matches are identified, key points associated with the feature correspondences may be projected from image coordinates to world coordinates. A distance, in the world coordinate frame, between each feature correspondence may he computed. When the computed distances indicate sufficient movement of the mobile agent to ensure accurate motion estimation, a motion estimate may he computed from the pairs of world coordinates associated with the feature correspondences. A current camera pose in a global coordinate frame may be generated. A motion trajectory may then be determined and feature descriptors for the key image may be updated to the feature descriptors for the current image frame.Type: GrantFiled: November 16, 2015Date of Patent: May 15, 2018Assignee: Sharp Laboratories of America, Inc.Inventor: Miao Liao
-
Patent number: 9946264Abstract: A system and method are provided for autonomously navigating a vehicle. The method captures a sequence of image pairs using a stereo camera. A navigation application stores a vehicle pose (history of vehicle position). The application detects a plurality of matching feature points in a first matching image pair, and determines a plurality of corresponding object points in three-dimensional (3D) space from the first image pair. A plurality of feature points are tracked from the first image pair to a second image pair, and the plurality of corresponding object points in 3D space are determined from the second image pair. From this, a vehicle pose transformation is calculated using the object points from the first and second image pairs. The rotation angle and translation are determined from the vehicle pose transformation. If the rotation angle or translation exceed a minimum threshold, the stored vehicle pose is updated.Type: GrantFiled: March 22, 2016Date of Patent: April 17, 2018Assignee: Sharp Laboratories of America, Inc.Inventors: Miao Liao, Ming Li, Soonhac Hong
-
Publication number: 20170277197Abstract: A system and method are provided for autonomously navigating a vehicle. The method captures a sequence of image pairs using a stereo camera. A navigation application stores a vehicle pose (history of vehicle position). The application detects a plurality of matching feature points in a first matching image pair, and determines a plurality of corresponding object points in three-dimensional (3D) space from the first image pair. A plurality of feature points are tracked from the first image pair to a second image pair, and the plurality of corresponding object points in 3D space are determined from the second image pair. From this, a vehicle pose transformation is calculated using the object points from the first and second image pairs. The rotation angle and translation are determined from the vehicle pose transformation. If the rotation angle or translation exceed a minimum threshold, the stored vehicle pose is updated.Type: ApplicationFiled: March 22, 2016Publication date: September 28, 2017Inventors: Miao Liao, Ming Li, Soonhac Hong
-
Patent number: 9625912Abstract: Aspects of the present invention are related to methods and systems for autonomous navigation using visual-landmark recognition. One method may include scanning a captured image to detect a machine-readable marker and extracting encoded data therefrom. The data may include an identifying key usable to access a database to retrieve physical attributes associated with the marker. The method may include using the physical attributes to compute a position and an orientation of a mobile agent relative to a landmark object associated with the marker. The method may further include determining a path toward a next route location based on the position of the next route location and the computed position and orientation of the mobile agent and controlling the mobile agent to drive along the path toward the next route location.Type: GrantFiled: May 3, 2016Date of Patent: April 18, 2017Assignee: Sharp Laboratories of America, Inc.Inventors: Miao Liao, Xinyu Xu, Petrus J. L. van Beek
-
Patent number: 9625908Abstract: Aspects of the present invention are related to methods and systems for autonomous navigation using visual landmark recognition.Type: GrantFiled: October 12, 2015Date of Patent: April 18, 2017Assignee: Sharp Laboratories of America, Inc.Inventors: Miao Liao, Xinyu Xu, Petrus J. L. van Beek
-
Patent number: 9622322Abstract: A system that determines the task of the viewer and/or gestures made by the user. Based upon the determined task and/or the gestures, the lighting provided to the viewer may be modified.Type: GrantFiled: December 23, 2013Date of Patent: April 11, 2017Assignee: Sharp Laboratories of America, Inc.Inventors: Miao Liao, Xiao-Fan Feng, Xu Chen
-
Publication number: 20160246302Abstract: Aspects of the present invention are related to methods and systems for autonomous navigation using visual-landmark recognition. One method may include scanning a captured image to detect a machine-readable marker and extracting encoded data therefrom. The data may include an identifying key usable to access a database to retrieve physical attributes associated with the marker. The method may include using the physical attributes to compute a position and an orientation of a mobile agent relative to a landmark object associated with the marker. The method may further include determining a path toward a next route location based on the position of the next route location and the computed position and orientation of the mobile agent and controlling the mobile agent to drive along the path toward the next route location.Type: ApplicationFiled: May 3, 2016Publication date: August 25, 2016Inventors: Miao Liao, Xinyu Xu, Petrus J. L. van Beek
-
Patent number: 9285923Abstract: A display includes a display area that emits light and a border region surrounding at least a portion of the light emitting region. A light guide plate is overlaying the display area. A lighting module is operatively interconnected with the light guide plate to provide light to the light guide plate and positioned within the border region, and a one camera module is operatively interconnected with the light guide plate to sense light from the light guide plate and positioned within the border region. The display determines a position of a touch on the light guide plate by determining a location of frustrated total internal reflection within the light guide plate as a result of the touch on the light guide plate.Type: GrantFiled: December 19, 2012Date of Patent: March 15, 2016Assignee: Sharp Laboratories of America, Inc.Inventors: Miao Liao, Ahmet Mufit Ferman, Xiaofan Feng, Philip B. Cowan
-
Publication number: 20160068114Abstract: Aspects of the present invention are related to methods and systems for vision-based computation of ego-motion.Type: ApplicationFiled: November 16, 2015Publication date: March 10, 2016Inventor: Miao Liao
-
Publication number: 20160062359Abstract: Aspects of the present invention are related to methods and systems for autonomous navigation using visual landmark recognition.Type: ApplicationFiled: October 12, 2015Publication date: March 3, 2016Inventors: Miao Liao, Xinyu Xu, Petrus J. L. van Beek
-
Publication number: 20160063330Abstract: Aspects of the present invention are related to methods and systems for vision-based computation of ego-motion.Type: ApplicationFiled: September 3, 2014Publication date: March 3, 2016Inventors: Xinyu Xu, Miao Liao, Petrus J. L. van Beek
-
Patent number: 9157757Abstract: Aspects of the present invention are related to methods and systems for autonomous navigation using visual landmark recognition.Type: GrantFiled: September 3, 2014Date of Patent: October 13, 2015Assignee: Sharp Laboratories of America, Inc.Inventors: Miao Liao, Xinyu Xu, Petrus J. L. van Beek
-
Publication number: 20150181679Abstract: A system that determines the task of the viewer and/or gestures made by the user. Based upon the determined task and/or the gestures, the lighting provided to the viewer may be modified.Type: ApplicationFiled: December 23, 2013Publication date: June 25, 2015Applicant: Sharp Laboratories of America, Inc.Inventors: Miao LIAO, Xiao-Fan FENG, Xu CHEN