Patents by Inventor Miao LIAO

Miao LIAO has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11587548
    Abstract: Presented herein are novel approaches to synthesize video of the speech from text. In a training phase, embodiments build a phoneme-pose dictionary and train a generative neural network model using a generative adversarial network (GAN) to generate video from interpolated phoneme poses. In deployment, the trained generative neural network in conjunction with the phoneme-pose dictionary convert an input text into a video of a person speaking the words of the input text. Compared to audio-driven video generation approaches, the embodiments herein have a number of advantages: 1) they only need a fraction of the training data used by an audio-driven approach; 2) they are more flexible and not subject to vulnerability due to speaker variation; and 3) they significantly reduce the preprocessing, training, and inference times.
    Type: Grant
    Filed: April 2, 2021
    Date of Patent: February 21, 2023
    Assignee: Baidu USA LLC
    Inventors: Sibo Zhang, Jiahong Yuan, Miao Liao, Liangjun Zhang
  • Patent number: 11514634
    Abstract: Presented herein are novel embodiments for converting a given speech audio or text into a photo-realistic speaking video of a person with synchronized, realistic, and expressive body dynamics. In one or more embodiments, 3D skeleton movements are generated from the audio sequence using a recurrent neural network, and an output video is synthesized via a conditional generative adversarial network. To make movements realistic and expressive, the knowledge of an articulated 3D human skeleton and a learned dictionary of personal speech iconic gestures may be embedded into the generation process in both learning and testing pipelines. The former prevents the generation of unreasonable body distortion, while the later helps the model quickly learn meaningful body movement with a few videos. To produce photo-realistic and high-resolution video with motion details, a part-attention mechanism is inserted in the conditional GAN, where each detailed part is automatically zoomed in to have their own discriminators.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: November 29, 2022
    Assignees: Baidu USA LLC, Baidu.com Times Technology (Beijing) Co., Ltd.
    Inventors: Miao Liao, Sibo Zhang, Peng Wang, Ruigang Yang
  • Patent number: 11282164
    Abstract: Systems and methods of video inpainting for autonomous driving are disclosed. For example, the method stitches a multiplicity of depth frames into a 3D map, where one or more objects in the depth frames have previously been removed. The method further projects the 3D map onto a first image frame to generate a corresponding depth map, where the first image frame includes a target inpainting region. For each target pixel within the target inpainting region of the first image frame, based on the corresponding depth map, the method further maps the target pixel within the target inpainting region of the first image frame to a candidate pixel in a second image frame. The method further determines a candidate color to fill the target pixel. The method further performs Poisson image editing on the first image frame to achieve color consistency at a boundary and between inside and outside of the target inpainting region of the first image frame.
    Type: Grant
    Filed: May 26, 2020
    Date of Patent: March 22, 2022
    Assignees: BAIDU USA LLC, BAIDU.COM TIMES TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Miao Liao, Feixiang Lu, Dingfu Zhou, Sibo Zhang, Ruigang Yang
  • Publication number: 20210390945
    Abstract: Presented herein are novel approaches to synthesize video of the speech from text. In a training phase, embodiments build a phoneme-pose dictionary and train a generative neural network model using a generative adversarial network (GAN) to generate video from interpolated phoneme poses. In deployment, the trained generative neural network in conjunction with the phoneme-pose dictionary convert an input text into a video of a person speaking the words of the input text. Compared to audio-driven video generation approaches, the embodiments herein have a number of advantages: 1) they only need a fraction of the training data used by an audio-driven approach; 2) they are more flexible and not subject to vulnerability due to speaker variation; and 3) they significantly reduce the preprocessing, training, and inference times.
    Type: Application
    Filed: April 2, 2021
    Publication date: December 16, 2021
    Applicant: Baidu USA LLC
    Inventors: Sibo ZHANG, Jiahong YUAN, Miao LIAO, Liangjun ZHANG
  • Publication number: 20210390748
    Abstract: Presented herein are novel embodiments for converting a given speech audio or text into a photo-realistic speaking video of a person with synchronized, realistic, and expressive body dynamics. In one or more embodiments, 3D skeleton movements are generated from the audio sequence using a recurrent neural network, and an output video is synthesized via a conditional generative adversarial network. To make movements realistic and expressive, the knowledge of an articulated 3D human skeleton and a learned dictionary of personal speech iconic gestures may be embedded into the generation process in both learning and testing pipelines. The former prevents the generation of unreasonable body distortion, while the later helps the model quickly learn meaningful body movement with a few videos. To produce photo-realistic and high-resolution video with motion details, a part-attention mechanism is inserted in the conditional GAN, where each detailed part is automatically zoomed in to have their own discriminators.
    Type: Application
    Filed: June 12, 2020
    Publication date: December 16, 2021
    Applicants: Baidu USA LLC, Baidu.com Times Technology (Beijing) Co., Ltd.
    Inventors: Miao LIAO, Sibo ZHANG, Peng WANG, Ruigang YANG
  • Publication number: 20210374904
    Abstract: Systems and methods of video inpainting for autonomous driving are disclosed. For example, the method stitches a multiplicity of depth frames into a 3D map, where one or more objects in the depth frames have previously been removed. The method further projects the 3D map onto a first image frame to generate a corresponding depth map, where the first image frame includes a target inpainting region. For each target pixel within the target inpainting region of the first image frame, based on the corresponding depth map, the method further maps the target pixel within the target inpainting region of the first image frame to a candidate pixel in a second image frame. The method further determines a candidate color to fill the target pixel. The method further performs Poisson image editing on the first image frame to achieve color consistency at a boundary and between inside and outside of the target inpainting region of the first image frame.
    Type: Application
    Filed: May 26, 2020
    Publication date: December 2, 2021
    Inventors: Miao LIAO, Feixiang LU, Dingfu ZHOU, Sibo ZHANG, Ruigang YANG
  • Patent number: 10339389
    Abstract: Aspects of the present invention are related to methods and systems for vision-based computation of ego-motion.
    Type: Grant
    Filed: September 3, 2014
    Date of Patent: July 2, 2019
    Assignee: Sharp Laboratories of America, Inc.
    Inventors: Xinyu Xu, Miao Liao, Petrus J. L. van Beek
  • Patent number: 9969337
    Abstract: Feature detection may be performed on an image. After feature descriptors for each detected feature are computed, feature matching between feature descriptors for the current image and for a key image frame is performed. If a sufficient number of good matches are identified, key points associated with the feature correspondences may be projected from image coordinates to world coordinates. A distance, in the world coordinate frame, between each feature correspondence may he computed. When the computed distances indicate sufficient movement of the mobile agent to ensure accurate motion estimation, a motion estimate may he computed from the pairs of world coordinates associated with the feature correspondences. A current camera pose in a global coordinate frame may be generated. A motion trajectory may then be determined and feature descriptors for the key image may be updated to the feature descriptors for the current image frame.
    Type: Grant
    Filed: November 16, 2015
    Date of Patent: May 15, 2018
    Assignee: Sharp Laboratories of America, Inc.
    Inventor: Miao Liao
  • Patent number: 9946264
    Abstract: A system and method are provided for autonomously navigating a vehicle. The method captures a sequence of image pairs using a stereo camera. A navigation application stores a vehicle pose (history of vehicle position). The application detects a plurality of matching feature points in a first matching image pair, and determines a plurality of corresponding object points in three-dimensional (3D) space from the first image pair. A plurality of feature points are tracked from the first image pair to a second image pair, and the plurality of corresponding object points in 3D space are determined from the second image pair. From this, a vehicle pose transformation is calculated using the object points from the first and second image pairs. The rotation angle and translation are determined from the vehicle pose transformation. If the rotation angle or translation exceed a minimum threshold, the stored vehicle pose is updated.
    Type: Grant
    Filed: March 22, 2016
    Date of Patent: April 17, 2018
    Assignee: Sharp Laboratories of America, Inc.
    Inventors: Miao Liao, Ming Li, Soonhac Hong
  • Publication number: 20170277197
    Abstract: A system and method are provided for autonomously navigating a vehicle. The method captures a sequence of image pairs using a stereo camera. A navigation application stores a vehicle pose (history of vehicle position). The application detects a plurality of matching feature points in a first matching image pair, and determines a plurality of corresponding object points in three-dimensional (3D) space from the first image pair. A plurality of feature points are tracked from the first image pair to a second image pair, and the plurality of corresponding object points in 3D space are determined from the second image pair. From this, a vehicle pose transformation is calculated using the object points from the first and second image pairs. The rotation angle and translation are determined from the vehicle pose transformation. If the rotation angle or translation exceed a minimum threshold, the stored vehicle pose is updated.
    Type: Application
    Filed: March 22, 2016
    Publication date: September 28, 2017
    Inventors: Miao Liao, Ming Li, Soonhac Hong
  • Patent number: 9625912
    Abstract: Aspects of the present invention are related to methods and systems for autonomous navigation using visual-landmark recognition. One method may include scanning a captured image to detect a machine-readable marker and extracting encoded data therefrom. The data may include an identifying key usable to access a database to retrieve physical attributes associated with the marker. The method may include using the physical attributes to compute a position and an orientation of a mobile agent relative to a landmark object associated with the marker. The method may further include determining a path toward a next route location based on the position of the next route location and the computed position and orientation of the mobile agent and controlling the mobile agent to drive along the path toward the next route location.
    Type: Grant
    Filed: May 3, 2016
    Date of Patent: April 18, 2017
    Assignee: Sharp Laboratories of America, Inc.
    Inventors: Miao Liao, Xinyu Xu, Petrus J. L. van Beek
  • Patent number: 9625908
    Abstract: Aspects of the present invention are related to methods and systems for autonomous navigation using visual landmark recognition.
    Type: Grant
    Filed: October 12, 2015
    Date of Patent: April 18, 2017
    Assignee: Sharp Laboratories of America, Inc.
    Inventors: Miao Liao, Xinyu Xu, Petrus J. L. van Beek
  • Patent number: 9622322
    Abstract: A system that determines the task of the viewer and/or gestures made by the user. Based upon the determined task and/or the gestures, the lighting provided to the viewer may be modified.
    Type: Grant
    Filed: December 23, 2013
    Date of Patent: April 11, 2017
    Assignee: Sharp Laboratories of America, Inc.
    Inventors: Miao Liao, Xiao-Fan Feng, Xu Chen
  • Publication number: 20160246302
    Abstract: Aspects of the present invention are related to methods and systems for autonomous navigation using visual-landmark recognition. One method may include scanning a captured image to detect a machine-readable marker and extracting encoded data therefrom. The data may include an identifying key usable to access a database to retrieve physical attributes associated with the marker. The method may include using the physical attributes to compute a position and an orientation of a mobile agent relative to a landmark object associated with the marker. The method may further include determining a path toward a next route location based on the position of the next route location and the computed position and orientation of the mobile agent and controlling the mobile agent to drive along the path toward the next route location.
    Type: Application
    Filed: May 3, 2016
    Publication date: August 25, 2016
    Inventors: Miao Liao, Xinyu Xu, Petrus J. L. van Beek
  • Patent number: 9285923
    Abstract: A display includes a display area that emits light and a border region surrounding at least a portion of the light emitting region. A light guide plate is overlaying the display area. A lighting module is operatively interconnected with the light guide plate to provide light to the light guide plate and positioned within the border region, and a one camera module is operatively interconnected with the light guide plate to sense light from the light guide plate and positioned within the border region. The display determines a position of a touch on the light guide plate by determining a location of frustrated total internal reflection within the light guide plate as a result of the touch on the light guide plate.
    Type: Grant
    Filed: December 19, 2012
    Date of Patent: March 15, 2016
    Assignee: Sharp Laboratories of America, Inc.
    Inventors: Miao Liao, Ahmet Mufit Ferman, Xiaofan Feng, Philip B. Cowan
  • Publication number: 20160068114
    Abstract: Aspects of the present invention are related to methods and systems for vision-based computation of ego-motion.
    Type: Application
    Filed: November 16, 2015
    Publication date: March 10, 2016
    Inventor: Miao Liao
  • Publication number: 20160062359
    Abstract: Aspects of the present invention are related to methods and systems for autonomous navigation using visual landmark recognition.
    Type: Application
    Filed: October 12, 2015
    Publication date: March 3, 2016
    Inventors: Miao Liao, Xinyu Xu, Petrus J. L. van Beek
  • Publication number: 20160063330
    Abstract: Aspects of the present invention are related to methods and systems for vision-based computation of ego-motion.
    Type: Application
    Filed: September 3, 2014
    Publication date: March 3, 2016
    Inventors: Xinyu Xu, Miao Liao, Petrus J. L. van Beek
  • Patent number: 9157757
    Abstract: Aspects of the present invention are related to methods and systems for autonomous navigation using visual landmark recognition.
    Type: Grant
    Filed: September 3, 2014
    Date of Patent: October 13, 2015
    Assignee: Sharp Laboratories of America, Inc.
    Inventors: Miao Liao, Xinyu Xu, Petrus J. L. van Beek
  • Publication number: 20150181679
    Abstract: A system that determines the task of the viewer and/or gestures made by the user. Based upon the determined task and/or the gestures, the lighting provided to the viewer may be modified.
    Type: Application
    Filed: December 23, 2013
    Publication date: June 25, 2015
    Applicant: Sharp Laboratories of America, Inc.
    Inventors: Miao LIAO, Xiao-Fan FENG, Xu CHEN