Patents by Inventor Linhuang Yan

Linhuang Yan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method for generating a talking head video with mouth movement sequence, device and computer-readable storage medium

Patent number: 12315059

Abstract: A method for generating a talking head video includes: obtaining a text and an image containing a face of a user; determining a phoneme sequence that corresponds to the text and includes one or more phonemes; determining acoustic features corresponding to the text according to the phoneme sequence, and obtaining synthesized speech corresponding to the text according to the acoustic features; determining a first mouth movement sequence corresponding to the text according to the phoneme sequence, and determining a second mouth movement sequence corresponding to the text according to the acoustic features; creating a facial action video corresponding to the user according to the first mouth movement sequence, the second mouth movement sequence and the image; and processing the synthesized speech and the facial action video synchronously to obtain a talking head video corresponding to the user.

Type: Grant

Filed: May 26, 2023

Date of Patent: May 27, 2025

Assignee: UBTECH ROBOTICS CORP LTD

Inventors: Wan Ding, Dongyan Huang, Linhuang Yan, Zhiyong Yang
TEXT-TO-SPEECH SYNTHESIS METHOD, ELECTRONIC DEVICE, AND COMPUTER-READABLE STORAGE MEDIUM

Publication number: 20230410791

Abstract: A text-to-speech synthesis method, an electronic device, and a computer-readable storage medium are provided. The method includes: obtaining prosodic pause features of an input text by performing a prosodic pause prediction processing on the input text, and dividing the input text into a plurality of prosodic phrases according to the prosodic pause features; synthesizing short sentence audios according to the prosodic phrases by performing a streamed speech synthesis processing on each of the prosodic phrases in the input text in a manner of asynchronous processing of a thread pool; and performing an audio playback operation of the input text according to the short sentence audios corresponding to the first prosodic phrase of the input text, in response to synthesizing the short sentence audio corresponding to the first prosodic phrase of the input text.

Type: Application

Filed: June 20, 2023

Publication date: December 21, 2023

Inventors: Wan Ding, Dongyuan Huang, Zehong Zheng, Linhuang Yan, Zhiyong Yang
METHOD FOR GENERATING TALKING HEAD VIDEO, DEVICE AND COMPUTER-READABLE STORAGE MEDIUM

Publication number: 20230386116

Abstract: A method for generating a talking head video includes: obtaining a text and an image containing a face of a user; determining a phoneme sequence that corresponds to the text and includes one or more phonemes; determining acoustic features corresponding to the text according to the phoneme sequence, and obtaining synthesized speech corresponding to the text according to the acoustic features; determining a first mouth movement sequence corresponding to the text according to the phoneme sequence, and determining a second mouth movement sequence corresponding to the text according to the acoustic features; creating a facial action video corresponding to the user according to the first mouth movement sequence, the second mouth movement sequence and the image; and processing the synthesized speech and the facial action video synchronously to obtain a talking head video corresponding to the user.

Type: Application

Filed: May 26, 2023

Publication date: November 30, 2023

Inventors: WAN DING, Dongyan Huang, Linhuang Yan, Zhiyong Yang

Method for generating a talking head video with mouth movement sequence, device and computer-readable storage medium

TEXT-TO-SPEECH SYNTHESIS METHOD, ELECTRONIC DEVICE, AND COMPUTER-READABLE STORAGE MEDIUM

METHOD FOR GENERATING TALKING HEAD VIDEO, DEVICE AND COMPUTER-READABLE STORAGE MEDIUM