Patents by Inventor Zhengkun GAO

Zhengkun GAO has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Speech synthesis method, and electronic device

Patent number: 12211485

Abstract: The disclosure provides a speech synthesis method, and an electronic device. The technical solution is described as follows. A text to be synthesized and speech features of a target user are obtained. Predicted first acoustic features based on the text to be synthesized and the speech features are obtained. A target template audio is obtained from a template audio library based on the text to be synthesized. Second acoustic features of the target template audio are extracted. Target acoustic features are generated by splicing the first acoustic features and the second acoustic features. Speech synthesis is performed on the text to be synthesized based on the target acoustic features and the speech features, to generate a target speech of the text to be synthesized.

Type: Grant

Filed: August 17, 2022

Date of Patent: January 28, 2025

Assignee: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventors: Zhengkun Gao, Junteng Zhang, Tao Sun, Lei Jia
Method of registering attribute in speech synthesis model, apparatus of registering attribute in speech synthesis model, electronic device, and medium

Patent number: 12062357

Abstract: A method of registering an attribute in a speech synthesis model, an apparatus of registering an attribute in a speech synthesis model, an electronic device, and a medium are provided, which relate to a field of an artificial intelligence technology such as a deep learning and intelligent speech technology. The method includes: acquiring a plurality of data associated with an attribute to be registered; and registering the attribute in the speech synthesis model by using the plurality of data associated with the attribute, wherein the speech synthesis model is trained in advance by using a training data in a training data set.

Type: Grant

Filed: November 16, 2021

Date of Patent: August 13, 2024

Assignee: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventors: Wenfu Wang, Xilei Wang, Tao Sun, Han Yuan, Zhengkun Gao, Lei Jia
Method and apparatus of synthesizing speech, method and apparatus of training speech synthesis model, electronic device, and storage medium

Patent number: 11769482

Abstract: The present disclosure provides a method and apparatus of synthesizing a speech, a method and apparatus of training a speech synthesis model, an electronic device, and a storage medium. The method of synthesizing a speech includes acquiring a style information of a speech to be synthesized, a tone information of the speech to be synthesized, and a content information of a text to be processed; generating an acoustic feature information of the text to be processed, by using a pre-trained speech synthesis model, based on the style information, the tone information, and the content information of the text to be processed; and synthesizing the speech for the text to be processed, based on the acoustic feature information of the text to be processed.

Type: Grant

Filed: September 29, 2021

Date of Patent: September 26, 2023

Assignee: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventors: Wenfu Wang, Tao Sun, Xilei Wang, Junteng Zhang, Zhengkun Gao, Lei Jia
Method and apparatus for training model, method and apparatus for synthesizing speech, device and storage medium

Patent number: 11769480

Abstract: The present disclosure discloses a method and apparatus for training a model, a method and apparatus for synthesizing a speech, a device and a storage medium, and relates to the field of natural language processing and deep learning technology. The method for training a model may include: determining a phoneme feature and a prosodic word boundary feature of sample text data; inserting a pause character into the phoneme feature according to the prosodic word boundary feature to obtain a combined feature of the sample text data; and training an initial speech synthesis model according to the combined feature of the sample text data, to obtain a target speech synthesis model.

Type: Grant

Filed: December 3, 2020

Date of Patent: September 26, 2023

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Zhengkun Gao, Junteng Zhang, Wenfu Wang, Tao Sun
SPEECH SYNTHESIS METHOD, AND ELECTRONIC DEVICE

Publication number: 20230005466

Abstract: The disclosure provides a speech synthesis method, and an electronic device. The technical solution is described as follows. A text to be synthesized and speech features of a target user are obtained. Predicted first acoustic features based on the text to be synthesized and the speech features are obtained. A target template audio is obtained from a template audio library based on the text to be synthesized. Second acoustic features of the target template audio are extracted. Target acoustic features are generated by splicing the first acoustic features and the second acoustic features. Speech synthesis is performed on the text to be synthesized based on the target acoustic features and the speech features, to generate a target speech of the text to be synthesized.

Type: Application

Filed: August 17, 2022

Publication date: January 5, 2023

Inventors: Zhengkun GAO, Junteng ZHANG, Tao SUN, Lei JIA
METHOD OF REGISTERING ATTRIBUTE IN SPEECH SYNTHESIS MODEL, APPARATUS OF REGISTERING ATTRIBUTE IN SPEECH SYNTHESIS MODEL, ELECTRONIC DEVICE, AND MEDIUM

Publication number: 20220076657

Abstract: A method of registering an attribute in a speech synthesis model, an apparatus of registering an attribute in a speech synthesis model, an electronic device, and a medium are provided, which relate to a field of an artificial intelligence technology such as a deep learning and intelligent speech technology. The method includes: acquiring a plurality of data associated with an attribute to be registered; and registering the attribute in the speech synthesis model by using the plurality of data associated with the attribute, wherein the speech synthesis model is trained in advance by using a training data in a training data set.

Type: Application

Filed: November 16, 2021

Publication date: March 10, 2022

Inventors: Wenfu WANG, Xilei WANG, Tao SUN, Han YUAN, Zhengkun GAO, Lei JIA
METHOD AND APPARATUS OF SYNTHESIZING SPEECH, METHOD AND APPARATUS OF TRAINING SPEECH SYNTHESIS MODEL, ELECTRONIC DEVICE, AND STORAGE MEDIUM

Publication number: 20220020356

Abstract: The present disclosure provides a method and apparatus of synthesizing a speech, a method and apparatus of training a speech synthesis model, an electronic device, and a storage medium. The method of synthesizing a speech includes acquiring a style information of a speech to be synthesized, a tone information of the speech to be synthesized, and a content information of a text to be processed; generating an acoustic feature information of the text to be processed, by using a pre-trained speech synthesis model, based on the style information, the tone information, and the content information of the text to be processed; and synthesizing the speech for the text to be processed, based on the acoustic feature information of the text to be processed.

Type: Application

Filed: September 29, 2021

Publication date: January 20, 2022

Inventors: Wenfu WANG, Tao SUN, Xilei WANG, Junteng ZHANG, Zhengkun GAO, Lei JIA
Method And Apparatus For Training Model, Method And Apparatus For Synthesizing Speech, Device And Storage Medium

Publication number: 20210390943

Abstract: The present disclosure discloses a method and apparatus for training a model, a method and apparatus for synthesizing a speech, a device and a storage medium, and relates to the field of natural language processing and deep learning technology. The method for training a model may include: determining a phoneme feature and a prosodic word boundary feature of sample text data; inserting a pause character into the phoneme feature according to the prosodic word boundary feature to obtain a combined feature of the sample text data; and training an initial speech synthesis model according to the combined feature of the sample text data, to obtain a target speech synthesis model.

Type: Application

Filed: December 3, 2020

Publication date: December 16, 2021

Inventors: Zhengkun GAO, Junteng ZHANG, Wenfu WANG, Tao SUN