Patents by Inventor Binggong Ding

Binggong Ding has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

CONTROLLED TRAINING AND USE OF TEXT-TO-SPEECH MODELS AND PERSONALIZED MODEL GENERATED VOICES

Publication number: 20220310058

Abstract: Systems are configured for generating text-to-speech data in a personalized voice by training a neural text-to-speech machine learning model on natural speech data collected from a particular user, validating the identity of the user from which data is collected, and authorizing requests from users to use the personalized voice in generating new speech data. The systems are further configured to train a machine learning model as a neural text-to-speech model with generated personalized speech data.

Type: Application

Filed: November 3, 2020

Publication date: September 29, 2022

Inventors: Sheng ZHAO, Li JIANG, Xuedong HUANG, Lijuan QIN, Lei HE, Binggong DING, Bo YAN, Chunling MA, Raunak OBEROI
Blending recorded speech with text-to-speech output for specific domains

Patent number: 8996377

Abstract: A text-to-speech (TTS) engine combines recorded speech with synthesized speech from a TTS synthesizer based on text input. The TTS engine receives the text input and identifies the domain for the speech (e.g. navigation, dialing, . . . ). The identified domain is used in selecting domain specific speech recordings (e.g. pre-recorded static phrases such as “turn left”, “turn right” . . . ) from the input text. The speech recordings are obtained based on the static phrases for the domain that are identified from the input text. The TTS engine blends the static phrases with the TTS output to smooth the acoustic trajectory of the input text. The prosody of the static phrases is used to create similar prosody in the TTS output.

Type: Grant

Filed: July 12, 2012

Date of Patent: March 31, 2015

Assignee: Microsoft Technology Licensing, LLC

Inventors: Sheng Zhao, Peng Wang, Difei Gao, Yijian Wu, Binggong Ding, Shenghua Ye, Max Leung
BLENDING RECORDED SPEECH WITH TEXT-TO-SPEECH OUTPUT FOR SPECIFIC DOMAINS

Publication number: 20140019134

Abstract: A text-to-speech (TTS) engine combines recorded speech with synthesized speech from a TTS synthesizer based on text input. The TTS engine receives the text input and identifies the domain for the speech (e.g. navigation, dialing, . . . ). The identified domain is used in selecting domain specific speech recordings (e.g. pre-recorded static phrases such as “turn left”, “turn right” . . . ) from the input text. The speech recordings are obtained based on the static phrases for the domain that are identified from the input text. The TTS engine blends the static phrases with the TTS output to smooth the acoustic trajectory of the input text. The prosody of the static phrases is used to create similar prosody in the TTS output.

Type: Application

Filed: July 12, 2012

Publication date: January 16, 2014

Applicant: Microsoft Corporation

Inventors: Sheng Zhao, Peng Wang, Difei Gao, Yijian Wu, Binggong Ding, Shenghua Ye, Max Leung
Techniques to create a custom voice font

Patent number: 8332225

Abstract: Techniques to create and share custom voice fonts are described. An apparatus may include a preprocessing component to receive voice audio data and a corresponding text script from a client and to process the voice audio data to produce prosody labels and a rich script. The apparatus may further include a verification component to automatically verify the voice audio data and the text script. The apparatus may further include a training component to train a custom voice font from the verified voice audio data and rich script and to generate custom voice font data usable by the TTS component. Other embodiments are described and claimed.

Type: Grant

Filed: June 4, 2009

Date of Patent: December 11, 2012

Assignee: Microsoft Corporation

Inventors: Sheng Zhao, Zhi Li, Shenghao Qin, Chiwei Che, Jingyang Xu, Binggong Ding
TECHNIQUES TO CREATE A CUSTOM VOICE FONT

Publication number: 20100312563

Abstract: Techniques to create and share custom voice fonts are described. An apparatus may include a preprocessing component to receive voice audio data and a corresponding text script from a client and to process the voice audio data to produce prosody labels and a rich script. The apparatus may further include a verification component to automatically verify the voice audio data and the text script. The apparatus may further include a training component to train a custom voice font from the verified voice audio data and rich script and to generate custom voice font data usable by the TTS component. Other embodiments are described and claimed.

Type: Application

Filed: June 4, 2009

Publication date: December 9, 2010

Applicant: MICROSOFT CORPORATION

Inventors: Sheng Zhao, Zhi Li, Shenghao Qin, Chiwei Che, Jingyang Xu, Binggong Ding

CONTROLLED TRAINING AND USE OF TEXT-TO-SPEECH MODELS AND PERSONALIZED MODEL GENERATED VOICES

Blending recorded speech with text-to-speech output for specific domains

BLENDING RECORDED SPEECH WITH TEXT-TO-SPEECH OUTPUT FOR SPECIFIC DOMAINS

Techniques to create a custom voice font

TECHNIQUES TO CREATE A CUSTOM VOICE FONT