Patents by Inventor Zhi Wei Shuang

Zhi Wei Shuang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Speech translation method and apparatus utilizing prosodic information

Patent number: 9342509

Abstract: A method and apparatus for speech translation. The method includes: receiving a source speech; extracting non-text information in the source speech; translating the source speech into a target speech; and adjusting the translated target speech according to the extracted non-text information so that the target speech preserves the non-text information in the source speech. The apparatus includes: a receiving module for receiving source speech; an extracting module for extracting non-text information in the source speech; a translation module for translating the source speech into a target speech; and an adjusting module for adjusting the translated target speech according to the extracted non-text information so that the target speech preserves the non-text information in the source speech.

Type: Grant

Filed: October 30, 2009

Date of Patent: May 17, 2016

Assignee: Nuance Communications, Inc.

Inventors: Fan Ping Meng, Yong Qin, Zhi Wei Shuang, Shi Lei Zhang
Audio archive generation and presentation

Patent number: 9210263

Abstract: A method, information processing system, and computer program storage product for automatically generating auditory archives in a customer service environment are disclosed. A communication link with an end user is established. An information form is retrieved. The information form includes at least a category choice information set and at least one audio recoding information set. The end user is prompted to answer a set of questions based on information in the information form. A data set associated with each answer to each question in the set of questions given by the end user is stored. The data is stored under a set of fields corresponding to a question. Each data set stored under the set of fields for each question in the set of questions are combined with each other. An audio archive file is generated including the data sets that have been combined.

Type: Grant

Filed: April 9, 2015

Date of Patent: December 8, 2015

Assignee: International Business Machines Corporation

Inventors: Fan Ping Meng, Yong Qin, Qin Shi, Zhi Wei Shuang
AUDIO ARCHIVE GENERATION AND PRESENTATION

Publication number: 20150215458

Abstract: A method, information processing system, and computer program storage product for automatically generating auditory archives in a customer service environment are disclosed. A communication link with an end user is established. An information form is retrieved. The information form includes at least a category choice information set and at least one audio recoding information set. The end user is prompted to answer a set of questions based on information in the information form. A data set associated with each answer to each question in the set of questions given by the end user is stored. The data is stored under a set of fields corresponding to a question. Each data set stored under the set of fields for each question in the set of questions are combined with each other. An audio archive file is generated including the data sets that have been combined.

Type: Application

Filed: April 9, 2015

Publication date: July 30, 2015

Inventors: Fan Ping MENG, Yong QIN, Qin SHI, Zhi Wei SHUANG
Audio archive generation and presentation

Patent number: 9025736

Abstract: A method, information processing system, and computer program storage product for automatically generating auditory archives in a customer service environment are disclosed. A communication link with an end user is established. An information form is retrieved. The information form includes at least a category choice information set and at least one audio recoding information set. The end user is prompted to answer a set of questions based on information in the information form. A data set associated with each answer to each question in the set of questions given by the end user is stored. The data is stored under a set of fields corresponding to a question. Each data set stored under the set of fields for each question in the set of questions are combined with each other. An audio archive file is generated including the data sets that have been combined.

Type: Grant

Filed: February 4, 2008

Date of Patent: May 5, 2015

Assignee: International Business Machines Corporation

Inventors: Fan Ping Meng, Yong Qin, Qin Shi, Zhi Wei Shuang
Method and apparatus for speech analysis and synthesis

Patent number: 8280739

Abstract: The present invention provides a speech analysis method comprising steps of obtaining a speech signal and a corresponding DEGG/EGG signal; regarding the speech signal as the output of a vocal tract filter in a source-filter model taking the DEGG/EGG signal as the input; and estimating the features of the vocal tract filter from the speech signal as the output and the DEGG/EGG signal as the input, wherein the features of the vocal tract filter are expressed by the state vectors of the vocal tract filter at selected time points, and the step of estimating is performed using Kalman filtering.

Type: Grant

Filed: April 3, 2008

Date of Patent: October 2, 2012

Assignee: Nuance Communications, Inc.

Inventors: Dan Ning Jiang, Fan Ping Meng, Yong Qin, Zhi Wei Shuang
Voice conversion method and system

Patent number: 8234110

Abstract: A method, system and computer program product for voice conversion. The method includes performing speech analysis on the speech of a source speaker to achieve speech information; performing spectral conversion based on said speech information, to at least achieve a first spectrum similar to the speech of a target speaker; performing unit selection on the speech of said target speaker at least using said first spectrum as a target; replacing at least part of said first spectrum with the spectrum of the selected target speaker's speech unit; and performing speech reconstruction at least based on the replaced spectrum.

Type: Grant

Filed: September 29, 2008

Date of Patent: July 31, 2012

Assignee: Nuance Communications, Inc.

Inventors: Fan Ping Meng, Yong Qin, Qin Shi, Zhi Wei Shuang
Method and apparatus for automatically converting voice

Patent number: 8170878

Abstract: The invention proposes a method and apparatus for significantly improving the quality of voice morphing and guaranteeing the similarity of converted voice. The invention sets several standard speakers in a TTS database, and selects the voices of different standard speakers for speech synthesis according to different roles, wherein the voice of the selected standard speaker is similar to the original role to a certain extent. Then the invention further performs voice morphing on the standard voice similar to the original voice to a certain extent, in order to accurately mimic the voice of the original speaker, so as to make the converted voice closer to the original voice features while guaranteeing the similarity.

Type: Grant

Filed: July 29, 2008

Date of Patent: May 1, 2012

Assignee: International Business Machines Corporation

Inventors: Yi Liu, Yong Qin, Qin Shi, Zhi Wei Shuang
Method, apparatus and computer program providing a multi-speaker database for concatenative text-to-speech synthesis

Patent number: 7716052

Abstract: A method, apparatus and a computer program product to generate an audible speech word that corresponds to text. The method includes providing a text word and, in response to the text word, processing pre-recorded speech segments that are derived from a plurality of speakers to selectively concatenate together speech segments based on at least one cost function to form audio data for generating an audible speech word that corresponds to the text word. A data structure is also provided for use in a concatenative text-to-speech system that includes a plurality of speech segments derived from a plurality of speakers, where each speech segment includes an associated attribute vector each of which is comprised of at least one attribute vector element that identifies the speaker from which the speech segment was derived.

Type: Grant

Filed: April 7, 2005

Date of Patent: May 11, 2010

Assignee: Nuance Communications, Inc.

Inventors: Andrew S. Aaron, Ellen M. Eide, Wael M. Hamza, Michael A. Picheny, Charles T. Rutherfoord, Zhi Wei Shuang, Maria E. Smith
SPEECH TRANSLATION METHOD AND APPARATUS

Publication number: 20100114556

Abstract: A method and apparatus for speech translation. The method includes: receiving a source speech; extracting non-text information in the source speech; translating the source speech into a target speech; and adjusting the translated target speech according to the extracted non-text information so that the target speech preserves the non-text information in the source speech. The apparatus includes: a receiving module for receiving source speech; an extracting module for extracting non-text information in the source speech; a translation module for translating the source speech into a target speech; and an adjusting module for adjusting the translated target speech according to the extracted non-text information so that the target speech preserves the non-text information in the source speech.

Type: Application

Filed: October 30, 2009

Publication date: May 6, 2010

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Fan Ping Meng, Yong Qin, Zhi Wei Shuang, Shi Lei Zhang
VOICE CONVERSION METHOD AND SYSTEM

Publication number: 20090089063

Abstract: A method, system and computer program product for voice conversion. The method includes performing speech analysis on the speech of a source speaker to achieve speech information; performing spectral conversion based on said speech information, to at least achieve a first spectrum similar to the speech of a target speaker; performing unit selection on the speech of said target speaker at least using said first spectrum as a target; replacing at least part of said first spectrum with the spectrum of the selected target speaker's speech unit; and performing speech reconstruction at least based on the replaced spectrum.

Type: Application

Filed: September 29, 2008

Publication date: April 2, 2009

Inventors: Fan Ping Meng, Yong Qin, Qin Shi, Zhi Wei Shuang
Method and Apparatus for Automatically Converting Voice

Publication number: 20090037179

Abstract: The invention proposes a method and apparatus for significantly improving the quality of voice morphing and guaranteeing the similarity of converted voice. The invention sets several standard speakers in a TTS database, and selects the voices of different standard speakers for speech synthesis according to different roles, wherein the voice of the selected standard speaker is similar to the original role to a certain extent. Then the invention further performs voice morphing on the standard voice similar to the original voice to a certain extent, in order to accurately mimic the voice of the original speaker, so as to make the converted voice closer to the original voice features while guaranteeing the similarity.

Type: Application

Filed: July 29, 2008

Publication date: February 5, 2009

Applicant: International Business Machines Corporation

Inventors: Yi Liu, Yong Qin, Qin Shi, Zhi Wei Shuang
METHOD AND APPARATUS FOR SPEECH ANALYSIS AND SYNTHESIS

Publication number: 20080288258

Abstract: The present invention provides a speech analysis method comprising steps of obtaining a speech signal and a corresponding DEGG/EGG signal; regarding the speech signal as the output of a vocal tract filter in a source-filter model taking the DEGG/EGG signal as the input; and estimating the features of the vocal tract filter from the speech signal as the output and the DEGG/EGG signal as the input, wherein the features of the vocal tract filter are expressed by the state vectors of the vocal tract filter at selected time points, and the step of estimating is performed using Kalman filtering.

Type: Application

Filed: April 3, 2008

Publication date: November 20, 2008

Applicant: International Business Machines Corporation

Inventors: Dan Ning Jiang, Fan Ping Meng, Yong Qin, Zhi Wei Shuang
AUDIO ARCHIVE GENERATION AND PRESENTATION

Publication number: 20080187109

Abstract: A method, information processing system, and computer program storage product for automatically generating auditory archives in a customer service environment are disclosed. A communication link with an end user is established. An information form is retrieved. The information form includes at least a category choice information set and at least one audio recoding information set. The end user is prompted to answer a set of questions based on information in the information form. A data set associated with each answer to each question in the set of questions given by the end user is stored. The data is stored under a set of fields corresponding to a question. Each data set stored under the set of fields for each question in the set of questions are combined with each other. An audio archive file is generated including the data sets that have been combined.

Type: Application

Filed: February 4, 2008

Publication date: August 7, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: FAN PING MENG, Yong Qin, Qin Shi, Zhi Wei Shuang