Patents by Inventor Zhi Wei Shuang
Zhi Wei Shuang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9342509Abstract: A method and apparatus for speech translation. The method includes: receiving a source speech; extracting non-text information in the source speech; translating the source speech into a target speech; and adjusting the translated target speech according to the extracted non-text information so that the target speech preserves the non-text information in the source speech. The apparatus includes: a receiving module for receiving source speech; an extracting module for extracting non-text information in the source speech; a translation module for translating the source speech into a target speech; and an adjusting module for adjusting the translated target speech according to the extracted non-text information so that the target speech preserves the non-text information in the source speech.Type: GrantFiled: October 30, 2009Date of Patent: May 17, 2016Assignee: Nuance Communications, Inc.Inventors: Fan Ping Meng, Yong Qin, Zhi Wei Shuang, Shi Lei Zhang
-
Patent number: 9210263Abstract: A method, information processing system, and computer program storage product for automatically generating auditory archives in a customer service environment are disclosed. A communication link with an end user is established. An information form is retrieved. The information form includes at least a category choice information set and at least one audio recoding information set. The end user is prompted to answer a set of questions based on information in the information form. A data set associated with each answer to each question in the set of questions given by the end user is stored. The data is stored under a set of fields corresponding to a question. Each data set stored under the set of fields for each question in the set of questions are combined with each other. An audio archive file is generated including the data sets that have been combined.Type: GrantFiled: April 9, 2015Date of Patent: December 8, 2015Assignee: International Business Machines CorporationInventors: Fan Ping Meng, Yong Qin, Qin Shi, Zhi Wei Shuang
-
Publication number: 20150215458Abstract: A method, information processing system, and computer program storage product for automatically generating auditory archives in a customer service environment are disclosed. A communication link with an end user is established. An information form is retrieved. The information form includes at least a category choice information set and at least one audio recoding information set. The end user is prompted to answer a set of questions based on information in the information form. A data set associated with each answer to each question in the set of questions given by the end user is stored. The data is stored under a set of fields corresponding to a question. Each data set stored under the set of fields for each question in the set of questions are combined with each other. An audio archive file is generated including the data sets that have been combined.Type: ApplicationFiled: April 9, 2015Publication date: July 30, 2015Inventors: Fan Ping MENG, Yong QIN, Qin SHI, Zhi Wei SHUANG
-
Patent number: 9025736Abstract: A method, information processing system, and computer program storage product for automatically generating auditory archives in a customer service environment are disclosed. A communication link with an end user is established. An information form is retrieved. The information form includes at least a category choice information set and at least one audio recoding information set. The end user is prompted to answer a set of questions based on information in the information form. A data set associated with each answer to each question in the set of questions given by the end user is stored. The data is stored under a set of fields corresponding to a question. Each data set stored under the set of fields for each question in the set of questions are combined with each other. An audio archive file is generated including the data sets that have been combined.Type: GrantFiled: February 4, 2008Date of Patent: May 5, 2015Assignee: International Business Machines CorporationInventors: Fan Ping Meng, Yong Qin, Qin Shi, Zhi Wei Shuang
-
Patent number: 8280739Abstract: The present invention provides a speech analysis method comprising steps of obtaining a speech signal and a corresponding DEGG/EGG signal; regarding the speech signal as the output of a vocal tract filter in a source-filter model taking the DEGG/EGG signal as the input; and estimating the features of the vocal tract filter from the speech signal as the output and the DEGG/EGG signal as the input, wherein the features of the vocal tract filter are expressed by the state vectors of the vocal tract filter at selected time points, and the step of estimating is performed using Kalman filtering.Type: GrantFiled: April 3, 2008Date of Patent: October 2, 2012Assignee: Nuance Communications, Inc.Inventors: Dan Ning Jiang, Fan Ping Meng, Yong Qin, Zhi Wei Shuang
-
Patent number: 8234110Abstract: A method, system and computer program product for voice conversion. The method includes performing speech analysis on the speech of a source speaker to achieve speech information; performing spectral conversion based on said speech information, to at least achieve a first spectrum similar to the speech of a target speaker; performing unit selection on the speech of said target speaker at least using said first spectrum as a target; replacing at least part of said first spectrum with the spectrum of the selected target speaker's speech unit; and performing speech reconstruction at least based on the replaced spectrum.Type: GrantFiled: September 29, 2008Date of Patent: July 31, 2012Assignee: Nuance Communications, Inc.Inventors: Fan Ping Meng, Yong Qin, Qin Shi, Zhi Wei Shuang
-
Patent number: 8170878Abstract: The invention proposes a method and apparatus for significantly improving the quality of voice morphing and guaranteeing the similarity of converted voice. The invention sets several standard speakers in a TTS database, and selects the voices of different standard speakers for speech synthesis according to different roles, wherein the voice of the selected standard speaker is similar to the original role to a certain extent. Then the invention further performs voice morphing on the standard voice similar to the original voice to a certain extent, in order to accurately mimic the voice of the original speaker, so as to make the converted voice closer to the original voice features while guaranteeing the similarity.Type: GrantFiled: July 29, 2008Date of Patent: May 1, 2012Assignee: International Business Machines CorporationInventors: Yi Liu, Yong Qin, Qin Shi, Zhi Wei Shuang
-
Patent number: 7716052Abstract: A method, apparatus and a computer program product to generate an audible speech word that corresponds to text. The method includes providing a text word and, in response to the text word, processing pre-recorded speech segments that are derived from a plurality of speakers to selectively concatenate together speech segments based on at least one cost function to form audio data for generating an audible speech word that corresponds to the text word. A data structure is also provided for use in a concatenative text-to-speech system that includes a plurality of speech segments derived from a plurality of speakers, where each speech segment includes an associated attribute vector each of which is comprised of at least one attribute vector element that identifies the speaker from which the speech segment was derived.Type: GrantFiled: April 7, 2005Date of Patent: May 11, 2010Assignee: Nuance Communications, Inc.Inventors: Andrew S. Aaron, Ellen M. Eide, Wael M. Hamza, Michael A. Picheny, Charles T. Rutherfoord, Zhi Wei Shuang, Maria E. Smith
-
Publication number: 20100114556Abstract: A method and apparatus for speech translation. The method includes: receiving a source speech; extracting non-text information in the source speech; translating the source speech into a target speech; and adjusting the translated target speech according to the extracted non-text information so that the target speech preserves the non-text information in the source speech. The apparatus includes: a receiving module for receiving source speech; an extracting module for extracting non-text information in the source speech; a translation module for translating the source speech into a target speech; and an adjusting module for adjusting the translated target speech according to the extracted non-text information so that the target speech preserves the non-text information in the source speech.Type: ApplicationFiled: October 30, 2009Publication date: May 6, 2010Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Fan Ping Meng, Yong Qin, Zhi Wei Shuang, Shi Lei Zhang
-
Publication number: 20090089063Abstract: A method, system and computer program product for voice conversion. The method includes performing speech analysis on the speech of a source speaker to achieve speech information; performing spectral conversion based on said speech information, to at least achieve a first spectrum similar to the speech of a target speaker; performing unit selection on the speech of said target speaker at least using said first spectrum as a target; replacing at least part of said first spectrum with the spectrum of the selected target speaker's speech unit; and performing speech reconstruction at least based on the replaced spectrum.Type: ApplicationFiled: September 29, 2008Publication date: April 2, 2009Inventors: Fan Ping Meng, Yong Qin, Qin Shi, Zhi Wei Shuang
-
Publication number: 20090037179Abstract: The invention proposes a method and apparatus for significantly improving the quality of voice morphing and guaranteeing the similarity of converted voice. The invention sets several standard speakers in a TTS database, and selects the voices of different standard speakers for speech synthesis according to different roles, wherein the voice of the selected standard speaker is similar to the original role to a certain extent. Then the invention further performs voice morphing on the standard voice similar to the original voice to a certain extent, in order to accurately mimic the voice of the original speaker, so as to make the converted voice closer to the original voice features while guaranteeing the similarity.Type: ApplicationFiled: July 29, 2008Publication date: February 5, 2009Applicant: International Business Machines CorporationInventors: Yi Liu, Yong Qin, Qin Shi, Zhi Wei Shuang
-
Publication number: 20080288258Abstract: The present invention provides a speech analysis method comprising steps of obtaining a speech signal and a corresponding DEGG/EGG signal; regarding the speech signal as the output of a vocal tract filter in a source-filter model taking the DEGG/EGG signal as the input; and estimating the features of the vocal tract filter from the speech signal as the output and the DEGG/EGG signal as the input, wherein the features of the vocal tract filter are expressed by the state vectors of the vocal tract filter at selected time points, and the step of estimating is performed using Kalman filtering.Type: ApplicationFiled: April 3, 2008Publication date: November 20, 2008Applicant: International Business Machines CorporationInventors: Dan Ning Jiang, Fan Ping Meng, Yong Qin, Zhi Wei Shuang
-
Publication number: 20080187109Abstract: A method, information processing system, and computer program storage product for automatically generating auditory archives in a customer service environment are disclosed. A communication link with an end user is established. An information form is retrieved. The information form includes at least a category choice information set and at least one audio recoding information set. The end user is prompted to answer a set of questions based on information in the information form. A data set associated with each answer to each question in the set of questions given by the end user is stored. The data is stored under a set of fields corresponding to a question. Each data set stored under the set of fields for each question in the set of questions are combined with each other. An audio archive file is generated including the data sets that have been combined.Type: ApplicationFiled: February 4, 2008Publication date: August 7, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: FAN PING MENG, Yong Qin, Qin Shi, Zhi Wei Shuang