Patents by Inventor Zhi Wei Shuang

Zhi Wei Shuang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9342509
    Abstract: A method and apparatus for speech translation. The method includes: receiving a source speech; extracting non-text information in the source speech; translating the source speech into a target speech; and adjusting the translated target speech according to the extracted non-text information so that the target speech preserves the non-text information in the source speech. The apparatus includes: a receiving module for receiving source speech; an extracting module for extracting non-text information in the source speech; a translation module for translating the source speech into a target speech; and an adjusting module for adjusting the translated target speech according to the extracted non-text information so that the target speech preserves the non-text information in the source speech.
    Type: Grant
    Filed: October 30, 2009
    Date of Patent: May 17, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Fan Ping Meng, Yong Qin, Zhi Wei Shuang, Shi Lei Zhang
  • Patent number: 9210263
    Abstract: A method, information processing system, and computer program storage product for automatically generating auditory archives in a customer service environment are disclosed. A communication link with an end user is established. An information form is retrieved. The information form includes at least a category choice information set and at least one audio recoding information set. The end user is prompted to answer a set of questions based on information in the information form. A data set associated with each answer to each question in the set of questions given by the end user is stored. The data is stored under a set of fields corresponding to a question. Each data set stored under the set of fields for each question in the set of questions are combined with each other. An audio archive file is generated including the data sets that have been combined.
    Type: Grant
    Filed: April 9, 2015
    Date of Patent: December 8, 2015
    Assignee: International Business Machines Corporation
    Inventors: Fan Ping Meng, Yong Qin, Qin Shi, Zhi Wei Shuang
  • Publication number: 20150215458
    Abstract: A method, information processing system, and computer program storage product for automatically generating auditory archives in a customer service environment are disclosed. A communication link with an end user is established. An information form is retrieved. The information form includes at least a category choice information set and at least one audio recoding information set. The end user is prompted to answer a set of questions based on information in the information form. A data set associated with each answer to each question in the set of questions given by the end user is stored. The data is stored under a set of fields corresponding to a question. Each data set stored under the set of fields for each question in the set of questions are combined with each other. An audio archive file is generated including the data sets that have been combined.
    Type: Application
    Filed: April 9, 2015
    Publication date: July 30, 2015
    Inventors: Fan Ping MENG, Yong QIN, Qin SHI, Zhi Wei SHUANG
  • Patent number: 9025736
    Abstract: A method, information processing system, and computer program storage product for automatically generating auditory archives in a customer service environment are disclosed. A communication link with an end user is established. An information form is retrieved. The information form includes at least a category choice information set and at least one audio recoding information set. The end user is prompted to answer a set of questions based on information in the information form. A data set associated with each answer to each question in the set of questions given by the end user is stored. The data is stored under a set of fields corresponding to a question. Each data set stored under the set of fields for each question in the set of questions are combined with each other. An audio archive file is generated including the data sets that have been combined.
    Type: Grant
    Filed: February 4, 2008
    Date of Patent: May 5, 2015
    Assignee: International Business Machines Corporation
    Inventors: Fan Ping Meng, Yong Qin, Qin Shi, Zhi Wei Shuang
  • Patent number: 8280739
    Abstract: The present invention provides a speech analysis method comprising steps of obtaining a speech signal and a corresponding DEGG/EGG signal; regarding the speech signal as the output of a vocal tract filter in a source-filter model taking the DEGG/EGG signal as the input; and estimating the features of the vocal tract filter from the speech signal as the output and the DEGG/EGG signal as the input, wherein the features of the vocal tract filter are expressed by the state vectors of the vocal tract filter at selected time points, and the step of estimating is performed using Kalman filtering.
    Type: Grant
    Filed: April 3, 2008
    Date of Patent: October 2, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Dan Ning Jiang, Fan Ping Meng, Yong Qin, Zhi Wei Shuang
  • Patent number: 8234110
    Abstract: A method, system and computer program product for voice conversion. The method includes performing speech analysis on the speech of a source speaker to achieve speech information; performing spectral conversion based on said speech information, to at least achieve a first spectrum similar to the speech of a target speaker; performing unit selection on the speech of said target speaker at least using said first spectrum as a target; replacing at least part of said first spectrum with the spectrum of the selected target speaker's speech unit; and performing speech reconstruction at least based on the replaced spectrum.
    Type: Grant
    Filed: September 29, 2008
    Date of Patent: July 31, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Fan Ping Meng, Yong Qin, Qin Shi, Zhi Wei Shuang
  • Patent number: 8170878
    Abstract: The invention proposes a method and apparatus for significantly improving the quality of voice morphing and guaranteeing the similarity of converted voice. The invention sets several standard speakers in a TTS database, and selects the voices of different standard speakers for speech synthesis according to different roles, wherein the voice of the selected standard speaker is similar to the original role to a certain extent. Then the invention further performs voice morphing on the standard voice similar to the original voice to a certain extent, in order to accurately mimic the voice of the original speaker, so as to make the converted voice closer to the original voice features while guaranteeing the similarity.
    Type: Grant
    Filed: July 29, 2008
    Date of Patent: May 1, 2012
    Assignee: International Business Machines Corporation
    Inventors: Yi Liu, Yong Qin, Qin Shi, Zhi Wei Shuang
  • Patent number: 7716052
    Abstract: A method, apparatus and a computer program product to generate an audible speech word that corresponds to text. The method includes providing a text word and, in response to the text word, processing pre-recorded speech segments that are derived from a plurality of speakers to selectively concatenate together speech segments based on at least one cost function to form audio data for generating an audible speech word that corresponds to the text word. A data structure is also provided for use in a concatenative text-to-speech system that includes a plurality of speech segments derived from a plurality of speakers, where each speech segment includes an associated attribute vector each of which is comprised of at least one attribute vector element that identifies the speaker from which the speech segment was derived.
    Type: Grant
    Filed: April 7, 2005
    Date of Patent: May 11, 2010
    Assignee: Nuance Communications, Inc.
    Inventors: Andrew S. Aaron, Ellen M. Eide, Wael M. Hamza, Michael A. Picheny, Charles T. Rutherfoord, Zhi Wei Shuang, Maria E. Smith
  • Publication number: 20100114556
    Abstract: A method and apparatus for speech translation. The method includes: receiving a source speech; extracting non-text information in the source speech; translating the source speech into a target speech; and adjusting the translated target speech according to the extracted non-text information so that the target speech preserves the non-text information in the source speech. The apparatus includes: a receiving module for receiving source speech; an extracting module for extracting non-text information in the source speech; a translation module for translating the source speech into a target speech; and an adjusting module for adjusting the translated target speech according to the extracted non-text information so that the target speech preserves the non-text information in the source speech.
    Type: Application
    Filed: October 30, 2009
    Publication date: May 6, 2010
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Fan Ping Meng, Yong Qin, Zhi Wei Shuang, Shi Lei Zhang
  • Publication number: 20090089063
    Abstract: A method, system and computer program product for voice conversion. The method includes performing speech analysis on the speech of a source speaker to achieve speech information; performing spectral conversion based on said speech information, to at least achieve a first spectrum similar to the speech of a target speaker; performing unit selection on the speech of said target speaker at least using said first spectrum as a target; replacing at least part of said first spectrum with the spectrum of the selected target speaker's speech unit; and performing speech reconstruction at least based on the replaced spectrum.
    Type: Application
    Filed: September 29, 2008
    Publication date: April 2, 2009
    Inventors: Fan Ping Meng, Yong Qin, Qin Shi, Zhi Wei Shuang
  • Publication number: 20090037179
    Abstract: The invention proposes a method and apparatus for significantly improving the quality of voice morphing and guaranteeing the similarity of converted voice. The invention sets several standard speakers in a TTS database, and selects the voices of different standard speakers for speech synthesis according to different roles, wherein the voice of the selected standard speaker is similar to the original role to a certain extent. Then the invention further performs voice morphing on the standard voice similar to the original voice to a certain extent, in order to accurately mimic the voice of the original speaker, so as to make the converted voice closer to the original voice features while guaranteeing the similarity.
    Type: Application
    Filed: July 29, 2008
    Publication date: February 5, 2009
    Applicant: International Business Machines Corporation
    Inventors: Yi Liu, Yong Qin, Qin Shi, Zhi Wei Shuang
  • Publication number: 20080288258
    Abstract: The present invention provides a speech analysis method comprising steps of obtaining a speech signal and a corresponding DEGG/EGG signal; regarding the speech signal as the output of a vocal tract filter in a source-filter model taking the DEGG/EGG signal as the input; and estimating the features of the vocal tract filter from the speech signal as the output and the DEGG/EGG signal as the input, wherein the features of the vocal tract filter are expressed by the state vectors of the vocal tract filter at selected time points, and the step of estimating is performed using Kalman filtering.
    Type: Application
    Filed: April 3, 2008
    Publication date: November 20, 2008
    Applicant: International Business Machines Corporation
    Inventors: Dan Ning Jiang, Fan Ping Meng, Yong Qin, Zhi Wei Shuang
  • Publication number: 20080187109
    Abstract: A method, information processing system, and computer program storage product for automatically generating auditory archives in a customer service environment are disclosed. A communication link with an end user is established. An information form is retrieved. The information form includes at least a category choice information set and at least one audio recoding information set. The end user is prompted to answer a set of questions based on information in the information form. A data set associated with each answer to each question in the set of questions given by the end user is stored. The data is stored under a set of fields corresponding to a question. Each data set stored under the set of fields for each question in the set of questions are combined with each other. An audio archive file is generated including the data sets that have been combined.
    Type: Application
    Filed: February 4, 2008
    Publication date: August 7, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: FAN PING MENG, Yong Qin, Qin Shi, Zhi Wei Shuang