Patents by Inventor Hong-Kwang Kuo

Hong-Kwang Kuo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11908454
    Abstract: A processor-implemented method trains an automatic speech recognition system using speech data and text data. A computer device receives speech data, and generates a spectrogram based on the speech data. The computing device receives text data associated with an entire corpus of text data, and generates a textogram based upon the text data. The computing device trains an automatic speech recognition system using the spectrogram and the textogram.
    Type: Grant
    Filed: December 1, 2021
    Date of Patent: February 20, 2024
    Assignee: International Business Machines Corporation
    Inventors: Samuel Thomas, Hong-Kwang Kuo, Brian E. D. Kingsbury, George Andrei Saon, Gakuto Kurata
  • Patent number: 11900922
    Abstract: Embodiments of the present invention provide computer implemented methods, computer program products and computer systems. For example, embodiments of the present invention can access one or more intents and associated entities from limited amount of speech to text training data in a single language. Embodiments of the present invention can locate speech to text training data in one or more other languages using the accessed one or more intents and associated entities to locate speech to text training data in the one or more other languages different than the single language. Embodiments of the present invention can then train a neural network based on the limited amount of speech to text training data in the single language and the located speech to text training data in the one or more other languages.
    Type: Grant
    Filed: November 10, 2020
    Date of Patent: February 13, 2024
    Assignee: International Business Machines Corporation
    Inventors: Samuel Thomas, Hong-Kwang Kuo, Kartik Audhkhasi, Michael Alan Picheny
  • Publication number: 20230298596
    Abstract: Systems, computer-implemented methods, and computer program products to facilitate end to end integration of dialogue history for spoken language understanding are provided. According to an embodiment, a system can comprise a processor that executes components stored in memory. The computer executable components comprise a conversation component that encodes speech-based content of an utterance and text-based content of the utterance into a uniform representation.
    Type: Application
    Filed: March 18, 2022
    Publication date: September 21, 2023
    Inventors: Samuel Thomas, Vishal Sunder, Hong-Kwang Kuo, Jatin Ganhotra, Brian E. D. Kingsbury, Eric Fosler-Lussier
  • Publication number: 20230169954
    Abstract: A processor-implemented method trains an automatic speech recognition system using speech data and text data. A computer device receives speech data, and generates a spectrogram based on the speech data. The computing device receives text data associated with an entire corpus of text data, and generates a textogram based upon the text data. The computing device trains an automatic speech recognition system using the spectrogram and the textogram.
    Type: Application
    Filed: December 1, 2021
    Publication date: June 1, 2023
    Inventors: SAMUEL THOMAS, HONG-KWANG KUO, BRIAN E.D. KINGSBURY, GEORGE ANDREI SAON, KAGUTO KURATA
  • Publication number: 20230081306
    Abstract: Training data can be received, which can include pairs of speech and meaning representation associated with the speech as ground truth data. The meaning representation includes at least semantic entities associated with the speech, where the spoken order of the semantic entities is unknown. The semantic entities of the meaning representation in the training data can be reordered into spoken order of the associated speech using an alignment technique. A spoken language understanding machine learning model can be trained using the pairs of speech and meaning representation having the reordered semantic entities. The meaning representation, e.g., semantic entities, in the received training data can be perturbed to create random order sequence variations of the semantic entities associated with speech. Perturbed meaning representation with associated speech can augment the training data.
    Type: Application
    Filed: August 27, 2021
    Publication date: March 16, 2023
    Inventors: Hong-Kwang Kuo, Zoltan Tueske, Samuel Thomas, Brian E. D. Kingsbury, George Andrei Saon
  • Publication number: 20230056680
    Abstract: Audio signals representing a current utterance in a conversation and a dialog history including at least information associated with past utterances corresponding to the current utterance in the conversation can be received. The dialog history can be encoded into an embedding. A spoken language understanding neural network model can be trained to perform a spoken language understanding task based on input features including at least speech features associated with the received audio signals and the embedding. An encoder can also be trained to encode a given dialog history into an embedding. The spoken language understanding task can include predicting a dialog action of an utterance. The spoken language understanding task can include predicting a dialog intent or overall topic of the conversation.
    Type: Application
    Filed: August 18, 2021
    Publication date: February 23, 2023
    Inventors: Samuel Thomas, Jatin Ganhotra, Hong-Kwang Kuo, Sachindra Joshi, George Andrei Saon, Zoltan Tueske, Brian E. D. Kingsbury
  • Publication number: 20220319494
    Abstract: An approach to training an end-to-end spoken language understanding model may be provided. A pre-trained general automatic speech recognition model may be adapted to a domain specific spoken language understanding model. The pre-trained general automatic speech recognition model may be a recurrent neural network transducer model. The adaptation may provide transcription data annotated with spoken language understanding labels. Adaptation may include audio data may also be provided for in addition to verbatim transcripts annotated with spoken language understanding labels. The spoken language understanding labels may be entity and/or intent based with values associated with each label.
    Type: Application
    Filed: March 31, 2021
    Publication date: October 6, 2022
    Inventors: Samuel Thomas, Hong-Kwang Kuo, George Andrei Saon, Zoltan Tueske, Brian E. D. Kingsbury
  • Publication number: 20220148581
    Abstract: Embodiments of the present invention provide computer implemented methods, computer program products and computer systems. For example, embodiments of the present invention can access one or more intents and associated entities from limited amount of speech to text training data in a single language. Embodiments of the present invention can locate speech to text training data in one or more other languages using the accessed one or more intents and associated entities to locate speech to text training data in the one or more other languages different than the single language. Embodiments of the present invention can then train a neural network based on the limited amount of speech to text training data in the single language and the located speech to text training data in the one or more other languages.
    Type: Application
    Filed: November 10, 2020
    Publication date: May 12, 2022
    Inventors: Samuel Thomas, Hong-Kwang Kuo, Kartik Audhkhasi, Michael Alan Picheny
  • Patent number: 9734823
    Abstract: Systems and methods for spoken term detection are provided. A method for spoken term detection, comprises receiving phone level out-of-vocabulary (OOV) keyword queries, converting the phone level OOV keyword queries to words, generating a confusion network (CN) based keyword searching (KWS) index, and using the CN based KWS index for both in-vocabulary (IV) keyword queries and the OOV keyword queries.
    Type: Grant
    Filed: August 27, 2015
    Date of Patent: August 15, 2017
    Assignee: International Business Machines Corporation
    Inventors: Brian E. D. Kingsbury, Hong-Kwang Kuo, Lidia Mangu, Hagen Soltau
  • Publication number: 20160005398
    Abstract: Systems and methods for spoken term detection are provided. A method for spoken term detection, comprises receiving phone level out-of-vocabulary (OOV) keyword queries, converting the phone level OOV keyword queries to words, generating a confusion network (CN) based keyword searching (KWS) index, and using the CN based KWS index for both in-vocabulary (IV) keyword queries and the OOV keyword queries.
    Type: Application
    Filed: August 27, 2015
    Publication date: January 7, 2016
    Inventors: Brian E.D. Kingsbury, Hong-Kwang Kuo, Lidia Mangu, Hagen Soltau
  • Patent number: 9196243
    Abstract: Systems and methods for spoken term detection are provided. A method for spoken term detection, comprises receiving phone level out-of-vocabulary (OOV) keyword queries, converting the phone level OOV keyword queries to words, generating a confusion network (CN) based keyword searching (KWS) index, and using the CN based KWS index for both in-vocabulary (IV) keyword queries and the OOV keyword queries.
    Type: Grant
    Filed: March 31, 2014
    Date of Patent: November 24, 2015
    Assignee: International Business Machines Corporation
    Inventors: Brian E. D. Kingsbury, Hong-Kwang Kuo, Lidia Mangu, Hagen Soltau
  • Publication number: 20150279358
    Abstract: Systems and methods for spoken term detection are provided. A method for spoken term detection, comprises receiving phone level out-of-vocabulary (OOV) keyword queries, converting the phone level OOV keyword queries to words, generating a confusion network (CN) based keyword searching (KWS) index, and using the CN based KWS index for both in-vocabulary (IV) keyword queries and the OOV keyword queries.
    Type: Application
    Filed: March 31, 2014
    Publication date: October 1, 2015
    Applicant: International Business Machines Corporation
    Inventors: Brian E.D. Kingsbury, Hong-Kwang Kuo, Lidia Mangu, Hagen Soltau
  • Patent number: 8977549
    Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.
    Type: Grant
    Filed: September 26, 2013
    Date of Patent: March 10, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Sabine V. Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
  • Publication number: 20140032217
    Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module.
    Type: Application
    Filed: September 26, 2013
    Publication date: January 30, 2014
    Applicant: Nuance Communications, Inc.
    Inventors: Sabine V. Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
  • Patent number: 8571869
    Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.
    Type: Grant
    Filed: May 15, 2008
    Date of Patent: October 29, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Sabine Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
  • Patent number: 7912727
    Abstract: An apparatus and method that integrates both phrase-based and free-form speech-to-speech translation approaches using probability models. The starting step of the method is to receive vocal communication in a source language. Then store the received vocal communication. Then decipher the content of the vocal communication. Then locate in a multilingual dictionary module the corresponding translation of the deciphered vocal communication provided a preset sentence exists in a speech recognition module for the vocal communication. Then translate the vocal communication into the target language provided there is no corresponding translation located in the multilingual dictionary module. Then synthesize the translated target language when there is no corresponding translation for the vocal communication in the multilingual dictionary module. Then store the sound of the translated target language. Then play the sound of the translated target language.
    Type: Grant
    Filed: May 29, 2008
    Date of Patent: March 22, 2011
    Assignee: International Business Machines Corporation
    Inventors: Yuqing Gao, Liang Gu, Hong-Kwang Kuo
  • Patent number: 7848915
    Abstract: A concept-based back translation system includes a target language semantic parser module, a source language semantic parser module, a bi-directional machine translation module, a relevancy judging module, and a back translation display module.
    Type: Grant
    Filed: August 9, 2006
    Date of Patent: December 7, 2010
    Assignee: International Business Machines Corporation
    Inventors: Yuqing Gao, Liang Gu, Hong-Kwang Kuo, Bowen Zhou
  • Publication number: 20100274552
    Abstract: A concept-based back translation system includes a target language semantic parser module, a source language semantic parser module, a bi-directional machine translation module, a relevancy judging module, and a back translation display module.
    Type: Application
    Filed: August 9, 2006
    Publication date: October 28, 2010
    Applicant: International Business Machines Corporation
    Inventors: Yuqing Gao, Liang Gu, Hong-Kwang Kuo, Bowen Zhou
  • Patent number: 7574358
    Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.
    Type: Grant
    Filed: February 28, 2005
    Date of Patent: August 11, 2009
    Assignee: International Business Machines Corporation
    Inventors: Sabine Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
  • Publication number: 20090055160
    Abstract: An apparatus and method that integrates both phrase-based and free-form speech-to-speech translation approaches using probability models. The starting step of the method is to receive vocal communication in a source language. Then store the received vocal communication. Then decipher the content of the vocal communication. Then locate in a multilingual dictionary module the corresponding translation of the deciphered vocal communication provided a preset sentence exists in a speech recognition module for the vocal communication. Then translate the vocal communication into the target language provided there is no corresponding translation located in the multilingual dictionary module. Then synthesize the translated target language when there is no corresponding translation for the vocal communication in the multilingual dictionary module. Then store the sound of the translated target language. Then play the sound of the translated target language.
    Type: Application
    Filed: May 29, 2008
    Publication date: February 26, 2009
    Applicant: International Business Machines Corporation
    Inventors: Yuqing Gao, Liang Gu, Hong-Kwang Kuo