Patents by Inventor Hong-Kwang Kuo

Hong-Kwang Kuo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Integrating text inputs for training and adapting neural network transducer ASR models

Patent number: 11908454

Abstract: A processor-implemented method trains an automatic speech recognition system using speech data and text data. A computer device receives speech data, and generates a spectrogram based on the speech data. The computing device receives text data associated with an entire corpus of text data, and generates a textogram based upon the text data. The computing device trains an automatic speech recognition system using the spectrogram and the textogram.

Type: Grant

Filed: December 1, 2021

Date of Patent: February 20, 2024

Assignee: International Business Machines Corporation

Inventors: Samuel Thomas, Hong-Kwang Kuo, Brian E. D. Kingsbury, George Andrei Saon, Gakuto Kurata
Multilingual intent recognition

Patent number: 11900922

Abstract: Embodiments of the present invention provide computer implemented methods, computer program products and computer systems. For example, embodiments of the present invention can access one or more intents and associated entities from limited amount of speech to text training data in a single language. Embodiments of the present invention can locate speech to text training data in one or more other languages using the accessed one or more intents and associated entities to locate speech to text training data in the one or more other languages different than the single language. Embodiments of the present invention can then train a neural network based on the limited amount of speech to text training data in the single language and the located speech to text training data in the one or more other languages.

Type: Grant

Filed: November 10, 2020

Date of Patent: February 13, 2024

Assignee: International Business Machines Corporation

Inventors: Samuel Thomas, Hong-Kwang Kuo, Kartik Audhkhasi, Michael Alan Picheny
END-TO-END INTEGRATION OF DIALOG HISTORY FOR SPOKEN LANGUAGE UNDERSTANDING

Publication number: 20230298596

Abstract: Systems, computer-implemented methods, and computer program products to facilitate end to end integration of dialogue history for spoken language understanding are provided. According to an embodiment, a system can comprise a processor that executes components stored in memory. The computer executable components comprise a conversation component that encodes speech-based content of an utterance and text-based content of the utterance into a uniform representation.

Type: Application

Filed: March 18, 2022

Publication date: September 21, 2023

Inventors: Samuel Thomas, Vishal Sunder, Hong-Kwang Kuo, Jatin Ganhotra, Brian E. D. Kingsbury, Eric Fosler-Lussier
INTEGRATING TEXT INPUTS FOR TRAINING AND ADAPTING NEURAL NETWORK TRANSDUCER ASR MODELS

Publication number: 20230169954

Abstract: A processor-implemented method trains an automatic speech recognition system using speech data and text data. A computer device receives speech data, and generates a spectrogram based on the speech data. The computing device receives text data associated with an entire corpus of text data, and generates a textogram based upon the text data. The computing device trains an automatic speech recognition system using the spectrogram and the textogram.

Type: Application

Filed: December 1, 2021

Publication date: June 1, 2023

Inventors: SAMUEL THOMAS, HONG-KWANG KUO, BRIAN E.D. KINGSBURY, GEORGE ANDREI SAON, KAGUTO KURATA
TRAINING END-TO-END SPOKEN LANGUAGE UNDERSTANDING SYSTEMS WITH UNORDERED ENTITIES

Publication number: 20230081306

Abstract: Training data can be received, which can include pairs of speech and meaning representation associated with the speech as ground truth data. The meaning representation includes at least semantic entities associated with the speech, where the spoken order of the semantic entities is unknown. The semantic entities of the meaning representation in the training data can be reordered into spoken order of the associated speech using an alignment technique. A spoken language understanding machine learning model can be trained using the pairs of speech and meaning representation having the reordered semantic entities. The meaning representation, e.g., semantic entities, in the received training data can be perturbed to create random order sequence variations of the semantic entities associated with speech. Perturbed meaning representation with associated speech can augment the training data.

Type: Application

Filed: August 27, 2021

Publication date: March 16, 2023

Inventors: Hong-Kwang Kuo, Zoltan Tueske, Samuel Thomas, Brian E. D. Kingsbury, George Andrei Saon
INTEGRATING DIALOG HISTORY INTO END-TO-END SPOKEN LANGUAGE UNDERSTANDING SYSTEMS

Publication number: 20230056680

Abstract: Audio signals representing a current utterance in a conversation and a dialog history including at least information associated with past utterances corresponding to the current utterance in the conversation can be received. The dialog history can be encoded into an embedding. A spoken language understanding neural network model can be trained to perform a spoken language understanding task based on input features including at least speech features associated with the received audio signals and the embedding. An encoder can also be trained to encode a given dialog history into an embedding. The spoken language understanding task can include predicting a dialog action of an utterance. The spoken language understanding task can include predicting a dialog intent or overall topic of the conversation.

Type: Application

Filed: August 18, 2021

Publication date: February 23, 2023

Inventors: Samuel Thomas, Jatin Ganhotra, Hong-Kwang Kuo, Sachindra Joshi, George Andrei Saon, Zoltan Tueske, Brian E. D. Kingsbury
END TO END SPOKEN LANGUAGE UNDERSTANDING MODEL

Publication number: 20220319494

Abstract: An approach to training an end-to-end spoken language understanding model may be provided. A pre-trained general automatic speech recognition model may be adapted to a domain specific spoken language understanding model. The pre-trained general automatic speech recognition model may be a recurrent neural network transducer model. The adaptation may provide transcription data annotated with spoken language understanding labels. Adaptation may include audio data may also be provided for in addition to verbatim transcripts annotated with spoken language understanding labels. The spoken language understanding labels may be entity and/or intent based with values associated with each label.

Type: Application

Filed: March 31, 2021

Publication date: October 6, 2022

Inventors: Samuel Thomas, Hong-Kwang Kuo, George Andrei Saon, Zoltan Tueske, Brian E. D. Kingsbury
MULTILINGUAL INTENT RECOGNITION

Publication number: 20220148581

Abstract: Embodiments of the present invention provide computer implemented methods, computer program products and computer systems. For example, embodiments of the present invention can access one or more intents and associated entities from limited amount of speech to text training data in a single language. Embodiments of the present invention can locate speech to text training data in one or more other languages using the accessed one or more intents and associated entities to locate speech to text training data in the one or more other languages different than the single language. Embodiments of the present invention can then train a neural network based on the limited amount of speech to text training data in the single language and the located speech to text training data in the one or more other languages.

Type: Application

Filed: November 10, 2020

Publication date: May 12, 2022

Inventors: Samuel Thomas, Hong-Kwang Kuo, Kartik Audhkhasi, Michael Alan Picheny
Method and system for efficient spoken term detection using confusion networks

Patent number: 9734823

Abstract: Systems and methods for spoken term detection are provided. A method for spoken term detection, comprises receiving phone level out-of-vocabulary (OOV) keyword queries, converting the phone level OOV keyword queries to words, generating a confusion network (CN) based keyword searching (KWS) index, and using the CN based KWS index for both in-vocabulary (IV) keyword queries and the OOV keyword queries.

Type: Grant

Filed: August 27, 2015

Date of Patent: August 15, 2017

Assignee: International Business Machines Corporation

Inventors: Brian E. D. Kingsbury, Hong-Kwang Kuo, Lidia Mangu, Hagen Soltau
METHOD AND SYSTEM FOR EFFICIENT SPOKEN TERM DETECTION USING CONFUSION NETWORKS

Publication number: 20160005398

Abstract: Systems and methods for spoken term detection are provided. A method for spoken term detection, comprises receiving phone level out-of-vocabulary (OOV) keyword queries, converting the phone level OOV keyword queries to words, generating a confusion network (CN) based keyword searching (KWS) index, and using the CN based KWS index for both in-vocabulary (IV) keyword queries and the OOV keyword queries.

Type: Application

Filed: August 27, 2015

Publication date: January 7, 2016

Inventors: Brian E.D. Kingsbury, Hong-Kwang Kuo, Lidia Mangu, Hagen Soltau
Method and system for efficient spoken term detection using confusion networks

Patent number: 9196243

Abstract: Systems and methods for spoken term detection are provided. A method for spoken term detection, comprises receiving phone level out-of-vocabulary (OOV) keyword queries, converting the phone level OOV keyword queries to words, generating a confusion network (CN) based keyword searching (KWS) index, and using the CN based KWS index for both in-vocabulary (IV) keyword queries and the OOV keyword queries.

Type: Grant

Filed: March 31, 2014

Date of Patent: November 24, 2015

Assignee: International Business Machines Corporation

Inventors: Brian E. D. Kingsbury, Hong-Kwang Kuo, Lidia Mangu, Hagen Soltau
METHOD AND SYSTEM FOR EFFICIENT SPOKEN TERM DETECTION USING CONFUSION NETWORKS

Publication number: 20150279358

Abstract: Systems and methods for spoken term detection are provided. A method for spoken term detection, comprises receiving phone level out-of-vocabulary (OOV) keyword queries, converting the phone level OOV keyword queries to words, generating a confusion network (CN) based keyword searching (KWS) index, and using the CN based KWS index for both in-vocabulary (IV) keyword queries and the OOV keyword queries.

Type: Application

Filed: March 31, 2014

Publication date: October 1, 2015

Applicant: International Business Machines Corporation

Inventors: Brian E.D. Kingsbury, Hong-Kwang Kuo, Lidia Mangu, Hagen Soltau
Natural language system and method based on unisolated performance metric

Patent number: 8977549

Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.

Type: Grant

Filed: September 26, 2013

Date of Patent: March 10, 2015

Assignee: Nuance Communications, Inc.

Inventors: Sabine V. Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
NATURAL LANGUAGE SYSTEM AND METHOD BASED ON UNISOLATED PERFORMANCE METRIC

Publication number: 20140032217

Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module.

Type: Application

Filed: September 26, 2013

Publication date: January 30, 2014

Applicant: Nuance Communications, Inc.

Inventors: Sabine V. Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
Natural language system and method based on unisolated performance metric

Patent number: 8571869

Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.

Type: Grant

Filed: May 15, 2008

Date of Patent: October 29, 2013

Assignee: Nuance Communications, Inc.

Inventors: Sabine Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
Apparatus and method for integrated phrase-based and free-form speech-to-speech translation

Patent number: 7912727

Abstract: An apparatus and method that integrates both phrase-based and free-form speech-to-speech translation approaches using probability models. The starting step of the method is to receive vocal communication in a source language. Then store the received vocal communication. Then decipher the content of the vocal communication. Then locate in a multilingual dictionary module the corresponding translation of the deciphered vocal communication provided a preset sentence exists in a speech recognition module for the vocal communication. Then translate the vocal communication into the target language provided there is no corresponding translation located in the multilingual dictionary module. Then synthesize the translated target language when there is no corresponding translation for the vocal communication in the multilingual dictionary module. Then store the sound of the translated target language. Then play the sound of the translated target language.

Type: Grant

Filed: May 29, 2008

Date of Patent: March 22, 2011

Assignee: International Business Machines Corporation

Inventors: Yuqing Gao, Liang Gu, Hong-Kwang Kuo
Apparatus for providing feedback of translation quality using concept-based back translation

Patent number: 7848915

Abstract: A concept-based back translation system includes a target language semantic parser module, a source language semantic parser module, a bi-directional machine translation module, a relevancy judging module, and a back translation display module.

Type: Grant

Filed: August 9, 2006

Date of Patent: December 7, 2010

Assignee: International Business Machines Corporation

Inventors: Yuqing Gao, Liang Gu, Hong-Kwang Kuo, Bowen Zhou
APPARATUS FOR PROVIDING FEEDBACK OF TRANSLATION QUALITY USING CONCEPT-BSED BACK TRANSLATION

Publication number: 20100274552

Abstract: A concept-based back translation system includes a target language semantic parser module, a source language semantic parser module, a bi-directional machine translation module, a relevancy judging module, and a back translation display module.

Type: Application

Filed: August 9, 2006

Publication date: October 28, 2010

Applicant: International Business Machines Corporation

Inventors: Yuqing Gao, Liang Gu, Hong-Kwang Kuo, Bowen Zhou
Natural language system and method based on unisolated performance metric

Patent number: 7574358

Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.

Type: Grant

Filed: February 28, 2005

Date of Patent: August 11, 2009

Assignee: International Business Machines Corporation

Inventors: Sabine Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
Apparatus And Method For Integrated Phrase-Based And Free-Form Speech-To-Speech Translation

Publication number: 20090055160

Abstract: An apparatus and method that integrates both phrase-based and free-form speech-to-speech translation approaches using probability models. The starting step of the method is to receive vocal communication in a source language. Then store the received vocal communication. Then decipher the content of the vocal communication. Then locate in a multilingual dictionary module the corresponding translation of the deciphered vocal communication provided a preset sentence exists in a speech recognition module for the vocal communication. Then translate the vocal communication into the target language provided there is no corresponding translation located in the multilingual dictionary module. Then synthesize the translated target language when there is no corresponding translation for the vocal communication in the multilingual dictionary module. Then store the sound of the translated target language. Then play the sound of the translated target language.

Type: Application

Filed: May 29, 2008

Publication date: February 26, 2009

Applicant: International Business Machines Corporation

Inventors: Yuqing Gao, Liang Gu, Hong-Kwang Kuo

1 2 next