Patents by Inventor Mei-Yuh Hwang

Mei-Yuh Hwang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Intent recognition and emotional text-to-speech learning

Patent number: 11727914

Abstract: An example intent-recognition system comprises a processor and memory storing instructions. The instructions cause the processor to receive speech input comprising spoken words. The instructions cause the processor to generate text results based on the speech input and generate acoustic feature annotations based on the speech input. The instructions also cause the processor to apply an intent model to the text result and the acoustic feature annotations to recognize an intent based on the speech input. An example system for adapting an emotional text-to-speech model comprises a processor and memory. The memory stores instructions that cause the processor to receive training examples comprising speech input and receive labelling data comprising emotion information associated with the speech input. The instructions also cause the processor to extract audio signal vectors from the training examples and generate an emotion-adapted voice font model based on the audio signal vectors and the labelling data.

Type: Grant

Filed: December 24, 2021

Date of Patent: August 15, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Pei Zhao, Kaisheng Yao, Max Leung, Bo Yan, Jian Luan, Yu Shi, Malone Ma, Mei-Yuh Hwang
INTENT RECOGNITION AND EMOTIONAL TEXT-TO-SPEECH LEARNING

Publication number: 20220122580

Abstract: An example intent-recognition system comprises a processor and memory storing instructions. The instructions cause the processor to receive speech input comprising spoken words. The instructions cause the processor to generate text results based on the speech input and generate acoustic feature annotations based on the speech input. The instructions also cause the processor to apply an intent model to the text result and the acoustic feature annotations to recognize an intent based on the speech input. An example system for adapting an emotional text-to-speech model comprises a processor and memory. The memory stores instructions that cause the processor to receive training examples comprising speech input and receive labelling data comprising emotion information associated with the speech input. The instructions also cause the processor to extract audio signal vectors from the training examples and generate an emotion-adapted voice font model based on the audio signal vectors and the labelling data.

Type: Application

Filed: December 24, 2021

Publication date: April 21, 2022

Inventors: Pei ZHAO, Kaisheng YAO, Max LEUNG, Bo YAN, Jian LUAN, Yu SHI, Malone MA, Mei-Yuh HWANG
Intent recognition and emotional text-to-speech learning

Patent number: 11238842

Abstract: An example intent-recognition system comprises a processor and memory storing instructions. The instructions cause the processor to receive speech input comprising spoken words. The instructions cause the processor to generate text results based on the speech input and generate acoustic feature annotations based on the speech input. The instructions also cause the processor to apply an intent model to the text result and the acoustic feature annotations to recognize an intent based on the speech input. An example system for adapting an emotional text-to-speech model comprises a processor and memory. The memory stores instructions that cause the processor to receive training examples comprising speech input and receive labelling data comprising emotion information associated with the speech input. The instructions also cause the processor to extract audio signal vectors from the training examples and generate an emotion-adapted voice font model based on the audio signal vectors and the labelling data.

Type: Grant

Filed: June 7, 2017

Date of Patent: February 1, 2022

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Pei Zhao, Kaisheng Yao, Max Leung, Bo Yan, Jian Luan, Yu Shi, Malone Ma, Mei-Yuh Hwang
INTENT RECOGNITION AND EMOTIONAL TEXT-TO-SPEECH LEARNING

Publication number: 20210225357

Abstract: An example intent-recognition system comprises a processor and memory storing instructions. The instructions cause the processor to receive speech input comprising spoken words. The instructions cause the processor to generate text results based on the speech input and generate acoustic feature annotations based on the speech input. The instructions also cause the processor to apply an intent model to the text result and the acoustic feature annotations to recognize an intent based on the speech input. An example system for adapting an emotional text-to-speech model comprises a processor and memory. The memory stores instructions that cause the processor to receive training examples comprising speech input and receive labelling data comprising emotion information associated with the speech input. The instructions also cause the processor to extract audio signal vectors from the training examples and generate an emotion-adapted voice font model based on the audio signal vectors and the labelling data.

Type: Application

Filed: June 7, 2017

Publication date: July 22, 2021

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Pei ZHAO, Kaisheng YAO, Max LEUNG, Bo YAN, Jian LUAN, Yu SHI, Malone MA, Mei-Yuh HWANG
Assignment of semantic labels to a sequence of words using neural network architectures

Patent number: 10867597

Abstract: Technologies pertaining to slot filling are described herein. A deep neural network, a recurrent neural network, and/or a spatio-temporally deep neural network are configured to assign labels to words in a word sequence set forth in natural language. At least one label is a semantic label that is assigned to at least one word in the word sequence.

Type: Grant

Filed: September 2, 2013

Date of Patent: December 15, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: Anoop Deoras, Kaisheng Yao, Xiaodong He, Li Deng, Geoffrey Gerson Zweig, Ruhi Sarikaya, Dong Yu, Mei-Yuh Hwang, Gregoire Mesnil
Speech recognition using a foreign word grammar

Patent number: 10290299

Abstract: Systems and methods are utilized for recognizing speech that is partially in a foreign language. The systems and methods receive speech input from a user and detect if a rule or sentence entry grammar structure utilizing a foreign word has been uttered. To recognize the foreign word, a foreign word grammar is utilized. The foreign word grammar includes rules for recognizing the uttered foreign word. Two rules may be included in the foreign word grammar for each legitimate or slang term included in the foreign word grammar. A first rule corresponds to the spoken form of the foreign word, and the second rule corresponds to the spelling form of the foreign word. The foreign word grammar may also utilize a prefix tree. Upon recognizing the foreign word, the recognized foreign word may be sent to an application to retrieve the pronunciation, translation, or definition of the foreign word.

Type: Grant

Filed: July 17, 2014

Date of Patent: May 14, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Mei-Yuh Hwang, Hua Zhang
Statistical machine translation based search query spelling correction

Patent number: 10176168

Abstract: Statistical Machine Translation (SMT) based search query spelling correction techniques are described herein. In one or more implementations, search data regarding searches performed by clients may be logged. The logged data includes query correction pairs that may be used to ascertain error patterns indicating how misspelled substrings may be translated to corrected substrings. The error patterns may be used to determine suggestions for an input query and to develop query correction models used to translate the input query to a corrected query. In one or more implementations, probabilistic features from multiple query correction models are combined to score different correction candidates. One or more top scoring correction candidates may then be exposed as suggestions for selection by a user and/or provided to a search engine to conduct a corresponding search using the corrected query version(s).

Type: Grant

Filed: November 15, 2011

Date of Patent: January 8, 2019

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Jianfeng Gao, Mei-Yuh Hwang, Xuedong D. Huang, Christopher Brian Quirk, Zhenghao Wang
Filled translation for bootstrapping language understanding of low-resourced languages

Patent number: 9613027

Abstract: Annotated training data (e.g., sentences) in a first language are used to generate annotated training data for a second language. For example, annotated sentences in English are manually collected first, and then is used to generate annotated sentences in Chinese. The annotated training data includes slot labels, slot values and carrier phrases. The carrier phrases are the portions of the training data that is outside of a slot. The carrier phrases are translated from the first language to one or more translations in the second language. The translations may include machine translations as well as human translations. Entities for the slot values are determined for the translated sentences using content sources that include locale-dependent entities. The determined entities are used to fill the slots in the translations of the second language. All or a portion of the resulting sentences may be used for training models in the second language.

Type: Grant

Filed: November 7, 2013

Date of Patent: April 4, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Mei-Yuh Hwang, Yong Ni
SPEECH RECOGNITION USING A FOREIGN WORD GRAMMAR

Publication number: 20160267902

Abstract: Systems and methods are utilized for recognizing speech that is partially in a foreign language. The systems and methods receive speech input from a user and detect if a rule or sentence entry grammar structure utilizing a foreign word has been uttered. To recognize the foreign word, a foreign word grammar is utilized. The foreign word grammar includes rules for recognizing the uttered foreign word. Two rules may be included in the foreign word grammar for each legitimate or slang term included in the foreign word grammar A first rule corresponds to the spoken form of the foreign word, and the second rule corresponds to the spelling form of the foreign word. The foreign word grammar may also utilize a prefix tree. Upon recognizing the foreign word, the recognized foreign word may be sent to an application to retrieve the pronunciation, translation, or definition of the foreign word.

Type: Application

Filed: July 17, 2014

Publication date: September 15, 2016

Applicant: MICROSOFT CORPORATION

Inventors: Mei-Yuh HWANG, Hua ZHANG
ADVANCED RECURRENT NEURAL NETWORK BASED LETTER-TO-SOUND

Publication number: 20150364127

Abstract: The technology relates to performing letter-to-sound conversion utilizing recurrent neural networks (RNNs). The RNNs may be implemented as RNN modules for letter-to-sound conversion. The RNN modules receive text input and convert the text to corresponding phonemes. In determining the corresponding phonemes, the RNN modules may analyze the letters of the text and the letters surrounding the text being analyzed. The RNN modules may also analyze the letters of the text in reverse order. The RNN modules may also receive contextual information about the input text. The letter-to-sound conversion may then also be based on the contextual information that is received. The determined phonemes may be utilized to generate synthesized speech from the input text.

Type: Application

Filed: June 13, 2014

Publication date: December 17, 2015

Applicant: MICROSOFT CORPORATION

Inventors: Pei Zhao, Kaisheng Yao, Max Leung, Mei-Yuh Hwang, Sheng Zhao, Bo Yan, Geoffrey Zweig, Fileno A. Alleva
Filled Translation for Bootstrapping Language Understanding of Low-Resourced Languages

Publication number: 20150127319

Abstract: Annotated training data (e.g., sentences) in a first language are used to generate annotated training data for a second language. For example, annotated sentences in English are manually collected first, and then is used to generate annotated sentences in Chinese. The annotated training data includes slot labels, slot values and carrier phrases. The carrier phrases are the portions of the training data that is outside of a slot. The carrier phrases are translated from the first language to one or more translations in the second language. The translations may include machine translations as well as human translations. Entities for the slot values are determined for the translated sentences using content sources that include locale-dependent entities. The determined entities are used to fill the slots in the translations of the second language. All or a portion of the resulting sentences may be used for training models in the second language.

Type: Application

Filed: November 7, 2013

Publication date: May 7, 2015

Applicant: Microsoft Corporation

Inventors: Mei-Yuh Hwang, Yong Ni
ASSIGNMENT OF SEMANTIC LABELS TO A SEQUENCE OF WORDS USING NEURAL NETWORK ARCHITECTURES

Publication number: 20150066496

Abstract: Technologies pertaining to slot filling are described herein. A deep neural network, a recurrent neural network, and/or a spatio-temporally deep neural network are configured to assign labels to words in a word sequence set forth in natural language. At least one label is a semantic label that is assigned to at least one word in the word sequence.

Type: Application

Filed: September 2, 2013

Publication date: March 5, 2015

Applicant: Microsoft Corporation

Inventors: Anoop Deoras, Kaisheng Yao, Xiaodong He, Li Deng, Geoffrey Gerson Zweig, Ruhi Sarikaya, Dong Yu, Mei-Yuh Hwang, Gregoire Mesnil
Statistical Machine Translation Based Search Query Spelling Correction

Publication number: 20130124492

Abstract: Statistical Machine Translation (SMT) based search query spelling correction techniques are described herein. In one or more implementations, search data regarding searches performed by clients may be logged. The logged data includes query correction pairs that may be used to ascertain error patterns indicating how misspelled substrings may be translated to corrected substrings. The error patterns may be used to determine suggestions for an input query and to develop query correction models used to translate the input query to a corrected query. In one or more implementations, probabilistic features from multiple query correction models are combined to score different correction candidates. One or more top scoring correction candidates may then be exposed as suggestions for selection by a user and/or provided to a search engine to conduct a corresponding search using the corrected query version(s).

Type: Application

Filed: November 15, 2011

Publication date: May 16, 2013

Applicant: MICROSOFT CORPORATION

Inventors: Jianfeng Gao, Mei-Yuh Hwang, Xuedong D. Huang, Christopher Brian Quirk, Zhenghao Wang
Automatic speech recognition learning using categorization and selective incorporation of user-initiated corrections

Patent number: 8280733

Abstract: An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.

Type: Grant

Filed: September 17, 2010

Date of Patent: October 2, 2012

Assignee: Microsoft Corporation

Inventors: Dong Yu, Peter Mau, Mei-Yuh Hwang, Alejandro Acero
Automatic speech recognition learning using user corrections

Patent number: 8019602

Abstract: An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.

Type: Grant

Filed: January 20, 2004

Date of Patent: September 13, 2011

Assignee: Microsoft Corporation

Inventors: Dong Yu, Peter Mau, Mei-Yuh Hwang, Alejandro Acero
SYSTEM AND METHOD FOR EFFICIENT LASER PROCESSING OF A MOVING WEB-BASED MATERIAL

Publication number: 20110015927

Abstract: An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.

Type: Application

Filed: September 17, 2010

Publication date: January 20, 2011

Applicant: MICROSOFT CORPORATION

Inventors: Dong Yu, Peter Mau, Mei-Yuh Hwang, Alejandro Acero
Generating large units of graphonemes with mutual information criterion for letter to sound conversion

Patent number: 7693715

Abstract: A method and apparatus are provided for segmenting words into component parts. Under the invention, mutual information scores for pairs of graphoneme units found in a set of words are determined. Each graphoneme unit includes at least one letter. The graphoneme units of one pair of graphoneme units are combined based on the mutual information score. This forms a new graphoneme unit. Under one aspect of the invention, a syllable n-gram model is trained based on words that have been segmented into syllables using mutual information. The syllable n-gram model is used to segment a phonetic representation of a new word into syllables. Similarly, an inventory of morphemes is formed using mutual information and a morpheme n-gram is trained that can be used to segment a new word into a sequence of morphemes.

Type: Grant

Filed: March 10, 2004

Date of Patent: April 6, 2010

Assignee: Microsoft Corporation

Inventors: Mei-Yuh Hwang, Li Jiang
Method and apparatus for constructing and using syllable-like unit language models

Patent number: 7676365

Abstract: A method and computer-readable medium use syllable-like units (SLUs) to decode a pronunciation into a phonetic description. The syllable-like units are generally larger than a single phoneme but smaller than a word. The present invention provides a means for defining these syllable-like units and for generating a language model based on these syllable-like units that can be used in the decoding process. As SLUs are longer than phonemes, they contain more acoustic contextual clues and better lexical constraints for speech recognition. Thus, the phoneme accuracy produced from SLU recognition is much better than all-phone sequence recognition.

Type: Grant

Filed: April 20, 2005

Date of Patent: March 9, 2010

Assignee: Microsoft Corporation

Inventors: Mei-Yuh Hwang, Fileno A. Alleva, Rebecca C. Weiss
New-word pronunciation learning using a pronunciation graph

Patent number: 7590533

Abstract: A method and computer-readable medium convert the text of a word and a user's pronunciation of the word into a phonetic description to be added to a speech recognition lexicon. Initially, a plurality of at least two possible phonetic descriptions are generated. One phonetic description is formed by decoding a speech signal representing a user's pronunciation of the word. At least one other phonetic description is generated from the text of the word. The plurality of possible sequences comprising speech-based and text-based phonetic descriptions are aligned and scored in a single graph based on their correspondence to the user's pronunciation. The phonetic description with the highest score is then selected for entry in the speech recognition lexicon.

Type: Grant

Filed: March 10, 2004

Date of Patent: September 15, 2009

Assignee: Microsoft Corporation

Inventor: Mei-Yuh Hwang
Generating a task-adapted acoustic model from one or more different corpora

Patent number: 7263487

Abstract: The present invention generates a task-dependent acoustic model from a supervised task-independent corpus and further adapted it with an unsupervised task dependent corpus. The task-independent corpus includes task-independent training data which has an acoustic representation of words and a sequence of transcribed words corresponding to the acoustic representation. A relevance measure is defined for each of the words in the task-independent data. The relevance measure is used to weight the data associated with each of the words in the task-independent training data. The task-dependent acoustic model is then trained based on the weighted data for the words in the task-independent training data.

Type: Grant

Filed: September 29, 2005

Date of Patent: August 28, 2007

Assignee: Microsoft Corporation

Inventor: Mei Yuh Hwang

1 2 next