Patents by Inventor Mei-Yuh Hwang

Mei-Yuh Hwang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11727914
    Abstract: An example intent-recognition system comprises a processor and memory storing instructions. The instructions cause the processor to receive speech input comprising spoken words. The instructions cause the processor to generate text results based on the speech input and generate acoustic feature annotations based on the speech input. The instructions also cause the processor to apply an intent model to the text result and the acoustic feature annotations to recognize an intent based on the speech input. An example system for adapting an emotional text-to-speech model comprises a processor and memory. The memory stores instructions that cause the processor to receive training examples comprising speech input and receive labelling data comprising emotion information associated with the speech input. The instructions also cause the processor to extract audio signal vectors from the training examples and generate an emotion-adapted voice font model based on the audio signal vectors and the labelling data.
    Type: Grant
    Filed: December 24, 2021
    Date of Patent: August 15, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Pei Zhao, Kaisheng Yao, Max Leung, Bo Yan, Jian Luan, Yu Shi, Malone Ma, Mei-Yuh Hwang
  • Publication number: 20220122580
    Abstract: An example intent-recognition system comprises a processor and memory storing instructions. The instructions cause the processor to receive speech input comprising spoken words. The instructions cause the processor to generate text results based on the speech input and generate acoustic feature annotations based on the speech input. The instructions also cause the processor to apply an intent model to the text result and the acoustic feature annotations to recognize an intent based on the speech input. An example system for adapting an emotional text-to-speech model comprises a processor and memory. The memory stores instructions that cause the processor to receive training examples comprising speech input and receive labelling data comprising emotion information associated with the speech input. The instructions also cause the processor to extract audio signal vectors from the training examples and generate an emotion-adapted voice font model based on the audio signal vectors and the labelling data.
    Type: Application
    Filed: December 24, 2021
    Publication date: April 21, 2022
    Inventors: Pei ZHAO, Kaisheng YAO, Max LEUNG, Bo YAN, Jian LUAN, Yu SHI, Malone MA, Mei-Yuh HWANG
  • Patent number: 11238842
    Abstract: An example intent-recognition system comprises a processor and memory storing instructions. The instructions cause the processor to receive speech input comprising spoken words. The instructions cause the processor to generate text results based on the speech input and generate acoustic feature annotations based on the speech input. The instructions also cause the processor to apply an intent model to the text result and the acoustic feature annotations to recognize an intent based on the speech input. An example system for adapting an emotional text-to-speech model comprises a processor and memory. The memory stores instructions that cause the processor to receive training examples comprising speech input and receive labelling data comprising emotion information associated with the speech input. The instructions also cause the processor to extract audio signal vectors from the training examples and generate an emotion-adapted voice font model based on the audio signal vectors and the labelling data.
    Type: Grant
    Filed: June 7, 2017
    Date of Patent: February 1, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Pei Zhao, Kaisheng Yao, Max Leung, Bo Yan, Jian Luan, Yu Shi, Malone Ma, Mei-Yuh Hwang
  • Publication number: 20210225357
    Abstract: An example intent-recognition system comprises a processor and memory storing instructions. The instructions cause the processor to receive speech input comprising spoken words. The instructions cause the processor to generate text results based on the speech input and generate acoustic feature annotations based on the speech input. The instructions also cause the processor to apply an intent model to the text result and the acoustic feature annotations to recognize an intent based on the speech input. An example system for adapting an emotional text-to-speech model comprises a processor and memory. The memory stores instructions that cause the processor to receive training examples comprising speech input and receive labelling data comprising emotion information associated with the speech input. The instructions also cause the processor to extract audio signal vectors from the training examples and generate an emotion-adapted voice font model based on the audio signal vectors and the labelling data.
    Type: Application
    Filed: June 7, 2017
    Publication date: July 22, 2021
    Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Pei ZHAO, Kaisheng YAO, Max LEUNG, Bo YAN, Jian LUAN, Yu SHI, Malone MA, Mei-Yuh HWANG
  • Patent number: 10867597
    Abstract: Technologies pertaining to slot filling are described herein. A deep neural network, a recurrent neural network, and/or a spatio-temporally deep neural network are configured to assign labels to words in a word sequence set forth in natural language. At least one label is a semantic label that is assigned to at least one word in the word sequence.
    Type: Grant
    Filed: September 2, 2013
    Date of Patent: December 15, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Anoop Deoras, Kaisheng Yao, Xiaodong He, Li Deng, Geoffrey Gerson Zweig, Ruhi Sarikaya, Dong Yu, Mei-Yuh Hwang, Gregoire Mesnil
  • Patent number: 10290299
    Abstract: Systems and methods are utilized for recognizing speech that is partially in a foreign language. The systems and methods receive speech input from a user and detect if a rule or sentence entry grammar structure utilizing a foreign word has been uttered. To recognize the foreign word, a foreign word grammar is utilized. The foreign word grammar includes rules for recognizing the uttered foreign word. Two rules may be included in the foreign word grammar for each legitimate or slang term included in the foreign word grammar. A first rule corresponds to the spoken form of the foreign word, and the second rule corresponds to the spelling form of the foreign word. The foreign word grammar may also utilize a prefix tree. Upon recognizing the foreign word, the recognized foreign word may be sent to an application to retrieve the pronunciation, translation, or definition of the foreign word.
    Type: Grant
    Filed: July 17, 2014
    Date of Patent: May 14, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Mei-Yuh Hwang, Hua Zhang
  • Patent number: 10176168
    Abstract: Statistical Machine Translation (SMT) based search query spelling correction techniques are described herein. In one or more implementations, search data regarding searches performed by clients may be logged. The logged data includes query correction pairs that may be used to ascertain error patterns indicating how misspelled substrings may be translated to corrected substrings. The error patterns may be used to determine suggestions for an input query and to develop query correction models used to translate the input query to a corrected query. In one or more implementations, probabilistic features from multiple query correction models are combined to score different correction candidates. One or more top scoring correction candidates may then be exposed as suggestions for selection by a user and/or provided to a search engine to conduct a corresponding search using the corrected query version(s).
    Type: Grant
    Filed: November 15, 2011
    Date of Patent: January 8, 2019
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Jianfeng Gao, Mei-Yuh Hwang, Xuedong D. Huang, Christopher Brian Quirk, Zhenghao Wang
  • Patent number: 9613027
    Abstract: Annotated training data (e.g., sentences) in a first language are used to generate annotated training data for a second language. For example, annotated sentences in English are manually collected first, and then is used to generate annotated sentences in Chinese. The annotated training data includes slot labels, slot values and carrier phrases. The carrier phrases are the portions of the training data that is outside of a slot. The carrier phrases are translated from the first language to one or more translations in the second language. The translations may include machine translations as well as human translations. Entities for the slot values are determined for the translated sentences using content sources that include locale-dependent entities. The determined entities are used to fill the slots in the translations of the second language. All or a portion of the resulting sentences may be used for training models in the second language.
    Type: Grant
    Filed: November 7, 2013
    Date of Patent: April 4, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Mei-Yuh Hwang, Yong Ni
  • Publication number: 20160267902
    Abstract: Systems and methods are utilized for recognizing speech that is partially in a foreign language. The systems and methods receive speech input from a user and detect if a rule or sentence entry grammar structure utilizing a foreign word has been uttered. To recognize the foreign word, a foreign word grammar is utilized. The foreign word grammar includes rules for recognizing the uttered foreign word. Two rules may be included in the foreign word grammar for each legitimate or slang term included in the foreign word grammar A first rule corresponds to the spoken form of the foreign word, and the second rule corresponds to the spelling form of the foreign word. The foreign word grammar may also utilize a prefix tree. Upon recognizing the foreign word, the recognized foreign word may be sent to an application to retrieve the pronunciation, translation, or definition of the foreign word.
    Type: Application
    Filed: July 17, 2014
    Publication date: September 15, 2016
    Applicant: MICROSOFT CORPORATION
    Inventors: Mei-Yuh HWANG, Hua ZHANG
  • Publication number: 20150364127
    Abstract: The technology relates to performing letter-to-sound conversion utilizing recurrent neural networks (RNNs). The RNNs may be implemented as RNN modules for letter-to-sound conversion. The RNN modules receive text input and convert the text to corresponding phonemes. In determining the corresponding phonemes, the RNN modules may analyze the letters of the text and the letters surrounding the text being analyzed. The RNN modules may also analyze the letters of the text in reverse order. The RNN modules may also receive contextual information about the input text. The letter-to-sound conversion may then also be based on the contextual information that is received. The determined phonemes may be utilized to generate synthesized speech from the input text.
    Type: Application
    Filed: June 13, 2014
    Publication date: December 17, 2015
    Applicant: MICROSOFT CORPORATION
    Inventors: Pei Zhao, Kaisheng Yao, Max Leung, Mei-Yuh Hwang, Sheng Zhao, Bo Yan, Geoffrey Zweig, Fileno A. Alleva
  • Publication number: 20150127319
    Abstract: Annotated training data (e.g., sentences) in a first language are used to generate annotated training data for a second language. For example, annotated sentences in English are manually collected first, and then is used to generate annotated sentences in Chinese. The annotated training data includes slot labels, slot values and carrier phrases. The carrier phrases are the portions of the training data that is outside of a slot. The carrier phrases are translated from the first language to one or more translations in the second language. The translations may include machine translations as well as human translations. Entities for the slot values are determined for the translated sentences using content sources that include locale-dependent entities. The determined entities are used to fill the slots in the translations of the second language. All or a portion of the resulting sentences may be used for training models in the second language.
    Type: Application
    Filed: November 7, 2013
    Publication date: May 7, 2015
    Applicant: Microsoft Corporation
    Inventors: Mei-Yuh Hwang, Yong Ni
  • Publication number: 20150066496
    Abstract: Technologies pertaining to slot filling are described herein. A deep neural network, a recurrent neural network, and/or a spatio-temporally deep neural network are configured to assign labels to words in a word sequence set forth in natural language. At least one label is a semantic label that is assigned to at least one word in the word sequence.
    Type: Application
    Filed: September 2, 2013
    Publication date: March 5, 2015
    Applicant: Microsoft Corporation
    Inventors: Anoop Deoras, Kaisheng Yao, Xiaodong He, Li Deng, Geoffrey Gerson Zweig, Ruhi Sarikaya, Dong Yu, Mei-Yuh Hwang, Gregoire Mesnil
  • Publication number: 20130124492
    Abstract: Statistical Machine Translation (SMT) based search query spelling correction techniques are described herein. In one or more implementations, search data regarding searches performed by clients may be logged. The logged data includes query correction pairs that may be used to ascertain error patterns indicating how misspelled substrings may be translated to corrected substrings. The error patterns may be used to determine suggestions for an input query and to develop query correction models used to translate the input query to a corrected query. In one or more implementations, probabilistic features from multiple query correction models are combined to score different correction candidates. One or more top scoring correction candidates may then be exposed as suggestions for selection by a user and/or provided to a search engine to conduct a corresponding search using the corrected query version(s).
    Type: Application
    Filed: November 15, 2011
    Publication date: May 16, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Jianfeng Gao, Mei-Yuh Hwang, Xuedong D. Huang, Christopher Brian Quirk, Zhenghao Wang
  • Patent number: 8280733
    Abstract: An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.
    Type: Grant
    Filed: September 17, 2010
    Date of Patent: October 2, 2012
    Assignee: Microsoft Corporation
    Inventors: Dong Yu, Peter Mau, Mei-Yuh Hwang, Alejandro Acero
  • Patent number: 8019602
    Abstract: An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.
    Type: Grant
    Filed: January 20, 2004
    Date of Patent: September 13, 2011
    Assignee: Microsoft Corporation
    Inventors: Dong Yu, Peter Mau, Mei-Yuh Hwang, Alejandro Acero
  • Publication number: 20110015927
    Abstract: An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.
    Type: Application
    Filed: September 17, 2010
    Publication date: January 20, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: Dong Yu, Peter Mau, Mei-Yuh Hwang, Alejandro Acero
  • Patent number: 7693715
    Abstract: A method and apparatus are provided for segmenting words into component parts. Under the invention, mutual information scores for pairs of graphoneme units found in a set of words are determined. Each graphoneme unit includes at least one letter. The graphoneme units of one pair of graphoneme units are combined based on the mutual information score. This forms a new graphoneme unit. Under one aspect of the invention, a syllable n-gram model is trained based on words that have been segmented into syllables using mutual information. The syllable n-gram model is used to segment a phonetic representation of a new word into syllables. Similarly, an inventory of morphemes is formed using mutual information and a morpheme n-gram is trained that can be used to segment a new word into a sequence of morphemes.
    Type: Grant
    Filed: March 10, 2004
    Date of Patent: April 6, 2010
    Assignee: Microsoft Corporation
    Inventors: Mei-Yuh Hwang, Li Jiang
  • Patent number: 7676365
    Abstract: A method and computer-readable medium use syllable-like units (SLUs) to decode a pronunciation into a phonetic description. The syllable-like units are generally larger than a single phoneme but smaller than a word. The present invention provides a means for defining these syllable-like units and for generating a language model based on these syllable-like units that can be used in the decoding process. As SLUs are longer than phonemes, they contain more acoustic contextual clues and better lexical constraints for speech recognition. Thus, the phoneme accuracy produced from SLU recognition is much better than all-phone sequence recognition.
    Type: Grant
    Filed: April 20, 2005
    Date of Patent: March 9, 2010
    Assignee: Microsoft Corporation
    Inventors: Mei-Yuh Hwang, Fileno A. Alleva, Rebecca C. Weiss
  • Patent number: 7590533
    Abstract: A method and computer-readable medium convert the text of a word and a user's pronunciation of the word into a phonetic description to be added to a speech recognition lexicon. Initially, a plurality of at least two possible phonetic descriptions are generated. One phonetic description is formed by decoding a speech signal representing a user's pronunciation of the word. At least one other phonetic description is generated from the text of the word. The plurality of possible sequences comprising speech-based and text-based phonetic descriptions are aligned and scored in a single graph based on their correspondence to the user's pronunciation. The phonetic description with the highest score is then selected for entry in the speech recognition lexicon.
    Type: Grant
    Filed: March 10, 2004
    Date of Patent: September 15, 2009
    Assignee: Microsoft Corporation
    Inventor: Mei-Yuh Hwang
  • Patent number: 7263487
    Abstract: The present invention generates a task-dependent acoustic model from a supervised task-independent corpus and further adapted it with an unsupervised task dependent corpus. The task-independent corpus includes task-independent training data which has an acoustic representation of words and a sequence of transcribed words corresponding to the acoustic representation. A relevance measure is defined for each of the words in the task-independent data. The relevance measure is used to weight the data associated with each of the words in the task-independent training data. The task-dependent acoustic model is then trained based on the weighted data for the words in the task-independent training data.
    Type: Grant
    Filed: September 29, 2005
    Date of Patent: August 28, 2007
    Assignee: Microsoft Corporation
    Inventor: Mei Yuh Hwang