Patents by Inventor Shuangyu Chang

Shuangyu Chang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

USER QUERY HISTORY EXPANSION FOR IMPROVING LANGUAGE MODEL ADAPTATION

Publication number: 20150325237

Abstract: Query history expansion may be provided. Upon receiving a spoken query from a user, an adapted language model may be applied to convert the spoken query to text. The adapted language model may comprise a plurality of queries interpolated from the user's previous queries and queries associated with other users. The spoken query may be executed and the results of the spoken query may be provided to the user.

Type: Application

Filed: July 23, 2015

Publication date: November 12, 2015

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Shuangyu Chang, Michael Levit, Bruce Melvin Buntschuh
CONTEXT SPECIFIC LANGUAGE MODEL SCALE FACTORS

Publication number: 20150325236

Abstract: The customization of recognition of speech utilizing context-specific language model scale factors is provided. Training audio may be received from a source in a training phase. The received training audio may be recognized utilizing acoustic and language models being combined utilizing static scale factors. A comparison may then be made of the recognition results to a transcription of the training audio. The recognition results may include one or more hypotheses for recognizing speech. Context specific scale factors may then be generated based on the comparison. The context specific scale factors may then be applied for use in the speech recognition of audio signals in an application phase.

Type: Application

Filed: May 8, 2014

Publication date: November 12, 2015

Applicant: MICROSOFT CORPORATION

Inventors: MICHAEL LEVIT, SHUANGYU CHANG, ZHIHENG HUANG
Flexible Schema for Language Model Customization

Publication number: 20150278191

Abstract: The customization of language modeling components for speech recognition is provided. A list of language modeling components may be made available by a computing device. A hint may then be sent to a recognition service provider for combining the multiple language modeling components from the list. The hint may be based on a number of different domains. A customized combination of the language modeling components based on the hint may then be received from the recognition service provider.

Type: Application

Filed: March 27, 2014

Publication date: October 1, 2015

Applicant: Microsoft Corporation

Inventors: Michael Levit, Hernan Guelman, Shuangyu Chang, Sarangarajan Parthasarathy, Benoit Dumoulin
INCREMENTAL UTTERANCE DECODER COMBINATION FOR EFFICIENT AND ACCURATE DECODING

Publication number: 20150269949

Abstract: An incremental speech recognition system. The incremental speech recognition system incrementally decodes a spoken utterance using an additional utterance decoder only when the additional utterance decoder is likely to add significant benefit to the combined result. The available utterance decoders are ordered in a series based on accuracy, performance, diversity, and other factors. A recognition management engine coordinates decoding of the spoken utterance by the series of utterance decoders, combines the decoded utterances, and determines whether additional processing is likely to significantly improve the recognition result. If so, the recognition management engine engages the next utterance decoder and the cycle continues. If the accuracy cannot be significantly improved, the result is accepted and decoding stops.

Type: Application

Filed: March 19, 2014

Publication date: September 24, 2015

Applicant: MICROSOFT CORPORATION

Inventors: Shuangyu Chang, Michael Levit, Abhik Lahiri, Barlas Oguz, Benoit Dumoulin
User query history expansion for improving language model adaptation

Patent number: 9129606

Abstract: Query history expansion may be provided. Upon receiving a spoken query from a user, an adapted language model may be applied to convert the spoken query to text. The adapted language model may comprise a plurality of queries interpolated from the user's previous queries and queries associated with other users. The spoken query may be executed and the results of the spoken query may be provided to the user.

Type: Grant

Filed: September 23, 2011

Date of Patent: September 8, 2015

Assignee: Microsoft Technology Licensing, LLC

Inventors: Shuangyu Chang, Michael Levit, Bruce Melvin Buntschuh
Automatic semantic evaluation of speech recognition results

Patent number: 9053087

Abstract: A semantic error rate calculation may be provided. After receiving a spoken query from a user, the spoken query may be converted to text according to a first speech recognition hypothesis. A plurality of results associated with the converted query may be received and compared to a second plurality of results associated with the converted query.

Type: Grant

Filed: September 23, 2011

Date of Patent: June 9, 2015

Assignee: Microsoft Technology Licensing, LLC

Inventors: Michael Levit, Shuangyu Chang, Bruce Melvin Buntschuh, Nick Kibre
Dynamically adding personalization features to language models for voice search

Patent number: 8938391

Abstract: A dynamic exponential, feature-based, language model is continually adjusted per utterance by a user, based on the user's usage history. This adjustment of the model is done incrementally per user, over a large number of users, each with a unique history. The user history can include previously recognized utterances, text queries, and other user inputs. The history data for a user is processed to derive features. These features are then added into the language model dynamically for that user.

Type: Grant

Filed: June 12, 2011

Date of Patent: January 20, 2015

Assignee: Microsoft Corporation

Inventors: Geoffrey Zweig, Shuangyu Chang
Recognition using re-recognition and statistical classification

Patent number: 8930179

Abstract: Architecture that employs an overall grammar as a set of context-specific grammars for recognition of an input, each responsible for a specific context, such as subtask category, geographic region, etc. The grammars together cover the entire domain. Moreover, multiple recognitions can be run in parallel against the same input, where each recognition uses one or more of the context-specific grammars. The multiple intermediate recognition results from the different recognizer-grammars are reconciled by running re-recognition using a dynamically composed grammar based on the multiple recognition results and potentially other domain knowledge, or selecting the winner using a statistical classifier operating on classification features extracted from the multiple recognition results and other domain knowledge.

Type: Grant

Filed: June 4, 2009

Date of Patent: January 6, 2015

Assignee: Microsoft Corporation

Inventors: Shuangyu Chang, Michael Levit, Bruce Buntschuh
LANGUAGE MODEL ADAPTATION USING RESULT SELECTION

Publication number: 20140365218

Abstract: A received utterance is recognized using different language models. For example, recognition of the utterance is independently performed using a baseline language model (BLM) and using an adapted language model (ALM). A determination is made as to what results from the different language model are more likely to be accurate. Different features may be used to assist in making the determination (e.g. language model scores, recognition confidences, acoustic model scores, quality measurements, . . . ) may be used. A classifier may be trained and then used in determining whether to select the results using the BLM or to select the results using the ALM. A language model may be automatically trained or re-trained that adjusts a weight of the training data used in training the model in response to differences between the two results obtained from applying the different language models.

Type: Application

Filed: June 7, 2013

Publication date: December 11, 2014

Inventors: Shuangyu Chang, Michael Levit
Automatic Semantic Evaluation of Speech Recognition Results

Publication number: 20130080150

Abstract: A semantic error rate calculation may be provided. After receiving a spoken query from a user, the spoken query may be converted to text according to a first speech recognition hypothesis. A plurality of results associated with the converted query may be received and compared to a second plurality of results associated with the converted query.

Type: Application

Filed: September 23, 2011

Publication date: March 28, 2013

Applicant: Microsoft Corporation

Inventors: Michael Levit, Shuangyu Chang, Bruce Melvin Buntschuh, Nick Kibre
User Query History Expansion for Improving Language Model Adaptation

Publication number: 20130080162

Abstract: Query history expansion may be provided. Upon receiving a spoken query from a user, an adapted language model may be applied to convert the spoken query to text. The adapted language model may comprise a plurality of queries interpolated from the user's previous queries and queries associated with other users. The spoken query may be executed and the results of the spoken query may be provided to the user.

Type: Application

Filed: September 23, 2011

Publication date: March 28, 2013

Applicant: Microsoft Corporation

Inventors: Shuangyu Chang, Michael Levit, Bruce Melvin Buntschuh
DYNAMICALLY ADDING PERSONALIZATION FEATURES TO LANGUAGE MODELS FOR VOICE SEARCH

Publication number: 20120316877

Abstract: A dynamic exponential, feature-based, language model is continually adjusted per utterance by a user, based on the user's usage history. This adjustment of the model is done incrementally per user, over a large number of users, each with a unique history. The user history can include previously recognized utterances, text queries, and other user inputs. The history data for a user is processed to derive features. These features are then added into the language model dynamically for that user.

Type: Application

Filed: June 12, 2011

Publication date: December 13, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Geoffrey Zweig, Shuangyu Chang
Utterance processing for network-based speech recognition utilizing a client-side cache

Patent number: 8224644

Abstract: Embodiments are provided for utilizing a client-side cache for utterance processing to facilitate network based speech recognition. An utterance comprising a query is received in a client computing device. The query is sent from the client to a network server for results processing. The utterance is processed to determine a speech profile. A cache lookup is performed based on the speech profile to determine whether results data for the query is stored in the cache. If the results data is stored in the cache, then a query is sent to cancel the results processing on the network server and the cached results data is displayed on the client computing device.

Type: Grant

Filed: December 18, 2008

Date of Patent: July 17, 2012

Assignee: Microsoft Corporation

Inventors: Andrew K. Krumel, Shuangyu Chang, Robert L. Chambers
Sequential speech recognition with two unequal ASR systems

Patent number: 8180641

Abstract: Sequential speech recognition using two unequal automatic speech recognition (ASR) systems may be provided. The system may provide two sets of vocabulary data. A determination may be made as to whether entries in one set of vocabulary data are likely to be confused with entries in the other set of vocabulary data. If confusion is likely, a decoy entry from one set of the vocabulary data may be placed in the other set of vocabulary data to ensure more efficient and accurate speech recognition processing may take place.

Type: Grant

Filed: September 29, 2008

Date of Patent: May 15, 2012

Assignee: Microsoft Corporation

Inventors: Michael Levit, Shuangyu Chang, Bruce Melvin Buntschuh
RECOGNITION USING RE-RECOGNITION AND STATISTICAL CLASSIFICATION

Publication number: 20100312546

Abstract: Architecture that employs an overall grammar as a set of context-specific grammars for recognition of an input, each responsible for a specific context, such as subtask category, geographic region, etc. The grammars together cover the entire domain. Moreover, multiple recognitions can be run in parallel against the same input, where each recognition uses one or more of the context-specific grammars. The multiple intermediate recognition results from the different recognizer-grammars are reconciled by running re-recognition using a dynamically composed grammar based on the multiple recognition results and potentially other domain knowledge, or selecting the winner using a statistical classifier operating on classification features extracted from the multiple recognition results and other domain knowledge.

Type: Application

Filed: June 4, 2009

Publication date: December 9, 2010

Applicant: Microsoft Corporation

Inventors: Shuangyu Chang, Michael Levit, Bruce Buntschuh
Utterance Processing For Network-Based Speech Recognition Utilizing A Client-Side Cache

Publication number: 20100161328

Abstract: Embodiments are provided for utilizing a client-side cache for utterance processing to facilitate network based speech recognition. An utterance comprising a query is received in a client computing device. The query is sent from the client to a network server for results processing. The utterance is processed to determine a speech profile. A cache lookup is performed based on the speech profile to determine whether results data for the query is stored in the cache. If the results data is stored in the cache, then a query is sent to cancel the results processing on the network server and the cached results data is displayed on the client computing device.

Type: Application

Filed: December 18, 2008

Publication date: June 24, 2010

Applicant: Microsoft Corporation

Inventors: Andrew K. Krumel, Shuangyu Chang, Robert L. Chambers
SEQUENTIAL SPEECH RECOGNITION WITH TWO UNEQUAL ASR SYSTEMS

Publication number: 20100082343

Abstract: Sequential speech recognition using two unequal automatic speech recognition (ASR) systems may be provided. The system may provide two sets of vocabulary data. A determination may be made as to whether entries in one set of vocabulary data are likely to be confused with entries in the other set of vocabulary data. If confusion is likely, a decoy entry from one set of the vocabulary data may be placed in the other set of vocabulary data to ensure more efficient and accurate speech recognition processing may take place.

Type: Application

Filed: September 29, 2008

Publication date: April 1, 2010

Applicant: Microsoft Corporation

Inventors: Michael Levit, Shuangyu Chang, Bruce Melvin Buntschuh
Speech recognition accuracy with multi-confidence thresholds

Patent number: 7657433

Abstract: A speech recognition system uses multiple confidence thresholds to improve the quality of speech recognition results. The choice of which confidence threshold to use for a particular utterance may be based on one or more features relating to the utterance. In one particular implementation, the speech recognition system includes a speech recognition engine that provides speech recognition results and a confidence score for an input utterance. The system also includes a threshold selection component that determines, based on the received input utterance, a threshold value corresponding to the input utterance. The system further includes a threshold component that accepts the recognition results based on a comparison of the confidence score to the threshold value.

Type: Grant

Filed: September 8, 2006

Date of Patent: February 2, 2010

Assignee: TellMe Networks, Inc.

Inventor: Shuangyu Chang

prev 1 2 3