Patents by Inventor Shuangyu Chang

Shuangyu Chang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Eyes-off training for automatic speech recognition

Patent number: 10679610

Abstract: A method for eyes-off training of a dictation system includes translating an audio signal featuring speech audio of a speaker into an initial recognized text using a previously-trained general language model. The initial recognized text is provided to the speaker for error correction. The audio signal is re-translated into an updated recognized text using a specialized language model biased to recognize words included in the corrected text. The general language model is retrained in an “eyes-off” manner, based on the audio signal and the updated recognized text.

Type: Grant

Filed: July 16, 2018

Date of Patent: June 9, 2020

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Hemant Malhotra, Shuangyu Chang, Pradip Kumar Fatehpuria
Correction of speech recognition on repetitive queries

Patent number: 10650811

Abstract: Disclosed in various examples are methods, systems, and machine-readable mediums for providing improved computer implemented speech recognition by detecting and correcting speech recognition errors during a speech session. The system recognizes repeated speech commands from a user in a speech session that are similar or identical to each other. To correct these repeated errors, the system creates a customized language model that is then utilized by the language modeler to produce a refined prediction of the meaning of the repeated speech commands. The custom language model may comprise clusters of similar past predictions of speech commands from the speech session of the user.

Type: Grant

Filed: March 13, 2018

Date of Patent: May 12, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: Meryem Pinar Donmez Ediz, Ranjitha Gurunath Kulkarni, Shuangyu Chang, Nitin Kamra
EYES-OFF TRAINING FOR AUTOMATIC SPEECH RECOGNITION

Publication number: 20200020319

Abstract: A method for eyes-off training of a dictation system includes translating an audio signal featuring speech audio of a speaker into an initial recognized text using a previously-trained general language model. The initial recognized text is provided to the speaker for error correction. The audio signal is re-translated into an updated recognized text using a specialized language model biased to recognize words included in the corrected text. The general language model is retrained in an “eyes-off” manner, based on the audio signal and the updated recognized text.

Type: Application

Filed: July 16, 2018

Publication date: January 16, 2020

Applicant: Microsoft Technology Licensing, LLC

Inventors: Hemant MALHOTRA, Shuangyu CHANG, Pradip Kumar FATEHPURIA
Flexible schema for language model customization

Patent number: 10497367

Abstract: The customization of language modeling components for speech recognition is provided. A list of language modeling components may be made available by a computing device. A hint may then be sent to a recognition service provider for combining the multiple language modeling components from the list. The hint may be based on a number of different domains. A customized combination of the language modeling components based on the hint may then be received from the recognition service provider.

Type: Grant

Filed: December 22, 2016

Date of Patent: December 3, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Michael Levit, Hernan Guelman, Shuangyu Chang, Sarangarajan Parthasarathy, Benoit Dumoulin
CORRECTION OF SPEECH RECOGNITION ON REPETITIVE QUERIES

Publication number: 20190287519

Abstract: Disclosed in various examples are methods, systems, and machine-readable mediums for providing improved computer implemented speech recognition by detecting and correcting speech recognition errors during a speech session. The system recognizes repeated speech commands from a user in a speech session that are similar or identical to each other. To correct these repeated errors, the system creates a customized language model that is then utilized by the language modeler to produce a refined prediction of the meaning of the repeated speech commands. The custom language model may comprise clusters of similar past predictions of speech commands from the speech session of the user.

Type: Application

Filed: March 13, 2018

Publication date: September 19, 2019

Inventors: Meryem Pinar Donmez Ediz, Ranjitha Gurunath Kulkarni, Shuangyu Chang, Nitin Kamra
Language modeling based on spoken and unspeakable corpuses

Patent number: 10192545

Abstract: A computer system for language modeling may collect training data from one or more information sources, generate a spoken corpus containing text of transcribed speech, and generate a typed corpus containing typed text. The computer system may derive feature vectors from the spoken corpus, analyze the typed corpus to determine feature vectors representing items of typed text, and generate an unspeakable corpus by filtering the typed corpus to remove each item of typed text represented by a feature vector that is within a similarity threshold of a feature vector derived from the spoken corpus. The computer system may derive feature vectors from the unspeakable corpus and train a classifier to perform discriminative data selection for language modeling based on the feature vectors derived from the spoken corpus and the feature vectors derived from the unspeakable corpus.

Type: Grant

Filed: June 5, 2017

Date of Patent: January 29, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Michael Levit, Shuangyu Chang, Benoit Dumoulin
INTENT BASED SPEECH RECOGNITION PRIMING

Publication number: 20180330725

Abstract: A method for priming an extensible speech recognition system comprises receiving audio language input from a user. The method also comprises receiving an indication that the audio language input is associated with a first language-based intelligent agent. The first language-based intelligent agent is associated with a first grammar set that is specific to the first language-based intelligent agent. Additionally, the method comprises matching one or more spoken words or phrases within the audio language input to text-based words or phrases within a general grammar set associated with a speech recognition system and the first grammar set. The first grammar set is associated with a higher match bias than the general grammar set, such that the speech recognition system is more likely to match the one or more spoken words or phrases to the text-based words or phrases within the first grammar set.

Type: Application

Filed: August 18, 2017

Publication date: November 15, 2018

Inventors: Padma VARADHARAJAN, Shuangyu CHANG, Khuram SHAHID, Meryem Pinar DONMEZ EDIZ, Nitin AGARWAL
Speech recognition error diagnosis

Patent number: 10019984

Abstract: Techniques and technologies for diagnosing speech recognition errors are described. In an example implementation, a system for diagnosing speech recognition errors may include an error detection module configured to determine that a speech recognition result is least partially erroneous, and a recognition error diagnostics module. The recognition error diagnostics module may be configured to (a) perform a first error analysis of the at least partially erroneous speech recognition result to provide a first error analysis result; (b) perform a second error analysis of the at least partially erroneous speech recognition result to provide a second error analysis result; and (c) determine at least one category of recognition error associated with the at least partially erroneous speech recognition result based on a combination of the first error analysis result and the second error analysis result.

Type: Grant

Filed: February 27, 2015

Date of Patent: July 10, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Shiun-Zu Kuo, Thomas Reutter, Yifan Gong, Mark T. Hanson, Ye Tian, Shuangyu Chang, Jonathan Hamaker, Qi Miao, Yuancheng Tu
Incremental utterance decoder combination for efficient and accurate decoding

Patent number: 9922654

Abstract: An incremental speech recognition system. The incremental speech recognition system incrementally decodes a spoken utterance using an additional utterance decoder only when the additional utterance decoder is likely to add significant benefit to the combined result. The available utterance decoders are ordered in a series based on accuracy, performance, diversity, and other factors. A recognition management engine coordinates decoding of the spoken utterance by the series of utterance decoders, combines the decoded utterances, and determines whether additional processing is likely to significantly improve the recognition result. If so, the recognition management engine engages the next utterance decoder and the cycle continues. If the accuracy cannot be significantly improved, the result is accepted and decoding stops.

Type: Grant

Filed: December 13, 2016

Date of Patent: March 20, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Shuangyu Chang, Michael Levit, Abhik Lahiri, Barlas Oguz, Benoit Dumoulin
LANGUAGE MODELING BASED ON SPOKEN AND UNSPEAKABLE CORPUSES

Publication number: 20170270912

Abstract: A computer system for language modeling may collect training data from one or more information sources, generate a spoken corpus containing text of transcribed speech, and generate a typed corpus containing typed text. The computer system may derive feature vectors from the spoken corpus, analyze the typed corpus to determine feature vectors representing items of typed text, and generate an unspeakable corpus by filtering the typed corpus to remove each item of typed text represented by a feature vector that is within a similarity threshold of a feature vector derived from the spoken corpus. The computer system may derive feature vectors from the unspeakable corpus and train a classifier to perform discriminative data selection for language modeling based on the feature vectors derived from the spoken corpus and the feature vectors derived from the unspeakable corpus.

Type: Application

Filed: June 5, 2017

Publication date: September 21, 2017

Applicant: Microsoft Technology Licensing, LLC

Inventors: Michael Levit, Shuangyu Chang, Benoit Dumoulin
Language modeling based on spoken and unspeakable corpuses

Patent number: 9761220

Abstract: A computer system for language modeling may collect training data from one or more information sources, generate a spoken corpus containing text of transcribed speech, and generate a typed corpus containing typed text. The computer system may derive feature vectors from the spoken corpus, analyze the typed corpus to determine feature vectors representing items of typed text, and generate an unspeakable corpus by filtering the typed corpus to remove each item of typed text represented by a feature vector that is within a similarity threshold of a feature vector derived from the spoken corpus. The computer system may derive feature vectors from the unspeakable corpus and train a classifier to perform discriminative data selection for language modeling based on the feature vectors derived from the spoken corpus and the feature vectors derived from the unspeakable corpus.

Type: Grant

Filed: May 13, 2015

Date of Patent: September 12, 2017

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Michael Levit, Shuangyu Chang, Benoit Dumoulin
Token-level interpolation for class-based language models

Patent number: 9734826

Abstract: Optimized language models are provided for in-domain applications through an iterative, joint-modeling approach that interpolates a language model (LM) from a number of component LMs according to interpolation weights optimized for a target domain. The component LMs may include class-based LMs, and the interpolation may be context-specific or context-independent. Through iterative processes, the component LMs may be interpolated and used to express training material as alternative representations or parses of tokens. Posterior probabilities may be determined for these parses and used for determining new (or updated) interpolation weights for the LM components, such that a combination or interpolation of component LMs is further optimized for the domain. The component LMs may be merged, according to the optimized weights, into a single, combined LM, for deployment in an application scenario.

Type: Grant

Filed: March 11, 2015

Date of Patent: August 15, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Michael Levit, Sarangarajan Parthasarathy, Andreas Stolcke, Shuangyu Chang
FLEXIBLE SCHEMA FOR LANGUAGE MODEL CUSTOMIZATION

Publication number: 20170103753

Abstract: The customization of language modeling components for speech recognition is provided. A list of language modeling components may be made available by a computing device. A hint may then be sent to a recognition service provider for combining the multiple language modeling components from the list. The hint may be based on a number of different domains. A customized combination of the language modeling components based on the hint may then be received from the recognition service provider.

Type: Application

Filed: December 22, 2016

Publication date: April 13, 2017

Applicant: Microsoft Technology Licensing, LLC

Inventors: Michael Levit, Hernan Guelman, Shuangyu Chang, Sarangarajan Parthasarathy, Benoit Dumoulin
INCREMENTAL UTTERANCE DECODER COMBINATION FOR EFFICIENT AND ACCURATE DECODING

Publication number: 20170092275

Abstract: An incremental speech recognition system. The incremental speech recognition system incrementally decodes a spoken utterance using an additional utterance decoder only when the additional utterance decoder is likely to add significant benefit to the combined result. The available utterance decoders are ordered in a series based on accuracy, performance, diversity, and other factors. A recognition management engine coordinates decoding of the spoken utterance by the series of utterance decoders, combines the decoded utterances, and determines whether additional processing is likely to significantly improve the recognition result. If so, the recognition management engine engages the next utterance decoder and the cycle continues. If the accuracy cannot be significantly improved, the result is accepted and decoding stops.

Type: Application

Filed: December 13, 2016

Publication date: March 30, 2017

Applicant: Microsoft Technology Licensing, LLC

Inventors: Shuangyu Chang, Michael Levit, Abhik Lahiri, Barlas Oguz, Benoit Dumoulin
Incremental utterance decoder combination for efficient and accurate decoding

Patent number: 9552817

Abstract: An incremental speech recognition system. The incremental speech recognition system incrementally decodes a spoken utterance using an additional utterance decoder only when the additional utterance decoder is likely to add significant benefit to the combined result. The available utterance decoders are ordered in a series based on accuracy, performance, diversity, and other factors. A recognition management engine coordinates decoding of the spoken utterance by the series of utterance decoders, combines the decoded utterances, and determines whether additional processing is likely to significantly improve the recognition result. If so, the recognition management engine engages the next utterance decoder and the cycle continues. If the accuracy cannot be significantly improved, the result is accepted and decoding stops.

Type: Grant

Filed: March 19, 2014

Date of Patent: January 24, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Shuangyu Chang, Michael Levit, Abhik Lahiri, Barlas Oguz, Benoit Dumoulin
Flexible schema for language model customization

Patent number: 9529794

Abstract: The customization of language modeling components for speech recognition is provided. A list of language modeling components may be made available by a computing device. A hint may then be sent to a recognition service provider for combining the multiple language modeling components from the list. The hint may be based on a number of different domains. A customized combination of the language modeling components based on the hint may then be received from the recognition service provider.

Type: Grant

Filed: March 27, 2014

Date of Patent: December 27, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventors: Michael Levit, Hernan Guelman, Shuangyu Chang, Sarangarajan Parthasarathy, Benoit Dumoulin
DISCRIMINATIVE DATA SELECTION FOR LANGUAGE MODELING

Publication number: 20160336006

Abstract: A computer system for language modeling may collect training data from one or more information sources, generate a spoken corpus containing text of transcribed speech, and generate a typed corpus containing typed text. The computer system may derive feature vectors from the spoken corpus, analyze the typed corpus to determine feature vectors representing items of typed text, and generate an unspeakable corpus by filtering the typed corpus to remove each item of typed text represented by a feature vector that is within a similarity threshold of a feature vector derived from the spoken corpus. The computer system may derive feature vectors from the unspeakable corpus and train a classifier to perform discriminative data selection for language modeling based on the feature vectors derived from the spoken corpus and the feature vectors derived from the unspeakable corpus.

Type: Application

Filed: May 13, 2015

Publication date: November 17, 2016

Applicant: Microsoft Technology Licensing, LLC

Inventors: Michael Levit, Shuangyu Chang, Benoit Dumoulin
Token-Level Interpolation For Class-Based Language Models

Publication number: 20160267905

Abstract: Optimized language models are provided for in-domain applications through an iterative, joint-modeling approach that interpolates a language model (LM) from a number of component LMs according to interpolation weights optimized for a target domain. The component LMs may include class-based LMs, and the interpolation may be context-specific or context-independent. Through iterative processes, the component LMs may be interpolated and used to express training material as alternative representations or parses of tokens. Posterior probabilities may be determined for these parses and used for determining new (or updated) interpolation weights for the LM components, such that a combination or interpolation of component LMs is further optimized for the domain. The component LMs may be merged, according to the optimized weights, into a single, combined LM, for deployment in an application scenario.

Type: Application

Filed: March 11, 2015

Publication date: September 15, 2016

Inventors: Michael Levit, Sarangarajan Parthasarathy, Andreas Stolcke, Shuangyu Chang
SPEECH RECOGNITION ERROR DIAGNOSIS

Publication number: 20160253989

Abstract: Techniques and technologies for diagnosing speech recognition errors are described. In an example implementation, a system for diagnosing speech recognition errors may include an error detection module configured to determine that a speech recognition result is least partially erroneous, and a recognition error diagnostics module. The recognition error diagnostics module may be configured to (a) perform a first error analysis of the at least partially erroneous speech recognition result to provide a first error analysis result; (b) perform a second error analysis of the at least partially erroneous speech recognition result to provide a second error analysis result; and (c) determine at least one category of recognition error associated with the at least partially erroneous speech recognition result based on a combination of the first error analysis result and the second error analysis result.

Type: Application

Filed: February 27, 2015

Publication date: September 1, 2016

Inventors: Shiun-Zu Kuo, Thomas Reutter, Yifan Gong, Mark T. Hanson, Ye Tian, Shuangyu Chang, Jon Hamaker, Qi Miao, Yuancheng Tu
User query history expansion for improving language model adaptation

Patent number: 9299342

Abstract: Query history expansion may be provided. Upon receiving a spoken query from a user, an adapted language model may be applied to convert the spoken query to text. The adapted language model may comprise a plurality of queries interpolated from the user's previous queries and queries associated with other users. The spoken query may be executed and the results of the spoken query may be provided to the user.

Type: Grant

Filed: July 23, 2015

Date of Patent: March 29, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventors: Shuangyu Chang, Michael Levit, Bruce Melvin Buntschuh

prev 1 2 3 next