Patents by Inventor Shuangyu Chang

Shuangyu Chang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10679610
    Abstract: A method for eyes-off training of a dictation system includes translating an audio signal featuring speech audio of a speaker into an initial recognized text using a previously-trained general language model. The initial recognized text is provided to the speaker for error correction. The audio signal is re-translated into an updated recognized text using a specialized language model biased to recognize words included in the corrected text. The general language model is retrained in an “eyes-off” manner, based on the audio signal and the updated recognized text.
    Type: Grant
    Filed: July 16, 2018
    Date of Patent: June 9, 2020
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Hemant Malhotra, Shuangyu Chang, Pradip Kumar Fatehpuria
  • Patent number: 10650811
    Abstract: Disclosed in various examples are methods, systems, and machine-readable mediums for providing improved computer implemented speech recognition by detecting and correcting speech recognition errors during a speech session. The system recognizes repeated speech commands from a user in a speech session that are similar or identical to each other. To correct these repeated errors, the system creates a customized language model that is then utilized by the language modeler to produce a refined prediction of the meaning of the repeated speech commands. The custom language model may comprise clusters of similar past predictions of speech commands from the speech session of the user.
    Type: Grant
    Filed: March 13, 2018
    Date of Patent: May 12, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Meryem Pinar Donmez Ediz, Ranjitha Gurunath Kulkarni, Shuangyu Chang, Nitin Kamra
  • Publication number: 20200020319
    Abstract: A method for eyes-off training of a dictation system includes translating an audio signal featuring speech audio of a speaker into an initial recognized text using a previously-trained general language model. The initial recognized text is provided to the speaker for error correction. The audio signal is re-translated into an updated recognized text using a specialized language model biased to recognize words included in the corrected text. The general language model is retrained in an “eyes-off” manner, based on the audio signal and the updated recognized text.
    Type: Application
    Filed: July 16, 2018
    Publication date: January 16, 2020
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Hemant MALHOTRA, Shuangyu CHANG, Pradip Kumar FATEHPURIA
  • Patent number: 10497367
    Abstract: The customization of language modeling components for speech recognition is provided. A list of language modeling components may be made available by a computing device. A hint may then be sent to a recognition service provider for combining the multiple language modeling components from the list. The hint may be based on a number of different domains. A customized combination of the language modeling components based on the hint may then be received from the recognition service provider.
    Type: Grant
    Filed: December 22, 2016
    Date of Patent: December 3, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Hernan Guelman, Shuangyu Chang, Sarangarajan Parthasarathy, Benoit Dumoulin
  • Publication number: 20190287519
    Abstract: Disclosed in various examples are methods, systems, and machine-readable mediums for providing improved computer implemented speech recognition by detecting and correcting speech recognition errors during a speech session. The system recognizes repeated speech commands from a user in a speech session that are similar or identical to each other. To correct these repeated errors, the system creates a customized language model that is then utilized by the language modeler to produce a refined prediction of the meaning of the repeated speech commands. The custom language model may comprise clusters of similar past predictions of speech commands from the speech session of the user.
    Type: Application
    Filed: March 13, 2018
    Publication date: September 19, 2019
    Inventors: Meryem Pinar Donmez Ediz, Ranjitha Gurunath Kulkarni, Shuangyu Chang, Nitin Kamra
  • Patent number: 10192545
    Abstract: A computer system for language modeling may collect training data from one or more information sources, generate a spoken corpus containing text of transcribed speech, and generate a typed corpus containing typed text. The computer system may derive feature vectors from the spoken corpus, analyze the typed corpus to determine feature vectors representing items of typed text, and generate an unspeakable corpus by filtering the typed corpus to remove each item of typed text represented by a feature vector that is within a similarity threshold of a feature vector derived from the spoken corpus. The computer system may derive feature vectors from the unspeakable corpus and train a classifier to perform discriminative data selection for language modeling based on the feature vectors derived from the spoken corpus and the feature vectors derived from the unspeakable corpus.
    Type: Grant
    Filed: June 5, 2017
    Date of Patent: January 29, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Shuangyu Chang, Benoit Dumoulin
  • Publication number: 20180330725
    Abstract: A method for priming an extensible speech recognition system comprises receiving audio language input from a user. The method also comprises receiving an indication that the audio language input is associated with a first language-based intelligent agent. The first language-based intelligent agent is associated with a first grammar set that is specific to the first language-based intelligent agent. Additionally, the method comprises matching one or more spoken words or phrases within the audio language input to text-based words or phrases within a general grammar set associated with a speech recognition system and the first grammar set. The first grammar set is associated with a higher match bias than the general grammar set, such that the speech recognition system is more likely to match the one or more spoken words or phrases to the text-based words or phrases within the first grammar set.
    Type: Application
    Filed: August 18, 2017
    Publication date: November 15, 2018
    Inventors: Padma VARADHARAJAN, Shuangyu CHANG, Khuram SHAHID, Meryem Pinar DONMEZ EDIZ, Nitin AGARWAL
  • Patent number: 10019984
    Abstract: Techniques and technologies for diagnosing speech recognition errors are described. In an example implementation, a system for diagnosing speech recognition errors may include an error detection module configured to determine that a speech recognition result is least partially erroneous, and a recognition error diagnostics module. The recognition error diagnostics module may be configured to (a) perform a first error analysis of the at least partially erroneous speech recognition result to provide a first error analysis result; (b) perform a second error analysis of the at least partially erroneous speech recognition result to provide a second error analysis result; and (c) determine at least one category of recognition error associated with the at least partially erroneous speech recognition result based on a combination of the first error analysis result and the second error analysis result.
    Type: Grant
    Filed: February 27, 2015
    Date of Patent: July 10, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shiun-Zu Kuo, Thomas Reutter, Yifan Gong, Mark T. Hanson, Ye Tian, Shuangyu Chang, Jonathan Hamaker, Qi Miao, Yuancheng Tu
  • Patent number: 9922654
    Abstract: An incremental speech recognition system. The incremental speech recognition system incrementally decodes a spoken utterance using an additional utterance decoder only when the additional utterance decoder is likely to add significant benefit to the combined result. The available utterance decoders are ordered in a series based on accuracy, performance, diversity, and other factors. A recognition management engine coordinates decoding of the spoken utterance by the series of utterance decoders, combines the decoded utterances, and determines whether additional processing is likely to significantly improve the recognition result. If so, the recognition management engine engages the next utterance decoder and the cycle continues. If the accuracy cannot be significantly improved, the result is accepted and decoding stops.
    Type: Grant
    Filed: December 13, 2016
    Date of Patent: March 20, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shuangyu Chang, Michael Levit, Abhik Lahiri, Barlas Oguz, Benoit Dumoulin
  • Publication number: 20170270912
    Abstract: A computer system for language modeling may collect training data from one or more information sources, generate a spoken corpus containing text of transcribed speech, and generate a typed corpus containing typed text. The computer system may derive feature vectors from the spoken corpus, analyze the typed corpus to determine feature vectors representing items of typed text, and generate an unspeakable corpus by filtering the typed corpus to remove each item of typed text represented by a feature vector that is within a similarity threshold of a feature vector derived from the spoken corpus. The computer system may derive feature vectors from the unspeakable corpus and train a classifier to perform discriminative data selection for language modeling based on the feature vectors derived from the spoken corpus and the feature vectors derived from the unspeakable corpus.
    Type: Application
    Filed: June 5, 2017
    Publication date: September 21, 2017
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Shuangyu Chang, Benoit Dumoulin
  • Patent number: 9761220
    Abstract: A computer system for language modeling may collect training data from one or more information sources, generate a spoken corpus containing text of transcribed speech, and generate a typed corpus containing typed text. The computer system may derive feature vectors from the spoken corpus, analyze the typed corpus to determine feature vectors representing items of typed text, and generate an unspeakable corpus by filtering the typed corpus to remove each item of typed text represented by a feature vector that is within a similarity threshold of a feature vector derived from the spoken corpus. The computer system may derive feature vectors from the unspeakable corpus and train a classifier to perform discriminative data selection for language modeling based on the feature vectors derived from the spoken corpus and the feature vectors derived from the unspeakable corpus.
    Type: Grant
    Filed: May 13, 2015
    Date of Patent: September 12, 2017
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Michael Levit, Shuangyu Chang, Benoit Dumoulin
  • Patent number: 9734826
    Abstract: Optimized language models are provided for in-domain applications through an iterative, joint-modeling approach that interpolates a language model (LM) from a number of component LMs according to interpolation weights optimized for a target domain. The component LMs may include class-based LMs, and the interpolation may be context-specific or context-independent. Through iterative processes, the component LMs may be interpolated and used to express training material as alternative representations or parses of tokens. Posterior probabilities may be determined for these parses and used for determining new (or updated) interpolation weights for the LM components, such that a combination or interpolation of component LMs is further optimized for the domain. The component LMs may be merged, according to the optimized weights, into a single, combined LM, for deployment in an application scenario.
    Type: Grant
    Filed: March 11, 2015
    Date of Patent: August 15, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Sarangarajan Parthasarathy, Andreas Stolcke, Shuangyu Chang
  • Publication number: 20170103753
    Abstract: The customization of language modeling components for speech recognition is provided. A list of language modeling components may be made available by a computing device. A hint may then be sent to a recognition service provider for combining the multiple language modeling components from the list. The hint may be based on a number of different domains. A customized combination of the language modeling components based on the hint may then be received from the recognition service provider.
    Type: Application
    Filed: December 22, 2016
    Publication date: April 13, 2017
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Hernan Guelman, Shuangyu Chang, Sarangarajan Parthasarathy, Benoit Dumoulin
  • Publication number: 20170092275
    Abstract: An incremental speech recognition system. The incremental speech recognition system incrementally decodes a spoken utterance using an additional utterance decoder only when the additional utterance decoder is likely to add significant benefit to the combined result. The available utterance decoders are ordered in a series based on accuracy, performance, diversity, and other factors. A recognition management engine coordinates decoding of the spoken utterance by the series of utterance decoders, combines the decoded utterances, and determines whether additional processing is likely to significantly improve the recognition result. If so, the recognition management engine engages the next utterance decoder and the cycle continues. If the accuracy cannot be significantly improved, the result is accepted and decoding stops.
    Type: Application
    Filed: December 13, 2016
    Publication date: March 30, 2017
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Shuangyu Chang, Michael Levit, Abhik Lahiri, Barlas Oguz, Benoit Dumoulin
  • Patent number: 9552817
    Abstract: An incremental speech recognition system. The incremental speech recognition system incrementally decodes a spoken utterance using an additional utterance decoder only when the additional utterance decoder is likely to add significant benefit to the combined result. The available utterance decoders are ordered in a series based on accuracy, performance, diversity, and other factors. A recognition management engine coordinates decoding of the spoken utterance by the series of utterance decoders, combines the decoded utterances, and determines whether additional processing is likely to significantly improve the recognition result. If so, the recognition management engine engages the next utterance decoder and the cycle continues. If the accuracy cannot be significantly improved, the result is accepted and decoding stops.
    Type: Grant
    Filed: March 19, 2014
    Date of Patent: January 24, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shuangyu Chang, Michael Levit, Abhik Lahiri, Barlas Oguz, Benoit Dumoulin
  • Patent number: 9529794
    Abstract: The customization of language modeling components for speech recognition is provided. A list of language modeling components may be made available by a computing device. A hint may then be sent to a recognition service provider for combining the multiple language modeling components from the list. The hint may be based on a number of different domains. A customized combination of the language modeling components based on the hint may then be received from the recognition service provider.
    Type: Grant
    Filed: March 27, 2014
    Date of Patent: December 27, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Hernan Guelman, Shuangyu Chang, Sarangarajan Parthasarathy, Benoit Dumoulin
  • Publication number: 20160336006
    Abstract: A computer system for language modeling may collect training data from one or more information sources, generate a spoken corpus containing text of transcribed speech, and generate a typed corpus containing typed text. The computer system may derive feature vectors from the spoken corpus, analyze the typed corpus to determine feature vectors representing items of typed text, and generate an unspeakable corpus by filtering the typed corpus to remove each item of typed text represented by a feature vector that is within a similarity threshold of a feature vector derived from the spoken corpus. The computer system may derive feature vectors from the unspeakable corpus and train a classifier to perform discriminative data selection for language modeling based on the feature vectors derived from the spoken corpus and the feature vectors derived from the unspeakable corpus.
    Type: Application
    Filed: May 13, 2015
    Publication date: November 17, 2016
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Shuangyu Chang, Benoit Dumoulin
  • Publication number: 20160267905
    Abstract: Optimized language models are provided for in-domain applications through an iterative, joint-modeling approach that interpolates a language model (LM) from a number of component LMs according to interpolation weights optimized for a target domain. The component LMs may include class-based LMs, and the interpolation may be context-specific or context-independent. Through iterative processes, the component LMs may be interpolated and used to express training material as alternative representations or parses of tokens. Posterior probabilities may be determined for these parses and used for determining new (or updated) interpolation weights for the LM components, such that a combination or interpolation of component LMs is further optimized for the domain. The component LMs may be merged, according to the optimized weights, into a single, combined LM, for deployment in an application scenario.
    Type: Application
    Filed: March 11, 2015
    Publication date: September 15, 2016
    Inventors: Michael Levit, Sarangarajan Parthasarathy, Andreas Stolcke, Shuangyu Chang
  • Publication number: 20160253989
    Abstract: Techniques and technologies for diagnosing speech recognition errors are described. In an example implementation, a system for diagnosing speech recognition errors may include an error detection module configured to determine that a speech recognition result is least partially erroneous, and a recognition error diagnostics module. The recognition error diagnostics module may be configured to (a) perform a first error analysis of the at least partially erroneous speech recognition result to provide a first error analysis result; (b) perform a second error analysis of the at least partially erroneous speech recognition result to provide a second error analysis result; and (c) determine at least one category of recognition error associated with the at least partially erroneous speech recognition result based on a combination of the first error analysis result and the second error analysis result.
    Type: Application
    Filed: February 27, 2015
    Publication date: September 1, 2016
    Inventors: Shiun-Zu Kuo, Thomas Reutter, Yifan Gong, Mark T. Hanson, Ye Tian, Shuangyu Chang, Jon Hamaker, Qi Miao, Yuancheng Tu
  • Patent number: 9299342
    Abstract: Query history expansion may be provided. Upon receiving a spoken query from a user, an adapted language model may be applied to convert the spoken query to text. The adapted language model may comprise a plurality of queries interpolated from the user's previous queries and queries associated with other users. The spoken query may be executed and the results of the spoken query may be provided to the user.
    Type: Grant
    Filed: July 23, 2015
    Date of Patent: March 29, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shuangyu Chang, Michael Levit, Bruce Melvin Buntschuh