Patents by Inventor Michael Levit

Michael Levit has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11676576
    Abstract: Systems and methods are provided for acquiring training data and building an organizational-based language model based on the training data. In organizational data is generated via one or more applications associated with an organization, the collected organizational data is aggregated and filtered into training data that is used for training an organizational-based language model for speech processing based on the training data.
    Type: Grant
    Filed: August 11, 2021
    Date of Patent: June 13, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ziad Al Bawab, Anand U Desai, Cem Aksoylar, Michael Levit, Xin Meng, Shuangyu Chang, Suyash Choudhury, Dhiresh Rawal, Tao Li, Rishi Girish, Marcus Jager, Ananth Rampura Sheshagiri Rao
  • Publication number: 20220392432
    Abstract: Systems and methods for speech recognition correction include receiving a voice recognition input from an individual user and using a trained error correction model to add a new alternative result to a results list based on the received voice input processed by a voice recognition system. The error correction model is trained using contextual information corresponding to the individual user. The contextual information comprises a plurality of historical user correction logs, a plurality of personal class definitions, and an application context. A re-ranker re-ranks the results list with the new alternative result and a top result from the re-ranked results list is output.
    Type: Application
    Filed: June 8, 2021
    Publication date: December 8, 2022
    Inventors: Issac ALPHONSO, Tasos ANASTASAKOS, Michael LEVIT, Nitin AGARWAL
  • Publication number: 20220382973
    Abstract: A computer implemented method includes receiving a natural language utterance, generating multiple alternative N-gram contexts for a evaluating a next word in the natural language utterance, selecting N-gram context candidates from the multiple alternative N-gram contexts comprising different sets of N-1 words in the natural language utterance for selecting a next word in the natural language utterance, and providing the N-gram context candidates for creating a transcript of the natural language utterance.
    Type: Application
    Filed: May 28, 2021
    Publication date: December 1, 2022
    Inventors: Michael LEVIT, Cem Aksoylar
  • Publication number: 20220013109
    Abstract: Provided is a system and method for acquiring training data and building an organizational-based language model based on the training data. In one example, the method may include collecting organizational data that is generated via one or more applications associated with an organization, aggregating the collected organizational data with previously collected organizational data to generate aggregated organizational training data, training an organizational-based language model for speech processing based on the aggregated organizational training data, and storing the trained organizational-based language model.
    Type: Application
    Filed: August 11, 2021
    Publication date: January 13, 2022
    Inventors: Ziad AL BAWAB, Anand U. DESAI, Cem AKSOYLAR, Michael LEVIT, Xin MENG, Shuangyu CHANG, Suyash CHOUDHURY, Dhiresh RAWAL, Tao LI, Rishi GIRISH, Marcus JAGER, Ananth Rampura SHESHAGIRI RAO
  • Patent number: 11120788
    Abstract: Provided is a system and method for acquiring training data and building an organizational-based language model based on the training data. In one example, the method may include collecting organizational data that is generated via one or more applications associated with an organization, aggregating the collected organizational data with previously collected organizational data to generate aggregated organizational training data, training an organizational-based language model for speech processing based on the aggregated organizational training data, and storing the trained organizational-based language model.
    Type: Grant
    Filed: June 27, 2019
    Date of Patent: September 14, 2021
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Ziad Al Bawab, Anand U Desai, Cem Aksoylar, Michael Levit, Xin Meng, Shuangyu Chang, Suyash Choudhury, Dhiresh Rawal, Tao Li, Rishi Girish, Marcus Jager, Ananth Rampura Sheshagiri Rao
  • Publication number: 20200349920
    Abstract: Provided is a system and method for acquiring training data and building an organizational-based language model based on the training data. In one example, the method may include collecting organizational data that is generated via one or more applications associated with an organization, aggregating the collected organizational data with previously collected organizational data to generate aggregated organizational training data, training an organizational-based language model for speech processing based on the aggregated organizational training data, and storing the trained organizational-based language model.
    Type: Application
    Filed: June 27, 2019
    Publication date: November 5, 2020
    Inventors: Ziad AL BAWAB, Anand U DESAI, Cem AKSOYLAR, Michael LEVIT, Xin MENG, Shuangyu CHANG, Suyash CHOUDHURY, Dhiresh RAWAL, Tao LI, Rishi GIRISH, Marcus JAGER, Ananth Rampura SHESHAGIRI RAO
  • Patent number: 10497367
    Abstract: The customization of language modeling components for speech recognition is provided. A list of language modeling components may be made available by a computing device. A hint may then be sent to a recognition service provider for combining the multiple language modeling components from the list. The hint may be based on a number of different domains. A customized combination of the language modeling components based on the hint may then be received from the recognition service provider.
    Type: Grant
    Filed: December 22, 2016
    Date of Patent: December 3, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Hernan Guelman, Shuangyu Chang, Sarangarajan Parthasarathy, Benoit Dumoulin
  • Patent number: 10192545
    Abstract: A computer system for language modeling may collect training data from one or more information sources, generate a spoken corpus containing text of transcribed speech, and generate a typed corpus containing typed text. The computer system may derive feature vectors from the spoken corpus, analyze the typed corpus to determine feature vectors representing items of typed text, and generate an unspeakable corpus by filtering the typed corpus to remove each item of typed text represented by a feature vector that is within a similarity threshold of a feature vector derived from the spoken corpus. The computer system may derive feature vectors from the unspeakable corpus and train a classifier to perform discriminative data selection for language modeling based on the feature vectors derived from the spoken corpus and the feature vectors derived from the unspeakable corpus.
    Type: Grant
    Filed: June 5, 2017
    Date of Patent: January 29, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Shuangyu Chang, Benoit Dumoulin
  • Patent number: 9972311
    Abstract: Systems and methods are provided for optimizing language models for in-domain applications through an iterative, joint-modeling approach that expresses training material as alternative representations of higher-level tokens, such as named entities and carrier phrases. From a first language model, an in-domain training corpus may be represented as a set of alternative parses of tokens. Statistical information determined from these parsed representations may be used to produce a second (or updated) language model, which is further optimized for the domain. The second language model may be used to determine another alternative parsed representation of the corpus for a next iteration, and the statistical information determined from this representation may be used to produce a third (or further updated) language model. Through each iteration, a language model may be determined that is further optimized for the domain.
    Type: Grant
    Filed: May 7, 2014
    Date of Patent: May 15, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Sarangarajan Parthasarathy, Andreas Stolcke
  • Patent number: 9922654
    Abstract: An incremental speech recognition system. The incremental speech recognition system incrementally decodes a spoken utterance using an additional utterance decoder only when the additional utterance decoder is likely to add significant benefit to the combined result. The available utterance decoders are ordered in a series based on accuracy, performance, diversity, and other factors. A recognition management engine coordinates decoding of the spoken utterance by the series of utterance decoders, combines the decoded utterances, and determines whether additional processing is likely to significantly improve the recognition result. If so, the recognition management engine engages the next utterance decoder and the cycle continues. If the accuracy cannot be significantly improved, the result is accepted and decoding stops.
    Type: Grant
    Filed: December 13, 2016
    Date of Patent: March 20, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shuangyu Chang, Michael Levit, Abhik Lahiri, Barlas Oguz, Benoit Dumoulin
  • Publication number: 20170270912
    Abstract: A computer system for language modeling may collect training data from one or more information sources, generate a spoken corpus containing text of transcribed speech, and generate a typed corpus containing typed text. The computer system may derive feature vectors from the spoken corpus, analyze the typed corpus to determine feature vectors representing items of typed text, and generate an unspeakable corpus by filtering the typed corpus to remove each item of typed text represented by a feature vector that is within a similarity threshold of a feature vector derived from the spoken corpus. The computer system may derive feature vectors from the unspeakable corpus and train a classifier to perform discriminative data selection for language modeling based on the feature vectors derived from the spoken corpus and the feature vectors derived from the unspeakable corpus.
    Type: Application
    Filed: June 5, 2017
    Publication date: September 21, 2017
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Shuangyu Chang, Benoit Dumoulin
  • Patent number: 9761220
    Abstract: A computer system for language modeling may collect training data from one or more information sources, generate a spoken corpus containing text of transcribed speech, and generate a typed corpus containing typed text. The computer system may derive feature vectors from the spoken corpus, analyze the typed corpus to determine feature vectors representing items of typed text, and generate an unspeakable corpus by filtering the typed corpus to remove each item of typed text represented by a feature vector that is within a similarity threshold of a feature vector derived from the spoken corpus. The computer system may derive feature vectors from the unspeakable corpus and train a classifier to perform discriminative data selection for language modeling based on the feature vectors derived from the spoken corpus and the feature vectors derived from the unspeakable corpus.
    Type: Grant
    Filed: May 13, 2015
    Date of Patent: September 12, 2017
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Michael Levit, Shuangyu Chang, Benoit Dumoulin
  • Patent number: 9734826
    Abstract: Optimized language models are provided for in-domain applications through an iterative, joint-modeling approach that interpolates a language model (LM) from a number of component LMs according to interpolation weights optimized for a target domain. The component LMs may include class-based LMs, and the interpolation may be context-specific or context-independent. Through iterative processes, the component LMs may be interpolated and used to express training material as alternative representations or parses of tokens. Posterior probabilities may be determined for these parses and used for determining new (or updated) interpolation weights for the LM components, such that a combination or interpolation of component LMs is further optimized for the domain. The component LMs may be merged, according to the optimized weights, into a single, combined LM, for deployment in an application scenario.
    Type: Grant
    Filed: March 11, 2015
    Date of Patent: August 15, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Sarangarajan Parthasarathy, Andreas Stolcke, Shuangyu Chang
  • Patent number: 9672202
    Abstract: Various components provide options to re-format an input based on one or more contexts. The input is received that has been submitted to an application (e.g., messaging application, mobile application, word-processing application, web browser, search tool, etc.), and one or more outputs are identified that are possibilities to be provided as options for re-formatting. A respective score of each output is determined by applying a statistical model to a respective combination of the input and each output, the respective score comprising a plurality of context scores that quantify a plurality of contexts of the respective combination. Exemplary contexts include historical-user contexts, domain contexts, and general contexts. One or more suggested outputs are selected from among the one or more outputs based on the respective scores and are provided as options to re-format the input.
    Type: Grant
    Filed: March 20, 2014
    Date of Patent: June 6, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Issac Alphonso, Nick Kibre, Michael Levit, Sarangarajan Parthasarathy
  • Publication number: 20170103753
    Abstract: The customization of language modeling components for speech recognition is provided. A list of language modeling components may be made available by a computing device. A hint may then be sent to a recognition service provider for combining the multiple language modeling components from the list. The hint may be based on a number of different domains. A customized combination of the language modeling components based on the hint may then be received from the recognition service provider.
    Type: Application
    Filed: December 22, 2016
    Publication date: April 13, 2017
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Hernan Guelman, Shuangyu Chang, Sarangarajan Parthasarathy, Benoit Dumoulin
  • Publication number: 20170092275
    Abstract: An incremental speech recognition system. The incremental speech recognition system incrementally decodes a spoken utterance using an additional utterance decoder only when the additional utterance decoder is likely to add significant benefit to the combined result. The available utterance decoders are ordered in a series based on accuracy, performance, diversity, and other factors. A recognition management engine coordinates decoding of the spoken utterance by the series of utterance decoders, combines the decoded utterances, and determines whether additional processing is likely to significantly improve the recognition result. If so, the recognition management engine engages the next utterance decoder and the cycle continues. If the accuracy cannot be significantly improved, the result is accepted and decoding stops.
    Type: Application
    Filed: December 13, 2016
    Publication date: March 30, 2017
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Shuangyu Chang, Michael Levit, Abhik Lahiri, Barlas Oguz, Benoit Dumoulin
  • Patent number: 9552817
    Abstract: An incremental speech recognition system. The incremental speech recognition system incrementally decodes a spoken utterance using an additional utterance decoder only when the additional utterance decoder is likely to add significant benefit to the combined result. The available utterance decoders are ordered in a series based on accuracy, performance, diversity, and other factors. A recognition management engine coordinates decoding of the spoken utterance by the series of utterance decoders, combines the decoded utterances, and determines whether additional processing is likely to significantly improve the recognition result. If so, the recognition management engine engages the next utterance decoder and the cycle continues. If the accuracy cannot be significantly improved, the result is accepted and decoding stops.
    Type: Grant
    Filed: March 19, 2014
    Date of Patent: January 24, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shuangyu Chang, Michael Levit, Abhik Lahiri, Barlas Oguz, Benoit Dumoulin
  • Patent number: 9542931
    Abstract: On a computing device a speech utterance is received from a user. The speech utterance is a section of a speech dialog that includes a plurality of speech utterances. One or more features from the speech utterance are identified. Each identified feature from the speech utterance is a specific characteristic of the speech utterance. One or more features from the speech dialog are identified. Each identified feature from the speech dialog is associated with one or more events in the speech dialog. The one or more events occur prior to the speech utterance. One or more identified features from the speech utterance and one or more identified features from the speech dialog are used to calculate a confidence score for the speech utterance.
    Type: Grant
    Filed: October 23, 2014
    Date of Patent: January 10, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Bruce Melvin Buntschuh
  • Patent number: 9529794
    Abstract: The customization of language modeling components for speech recognition is provided. A list of language modeling components may be made available by a computing device. A hint may then be sent to a recognition service provider for combining the multiple language modeling components from the list. The hint may be based on a number of different domains. A customized combination of the language modeling components based on the hint may then be received from the recognition service provider.
    Type: Grant
    Filed: March 27, 2014
    Date of Patent: December 27, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Hernan Guelman, Shuangyu Chang, Sarangarajan Parthasarathy, Benoit Dumoulin
  • Publication number: 20160336006
    Abstract: A computer system for language modeling may collect training data from one or more information sources, generate a spoken corpus containing text of transcribed speech, and generate a typed corpus containing typed text. The computer system may derive feature vectors from the spoken corpus, analyze the typed corpus to determine feature vectors representing items of typed text, and generate an unspeakable corpus by filtering the typed corpus to remove each item of typed text represented by a feature vector that is within a similarity threshold of a feature vector derived from the spoken corpus. The computer system may derive feature vectors from the unspeakable corpus and train a classifier to perform discriminative data selection for language modeling based on the feature vectors derived from the spoken corpus and the feature vectors derived from the unspeakable corpus.
    Type: Application
    Filed: May 13, 2015
    Publication date: November 17, 2016
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Shuangyu Chang, Benoit Dumoulin