Patents by Inventor Sarangarajan Parthasarathy

Sarangarajan Parthasarathy has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240135919
    Abstract: Systems and methods are provided for accessing a factorized neural transducer comprising a first set of layers for predicting blank tokens and a second set of layers for predicting vocabulary tokens, the second set of layers comprising a language model that includes a vocabulary predictor which is a separate predictor from the blank predictor, wherein a vocabulary predictor output from the vocabulary predictor and the encoder output are used for predicting a vocabulary token. The second set of layers is selectively modified to facilitate an improvement in an accuracy of the factorized neural transducer in performing automatic speech recognition, the selectively modifying comprising applying a particular modification to the second set of layers while refraining from applying the particular modification to the first set of layers.
    Type: Application
    Filed: November 9, 2022
    Publication date: April 25, 2024
    Inventors: Rui ZHAO, Jian XUE, Sarangarajan PARTHASARATHY, Jinyu LI
  • Publication number: 20230297606
    Abstract: Generally discussed herein are devices, systems, and methods for multi-lingual model generation. A method can include determining, for low-resource languages, respective a language similarity value indicating language similarity between each of the low-resource languages, clustering the low-resource languages into groups based on the respective language similarity value, aggregating training data of languages corresponding to a given group resulting in aggregated training data, and training a re-ranking language model based on the aggregated training data resulting in a trained re-ranking language model.
    Type: Application
    Filed: June 14, 2022
    Publication date: September 21, 2023
    Inventors: Li MIAO, Jian WU, Shuangyu CHANG, Piyush BEHRE, Sarangarajan PARTHASARATHY
  • Patent number: 11527238
    Abstract: A computer device is provided that includes one or more processors configured to receive an end-to-end (E2E) model that has been trained for automatic speech recognition with training data from a source-domain, and receive an external language model that has been trained with training data from a target-domain. The one or more processors are configured to perform an inference of the probability of an output token sequence given a sequence of input speech features. Performing the inference includes computing an E2E model score, computing an external language model score, and computing an estimated internal language model score for the E2E model. The estimated internal language model score is computed by removing a contribution of an intrinsic acoustic model. The processor is further configured to compute an integrated score based at least on E2E model score, the external language model score, and the estimated internal language model score.
    Type: Grant
    Filed: January 21, 2021
    Date of Patent: December 13, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Zhong Meng, Sarangarajan Parthasarathy, Xie Sun, Yashesh Gaur, Naoyuki Kanda, Liang Lu, Xie Chen, Rui Zhao, Jinyu Li, Yifan Gong
  • Publication number: 20220139380
    Abstract: A computer device is provided that includes one or more processors configured to receive an end-to-end (E2E) model that has been trained for automatic speech recognition with training data from a source-domain, and receive an external language model that has been trained with training data from a target-domain. The one or more processors are configured to perform an inference of the probability of an output token sequence given a sequence of input speech features. Performing the inference includes computing an E2E model score, computing an external language model score, and computing an estimated internal language model score for the E2E model. The estimated internal language model score is computed by removing a contribution of an intrinsic acoustic model. The processor is further configured to compute an integrated score based at least on E2E model score, the external language model score, and the estimated internal language model score.
    Type: Application
    Filed: January 21, 2021
    Publication date: May 5, 2022
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Zhong MENG, Sarangarajan PARTHASARATHY, Xie SUN, Yashesh GAUR, Naoyuki KANDA, Liang LU, Xie CHEN, Rui ZHAO, Jinyu LI, Yifan GONG
  • Patent number: 10607600
    Abstract: A system and method of updating automatic speech recognition parameters on a mobile device are disclosed. The method comprises storing user account-specific adaptation data associated with ASR on a computing device associated with a wireless network, generating new ASR adaptation parameters based on transmitted information from the mobile device when a communication channel between the computing device and the mobile device becomes available and transmitting the new ASR adaptation data to the mobile device when a communication channel between the computing device and the mobile device becomes available. The new ASR adaptation data on the mobile device more accurately recognizes user utterances.
    Type: Grant
    Filed: February 12, 2018
    Date of Patent: March 31, 2020
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Sarangarajan Parthasarathy, Richard Cameron Rose
  • Patent number: 10529326
    Abstract: Techniques are described herein that are capable of suggesting intent frame(s) for user request(s). For instance, the intent frame(s) may be suggested to elicit a request from a user. An intent frame is a natural language phrase (e.g., a sentence) that includes at least one carrier phrase and at least one slot. A slot in an intent frame is a placeholder that is identified as being replaceable by one or more words that identify an entity and/or an action to indicate an intent of the user. A carrier phrase in an intent frame includes one or more words that suggest a type of entity and/or action that is to be identified by the one or more words that may replace the corresponding slot. In accordance with these techniques, the intent frame(s) are suggested in response to determining that natural language functionality of a processing system is activated.
    Type: Grant
    Filed: January 13, 2017
    Date of Patent: January 7, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shane J. Landry, Anne K. Sullivan, Lisa J. Stifelman, Adam D. Elman, Larry Paul Heck, Sarangarajan Parthasarathy
  • Patent number: 10497367
    Abstract: The customization of language modeling components for speech recognition is provided. A list of language modeling components may be made available by a computing device. A hint may then be sent to a recognition service provider for combining the multiple language modeling components from the list. The hint may be based on a number of different domains. A customized combination of the language modeling components based on the hint may then be received from the recognition service provider.
    Type: Grant
    Filed: December 22, 2016
    Date of Patent: December 3, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Hernan Guelman, Shuangyu Chang, Sarangarajan Parthasarathy, Benoit Dumoulin
  • Publication number: 20190073994
    Abstract: This disclosure generally relates to a speech pronunciation generation system. The speech pronunciation generation system may be included with or otherwise interact with a speech recognition system, a speech synthesis system, or a combination thereof. The speech pronunciation generation system receives contextual information associated with a named entity, a determined pronunciation of the named entity and feedback associated with the pronunciation. This information may be used to update a pronunciation score associated with the pronunciation. The speech pronunciation generation system may also provide suggested pronunciations of named entities to the input recognition system.
    Type: Application
    Filed: September 5, 2017
    Publication date: March 7, 2019
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Sarangarajan PARTHASARATHY, Osama Mohamad Ahmed ABUELSOROUR
  • Patent number: 10176219
    Abstract: Methods and systems are provided for providing alternative query suggestions. For example, a spoken natural language expression may be received and converted to a textual query by a speech recognition component. The spoken natural language expression may include one or more words, terms, and/or phrases. A phonetically confusable segment of the textual query may be identified by a classifier component. The classifier component may determine at least one alternative query based on identifying at least the phonetically confusable segment of the textual query. The classifier may further determine whether to suggest the at least one alternative query based on whether the at least one alternative query is sensical and/or useful. When it is determined to suggest the at least one alternative query, the at least one alternative query may be provided to and displayed on a user interface display.
    Type: Grant
    Filed: March 13, 2015
    Date of Patent: January 8, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Benoit Dumoulin, Ali Ahmadi, Sarangarajan Parthasarathy, Nick Craswell, Umut Ozertem, Milad Shokouhi, Karthik Raghunathan, Rosie Jones
  • Publication number: 20180166070
    Abstract: A system and method of updating automatic speech recognition parameters on a mobile device are disclosed. The method comprises storing user account-specific adaptation data associated with ASR on a computing device associated with a wireless network, generating new ASR adaptation parameters based on transmitted information from the mobile device when a communication channel between the computing device and the mobile device becomes available and transmitting the new ASR adaptation data to the mobile device when a communication channel between the computing device and the mobile device becomes available. The new ASR adaptation data on the mobile device more accurately recognizes user utterances.
    Type: Application
    Filed: February 12, 2018
    Publication date: June 14, 2018
    Inventors: Sarangarajan PARTHASARATHY, Richard Cameron ROSE
  • Patent number: 9972311
    Abstract: Systems and methods are provided for optimizing language models for in-domain applications through an iterative, joint-modeling approach that expresses training material as alternative representations of higher-level tokens, such as named entities and carrier phrases. From a first language model, an in-domain training corpus may be represented as a set of alternative parses of tokens. Statistical information determined from these parsed representations may be used to produce a second (or updated) language model, which is further optimized for the domain. The second language model may be used to determine another alternative parsed representation of the corpus for a next iteration, and the statistical information determined from this representation may be used to produce a third (or further updated) language model. Through each iteration, a language model may be determined that is further optimized for the domain.
    Type: Grant
    Filed: May 7, 2014
    Date of Patent: May 15, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Sarangarajan Parthasarathy, Andreas Stolcke
  • Patent number: 9947317
    Abstract: A new pronunciation learning system for dynamically learning new pronunciations assisted by user correction logs. The user correction logs provide a record of speech recognition events and subsequent user behavior that implicitly confirms or rejects the recognition result and/or shows the user's intended words by via subsequent input. The system analyzes the correction logs and distills them down to a set of words which lack acceptable pronunciations. Hypothetical pronunciations, constrained by spelling and other linguistic knowledge, are generated for each of the words. Offline recognition determines the hypothetical pronunciations with a good acoustical match to the audio data likely to contain the words. The matching pronunciations are aggregated and adjudicated to select new pronunciations for the words to improve general or personalized recognition models.
    Type: Grant
    Filed: February 13, 2017
    Date of Patent: April 17, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Nicholas Kibre, Umut Ozertem, Sarangarajan Parthasarathy, Ziad Al Bawab
  • Patent number: 9892728
    Abstract: A system and method of updating automatic speech recognition parameters on a mobile device are disclosed. The method comprises storing user account-specific adaptation data associated with ASR on a computing device associated with a wireless network, generating new ASR adaptation parameters based on transmitted information from the mobile device when a communication channel between the computing device and the mobile device becomes available and transmitting the new ASR adaptation data to the mobile device when a communication channel between the computing device and the mobile device becomes available. The new ASR adaptation data on the mobile device more accurately recognizes user utterances.
    Type: Grant
    Filed: March 16, 2016
    Date of Patent: February 13, 2018
    Assignee: Nuance Communications, Inc.
    Inventors: Sarangarajan Parthasarathy, Richard Cameron Rose
  • Patent number: 9824682
    Abstract: A method, apparatus and machine-readable medium are provided. A phonotactic grammar is utilized to perform speech recognition on received speech and to generate a phoneme lattice. A document shortlist is generated based on using the phoneme lattice to query an index. A grammar is generated from the document shortlist. Data for each of at least one input field is identified based on the received speech and the generated grammar.
    Type: Grant
    Filed: October 19, 2015
    Date of Patent: November 21, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Cyril Georges Luc Allauzen, Sarangarajan Parthasarathy
  • Patent number: 9734826
    Abstract: Optimized language models are provided for in-domain applications through an iterative, joint-modeling approach that interpolates a language model (LM) from a number of component LMs according to interpolation weights optimized for a target domain. The component LMs may include class-based LMs, and the interpolation may be context-specific or context-independent. Through iterative processes, the component LMs may be interpolated and used to express training material as alternative representations or parses of tokens. Posterior probabilities may be determined for these parses and used for determining new (or updated) interpolation weights for the LM components, such that a combination or interpolation of component LMs is further optimized for the domain. The component LMs may be merged, according to the optimized weights, into a single, combined LM, for deployment in an application scenario.
    Type: Grant
    Filed: March 11, 2015
    Date of Patent: August 15, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Sarangarajan Parthasarathy, Andreas Stolcke, Shuangyu Chang
  • Patent number: 9672202
    Abstract: Various components provide options to re-format an input based on one or more contexts. The input is received that has been submitted to an application (e.g., messaging application, mobile application, word-processing application, web browser, search tool, etc.), and one or more outputs are identified that are possibilities to be provided as options for re-formatting. A respective score of each output is determined by applying a statistical model to a respective combination of the input and each output, the respective score comprising a plurality of context scores that quantify a plurality of contexts of the respective combination. Exemplary contexts include historical-user contexts, domain contexts, and general contexts. One or more suggested outputs are selected from among the one or more outputs based on the respective scores and are provided as options to re-format the input.
    Type: Grant
    Filed: March 20, 2014
    Date of Patent: June 6, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Issac Alphonso, Nick Kibre, Michael Levit, Sarangarajan Parthasarathy
  • Publication number: 20170154623
    Abstract: A new pronunciation learning system for dynamically learning new pronunciations assisted by user correction logs. The user correction logs provide a record of speech recognition events and subsequent user behavior that implicitly confirms or rejects the recognition result and/or shows the user's intended words by via subsequent input. The system analyzes the correction logs and distills them down to a set of words which lack acceptable pronunciations. Hypothetical pronunciations, constrained by spelling and other linguistic knowledge, are generated for each of the words. Offline recognition determines the hypothetical pronunciations with a good acoustical match to the audio data likely to contain the words. The matching pronunciations are aggregated and adjudicated to select new pronunciations for the words to improve general or personalized recognition models.
    Type: Application
    Filed: February 13, 2017
    Publication date: June 1, 2017
    Applicant: Microsoft Technology Licensing, LLC.
    Inventors: Nicholas Kibre, Umut Ozertem, Sarangarajan Parthasarathy, Ziad Al Bawab
  • Publication number: 20170125018
    Abstract: Techniques are described herein that are capable of suggesting intent frame(s) for user request(s). For instance, the intent frame(s) may be suggested to elicit a request from a user. An intent frame is a natural language phrase (e.g., a sentence) that includes at least one carrier phrase and at least one slot. A slot in an intent frame is a placeholder that is identified as being replaceable by one or more words that identify an entity and/or an action to indicate an intent of the user. A carrier phrase in an intent frame includes one or more words that suggest a type of entity and/or action that is to be identified by the one or more words that may replace the corresponding slot. In accordance with these techniques, the intent frame(s) are suggested in response to determining that natural language functionality of a processing system is activated.
    Type: Application
    Filed: January 13, 2017
    Publication date: May 4, 2017
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Shane J. Landry, Anne K. Sullivan, Lisa J. Stifelman, Adam D. Elman, Larry Paul Heck, Sarangarajan Parthasarathy
  • Publication number: 20170103753
    Abstract: The customization of language modeling components for speech recognition is provided. A list of language modeling components may be made available by a computing device. A hint may then be sent to a recognition service provider for combining the multiple language modeling components from the list. The hint may be based on a number of different domains. A customized combination of the language modeling components based on the hint may then be received from the recognition service provider.
    Type: Application
    Filed: December 22, 2016
    Publication date: April 13, 2017
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Hernan Guelman, Shuangyu Chang, Sarangarajan Parthasarathy, Benoit Dumoulin
  • Patent number: 9589562
    Abstract: A new pronunciation learning system for dynamically learning new pronunciations assisted by user correction logs. The user correction logs provide a record of speech recognition events and subsequent user behavior that implicitly confirms or rejects the recognition result and/or shows the user's intended words by via subsequent input. The system analyzes the correction logs and distills them down to a set of words which lack acceptable pronunciations. Hypothetical pronunciations, constrained by spelling and other linguistic knowledge, are generated for each of the words. Offline recognition determines the hypothetical pronunciations with a good acoustical match to the audio data likely to contain the words. The matching pronunciations are aggregated and adjudicated to select new pronunciations for the words to improve general or personalized recognition models.
    Type: Grant
    Filed: February 21, 2014
    Date of Patent: March 7, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Nicholas Kibre, Umut Ozertem, Sarangarajan Parthasarathy, Ziad Al Bawab