Patents by Inventor Satarupa Guha

Satarupa Guha has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11978434
    Abstract: A computer-implemented technique identifies terms in an original reference transcription and original ASR output results that are considered valid variants of each other, even though these terms have different textual forms. Based on this finding, the technique produces a normalized reference transcription and normalized ASR output results in which valid variants are assigned the same textual form. In some implementations, the technique uses the normalized text to develop a model for an ASR system. For example, the technique may generate a word error rate (WER) measure by comparing the normalized reference transcription with the normalized ASR output results, and use the WER measure as guidance in developing the model. Some aspects of the technique involve identifying occasions in which a term can be properly split into component parts. Other aspects can identify other ways in which two terms may vary in spelling, but nonetheless remain valid variants.
    Type: Grant
    Filed: September 29, 2021
    Date of Patent: May 7, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Satarupa Guha, Ankur Gupta, Rahul Ambavat, Rupeshkumar Rasiklal Mehta
  • Publication number: 20230094511
    Abstract: A computer-implemented technique identifies terms in an original reference transcription and original ASR output results that are considered valid variants of each other, even though these terms have different textual forms. Based on this finding, the technique produces a normalized reference transcription and normalized ASR output results in which valid variants are assigned the same textual form. In some implementations, the technique uses the normalized text to develop a model for an ASR system. For example, the technique may generate a word error rate (WER) measure by comparing the normalized reference transcription with the normalized ASR output results, and use the WER measure as guidance in developing the model. Some aspects of the technique involve identifying occasions in which a term can be properly split into component parts. Other aspects can identify other ways in which two terms may vary in spelling, but nonetheless remain valid variants.
    Type: Application
    Filed: September 29, 2021
    Publication date: March 30, 2023
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Satarupa GUHA, Ankur GUPTA, Rahul AMBAVAT, Rupeshkumar Rasiklal MEHTA
  • Publication number: 20220358910
    Abstract: A computing system obtains features that have been extracted from an acoustic signal, where the acoustic signal comprises spoken words uttered by a user. The computing system performs automatic speech recognition (ASR) based upon the features and a language model (LM) generated based upon expanded pattern data. The expanded pattern data includes a name of an entity and a search term, where the entity belongs to a segment identified in a knowledge base. The search term has been included in queries for entities belonging to the segment. The computing system identifies a sequence of words corresponding to the features based upon results of the ASR. The computing system transmits computer-readable text to a search engine, where the text includes the sequence of words.
    Type: Application
    Filed: May 6, 2021
    Publication date: November 10, 2022
    Inventors: Ankur GUPTA, Satarupa GUHA, Rupeshkumar Rasiklal MEHTA, Issac John ALPHONSO, Anastasios ANASTASAKOS, Shuangyu CHANG
  • Patent number: 10216724
    Abstract: Performing semantic analysis on a user-generated text string includes training a neural network model with a plurality of known text strings to obtain a first distributed vector representation of the known text strings and a second distributed vector representation of a plurality of words in the known text strings, computing a relevance matrix of the first and second distributed representations based on a cosine distance between each of the plurality of words and the plurality of known text strings, and performing a latent dirichlet allocation (LDA) operation using the relevance matrix as an input to obtain a distribution of topics associated with the plurality of known text strings.
    Type: Grant
    Filed: April 7, 2017
    Date of Patent: February 26, 2019
    Assignee: Conduent Business Services, LLC
    Inventors: Manjira Sinha, Tridib Mukherjee, Preethy Varma, Satarupa Guha
  • Publication number: 20180293978
    Abstract: Performing semantic analysis on a user-generated text string includes training a neural network model with a plurality of known text strings to obtain a first distributed vector representation of the known text strings and a second distributed vector representation of a plurality of words in the known text strings, computing a relevance matrix of the first and second distributed representations based on a cosine distance between each of the plurality of words and the plurality of known text strings, and performing a latent dirichlet allocation (LDA) operation using the relevance matrix as an input to obtain a distribution of topics associated with the plurality of known text strings.
    Type: Application
    Filed: April 7, 2017
    Publication date: October 11, 2018
    Inventors: Manjira Sinha, Tridib Mukherjee, Preethy Varma, Satarupa Guha