Patents by Inventor Mitchel Weintraub

Mitchel Weintraub has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20200134466
    Abstract: Aspects of the present disclosure enable humanly-specified relationships to contribute to a mapping that enables compression of the output structure of a machine-learned model. An exponential model such as a maximum entropy model can leverage a machine-learned embedding and the mapping to produce a classification output. In such fashion, the feature discovery capabilities of machine-learned models (e.g., deep networks) can be synergistically combined with relationships developed based on human understanding of the structural nature of the problem to be solved, thereby enabling compression of model output structures without significant loss of accuracy. These compressed models provide improved applicability to “on device” or other resource-constrained scenarios.
    Type: Application
    Filed: October 16, 2019
    Publication date: April 30, 2020
    Inventors: Mitchel Weintraub, Ananda Theertha Suresh, Ehsan Variani
  • Patent number: 9286894
    Abstract: Recognition techniques may include the following. On a first processing entity, a first recognition process is performed on a first element, where the first recognition process includes: in a first state machine having M (M>1) states, determining a first best path cost in at least a subset of the M states for at least part of the first element. On a second processing entity, a second recognition process is performed on a second element, where the second recognition process includes: in a second state machine having N (N>1) states, determining a second best path cost in at least a subset of the N states for at least part of the second element. At least one of the following is done: (i) passing the first best path cost to the second state machine, or (ii) passing the second best path cost to the first state machine. The foregoing techniques may include one or more of the following features, either alone or in combination.
    Type: Grant
    Filed: January 31, 2013
    Date of Patent: March 15, 2016
    Assignee: Google Inc.
    Inventor: Mitchel Weintraub
  • Patent number: 9123331
    Abstract: Respective word frequencies may be determined from a corpus of utterance-to-text-string mappings that contain associations between audio utterances and a respective text string transcription of each audio utterance. Respective compressed word frequencies may be obtained based on the respective word frequencies such that the distribution of the respective compressed word frequencies has a lower variance than the distribution of the respective word frequencies. Sample utterance-to-text-string mappings may be selected from the corpus of utterance-to-text-string mappings based on the compressed word frequencies. An automatic speech recognition (ASR) system may be trained with the sample utterance-to-text-string mappings.
    Type: Grant
    Filed: August 15, 2013
    Date of Patent: September 1, 2015
    Assignee: Google Inc.
    Inventors: Brian Strope, Mitchel Weintraub
  • Patent number: 8775177
    Abstract: A speech recognition process may perform the following operations: performing a preliminary recognition process on first audio to identify candidates for the first audio; generating first templates corresponding to the first audio, where each first template includes a number of elements; selecting second templates corresponding to the candidates, where the second templates represent second audio, and where each second template includes elements that correspond to the elements in the first templates; comparing the first templates to the second templates, where comparing comprises includes similarity metrics between the first templates and corresponding second templates; applying weights to the similarity metrics to produce weighted similarity metrics, where the weights are associated with corresponding second templates; and using the weighted similarity metrics to determine whether the first audio corresponds to the second audio.
    Type: Grant
    Filed: October 31, 2012
    Date of Patent: July 8, 2014
    Assignee: Google Inc.
    Inventors: Georg Heigold, Patrick An Phu Nguyen, Mitchel Weintraub, Vincent O. Vanhoucke
  • Patent number: 8543398
    Abstract: Respective word frequencies may be determined from a corpus of utterance-to-text-string mappings that contain associations between audio utterances and a respective text string transcription of each audio utterance. Respective compressed word frequencies may be obtained based on the respective word frequencies such that the distribution of the respective compressed word frequencies has a lower variance than the distribution of the respective word frequencies. Sample utterance-to-text-string mappings may be selected from the corpus of utterance-to-text-string mappings based on the compressed word frequencies. An automatic speech recognition (ASR) system may be trained with the sample utterance-to-text-string mappings.
    Type: Grant
    Filed: November 1, 2012
    Date of Patent: September 24, 2013
    Assignee: Google Inc.
    Inventors: Brian Strope, Mitchel Weintraub
  • Patent number: 7280963
    Abstract: A computerized method is provided for generating pronunciations for words and storing the pronunciations in a pronunciation dictionary. The method includes graphing sets of initial pronunciations; thereafter in an ASR subsystem determining a highest-scoring set of initial pronunciations; generating sets of alternate pronunciations, wherein each set of alternate pronunciations includes the highest-scoring set of initial pronunciations with a lowest-probability phone of the highest-scoring initial pronunciation substituted with a unique-substitute phone; graphing the sets of alternate pronunciations; determining in the ASR subsystem a highest-scoring set of alternate pronunciations; and adding to a pronunciation dictionary the highest-scoring set of alternate pronunciations.
    Type: Grant
    Filed: September 12, 2003
    Date of Patent: October 9, 2007
    Assignee: Nuance Communications, Inc.
    Inventors: Francoise Beaufays, Ananth Sankar, Mitchel Weintraub, Shaun Williams
  • Patent number: 7266495
    Abstract: A computerized pronunciation system is provided for generating pronunciations for words and storing the pronunciations in a pronunciation dictionary. The system includes a word list including at least one word; transcribed acoustic data including at least one waveform for the word and transcribed text associated with the waveform; a pronunciation-learning module configured to accept as input the word list and the transcribed acoustic data, the pronunciation-learning module including: sets of initial pronunciations of the word, a scoring module configured score pronunciations and to generate phone probabilities, and a set of alternate pronunciations of the word, wherein the set of alternate pronunciations include a highest-scoring set of initial pronunciations with a highest-scoring substitute phone substituted for a lowest-probability phone; and a pronunciation dictionary configured to receive the highest-scoring set of initial pronunciations and the set of alternate pronunciations.
    Type: Grant
    Filed: September 12, 2003
    Date of Patent: September 4, 2007
    Assignee: Nuance Communications, Inc.
    Inventors: Francoise Beaufays, Ananth Sankar, Mitchel Weintraub, Shaun Williams
  • Patent number: 6804640
    Abstract: A method and apparatus for generating a noise-reduced feature vector representing human speech are provided. Speech data representing an input speech waveform are first input and filtered. Spectral energies of the filtered speech data are determined, and a noise reduction process is then performed. In the noise reduction process, a spectral magnitude is computed for a frequency index of multiple frequency indexes. A noise magnitude estimate is then determined for the frequency index by updating a histogram of spectral magnitude, and then determining the noise magnitude estimate as a predetermined percentile of the histogram. A signal-to-noise ratio is then determined for the frequency index. A scale factor is computed for the frequency index, as a function of the signal-to-noise ratio and the noise magnitude estimate. The noise magnitude estimate is then scaled by the scale factor.
    Type: Grant
    Filed: February 29, 2000
    Date of Patent: October 12, 2004
    Assignee: Nuance Communications
    Inventors: Mitchel Weintraub, Francoise Beaufays
  • Patent number: 6226611
    Abstract: Pronunciation quality is automatically evaluated for an utterance of speech based on one or more pronunciation scores. One type of pronunciation score is based on duration of acoustic units. Examples of acoustic units include phones and syllables. Another type of pronunciation score is based on a posterior probability that a piece of input speech corresponds to a certain model such as an HMM, given the piece of input speech. Speech may be segmented into phones and syllables for evaluation with respect to the models. The utterance of speech may be an arbitrary utterance made up of a sequence of words which had not been encountered before. Pronunciation scores are converted into grades as would be assigned by human graders. Pronunciation quality may be evaluated in a client-server language instruction environment.
    Type: Grant
    Filed: January 26, 2000
    Date of Patent: May 1, 2001
    Assignee: SRI International
    Inventors: Leonardo Neumeyer, Horacio Franco, Mitchel Weintraub, Patti Price, Vassilios Digalakis
  • Patent number: 6055498
    Abstract: Pronunciation quality is automatically evaluated for an utterance of speech based on one or more pronunciation scores. One type of pronunciation score is based on duration of acoustic units. Examples of acoustic units include phones and syllables. Another type of pronunciation score is based on a posterior probability that a piece of input speech corresponds to a certain model, such as a hidden Markov model, given the piece of input speech. Speech may be segmented into phones and syllable for evaluation with respect to the models. The utterance of speech may be an arbitrary utterance made up of a sequence of words which had not been encountered before. Pronunciation scores are converted into grades as would be assigned by human graders. Pronunciation quality may be evaluated in a client-server language instruction environment.
    Type: Grant
    Filed: October 2, 1997
    Date of Patent: April 25, 2000
    Assignee: SRI International
    Inventors: Leonardo Neumeyer, Horacio Franco, Mitchel Weintraub, Patti Price, Vassilios Digalakis
  • Patent number: 5950157
    Abstract: Adverse effects of type mismatch between acoustic input devices used during testing and during training in machine-based recognition of the source of acoustic phenomena are minimized. A normalizing model is matched to a source model based, or dependent, upon an acoustic input device whose transfer characteristics color acoustic characteristics of a source as represented in the source model. An application of the present invention is to speaker recognition, i.e., recognition of the identity of a speaker by the speaker's voice.
    Type: Grant
    Filed: April 18, 1997
    Date of Patent: September 7, 1999
    Assignee: SRI International
    Inventors: Larry P. Heck, Mitchel Weintraub
  • Patent number: 5842163
    Abstract: In a method for determining likelihood of appearance of keywords in a spoken utterance as part of a keyword spotting system of a speech recognizer, a new scoring technique is provided wherein a confidence score is computed as a probability of observing the keyword in a sequence of words given the observations. The corresponding confidence scores are the probability of the keyword appearing in any word sequence given the observations. In a specific embodiment, the technique involves hypothesizing a keyword whenever it appears in any of the "N-Best" word lists with a confidence score that is computed by summing the likelihoods for all hypotheses that contain the keyword, normalized by dividing by the sum of all hypothesis likelihoods in the "N-best" list.
    Type: Grant
    Filed: June 7, 1996
    Date of Patent: November 24, 1998
    Assignee: SRI International
    Inventor: Mitchel Weintraub
  • Patent number: 5581655
    Abstract: An automatic speech recognition methodology, wherein words are modeled as probabilistic networks of allophones, collects nodes in the probabilistic network into equivalence classes when those nodes have the same allophonic choices governed by the same phonological rules. The allophonic choices allow for representation of dialectic pronunciation variations between different speakers. Training data is shared among nodes in an equivalence class so that accurate pronunciation probabilities may be determined even for words for which there is only a limited amount of training data. A method is used to determine probabilities for each of a multitude of pronunciation models for each word in the vocabulary, based on automatic extraction of linguistic knowledge from sets of phonological rules, in order to robustly and accurately model dialectal variation.
    Type: Grant
    Filed: January 22, 1996
    Date of Patent: December 3, 1996
    Assignee: SRI International
    Inventors: Michael H. Cohen, Mitchel Weintraub, Patti J. Price, Hy Murveit, Jared C. Bernstein
  • Patent number: 5268990
    Abstract: An automatic speech recognition methodology takes advantage of linguistic constraints wherein words are modeled as probabilistic networks of phonetic segments (herein phones), and each phone is represented as a context-independent hidden Markov phone model mixed with a number of context-dependent phone models. Recognition is based on use of methods to design phonological rule sets based on measures of coverage and overgeneration of pronunciations which achieves high coverage of pronunciations with compact representations. Further, a method estimates probabilities of the different possible pronunciations of words. A further method models cross-word coarticulatory effects. In a specific embodiment of the system, a specific method determines the single most-likely pronunciation of words. In further specific embodiments of the system, methods generate speaker-dependent pronunciation networks.
    Type: Grant
    Filed: January 31, 1991
    Date of Patent: December 7, 1993
    Assignee: SRI International
    Inventors: Michael H. Cohen, Mitchel Weintraub, Patti J. Price, Hy Murveit, Jared C. Bernstein
  • Patent number: 5148489
    Abstract: A method is disclosed for use in preprocessing noisy speech to minimize likelihood of error in estimation for use in a recognizer. The computationally-feasible technique, herein called Minimum-Mean-Log-Spectral-Distance (MMLSD) estimation using mixture models and Marlov models, comprises the steps of calculating for each vector of speech in the presence of noise corresponding to a single time frame, an estimate of clean speech, where the basic assumptions of the method of the estimator are that the probability distribution of clean speech can be modeled by a mixture of components each representing a different speech class assuming different frequency channels are uncorrelated within each class and that noise at different frequency channels is uncorrelated.
    Type: Grant
    Filed: March 9, 1992
    Date of Patent: September 15, 1992
    Assignee: SRI International
    Inventors: Adoram Erell, Mitchel Weintraub