Patents by Inventor Lalit R. Bahl

Lalit R. Bahl has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 6377921
    Abstract: A method of identifying mismatches between acoustic data and a corresponding transcription, the transcription being expressed in terms of basic units, comprises the steps of: aligning the acoustic data with the corresponding transcription; computing a probability score for each instance of a basic unit in the acoustic data with respect to the transcription; generating a distribution for each basic unit; tagging, as mismatches, instances of a basic unit corresponding to a particular range of scores in the distribution for each basic unit based on a threshold value; and correcting the mismatches.
    Type: Grant
    Filed: June 26, 1998
    Date of Patent: April 23, 2002
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Mukund Padmanabhan
  • Patent number: 6343270
    Abstract: In accordance with the present invention, a method for increasing both dialect precision and usability in speech recognition and text-to-speech systems is described. The invention generates non-linear (i.e. encoded)baseform representations for words and phrases from a pronunciation lexicon. The baseform representations are encoded to incorporate both pronunciation variations and dialectal variations. The encoded baseform representations may be later expanded (i.e. decoded) into one or more linear dialect specific baseform representations, utilizing a set of dialect specific phonological rules. The method comprises the steps of: constructing an encoded pronunciation lexicon having a plurality of encoded and unencoded baseforms; inputting one or more user specified dialects; selecting dialect specific phonological rules from a rule set database; and decoding the encoded pronunciation lexicon using the dialect specific phonological rules to yield a dialect specific decoded pronunciation lexicon.
    Type: Grant
    Filed: December 9, 1998
    Date of Patent: January 29, 2002
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Paul S. Cohen
  • Patent number: 5497447
    Abstract: A speech coding apparatus in which measured acoustic feature vectors are each represented by the best matched prototype vector. The prototype vectors are generated by storing a model of a training script comprising a series of elementary models. The value of at least one feature of a training utterance of the training script is measured over each of a series of successive time intervals to produce a series of training feature vectors. A first set of training feature vectors corresponding to a first elementary model in the training script is identified. The feature value of each training feature vector signal in the first set is compared to the parameter value of a first reference vector signal to obtain a first closeness score, and is compared to the parameter value of a second reference vector to obtain a second closeness score for each training feature vector.
    Type: Grant
    Filed: March 8, 1993
    Date of Patent: March 5, 1996
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Ponani S. Gopalakrishnan, Michael A. Picheny, Peter D. De Souza
  • Patent number: 5455889
    Abstract: The present invention relates to labelling of speech in a context-dependent speech recognition system. When labelling speech using context-dependent prototypes the phone context of a frame of speech needs to be aligned with the appropriate acoustic parameter vector. Since aligning a large amount of data is difficult if based upon arc ranks, the present invention aligns the data using context-independent acoustic prototypes. The phonetic context of each phone of the data is known. Therefore after the alignment step the acoustic parameter vectors are tagged with a corresponding phonetic context. Context-dependent prototype vectors exists for each label. For all labels the context-dependent prototype vectors having the same phonetic context as the tagged acoustic parameter vector are determined.
    Type: Grant
    Filed: February 8, 1993
    Date of Patent: October 3, 1995
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Peter de Souza, P. S. Gopalakrishnan, Michael A. Picheny
  • Patent number: 5333236
    Abstract: A speech coding apparatus compares the closeness of the feature value of a feature vector signal of an utterance to the parameter values of prototype vector signals to obtain prototype match scores for the feature vector signal and each prototype vector signal. The speech coding apparatus stores a plurality of speech transition models representing speech transitions. At least one speech transition is represented by a plurality of different models. Each speech transition model has a plurality of model outputs, each comprising a prototype match score for a prototype vector signal. Each model output has an output probability. A model match score for a first feature vector signal and each speech transition model comprises the output probability for at least one prototype match score for the first feature vector signal and a prototype vector signal.
    Type: Grant
    Filed: September 10, 1992
    Date of Patent: July 26, 1994
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Peter V. De Souza, Ponani S. Gopalakrishnan, Michael A. Picheny
  • Patent number: 5280562
    Abstract: In speech recognition and speech coding, the values of at least two features of an utterance are measured during a series of time intervals to produce a series of feature vector signals. A plurality of single-dimension prototype vector signals having only one parameter value are stored. At least two single-dimension prototype vector signals having parameter values representing first feature values, and at least two other single-dimension prototype vector signals have parameter values representing second feature values. A plurality of compound-dimension prototype vector signals have unique identification values and comprise one first-dimension and one second-dimension prototype vector signal. At least two compound-dimension prototype vector signals comprise the same first-dimension prototype vector signal. The feature values of each feature vector signal are compared to the parameter values of the compound-dimension prototype vector signals to obtain prototype match scores.
    Type: Grant
    Filed: October 3, 1991
    Date of Patent: January 18, 1994
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Jerome R. Bellegarda, Edward A. Epstein, John M. Lucassen, David Nahamoo, Michael A. Picheny
  • Patent number: 5278942
    Abstract: A speech coding apparatus and method for use in a speech recognition apparatus and method. The value of at least one feature of an utterance is measured during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values. A plurality of prototype vector signals, each having at least one parameter value and a unique identification value are stored. The closeness of the feature vector signal is compared to the parameter values of the prototype vector signals to obtain prototype match scores for the feature value signal and each prototype vector signal. The identification value of the prototype vector signal having the best prototype match score is output as a coded representation signal of the feature vector signal. Speaker-dependent prototype vector signals are generated from both synthesized training vector signals and measured training vector signals.
    Type: Grant
    Filed: December 5, 1991
    Date of Patent: January 11, 1994
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Jerome R. Bellegarda, Peter V. De Souza, Ponani S. Gopalakrishnan, Arthur J. Nadas, David Nahamoo, Michael A. Picheny
  • Patent number: 5276766
    Abstract: An apparatus for generating a set of acoustic prototype signals for encoding speech includes a memory for storing a training script model comprising a series of word-segment models. Each word-segment model comprises a series of elementary models. An acoustic measure is provided for measuring the value of at least one feature of an utterance of the training script during each of a series of time intervals to produce a series of feature vector signals representing the feature values of the utterance. An acoustic matcher is provided for estimating at least one path through the training script model which would produce the entire series of measured feature vector signals. From the estimated path, the elementary model in the training script model which would produce each feature vector signal is estimated. The apparatus further comprises a cluster processor for clustering the feature vector signals into a plurality of clusters.
    Type: Grant
    Filed: July 16, 1991
    Date of Patent: January 4, 1994
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Jerome R. Bellegarda, Peter V. DeSouza, David Nahamoo, Michael A. Picheny
  • Patent number: 5233681
    Abstract: A speech recognition apparatus and method estimates the next word context for each current candidate word in a speech hypothesis. An initial model of each speech hypothesis comprises a model of a partial hypothesis of zero or more words followed by a model of a candidate word. An initial hypothesis score for each speech hypothesis comprises an estimate of the closeness of a match between the initial model of the speech hypothesis and a sequence of coded representations of the utterance. The speech hypotheses having the best initial hypothesis scores form an initial subset. For each speech hypothesis in the initial subset, the word which is most likely to follow the speech hypothesis is estimated. A revised model of each speech hypothesis in the initial subset comprises a model of the partial hypothesis followed by a revised model of the candidate word. The revised candidate word model is dependent at least on the word which is estimated to be most likely to follow the speech hypothesis.
    Type: Grant
    Filed: April 24, 1992
    Date of Patent: August 3, 1993
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Peter V. De Souza, Ponani S. Gopalakrishnan, Michael A. Picheny
  • Patent number: 5195167
    Abstract: Symbol feature values and contextual feature values of each event in a training set of events are measured. At least two pairs of complementary subsets of observed events are selected. In each pair of complementary subsets of observed events, one subset has contextual features with values in a set C.sub.n, and the other set has contextual features with values in a set C.sub.n, were the sets in C.sub.n and C.sub.n are complementary sets of contextual feature values. For each subset of observed events, the similarity values of the symbol features of the observed events in the subsets are calculated. For each pair of complementary sets of observed events, a "goodness of fit" is the sum of the symbol feature value similarity of the subsets. The sets of contextual feature values associated with the subsets of observed events having the best "goodness of fit" are identified and form context-dependent bases for grouping the observed events into two output sets.
    Type: Grant
    Filed: April 17, 1992
    Date of Patent: March 16, 1993
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Peter V. De Souza, Ponani S. Gopalakrishnan, David Nahamoo, Michael A. Picheny
  • Patent number: 5182773
    Abstract: The present invention is related to speech recognition and particularly to a new type of vector quantizer and a new vector quantization technique in which the error rate of associating a sound with an incoming speech signal is drastically reduced. To achieve this end, the present invention technique groups the feature vectors in a space into different prototypes at least two of which represent a class of sound. Each of the prototypes may in turn have a number of subclasses or partitions. Each of the prototypes and their subclasses may be assigned respective identifying values. To identify an incoming speech feature vector, at least one of the feature values of the incoming feature vector is compared with the different values of the respective prototypes, or the subclasses of the prototypes.
    Type: Grant
    Filed: March 22, 1991
    Date of Patent: January 26, 1993
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Michael A. Picheny, David Nahamoo, Peter V. de Souza
  • Patent number: 5165007
    Abstract: In a speech recognition system, apparatus and method for modelling words with label-based Markov models is disclosed. The modelling includes: entering a first speech input, corresponding to words in a vocabulary, into an acoustic processor which converts each spoken word into a sequence of standard labels, where each standard label corresponds to a sound type assignable to an interval of time; representing each standard label as a probabilistic model which has a plurality of states, at least one transition from a state to a state, and at least one settable output probability at some transitions; entering selected acoustic inputs into an acoustic processor which converts the selected acoustic inputs into personalized labels, each personalized label corresponding to a sound type assigned to an interval of time; and setting each output probability as the probability of the standard label represented by a given model producing a particular personalized label at a given transition in the given model.
    Type: Grant
    Filed: June 12, 1989
    Date of Patent: November 17, 1992
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Peter V. DeSouza, Robert L. Mercer, Michael A. Picheny
  • Patent number: 5129001
    Abstract: Modeling a word is done by concatenating a series of elemental models to form a word model. At least one elemental model in the series is a composite elemental model formed by combining the starting states of at least first and second primitive elemental models. Each primitive elemental model represents a speech component. The primitive elemental models are combined by a weighted combination of their parameters in proportion to the values of the weighting factors. To tailor the word model to closely represent variations in the pronunciation of the word, the word is uttered a plurality of times by a plurality of different speakers. Constructing word models from composite elemental models, and constructing composite elemental models from primitive elemental models enables word models to represent many variations in the pronunciation of a word.
    Type: Grant
    Filed: April 25, 1990
    Date of Patent: July 7, 1992
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Jerome R. Bellegarda, Peter V. De Souza, Ponani S. Gopalakrishnan, David Nahamoo, Michael A. Picheny
  • Patent number: 5033087
    Abstract: A continuous speech recognition system includes an automatic phonological rules generator which determines variations in the pronunciation of phonemes based on the context in which they occur. This phonological rules generator associates sequences of labels derived from vocalizations of a training text with respective phonemes inferred from the training text. These sequences are then annotated with their pheneme context from the training text and clustered into groups representing similar pronunciations of each phoneme. A decision tree is generated using the context information of the sequences to predict the clusters to which the sequences belong. The training data is processed by the decision tree to divide the sequences into leaf-groups representing similar pronunciations of each phoneme. The sequences in each leaf-group are clustered into sub-groups representing respectively different pronunciations of their corresponding phoneme in a give context. A Markov model is generated for each sub-group.
    Type: Grant
    Filed: March 14, 1989
    Date of Patent: July 16, 1991
    Assignee: International Business Machines Corp.
    Inventors: Lalit R. Bahl, Peter F. Brown, Peter V. DeSouza, Robert L. Mercer
  • Patent number: 4980918
    Abstract: A continuous speech recognition system having a speech processor and a word recognition computer subsystem, characterized by an element for developing a graph for confluent links between confluent nodes; an element for developing a graph of boundary links between adjacent words; an element for storing an inventory of confluent links and boundary links as a coding inventory; an element for converting an unknown utterance into an encoded sequence of confluent links and boundary links corresponding to recognition sequences stored in the word recognition subsystem recognition vocabulary for speech recognition. The invention also includes a method for achieving continouous speech recognition by characterizing speech as a sequence of confluent links which are matched with candidate words. The invention also applies to isolated word speech recognition as with continuous speech recognition, except that in such case there are no boundary links.
    Type: Grant
    Filed: May 9, 1985
    Date of Patent: December 25, 1990
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Paul S. Cohen, Robert L. Mercer
  • Patent number: 4977599
    Abstract: Apparatus and method for constructing word baseforms which can be matched against a string of generated acoustic labels. A set of phonetic phone machines are formed, wherein each phone machine has (i) a plurality of states, (ii) a plurality of transitions each of which extends from a state to a state, (iii) a stored probability for each transition, and (iv) stored label output probabilities, each label output probability corresponding to the probability of each phone machine producing a corresponding label. The set of phonetic machines is formed to include a subset of onset phone machines. The stored probabilities of each onset phone macine correspond to at least one phonetic element being uttered at the beginning of a speech segment. The set of phonetic machines is formed to include a subset of trailing phone machines. The stored probabilities of each trailing phone machine correspond to at least one single phonetic element being uttered at the end of a speech segment.
    Type: Grant
    Filed: December 15, 1988
    Date of Patent: December 11, 1990
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Peter V. DeSouza, Robert L. Mercer, Michael A. Picheny
  • Patent number: 4882759
    Abstract: Apparatus and method for synthesizing word baseforms for words not spoken during a training session, wherein each synthesized baseform represents a series of models from a first set of models, which include: (a) uttering speech during a training session and representing the uttered speech as a sequence of models from a second set of models; (b) for each of at least some of the second set models spoken in a given phonetic model context during the training session, storing a respective string of first set models; and (c) constructing a word baseform of first set models for a word not spoken during the training session, including the step of representing each piece of a word that corresponds to a second set model in a given context by the stored respective string, if any, corresponding thereto.
    Type: Grant
    Filed: April 18, 1986
    Date of Patent: November 21, 1989
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Peter V. deSouza, Robert L. Mercer, Michael A. Picheny
  • Patent number: 4852173
    Abstract: In order to determine a next event based upon available data, a binary decision tree is constructed having true or false questions at each node and a probability distribution of the unknown next event based upon available data at each leaf. Starting at the root of the tree, the construction process proceeds from node-to-node towards a leaf by answering the question at each node encountered and following either the true or false path depending upon the answer. The questions are phrased in terms of the available data and are designed to provide as much information as possible about the next unknown event. The process is particularly useful in speech recognition when the next word to be spoken is determined on the basis of the previously spoken words.
    Type: Grant
    Filed: October 29, 1987
    Date of Patent: July 25, 1989
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Peter F. Brown, Peter V. deSouza, Robert L. Mercer
  • Patent number: 4833712
    Abstract: In a system that (i) defines each word in a vocabulary by a fenemic baseform of fenemic phones, (ii) defines an alphabet of composite phones each of which corresponds to at least one fenemic phone, and (iii) generates a string of fenemes in response to speech input, the method provides for converting a word baseform comprised of fenemic phones into a stunted word baseform of composite phones by (a) replacing each fenemic phone in the fenemic phone word baseform by the composite phone corresponding thereto; and (b) merging together at least one pair of adjacent composite phones by a single composite phone where the adverse effect of the merging is below a predefined threshold.
    Type: Grant
    Filed: May 29, 1985
    Date of Patent: May 23, 1989
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Peter V. DeSouza, Robert L. Mercer, Michael A. Picheny
  • Patent number: 4827521
    Abstract: In a word, or speech, recognition system for decoding a vocabulary word from outputs selected from an alphabet of outputs in response to a communicated word input wherein each word in the vocabulary is represented by a baseform of at least one probabilistic finite state model and wherein each probabilistic model has transition probability items and output probability items and wherein a value is stored for each of at least some probability items, the present invention relates to apparatus and method for determining probability values for probability items by biassing at least some of the stored values to enhance the likelihood that outputs generated in response to communication of a known word input are produced by the baseform for the known word relative to the respective likelihood of the generated outputs being produced by the baseform for at least one other word.
    Type: Grant
    Filed: March 27, 1986
    Date of Patent: May 2, 1989
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Peter F. Brown, Peter V. deSouza, Robert L. Mercer