Patents by Inventor Lalit Rai Bahl
Lalit Rai Bahl has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 6260014Abstract: A method for recognizing speech includes the steps of providing a generic model having a baseform representation of a vocabulary of words, identifying a subset of words relating to an application, constructing a task specific model for the subset of words, constructing a composite model by combining the generic and task specific models and modifying the baseform representation of the subset of words such that the subset of words are recognized by the task specific model. A system for recognizing speech includes a composite model having a generic model having a generic baseform representation of a vocabulary of words and a task specific model for recognizing a subset of words relating to an application wherein the subset of words are recognized using a modified baseform representation. A recognizer compares words input thereto with the generic model for words other than the subset of words and with the task specific model for the subset of words.Type: GrantFiled: September 14, 1998Date of Patent: July 10, 2001Assignee: International Business Machines CorporationInventors: Lalit Rai Bahl, David Lubensky, Mukund Padmanabhan, Salim Roukos
-
Patent number: 6067517Abstract: A technique to improve the recognition accuracy when transcribing speech data that contains data from a wide range of environments. Input data in many situations contains data from a variety of sources in different environments. Such classes include: clean speech, speech corrupted by noise (e.g., music), non-speech (e.g., pure music with no speech), telephone speech, and the identity of a speaker. A technique is described whereby the different classes of data are first automatically identified, and then each class is transcribed by a system that is made specifically for it. The invention also describes a segmentation algorithm that is based on making up an acoustic model that characterizes the data in each class, and then using a dynamic programming algorithm (the viterbi algorithm) to automatically identify segments that belong to each class. The acoustic models are made in a certain feature space, and the invention also describes different feature spaces for use with different classes.Type: GrantFiled: February 2, 1996Date of Patent: May 23, 2000Assignee: International Business Machines CorporationInventors: Lalit Rai Bahl, Ponani Gopalakrishnan, Ramesh Ambat Gopinath, Stephane Herman Maes, Mukund Panmanabhan, Lazaros Polymenakos
-
Patent number: 6058205Abstract: A system and method are provided which partition the feature space of a classifier by using hyperplanes to construct a binary decision tree or hierarchical data structure for obtaining the class probabilities for a particular feature vector. One objective in the construction of the decision tree is to minimize the average entropy of the empirical class distributions at each successive node or subset, such that the average entropy of the class distributions at the terminal nodes is minimized. First, a linear discriminant vector is computed that maximally separates the classes at any particular node. A threshold is then chosen that can be applied on the value of the projection onto the hyperplane such that all feature vectors that have a projection onto the hyperplane that is less than the threshold are assigned to a child node (say, left child node) and the feature vectors that have a projection greater than or equal to the threshold are assigned to a right child node.Type: GrantFiled: January 9, 1997Date of Patent: May 2, 2000Assignee: International Business Machines CorporationInventors: Lalit Rai Bahl, Peter Vincent deSouza, David Nahamoo, Mukund Padmanabhan
-
Patent number: 5995931Abstract: A system and method for recognizing spoken liaisoned words. The method and system identify each word in the vocabulary as a liaison generator and/or liaison receptor. If the word is a liaison receptor, and if the word is preceded by a liaison generator, the most probable recognition result for the word will be the liaison generated by the preceding word plus the word. Liaisons are identified on an immediately preceding word in accordance with rules in a language. A word that ends with an unpronounced consonant phoneme, when followed by a word beginning with a consonant phoneme, and ends with a pronounced phoneme, when followed by a word with a vowel-like phoneme, causes a match list for the current word to be amended with words having liaisons added at their beginnings.Type: GrantFiled: February 22, 1999Date of Patent: November 30, 1999Assignee: International Business Machines CorporationInventors: Lalit Rai Bahl, Steven Vincent De Gennaro, Peter Vincent deSouza, Edward Adam Epstein, Jean-Michel Le Roux, Burn Lewin Lewis, Claire Waast-Richard
-
Patent number: 5970239Abstract: Method for performing acoustic model estimation to optimize classification accuracy on speaker derived feature vectors with respect to a plurality of classes corresponding to phones to which a plurality of acoustic models respectively correspond comprises: (a) initializing an acoustic model for each phone; (b) evaluating the merit of the acoustic model initialized for each phone utilizing an objective function having a two component discriminant measure capable of characterizing each phone whereby a first component is defined as a probability that the model for the phone assigns to feature vectors from the phone and a second component is defined as a probability that the model for the phone assigns to feature vectors from other phones; (c) adapting the model for selected phones so as to increase the first component for the phone or decrease the second component for the phone, the adapting step yielding a new model for each selected phone; (d) evaluating the merit of the new models for each phone adapted in stType: GrantFiled: August 11, 1997Date of Patent: October 19, 1999Assignee: International Business Machines CorporationInventors: Lalit Rai Bahl, Mukund Padmanabhan
-
Patent number: 5884259Abstract: A method and apparatus for using a tree structure to constrain a time-synchronous, fast search for candidate words in an acoustic stream is described. A minimum stay of three frames in each graph node visited is imposed by allowing transitions only every third frame. This constraint enables the simplest possible Markov model for each phoneme while enforcing the desired minimum duration. The fast, time-synchronous search for likely words is done for an entire sentence/utterance. The list of hypotheses beginning at each time frame is stored for providing, on-demand, lists of contender/candidate words to the asynchronous, detailed match phase of decoding.Type: GrantFiled: February 12, 1997Date of Patent: March 16, 1999Assignee: International Business Machines CorporationInventors: Lalit Rai Bahl, Ellen Marie Eide
-
Patent number: 5875426Abstract: A method and system of recognizing speech. The method and system perform a fast match on a word in the string of speech to be recognized which generates a fast match list representing words in a system vocabulary that most likely match a current word to be recognized. Next, the method and system perform a detailed match on the words in the fast match list and generate a detailed match list representing words that most likely match the current word to be recognized. Then for each word in the detailed match list that can accept a liaison phoneme from a preceding word, where each word is a liaison receptor, adding to the detailed match list a form of the liaison receptor, where the form represents an addition of a liaison phoneme to the liaison receptor, creating a modified detailed match list which is inclusive of the forms of the liaison receptors added to the detailed match list.Type: GrantFiled: June 12, 1996Date of Patent: February 23, 1999Assignee: International Business Machines CorporationInventors: Lalit Rai Bahl, Steven Vincent De Gennaro, Peter Vincent deSouza, Edward Adam Epstein, Jean-Michel Le Roux, Burn Lewin Lewis, Claire Waast-Richard
-
Patent number: 5787394Abstract: A system and method for adaptation of a speaker independent speech recognition system for use by a particular user. The system and method gather acoustic characterization data from a test speaker and compare the data with acoustic characterization data generated for a plurality of training speakers. A match score is computed between the test speaker's acoustic characterization for a particular acoustic subspace and each training speaker's acoustic characterization for the same acoustic subspace. The training speakers are ranked for the subspace according to their scores and a new acoustic model is generated for the test speaker based upon the test speaker's acoustic characterization data and the acoustic characterization data of the closest matching training speakers. The process is repeated for each acoustic subspace.Type: GrantFiled: December 13, 1995Date of Patent: July 28, 1998Assignee: International Business Machines CorporationInventors: Lalit Rai Bahl, Ponani Gopalakrishnan, David Nahamoo, Mukund Padmanabhan
-
Patent number: 4028731Abstract: An apparatus is disclosed for compressing a p .times. q image array of two-valued (black/white) sample points. The image array points are serially applied to the apparatus in consecutive raster scan lines. In response, the apparatus simultaneously forms two matrices respectively representing a high order p .times. q predictive error array and a p .times. q array of location events (such as the raster leading edges of all objects in the image). Improved compression is achieved by selecting between the more compression efficient of two methods for encoding the position of errors in the prediction error array. These alternative methods are conventional run-length coding and a novel form of reference encoding, used selectively but to significant advantage. Thus, a run-length compression codeword is formed from the count C of non-errors between consecutive errors (in response to the occurrence of each error in the jth bit position of the ith scan line of the predictive error array) upon either C.ltoreq.Type: GrantFiled: September 29, 1975Date of Patent: June 7, 1977Assignee: International Business Machines CorporationInventors: Ronald Barthold Arps, Lalit Rai Bahl, Arnold Weinberger