Patents by Inventor Mukund Padmanabhan

Mukund Padmanabhan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7529666
    Abstract: In connection with speech recognition, the design of a linear transformation ??p×n, of rank p×n, which projects the features of a classifier x?n onto y=?x?p such as to achieve minimum Bayes error (or probability of misclassification). Two avenues are explored: the first is to maximize the ?-average divergence between the class densities and the second is to minimize the union Bhattacharyya bound in the range of ?. While both approaches yield similar performance in practice, they outperform standard linear discriminant analysis features and show a 10% relative improvement in the word error rate over known cepstral features on a large vocabulary telephony speech recognition task.
    Type: Grant
    Filed: October 30, 2000
    Date of Patent: May 5, 2009
    Assignee: International Business Machines Corporation
    Inventors: Mukund Padmanabhan, George A. Saon
  • Patent number: 7216077
    Abstract: Methods and arrangements using lattice-based information for unsupervised speaker adaptation. By performing adaptation against a word lattice, correct models are more likely to be used in estimating a transform. Further, a particular type of lattice proposed herein enables the use of a natural confidence measure given by the posterior occupancy probability of a state, that is, the statistics of a particular state will be updated with the current frame only if the a posteriori probability of the state at that particular time is greater than a predetermined threshold.
    Type: Grant
    Filed: September 26, 2000
    Date of Patent: May 8, 2007
    Assignee: International Business Machines Corporation
    Inventors: Mukund Padmanabhan, George A. Saon, Geoffrey G. Zweig
  • Publication number: 20060271365
    Abstract: Methods and apparatus are provided for processing an information signal containing content presented in accordance with at least one modality. In one aspect of the present invention, a method of processing an information signal containing content presented in accordance with at least one modality, comprises the steps of: (i) obtaining the information signal; (ii) performing content detection on the information signal to detect whether the information signal includes particular content presented in accordance with the at least one modality; and (iii) generating a control signal, when the particular content is detected, for use in controlling a rendering property of the particular content and/or implementation of a specific action relating to the particular content.
    Type: Application
    Filed: July 27, 2006
    Publication date: November 30, 2006
    Applicant: International Business Machines Corporation
    Inventors: Stephane Maes, Mukund Padmanabhan, Jeffrey Sorensen
  • Patent number: 7092496
    Abstract: Methods and apparatus are provided for processing an information signal containing content presented in accordance with at least one modality. In one aspect of the present invention, a method of processing an information signal containing content presented in accordance with at least one modality, comprises the steps of: (i) obtaining the information signal; (ii) performing content detection on the information signal to detect whether the information signal includes particular content presented in accordance with the at least one modality; and (iii) generating a control signal, when the particular content is detected, for use in controlling a rendering property of the particular content and/or implementation of a specific action relating to the particular content.
    Type: Grant
    Filed: September 18, 2000
    Date of Patent: August 15, 2006
    Assignee: International Business Machines Corporation
    Inventors: Stephane Herman Maes, Mukund Padmanabhan, Jeffrey Scott Sorensen
  • Patent number: 6920424
    Abstract: Generally, the present invention determines and uses spectral peak information, which preferably augments feature vectors and creates augmented feature vectors. The augmented feature vectors decrease errors in pattern recognition, increase noise immunity for wide-band noise, and reduce reliance on noisy formant features. Illustratively, one way of determining spectral peak information is to split pattern data into a number of frequency ranges and determine spectral peak information for each of the frequency ranges. This allows single peak selection. All of the spectral peak information is then used to augment a feature vector. Another way of determining spectral peak information is to use an adaptive Infinite Impulse Response filter to provide this information. Additionally, the present invention can determine and use incremental information. The incremental information is relatively easy to calculate and helps to determine if additional or changed features are worthwhile.
    Type: Grant
    Filed: February 16, 2001
    Date of Patent: July 19, 2005
    Assignee: International Business Machines Corporation
    Inventor: Mukund Padmanabhan
  • Patent number: 6859774
    Abstract: Techniques are described for decreasing the number of errors when consensus decoding is used during speech recognition. A number of corrective rules are applied to confusion sets that are extracted during real-time speech recognition. The corrective rules are determined during training of the speech recognition system, which entails using many training confusion sets. A learning process is used that generates a number of possible rules, called template rules, that can be applied to the training confusion sets. The learning process also determines the corrective rules from the template rules. The corrective rules operate on the real-time confusion sets to select hypothesis words from the confusion sets, where the hypothesis words are not necessarily the words having the highest score.
    Type: Grant
    Filed: May 2, 2001
    Date of Patent: February 22, 2005
    Assignee: International Business Machines Corporation
    Inventors: Lidia Luminita Mangu, Mukund Padmanabhan
  • Patent number: 6842796
    Abstract: Techniques are provided for enumerating regularly identifiable or stereotypical phrases that people commonly use to convey particular information, and where exactly in these phrases the particular information is to be found. In one embodiment, such phrases are referred to as “regular expressions.” Using such enumerated phrases, the invention is able to automatically identify them in an input data stream and then identify and extract the particular information associated with the phrase that is being sought, e.g., important or relevant information.
    Type: Grant
    Filed: July 3, 2001
    Date of Patent: January 11, 2005
    Assignee: International Business Machines Corporation
    Inventors: Geoffrey G. Zweig, Mukund Padmanabhan
  • Patent number: 6609093
    Abstract: The present invention provides a new approach to heteroscedastic linear discriminant analysis (HDA) by defining an objective function which maximizes the class discrimination in the projected subspace while ignoring the rejected dimensions. Moreover, we present a link between discrimination and the likelihood of the projected samples and show that HDA can be viewed as a constrained maximum likelihood (ML) projection for a full covariance gaussian model, the constraint being given by the maximization of the projected between-class scatter volume. The present invention also provides that, under diagonal covariance gaussian modeling constraints, applying a diagonalizing linear transformation (e.g., MLLT—maximum likelihood linear transformation) to the HDA space results in an increased classification accuracy.
    Type: Grant
    Filed: June 1, 2000
    Date of Patent: August 19, 2003
    Assignee: International Business Machines Corporation
    Inventors: Ramesh Ambat Gopinath, Mukund Padmanabhan, George Andrei Saon
  • Patent number: 6603921
    Abstract: An archive system for records with an audio component, which uses automated speech recognition to create a multi-layered archive pyramid. The archive pyramid includes successive layers of data stored at varying data rates such as original video data, compressed video data, original audio, compressed audio data, recognized word-lattices, recognized word-bags and a global word index. The disclosed system uses automatic speech recognition to transcribe from audio to searchable index layers. During a search operation, automatic and semi-automatic techniques are used to search the archive pyramid from the smallest narrowest layers to the largest widest layers, to identify a moderate subset of records. This subset is further refined by a manual survey of regenerated compressed audio. Finally, the selected records are retrieved from the original audio archive layer.
    Type: Grant
    Filed: July 1, 1998
    Date of Patent: August 5, 2003
    Assignee: International Business Machines Corporation
    Inventors: Dimitri Kanevsky, Stephane H. Maes, Mukund Padmanabhan, Arthur R. Zingher
  • Publication number: 20030050782
    Abstract: Techniques are provided for enumerating regularly identifiable or stereotypical phrases that people commonly use to convey particular information, and where exactly in these phrases the particular information is to be found. In one embodiment, such phrases are referred to as “regular expressions.” Using such enumerated phrases, the invention is able to automatically identify them in an input data stream and then identify and extract the particular information associated with the phrase that is being sought, e.g., important or relevant information.
    Type: Application
    Filed: July 3, 2001
    Publication date: March 13, 2003
    Applicant: International Business Machines Corporation
    Inventors: Geoffrey G. Zweig, Mukund Padmanabhan
  • Publication number: 20020165716
    Abstract: Techniques are described for decreasing the number of errors when consensus decoding is used during speech recognition. A number of corrective rules are applied to confusion sets that are extracted during real-time speech recognition. The corrective rules are determined during training of the speech recognition system, which entails using many training confusion sets. A learning process is used that generates a number of possible rules, called template rules, that can be applied to the training confusion sets. The learning process also determines the corrective rules from the template rules. The corrective rules operate on the real-time confusion sets to select hypothesis words from the confusion sets, where the hypothesis words are not necessarily the words having the highest score.
    Type: Application
    Filed: May 2, 2001
    Publication date: November 7, 2002
    Applicant: International Business Machines Corporation
    Inventors: Lidia Luminita Mangu, Mukund Padmanabhan
  • Patent number: 6470314
    Abstract: A method of adapting a speech recognition system to one or more acoustic conditions comprises the steps of: (i) computing cumulative distribution functions based on dimensions of speech vectors associated with training speech data provided to the speech recognition system; (ii) computing cumulative distribution functions based on dimensions of speech vectors associated with test speech data provided to the speech recognition system; (iii) computing a nonlinear transformation mapping based on the cumulative distribution functions associated with the training speech data and the cumulative distribution functions associated with the test speech data; and (iv) applying the nonlinear transformation mapping to speech vectors associated with the test speech data prior to recognition, wherein the speech vectors transformed in accordance with the nonlinear transformation mapping are substantially similar to speech vectors associated with the training speech data.
    Type: Grant
    Filed: April 6, 2000
    Date of Patent: October 22, 2002
    Assignee: International Business Machines Corporation
    Inventors: Satyanarayana Dharanipragada, Mukund Padmanabhan
  • Patent number: 6421641
    Abstract: A method of performing speaker adaptation of acoustic models in a band-quantized speech recognition system, wherein the system including one or more acoustic models represented by a feature space of multi-dimensional gaussians, whose dimensions are partitioned into bands, and the gaussian means and covariances within each band are quantized into atoms, comprises the following steps. A decoded segment of a speech signal associated with a particular speaker is obtained. Then, at least one adaptation mapping based on the decoded segment is computed. Lastly, the at least one adaptation mapping is applied to the atoms of the acoustic models to generate one or more acoustic models adapted to the particular speaker. Accordingly, a fast speaker adaptation methodology is provided for use in real-time applications.
    Type: Grant
    Filed: November 12, 1999
    Date of Patent: July 16, 2002
    Assignee: International Business Machines Corporation
    Inventors: Jing Huang, Mukund Padmanabhan
  • Patent number: 6385579
    Abstract: A method of forming an augmented textual training corpus with compound words for use with an associated with a speech recognition system includes computing a measure for a consecutive word pair in the training corpus. The measure is then compared to a threshold value. The consecutive word pair is replaced in the training corpus with a corresponding compound word depending on the result of the comparison between the measure and the threshold value. One or more measures may be employed. A first measure is an average of a direct bigram probability value and a reverse bigram probability value. A second measure is based on mutual information between the words in the pair. A third measure is based on a comparison of the number of times a co-articulated baseform for the pair is preferred over a concatenation of non-co-articulated individual baseforms of the words forming the pair.
    Type: Grant
    Filed: April 29, 1999
    Date of Patent: May 7, 2002
    Assignee: International Business Machines Corporation
    Inventors: Mukund Padmanabhan, George Andrei Saon
  • Patent number: 6377921
    Abstract: A method of identifying mismatches between acoustic data and a corresponding transcription, the transcription being expressed in terms of basic units, comprises the steps of: aligning the acoustic data with the corresponding transcription; computing a probability score for each instance of a basic unit in the acoustic data with respect to the transcription; generating a distribution for each basic unit; tagging, as mismatches, instances of a basic unit corresponding to a particular range of scores in the distribution for each basic unit based on a threshold value; and correcting the mismatches.
    Type: Grant
    Filed: June 26, 1998
    Date of Patent: April 23, 2002
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Mukund Padmanabhan
  • Publication number: 20020010578
    Abstract: Generally, the present invention determines and uses spectral peak information, which preferably augments feature vectors and creates augmented feature vectors. The augmented feature vectors decrease errors in pattern recognition, increase noise immunity for wide-band noise, and reduce reliance on noisy formant features. Illustratively, one way of determining spectral peak information is to split pattern data into a number of frequency ranges and determine spectral peak information for each of the frequency ranges. This allows single peak selection. All of the spectral peak information is then used to augment a feature vector. Another way of determining spectral peak information is to use an adaptive Infinite Impulse Response filter to provide this information. Additionally, the present invention can determine and use incremental information. The incremental information is relatively easy to calculate and helps to determine if additional or changed features are worthwhile.
    Type: Application
    Filed: February 16, 2001
    Publication date: January 24, 2002
    Applicant: International Business Machines Corporation
    Inventor: Mukund Padmanabhan
  • Patent number: 6260014
    Abstract: A method for recognizing speech includes the steps of providing a generic model having a baseform representation of a vocabulary of words, identifying a subset of words relating to an application, constructing a task specific model for the subset of words, constructing a composite model by combining the generic and task specific models and modifying the baseform representation of the subset of words such that the subset of words are recognized by the task specific model. A system for recognizing speech includes a composite model having a generic model having a generic baseform representation of a vocabulary of words and a task specific model for recognizing a subset of words relating to an application wherein the subset of words are recognized using a modified baseform representation. A recognizer compares words input thereto with the generic model for words other than the subset of words and with the task specific model for the subset of words.
    Type: Grant
    Filed: September 14, 1998
    Date of Patent: July 10, 2001
    Assignee: International Business Machines Corporation
    Inventors: Lalit Rai Bahl, David Lubensky, Mukund Padmanabhan, Salim Roukos
  • Patent number: 6219638
    Abstract: A messaging system for receiving speech over a telephone and converting the speech to text includes a first server for receiving speech input by a user, a speech recognition system for converting the speech to text, a speech synthesizer for converting the text to speech for playing back the synthesized speech for correction by the user and a correction mechanism for enabling the user to correct the speech such that the corrected speech is provided as text for transmittal over a communication system.
    Type: Grant
    Filed: November 3, 1998
    Date of Patent: April 17, 2001
    Assignee: International Business Machines Corporation
    Inventors: Mukund Padmanabhan, Michael Picheny, David Nahamoo, Salim Roukos
  • Patent number: 6073096
    Abstract: A method of speech recognition, in accordance with the present invention includes the steps of grouping acoustics to form classes based on acoustic features, clustering training speakers by the classes to provide class-specific cluster systems, selecting from the cluster systems, a subset of cluster systems closest to adaptation data from a test speaker, transforming the subset of cluster systems to bring the subset of cluster systems closer to the test speaker based on the adaptation data to form adapted cluster systems and combining the adapted cluster systems to create a speaker adapted system for decoding speech from the test speaker. System and methods for building speech recognition systems as well as adapting speaker systems for class-specific speaker clusters are included.
    Type: Grant
    Filed: February 4, 1998
    Date of Patent: June 6, 2000
    Assignee: International Business Machines Corporation
    Inventors: Yuqing Gao, Mukund Padmanabhan, Michael Alan Picheny
  • Patent number: 6058205
    Abstract: A system and method are provided which partition the feature space of a classifier by using hyperplanes to construct a binary decision tree or hierarchical data structure for obtaining the class probabilities for a particular feature vector. One objective in the construction of the decision tree is to minimize the average entropy of the empirical class distributions at each successive node or subset, such that the average entropy of the class distributions at the terminal nodes is minimized. First, a linear discriminant vector is computed that maximally separates the classes at any particular node. A threshold is then chosen that can be applied on the value of the projection onto the hyperplane such that all feature vectors that have a projection onto the hyperplane that is less than the threshold are assigned to a child node (say, left child node) and the feature vectors that have a projection greater than or equal to the threshold are assigned to a right child node.
    Type: Grant
    Filed: January 9, 1997
    Date of Patent: May 2, 2000
    Assignee: International Business Machines Corporation
    Inventors: Lalit Rai Bahl, Peter Vincent deSouza, David Nahamoo, Mukund Padmanabhan