Patents by Inventor Mukund Padmanabhan

Mukund Padmanabhan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Minimum bayes error feature selection in speech recognition

Patent number: 7529666

Abstract: In connection with speech recognition, the design of a linear transformation ??p×n, of rank p×n, which projects the features of a classifier x?n onto y=?x?p such as to achieve minimum Bayes error (or probability of misclassification). Two avenues are explored: the first is to maximize the ?-average divergence between the class densities and the second is to minimize the union Bhattacharyya bound in the range of ?. While both approaches yield similar performance in practice, they outperform standard linear discriminant analysis features and show a 10% relative improvement in the word error rate over known cepstral features on a large vocabulary telephony speech recognition task.

Type: Grant

Filed: October 30, 2000

Date of Patent: May 5, 2009

Assignee: International Business Machines Corporation

Inventors: Mukund Padmanabhan, George A. Saon
Lattice-based unsupervised maximum likelihood linear regression for speaker adaptation

Patent number: 7216077

Abstract: Methods and arrangements using lattice-based information for unsupervised speaker adaptation. By performing adaptation against a word lattice, correct models are more likely to be used in estimating a transform. Further, a particular type of lattice proposed herein enables the use of a natural confidence measure given by the posterior occupancy probability of a state, that is, the statistics of a particular state will be updated with the current frame only if the a posteriori probability of the state at that particular time is greater than a predetermined threshold.

Type: Grant

Filed: September 26, 2000

Date of Patent: May 8, 2007

Assignee: International Business Machines Corporation

Inventors: Mukund Padmanabhan, George A. Saon, Geoffrey G. Zweig
Methods and apparatus for processing information signals based on content

Publication number: 20060271365

Abstract: Methods and apparatus are provided for processing an information signal containing content presented in accordance with at least one modality. In one aspect of the present invention, a method of processing an information signal containing content presented in accordance with at least one modality, comprises the steps of: (i) obtaining the information signal; (ii) performing content detection on the information signal to detect whether the information signal includes particular content presented in accordance with the at least one modality; and (iii) generating a control signal, when the particular content is detected, for use in controlling a rendering property of the particular content and/or implementation of a specific action relating to the particular content.

Type: Application

Filed: July 27, 2006

Publication date: November 30, 2006

Applicant: International Business Machines Corporation

Inventors: Stephane Maes, Mukund Padmanabhan, Jeffrey Sorensen
Method and apparatus for processing information signals based on content

Patent number: 7092496

Abstract: Methods and apparatus are provided for processing an information signal containing content presented in accordance with at least one modality. In one aspect of the present invention, a method of processing an information signal containing content presented in accordance with at least one modality, comprises the steps of: (i) obtaining the information signal; (ii) performing content detection on the information signal to detect whether the information signal includes particular content presented in accordance with the at least one modality; and (iii) generating a control signal, when the particular content is detected, for use in controlling a rendering property of the particular content and/or implementation of a specific action relating to the particular content.

Type: Grant

Filed: September 18, 2000

Date of Patent: August 15, 2006

Assignee: International Business Machines Corporation

Inventors: Stephane Herman Maes, Mukund Padmanabhan, Jeffrey Scott Sorensen
Determination and use of spectral peak information and incremental information in pattern recognition

Patent number: 6920424

Abstract: Generally, the present invention determines and uses spectral peak information, which preferably augments feature vectors and creates augmented feature vectors. The augmented feature vectors decrease errors in pattern recognition, increase noise immunity for wide-band noise, and reduce reliance on noisy formant features. Illustratively, one way of determining spectral peak information is to split pattern data into a number of frequency ranges and determine spectral peak information for each of the frequency ranges. This allows single peak selection. All of the spectral peak information is then used to augment a feature vector. Another way of determining spectral peak information is to use an adaptive Infinite Impulse Response filter to provide this information. Additionally, the present invention can determine and use incremental information. The incremental information is relatively easy to calculate and helps to determine if additional or changed features are worthwhile.

Type: Grant

Filed: February 16, 2001

Date of Patent: July 19, 2005

Assignee: International Business Machines Corporation

Inventor: Mukund Padmanabhan
Error corrective mechanisms for consensus decoding of speech

Patent number: 6859774

Abstract: Techniques are described for decreasing the number of errors when consensus decoding is used during speech recognition. A number of corrective rules are applied to confusion sets that are extracted during real-time speech recognition. The corrective rules are determined during training of the speech recognition system, which entails using many training confusion sets. A learning process is used that generates a number of possible rules, called template rules, that can be applied to the training confusion sets. The learning process also determines the corrective rules from the template rules. The corrective rules operate on the real-time confusion sets to select hypothesis words from the confusion sets, where the hypothesis words are not necessarily the words having the highest score.

Type: Grant

Filed: May 2, 2001

Date of Patent: February 22, 2005

Assignee: International Business Machines Corporation

Inventors: Lidia Luminita Mangu, Mukund Padmanabhan
Information extraction from documents with regular expression matching

Patent number: 6842796

Abstract: Techniques are provided for enumerating regularly identifiable or stereotypical phrases that people commonly use to convey particular information, and where exactly in these phrases the particular information is to be found. In one embodiment, such phrases are referred to as “regular expressions.” Using such enumerated phrases, the invention is able to automatically identify them in an input data stream and then identify and extract the particular information associated with the phrase that is being sought, e.g., important or relevant information.

Type: Grant

Filed: July 3, 2001

Date of Patent: January 11, 2005

Assignee: International Business Machines Corporation

Inventors: Geoffrey G. Zweig, Mukund Padmanabhan
Methods and apparatus for performing heteroscedastic discriminant analysis in pattern recognition systems

Patent number: 6609093

Abstract: The present invention provides a new approach to heteroscedastic linear discriminant analysis (HDA) by defining an objective function which maximizes the class discrimination in the projected subspace while ignoring the rejected dimensions. Moreover, we present a link between discrimination and the likelihood of the projected samples and show that HDA can be viewed as a constrained maximum likelihood (ML) projection for a full covariance gaussian model, the constraint being given by the maximization of the projected between-class scatter volume. The present invention also provides that, under diagonal covariance gaussian modeling constraints, applying a diagonalizing linear transformation (e.g., MLLT—maximum likelihood linear transformation) to the HDA space results in an increased classification accuracy.

Type: Grant

Filed: June 1, 2000

Date of Patent: August 19, 2003

Assignee: International Business Machines Corporation

Inventors: Ramesh Ambat Gopinath, Mukund Padmanabhan, George Andrei Saon
Audio/video archive system and method for automatic indexing and searching

Patent number: 6603921

Abstract: An archive system for records with an audio component, which uses automated speech recognition to create a multi-layered archive pyramid. The archive pyramid includes successive layers of data stored at varying data rates such as original video data, compressed video data, original audio, compressed audio data, recognized word-lattices, recognized word-bags and a global word index. The disclosed system uses automatic speech recognition to transcribe from audio to searchable index layers. During a search operation, automatic and semi-automatic techniques are used to search the archive pyramid from the smallest narrowest layers to the largest widest layers, to identify a moderate subset of records. This subset is further refined by a manual survey of regenerated compressed audio. Finally, the selected records are retrieved from the original audio archive layer.

Type: Grant

Filed: July 1, 1998

Date of Patent: August 5, 2003

Assignee: International Business Machines Corporation

Inventors: Dimitri Kanevsky, Stephane H. Maes, Mukund Padmanabhan, Arthur R. Zingher
Information extraction from documents with regular expression matching

Publication number: 20030050782

Abstract: Techniques are provided for enumerating regularly identifiable or stereotypical phrases that people commonly use to convey particular information, and where exactly in these phrases the particular information is to be found. In one embodiment, such phrases are referred to as “regular expressions.” Using such enumerated phrases, the invention is able to automatically identify them in an input data stream and then identify and extract the particular information associated with the phrase that is being sought, e.g., important or relevant information.

Type: Application

Filed: July 3, 2001

Publication date: March 13, 2003

Applicant: International Business Machines Corporation

Inventors: Geoffrey G. Zweig, Mukund Padmanabhan
Error corrective mechanisms for consensus decoding of speech

Publication number: 20020165716

Abstract: Techniques are described for decreasing the number of errors when consensus decoding is used during speech recognition. A number of corrective rules are applied to confusion sets that are extracted during real-time speech recognition. The corrective rules are determined during training of the speech recognition system, which entails using many training confusion sets. A learning process is used that generates a number of possible rules, called template rules, that can be applied to the training confusion sets. The learning process also determines the corrective rules from the template rules. The corrective rules operate on the real-time confusion sets to select hypothesis words from the confusion sets, where the hypothesis words are not necessarily the words having the highest score.

Type: Application

Filed: May 2, 2001

Publication date: November 7, 2002

Applicant: International Business Machines Corporation

Inventors: Lidia Luminita Mangu, Mukund Padmanabhan
Method and apparatus for rapid adapt via cumulative distribution function matching for continuous speech

Patent number: 6470314

Abstract: A method of adapting a speech recognition system to one or more acoustic conditions comprises the steps of: (i) computing cumulative distribution functions based on dimensions of speech vectors associated with training speech data provided to the speech recognition system; (ii) computing cumulative distribution functions based on dimensions of speech vectors associated with test speech data provided to the speech recognition system; (iii) computing a nonlinear transformation mapping based on the cumulative distribution functions associated with the training speech data and the cumulative distribution functions associated with the test speech data; and (iv) applying the nonlinear transformation mapping to speech vectors associated with the test speech data prior to recognition, wherein the speech vectors transformed in accordance with the nonlinear transformation mapping are substantially similar to speech vectors associated with the training speech data.

Type: Grant

Filed: April 6, 2000

Date of Patent: October 22, 2002

Assignee: International Business Machines Corporation

Inventors: Satyanarayana Dharanipragada, Mukund Padmanabhan
Methods and apparatus for fast adaptation of a band-quantized speech decoding system

Patent number: 6421641

Abstract: A method of performing speaker adaptation of acoustic models in a band-quantized speech recognition system, wherein the system including one or more acoustic models represented by a feature space of multi-dimensional gaussians, whose dimensions are partitioned into bands, and the gaussian means and covariances within each band are quantized into atoms, comprises the following steps. A decoded segment of a speech signal associated with a particular speaker is obtained. Then, at least one adaptation mapping based on the decoded segment is computed. Lastly, the at least one adaptation mapping is applied to the atoms of the acoustic models to generate one or more acoustic models adapted to the particular speaker. Accordingly, a fast speaker adaptation methodology is provided for use in real-time applications.

Type: Grant

Filed: November 12, 1999

Date of Patent: July 16, 2002

Assignee: International Business Machines Corporation

Inventors: Jing Huang, Mukund Padmanabhan
Methods and apparatus for forming compound words for use in a continuous speech recognition system

Patent number: 6385579

Abstract: A method of forming an augmented textual training corpus with compound words for use with an associated with a speech recognition system includes computing a measure for a consecutive word pair in the training corpus. The measure is then compared to a threshold value. The consecutive word pair is replaced in the training corpus with a corresponding compound word depending on the result of the comparison between the measure and the threshold value. One or more measures may be employed. A first measure is an average of a direct bigram probability value and a reverse bigram probability value. A second measure is based on mutual information between the words in the pair. A third measure is based on a comparison of the number of times a co-articulated baseform for the pair is preferred over a concatenation of non-co-articulated individual baseforms of the words forming the pair.

Type: Grant

Filed: April 29, 1999

Date of Patent: May 7, 2002

Assignee: International Business Machines Corporation

Inventors: Mukund Padmanabhan, George Andrei Saon
Identifying mismatches between assumed and actual pronunciations of words

Patent number: 6377921

Abstract: A method of identifying mismatches between acoustic data and a corresponding transcription, the transcription being expressed in terms of basic units, comprises the steps of: aligning the acoustic data with the corresponding transcription; computing a probability score for each instance of a basic unit in the acoustic data with respect to the transcription; generating a distribution for each basic unit; tagging, as mismatches, instances of a basic unit corresponding to a particular range of scores in the distribution for each basic unit based on a threshold value; and correcting the mismatches.

Type: Grant

Filed: June 26, 1998

Date of Patent: April 23, 2002

Assignee: International Business Machines Corporation

Inventors: Lalit R. Bahl, Mukund Padmanabhan
Determination and use of spectral peak information and incremental information in pattern recognition

Publication number: 20020010578

Abstract: Generally, the present invention determines and uses spectral peak information, which preferably augments feature vectors and creates augmented feature vectors. The augmented feature vectors decrease errors in pattern recognition, increase noise immunity for wide-band noise, and reduce reliance on noisy formant features. Illustratively, one way of determining spectral peak information is to split pattern data into a number of frequency ranges and determine spectral peak information for each of the frequency ranges. This allows single peak selection. All of the spectral peak information is then used to augment a feature vector. Another way of determining spectral peak information is to use an adaptive Infinite Impulse Response filter to provide this information. Additionally, the present invention can determine and use incremental information. The incremental information is relatively easy to calculate and helps to determine if additional or changed features are worthwhile.

Type: Application

Filed: February 16, 2001

Publication date: January 24, 2002

Applicant: International Business Machines Corporation

Inventor: Mukund Padmanabhan
Specific task composite acoustic models

Patent number: 6260014

Abstract: A method for recognizing speech includes the steps of providing a generic model having a baseform representation of a vocabulary of words, identifying a subset of words relating to an application, constructing a task specific model for the subset of words, constructing a composite model by combining the generic and task specific models and modifying the baseform representation of the subset of words such that the subset of words are recognized by the task specific model. A system for recognizing speech includes a composite model having a generic model having a generic baseform representation of a vocabulary of words and a task specific model for recognizing a subset of words relating to an application wherein the subset of words are recognized using a modified baseform representation. A recognizer compares words input thereto with the generic model for words other than the subset of words and with the task specific model for the subset of words.

Type: Grant

Filed: September 14, 1998

Date of Patent: July 10, 2001

Assignee: International Business Machines Corporation

Inventors: Lalit Rai Bahl, David Lubensky, Mukund Padmanabhan, Salim Roukos
Telephone messaging and editing system

Patent number: 6219638

Abstract: A messaging system for receiving speech over a telephone and converting the speech to text includes a first server for receiving speech input by a user, a speech recognition system for converting the speech to text, a speech synthesizer for converting the text to speech for playing back the synthesized speech for correction by the user and a correction mechanism for enabling the user to correct the speech such that the corrected speech is provided as text for transmittal over a communication system.

Type: Grant

Filed: November 3, 1998

Date of Patent: April 17, 2001

Assignee: International Business Machines Corporation

Inventors: Mukund Padmanabhan, Michael Picheny, David Nahamoo, Salim Roukos
Speaker adaptation system and method based on class-specific pre-clustering training speakers

Patent number: 6073096

Abstract: A method of speech recognition, in accordance with the present invention includes the steps of grouping acoustics to form classes based on acoustic features, clustering training speakers by the classes to provide class-specific cluster systems, selecting from the cluster systems, a subset of cluster systems closest to adaptation data from a test speaker, transforming the subset of cluster systems to bring the subset of cluster systems closer to the test speaker based on the adaptation data to form adapted cluster systems and combining the adapted cluster systems to create a speaker adapted system for decoding speech from the test speaker. System and methods for building speech recognition systems as well as adapting speaker systems for class-specific speaker clusters are included.

Type: Grant

Filed: February 4, 1998

Date of Patent: June 6, 2000

Assignee: International Business Machines Corporation

Inventors: Yuqing Gao, Mukund Padmanabhan, Michael Alan Picheny
System and method for partitioning the feature space of a classifier in a pattern classification system

Patent number: 6058205

Abstract: A system and method are provided which partition the feature space of a classifier by using hyperplanes to construct a binary decision tree or hierarchical data structure for obtaining the class probabilities for a particular feature vector. One objective in the construction of the decision tree is to minimize the average entropy of the empirical class distributions at each successive node or subset, such that the average entropy of the class distributions at the terminal nodes is minimized. First, a linear discriminant vector is computed that maximally separates the classes at any particular node. A threshold is then chosen that can be applied on the value of the projection onto the hyperplane such that all feature vectors that have a projection onto the hyperplane that is less than the threshold are assigned to a child node (say, left child node) and the feature vectors that have a projection greater than or equal to the threshold are assigned to a right child node.

Type: Grant

Filed: January 9, 1997

Date of Patent: May 2, 2000

Assignee: International Business Machines Corporation

Inventors: Lalit Rai Bahl, Peter Vincent deSouza, David Nahamoo, Mukund Padmanabhan

1 2 next