Patents by Inventor Michael Picheny

Michael Picheny has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method and apparatus for translating natural-language speech using multiple output phrases

Patent number: 6859778

Abstract: A multi-lingual translation system that provides multiple output sentences for a given word or phrase. Each output sentence for a given word or phrase reflects, for example, a different emotional emphasis, dialect, accents, loudness or rates of speech. A given output sentence could be selected automatically, or manually as desired, to create a desired effect. For example, the same output sentence for a given word or phrase can be recorded three times, to selectively reflect excitement, sadness or fear. The multi-lingual translation system includes a phrase-spotting mechanism, a translation mechanism, a speech output mechanism and optionally, a language understanding mechanism or an event measuring mechanism or both. The phrase-spotting mechanism identifies a spoken phrase from a restricted domain of phrases. The language understanding mechanism, if present, maps the identified phrase onto a small set of formal phrases.

Type: Grant

Filed: March 16, 2000

Date of Patent: February 22, 2005

Assignees: International Business Machines Corporation, OIPENN, Inc.

Inventors: Raimo Bakis, Mark Edward Epstein, William Stuart Meisel, Miroslav Novak, Michael Picheny, Ridley M. Whitaker
Model-based voice activity detection system and method using a log-likelihood ratio and pitch

Patent number: 6615170

Abstract: A system and method for voice activity detection, in accordance with the invention includes the steps of inputting data including frames of speech and noise, and deciding if the frames of the input data include speech or noise by employing a log-likelihood ratio test statistic and pitch. The frames of the input data are tagged based on the log-likelihood ratio test statistic and pitch characteristics of the input data as being most likely noise or most likely speech. The tags are counted in a plurality of frames to determine if the input data is speech or noise.

Type: Grant

Filed: March 7, 2000

Date of Patent: September 2, 2003

Assignee: International Business Machines Corporation

Inventors: Fu-Hua Liu, Michael A. Picheny
Method and apparatus for time-synchronized translation and synthesis of natural-language speech

Patent number: 6556972

Abstract: A multi-lingual time-synchronized translation system and method provide automatic time-synchronized spoken translations of spoken phrases. The multi-lingual time-synchronized translation system includes a phrase-spotting mechanism, optionally, a language understanding mechanism, a translation mechanism, a speech output mechanism and an event measuring mechanism. The phrase-spotting mechanism identifies a spoken phrase from a restricted domain of phrases. The language understanding mechanism, if present, maps the identified phrase onto a small set of formal phrases. The translation mechanism maps the formal phrase onto a well-formed phrase in one or more target languages. The speech output mechanism produces high-quality output speech using the output of the event measuring mechanism for time synchronization. The event-measuring mechanism measures the duration of various key events in the source phrase.

Type: Grant

Filed: March 16, 2000

Date of Patent: April 29, 2003

Assignee: International Business Machines Corporation

Inventors: Raimo Bakis, Mark Edward Epstein, William Stuart Meisel, Miroslav Novak, Michael Picheny, Ridley M. Whitaker
Method and system for text-to-speech caching

Publication number: 20030046077

Abstract: In a text-to-speech system, a method of converting text-to-speech can include receiving a text input and comparing the received text input to at least one entry in a text-to-speech cache memory. Each entry in the text-to-speech cache memory can specify a corresponding spoken output. If the text input matches one of the entries in the text-to-speech cache memory, the cached speech output specified by the matching entry can be provided.

Type: Application

Filed: August 29, 2001

Publication date: March 6, 2003

Applicant: International Business Machines Corporation

Inventors: Raimo Bakis, Hari Chittaluru, Edward A. Epstein, Steven J. Friedland, Abraham Ittycheriah, Stephen G. Lawrence, Michael A. Picheny, Charles Rutherfoord, Maria E. Smith
Methods and apparatus for conversational name dialing systems

Publication number: 20020196911

Abstract: Techniques for providing an automated conversational name dialing system for placing a call in response to an input by a user. One technique begins with the step of analyzing an input from a user, wherein the input includes information directed to identifying an intended recipient of a telephone call from the user. At least one candidate for the intended recipient is identified in response to the input, wherein the at least one candidate represents at least one potential match between the intended recipient and a predetermined vocabulary. A confidence measure indicative of a likelihood that the at least one candidate is the intended recipient is determined, and additional information is obtained from the user to increase the likelihood that the at least one candidate is the intended recipient, based on the determined confidence measure.

Type: Application

Filed: May 3, 2002

Publication date: December 26, 2002

Applicant: International Business Machines Corporation

Inventors: Yuqing Gao, Bhuvana Ramabhadran, Chengjun Julian Chen, Hakan Erdogan, Michael A. Picheny
Enhanced likelihood computation using regression in a speech recognition system

Patent number: 6493667

Abstract: In order to achieve low error rates in a speech recognition system, for example, in a system employing rank-based decoding, we discriminate the most confusable incorrect leaves from the correct leaf by lowering their ranks. That is, we increase the likelihood of the correct leaf of a frame, while decreasing the likelihoods of the confusable leaves. In order to do this, we use the auxiliary information from the prediction of the neighboring frames to augment the likelihood computation of the current frame. We then use the residual errors in the predictions of neighboring frames to discriminate between the correct (best) and incorrect leaves of a given frame. We present a new methodology that incorporates prediction error likelihoods into the overall likelihood computation to improve the rank position of the correct leaf.

Type: Grant

Filed: August 5, 1999

Date of Patent: December 10, 2002

Assignee: International Business Machines Corporation

Inventors: Peter V. de Souza, Yuqing Gao, Michael Picheny, Bhuvana Ramabhadran
Apparatus and method for robust pattern recognition

Publication number: 20020152069

Abstract: N sets of feature vectors are generated from a set of observation vectors which are indicative of a pattern which it is desired to recognize. At least one of the sets of feature vectors is different than at least one other of the sets of feature vectors, and is preselected for purposes of containing at least some complimentary information with regard to the at least one other set of feature vectors. The N sets of feature vectors are combined in a manner to obtain an optimized set of feature vectors which best represents the pattern. The combination is performed via one of a weighted likelihood combination scheme and a rank-based state-selection scheme; preferably, it is done in accordance with an equation set forth herein. In one aspect, a weighted likelihood combination can be employed, while in another aspect, rank-based state selection can be employed. An apparatus suitable for performing the method is described, and implementation in a computer program product is also contemplated.

Type: Application

Filed: October 1, 2001

Publication date: October 17, 2002

Applicant: International Business Machines Corporation

Inventors: Yuging Gao, Michael A. Picheny, Bhuvana Ramabhadran
Audio-visual data collection system

Publication number: 20020120643

Abstract: Methods and apparatus for obtaining visual data in connection with speech recognition. An image capture device captures visible images, a text-supplying device supplies text, and a substantially fully frontal image of a human face is captured during the reading of text from the text-supplying device.

Type: Application

Filed: February 28, 2001

Publication date: August 29, 2002

Applicant: IBM Corporation

Inventors: Giridharan Iyengar, Chalapathy Neti, Michael A. Picheny, Gerasimos Potamianos
Non-leaf node penalty score assignment system and method for improving acoustic fast match speed in large vocabulary systems

Patent number: 6275801

Abstract: A method for fast match processing, comprising two stages, a pre-processing stage and an on-line stage. The pre-processing stage comprises the steps of computing an a-priori probability of occurrence for each word from an acoustic vocabulary; deriving a penalty score for each word from said acoustic vocabulary based on each words a-priori probability of occurrence in an input text. The on-line stage operates on an input text stream, comprising the steps of, computing a path score for each word from said input text; combining the computed path score with the derived penalty score to form a combined score and testing the combined score against a threshold to determine top ranking candidate words.

Type: Grant

Filed: November 3, 1998

Date of Patent: August 14, 2001

Assignee: International Business Machines Corporation

Inventors: Miroslav Novak, Michael Picheny
Telephone messaging and editing system

Patent number: 6219638

Abstract: A messaging system for receiving speech over a telephone and converting the speech to text includes a first server for receiving speech input by a user, a speech recognition system for converting the speech to text, a speech synthesizer for converting the text to speech for playing back the synthesized speech for correction by the user and a correction mechanism for enabling the user to correct the speech such that the corrected speech is provided as text for transmittal over a communication system.

Type: Grant

Filed: November 3, 1998

Date of Patent: April 17, 2001

Assignee: International Business Machines Corporation

Inventors: Mukund Padmanabhan, Michael Picheny, David Nahamoo, Salim Roukos
System and method for sampling rate transformation in speech recognition

Patent number: 6199041

Abstract: A method and system for transforming a sampling rate in speech recognition systems, in accordance with the present invention, includes the steps of providing cepstral based data including utterances comprised of segments at a reference frequency, the segments being represented by cepstral vector coefficients, converting the cepstral vector coefficients to energy bands in logarithmic spectra, filtering the energy bands of the logarithmic spectra to remove energy bands having a frequency above a predetermined portion of a target frequency and converting the filtered logarithmic spectra to modified cepstral vector coefficients at the target frequency. Another method and system convert system prototypes for speech recognition systems from a reference frequency to a target frequency.

Type: Grant

Filed: November 20, 1998

Date of Patent: March 6, 2001

Assignee: International Business Machines Corporation

Inventors: Fu-Hua Liu, Michael A. Picheny
Speech recognition using dynamic features

Patent number: 5615299

Abstract: A speech recognition technique utilizes a set of N different principal discriminant matrices. Each principal discriminant matrix is associated with a distinct class. The class is an indication of the proximity of a speech segment to neighboring phones. A technique for speech encoding includes arranging speech signal into a series of frames. A feature vector is derived which represents the speech signal for a speech segment or series of speech segments for each frame. A set of N different projected vectors are generated for each frame, by multiplying the principal discriminant matrices by the vector. This speech encoding technique is capable of being used in speech recognition systems by utilizing models, in which each model transition is tagged with one of the N classes. The projected vector is utilized with the corresponding tag to compute the probability that at least one particular speech port is present in said frame.

Type: Grant

Filed: June 20, 1994

Date of Patent: March 25, 1997

Assignee: International Business Machines Corporation

Inventors: Lahit R. Bahl, Peter V. de Souza, Ponani Gopalakrishnan, Michael A. Picheny
Speech coding apparatus and method for generating acoustic feature vector component values by combining values of the same features for multiple time intervals

Patent number: 5544277

Abstract: A speech coding apparatus and method measures the values of at least first and second different features of an utterance during each of a series of successive time intervals. For each time interval, a feature vector signal has a first component value equal to a first weighted combination of the values of only one feature of the utterance for at least two time intervals. The feature vector signal has a second component value equal to a second weighted combination, different from the first weighted combination, of the values of only one feature of the utterance for at least two time intervals. The resulting feature vector signals for a series of successive time intervals form a coded representation of the utterance. In one embodiment, a first weighted mixture signal has a value equal to a first weighted mixture of the values of the features of the utterance during a single time interval.

Type: Grant

Filed: July 28, 1993

Date of Patent: August 6, 1996

Assignee: International Business Machines Corporation

Inventors: Raimo Bakis, Ponani S. Gopalakrishnan, Dimitri Kanevsky, Arthur J. Nadas, David Nahamoo, Michael A. Picheny, Jan Sedivy
Speech coding apparatus and method using classification rules

Patent number: 5522011

Abstract: A speech coding apparatus and method uses classification rules to code an utterance while consuming fewer computing resources. The value of at least one feature of an utterance is measured during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values. The classification rules comprise at least first and second sets of classification rules. The first set of classification rules map each feature vector signal from a set of all possible feature vector signals to exactly one of at least two disjoint subsets of feature vector signals. The second set of classification rules map each feature vector signal in a subset of feature vector signals to exactly one of at least two different classes of prototype vector signals. Each class contains a plurality of prototype vector signals. According to the classification rules, a first feature vector signal is mapped to a first class of prototype vector signals.

Type: Grant

Filed: September 27, 1993

Date of Patent: May 28, 1996

Assignee: International Business Machines Corporation

Inventors: Mark E. Epstein, Ponani S. Gopalakrishnan, David Nahamoo, Michael A. Picheny, Jan Sedivy
Speech coding apparatus having acoustic prototype vectors generated by tying to elementary models and clustering around reference vectors

Patent number: 5497447

Abstract: A speech coding apparatus in which measured acoustic feature vectors are each represented by the best matched prototype vector. The prototype vectors are generated by storing a model of a training script comprising a series of elementary models. The value of at least one feature of a training utterance of the training script is measured over each of a series of successive time intervals to produce a series of training feature vectors. A first set of training feature vectors corresponding to a first elementary model in the training script is identified. The feature value of each training feature vector signal in the first set is compared to the parameter value of a first reference vector signal to obtain a first closeness score, and is compared to the parameter value of a second reference vector to obtain a second closeness score for each training feature vector.

Type: Grant

Filed: March 8, 1993

Date of Patent: March 5, 1996

Assignee: International Business Machines Corporation

Inventors: Lalit R. Bahl, Ponani S. Gopalakrishnan, Michael A. Picheny, Peter D. De Souza
Labelling speech using context-dependent acoustic prototypes

Patent number: 5455889

Abstract: The present invention relates to labelling of speech in a context-dependent speech recognition system. When labelling speech using context-dependent prototypes the phone context of a frame of speech needs to be aligned with the appropriate acoustic parameter vector. Since aligning a large amount of data is difficult if based upon arc ranks, the present invention aligns the data using context-independent acoustic prototypes. The phonetic context of each phone of the data is known. Therefore after the alignment step the acoustic parameter vectors are tagged with a corresponding phonetic context. Context-dependent prototype vectors exists for each label. For all labels the context-dependent prototype vectors having the same phonetic context as the tagged acoustic parameter vector are determined.

Type: Grant

Filed: February 8, 1993

Date of Patent: October 3, 1995

Assignee: International Business Machines Corporation

Inventors: Lalit R. Bahl, Peter de Souza, P. S. Gopalakrishnan, Michael A. Picheny
Speech recognizer having a speech coder for an acoustic match based on context-dependent speech-transition acoustic models

Patent number: 5333236

Abstract: A speech coding apparatus compares the closeness of the feature value of a feature vector signal of an utterance to the parameter values of prototype vector signals to obtain prototype match scores for the feature vector signal and each prototype vector signal. The speech coding apparatus stores a plurality of speech transition models representing speech transitions. At least one speech transition is represented by a plurality of different models. Each speech transition model has a plurality of model outputs, each comprising a prototype match score for a prototype vector signal. Each model output has an output probability. A model match score for a first feature vector signal and each speech transition model comprises the output probability for at least one prototype match score for the first feature vector signal and a prototype vector signal.

Type: Grant

Filed: September 10, 1992

Date of Patent: July 26, 1994

Assignee: International Business Machines Corporation

Inventors: Lalit R. Bahl, Peter V. De Souza, Ponani S. Gopalakrishnan, Michael A. Picheny
Speech coding apparatus with single-dimension acoustic prototypes for a speech recognizer

Patent number: 5280562

Abstract: In speech recognition and speech coding, the values of at least two features of an utterance are measured during a series of time intervals to produce a series of feature vector signals. A plurality of single-dimension prototype vector signals having only one parameter value are stored. At least two single-dimension prototype vector signals having parameter values representing first feature values, and at least two other single-dimension prototype vector signals have parameter values representing second feature values. A plurality of compound-dimension prototype vector signals have unique identification values and comprise one first-dimension and one second-dimension prototype vector signal. At least two compound-dimension prototype vector signals comprise the same first-dimension prototype vector signal. The feature values of each feature vector signal are compared to the parameter values of the compound-dimension prototype vector signals to obtain prototype match scores.

Type: Grant

Filed: October 3, 1991

Date of Patent: January 18, 1994

Assignee: International Business Machines Corporation

Inventors: Lalit R. Bahl, Jerome R. Bellegarda, Edward A. Epstein, John M. Lucassen, David Nahamoo, Michael A. Picheny
Speech coding apparatus having speaker dependent prototypes generated from nonuser reference data

Patent number: 5278942

Abstract: A speech coding apparatus and method for use in a speech recognition apparatus and method. The value of at least one feature of an utterance is measured during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values. A plurality of prototype vector signals, each having at least one parameter value and a unique identification value are stored. The closeness of the feature vector signal is compared to the parameter values of the prototype vector signals to obtain prototype match scores for the feature value signal and each prototype vector signal. The identification value of the prototype vector signal having the best prototype match score is output as a coded representation signal of the feature vector signal. Speaker-dependent prototype vector signals are generated from both synthesized training vector signals and measured training vector signals.

Type: Grant

Filed: December 5, 1991

Date of Patent: January 11, 1994

Assignee: International Business Machines Corporation

Inventors: Lalit R. Bahl, Jerome R. Bellegarda, Peter V. De Souza, Ponani S. Gopalakrishnan, Arthur J. Nadas, David Nahamoo, Michael A. Picheny
Fast algorithm for deriving acoustic prototypes for automatic speech recognition

Patent number: 5276766

Abstract: An apparatus for generating a set of acoustic prototype signals for encoding speech includes a memory for storing a training script model comprising a series of word-segment models. Each word-segment model comprises a series of elementary models. An acoustic measure is provided for measuring the value of at least one feature of an utterance of the training script during each of a series of time intervals to produce a series of feature vector signals representing the feature values of the utterance. An acoustic matcher is provided for estimating at least one path through the training script model which would produce the entire series of measured feature vector signals. From the estimated path, the elementary model in the training script model which would produce each feature vector signal is estimated. The apparatus further comprises a cluster processor for clustering the feature vector signals into a plurality of clusters.

Type: Grant

Filed: July 16, 1991

Date of Patent: January 4, 1994

Assignee: International Business Machines Corporation

Inventors: Lalit R. Bahl, Jerome R. Bellegarda, Peter V. DeSouza, David Nahamoo, Michael A. Picheny

prev 1 2 3 4 next