Patents by Inventor Michael A. Picheny

Michael A. Picheny has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Methods and apparatus for conversational name dialing systems

Patent number: 6925154

Abstract: Techniques for providing an automated conversational name dialing system for placing a call in response to an input by a user. One technique begins with the step of analyzing an input from a user, wherein the input includes information directed to identifying an intended recipient of a telephone call from the user. At least one candidate for the intended recipient is identified in response to the input, wherein the at least one candidate represents at least one potential match between the intended recipient and a predetermined vocabulary. A confidence measure indicative of a likelihood that the at least one candidate is the intended recipient is determined, and additional information is obtained from the user to increase the likelihood that the at least one candidate is the intended recipient, based on the determined confidence measure.

Type: Grant

Filed: May 3, 2002

Date of Patent: August 2, 2005

Assignee: International Business Machines Corproation

Inventors: Yuqing Gao, Bhuvana Ramabhadran, Chengjun Julian Chen, Hakan Erdogan, Michael A. Picheny
Model-based voice activity detection system and method using a log-likelihood ratio and pitch

Patent number: 6615170

Abstract: A system and method for voice activity detection, in accordance with the invention includes the steps of inputting data including frames of speech and noise, and deciding if the frames of the input data include speech or noise by employing a log-likelihood ratio test statistic and pitch. The frames of the input data are tagged based on the log-likelihood ratio test statistic and pitch characteristics of the input data as being most likely noise or most likely speech. The tags are counted in a plurality of frames to determine if the input data is speech or noise.

Type: Grant

Filed: March 7, 2000

Date of Patent: September 2, 2003

Assignee: International Business Machines Corporation

Inventors: Fu-Hua Liu, Michael A. Picheny
Method and system for text-to-speech caching

Publication number: 20030046077

Abstract: In a text-to-speech system, a method of converting text-to-speech can include receiving a text input and comparing the received text input to at least one entry in a text-to-speech cache memory. Each entry in the text-to-speech cache memory can specify a corresponding spoken output. If the text input matches one of the entries in the text-to-speech cache memory, the cached speech output specified by the matching entry can be provided.

Type: Application

Filed: August 29, 2001

Publication date: March 6, 2003

Applicant: International Business Machines Corporation

Inventors: Raimo Bakis, Hari Chittaluru, Edward A. Epstein, Steven J. Friedland, Abraham Ittycheriah, Stephen G. Lawrence, Michael A. Picheny, Charles Rutherfoord, Maria E. Smith
Methods and apparatus for conversational name dialing systems

Publication number: 20020196911

Abstract: Techniques for providing an automated conversational name dialing system for placing a call in response to an input by a user. One technique begins with the step of analyzing an input from a user, wherein the input includes information directed to identifying an intended recipient of a telephone call from the user. At least one candidate for the intended recipient is identified in response to the input, wherein the at least one candidate represents at least one potential match between the intended recipient and a predetermined vocabulary. A confidence measure indicative of a likelihood that the at least one candidate is the intended recipient is determined, and additional information is obtained from the user to increase the likelihood that the at least one candidate is the intended recipient, based on the determined confidence measure.

Type: Application

Filed: May 3, 2002

Publication date: December 26, 2002

Applicant: International Business Machines Corporation

Inventors: Yuqing Gao, Bhuvana Ramabhadran, Chengjun Julian Chen, Hakan Erdogan, Michael A. Picheny
Apparatus and method for robust pattern recognition

Publication number: 20020152069

Abstract: N sets of feature vectors are generated from a set of observation vectors which are indicative of a pattern which it is desired to recognize. At least one of the sets of feature vectors is different than at least one other of the sets of feature vectors, and is preselected for purposes of containing at least some complimentary information with regard to the at least one other set of feature vectors. The N sets of feature vectors are combined in a manner to obtain an optimized set of feature vectors which best represents the pattern. The combination is performed via one of a weighted likelihood combination scheme and a rank-based state-selection scheme; preferably, it is done in accordance with an equation set forth herein. In one aspect, a weighted likelihood combination can be employed, while in another aspect, rank-based state selection can be employed. An apparatus suitable for performing the method is described, and implementation in a computer program product is also contemplated.

Type: Application

Filed: October 1, 2001

Publication date: October 17, 2002

Applicant: International Business Machines Corporation

Inventors: Yuging Gao, Michael A. Picheny, Bhuvana Ramabhadran
Audio-visual data collection system

Publication number: 20020120643

Abstract: Methods and apparatus for obtaining visual data in connection with speech recognition. An image capture device captures visible images, a text-supplying device supplies text, and a substantially fully frontal image of a human face is captured during the reading of text from the text-supplying device.

Type: Application

Filed: February 28, 2001

Publication date: August 29, 2002

Applicant: IBM Corporation

Inventors: Giridharan Iyengar, Chalapathy Neti, Michael A. Picheny, Gerasimos Potamianos
System and method for sampling rate transformation in speech recognition

Patent number: 6199041

Abstract: A method and system for transforming a sampling rate in speech recognition systems, in accordance with the present invention, includes the steps of providing cepstral based data including utterances comprised of segments at a reference frequency, the segments being represented by cepstral vector coefficients, converting the cepstral vector coefficients to energy bands in logarithmic spectra, filtering the energy bands of the logarithmic spectra to remove energy bands having a frequency above a predetermined portion of a target frequency and converting the filtered logarithmic spectra to modified cepstral vector coefficients at the target frequency. Another method and system convert system prototypes for speech recognition systems from a reference frequency to a target frequency.

Type: Grant

Filed: November 20, 1998

Date of Patent: March 6, 2001

Assignee: International Business Machines Corporation

Inventors: Fu-Hua Liu, Michael A. Picheny
Speech recognition using dynamic features

Patent number: 5615299

Abstract: A speech recognition technique utilizes a set of N different principal discriminant matrices. Each principal discriminant matrix is associated with a distinct class. The class is an indication of the proximity of a speech segment to neighboring phones. A technique for speech encoding includes arranging speech signal into a series of frames. A feature vector is derived which represents the speech signal for a speech segment or series of speech segments for each frame. A set of N different projected vectors are generated for each frame, by multiplying the principal discriminant matrices by the vector. This speech encoding technique is capable of being used in speech recognition systems by utilizing models, in which each model transition is tagged with one of the N classes. The projected vector is utilized with the corresponding tag to compute the probability that at least one particular speech port is present in said frame.

Type: Grant

Filed: June 20, 1994

Date of Patent: March 25, 1997

Assignee: International Business Machines Corporation

Inventors: Lahit R. Bahl, Peter V. de Souza, Ponani Gopalakrishnan, Michael A. Picheny
Speech coding apparatus and method for generating acoustic feature vector component values by combining values of the same features for multiple time intervals

Patent number: 5544277

Abstract: A speech coding apparatus and method measures the values of at least first and second different features of an utterance during each of a series of successive time intervals. For each time interval, a feature vector signal has a first component value equal to a first weighted combination of the values of only one feature of the utterance for at least two time intervals. The feature vector signal has a second component value equal to a second weighted combination, different from the first weighted combination, of the values of only one feature of the utterance for at least two time intervals. The resulting feature vector signals for a series of successive time intervals form a coded representation of the utterance. In one embodiment, a first weighted mixture signal has a value equal to a first weighted mixture of the values of the features of the utterance during a single time interval.

Type: Grant

Filed: July 28, 1993

Date of Patent: August 6, 1996

Assignee: International Business Machines Corporation

Inventors: Raimo Bakis, Ponani S. Gopalakrishnan, Dimitri Kanevsky, Arthur J. Nadas, David Nahamoo, Michael A. Picheny, Jan Sedivy
Speech coding apparatus and method using classification rules

Patent number: 5522011

Abstract: A speech coding apparatus and method uses classification rules to code an utterance while consuming fewer computing resources. The value of at least one feature of an utterance is measured during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values. The classification rules comprise at least first and second sets of classification rules. The first set of classification rules map each feature vector signal from a set of all possible feature vector signals to exactly one of at least two disjoint subsets of feature vector signals. The second set of classification rules map each feature vector signal in a subset of feature vector signals to exactly one of at least two different classes of prototype vector signals. Each class contains a plurality of prototype vector signals. According to the classification rules, a first feature vector signal is mapped to a first class of prototype vector signals.

Type: Grant

Filed: September 27, 1993

Date of Patent: May 28, 1996

Assignee: International Business Machines Corporation

Inventors: Mark E. Epstein, Ponani S. Gopalakrishnan, David Nahamoo, Michael A. Picheny, Jan Sedivy
Speech coding apparatus having acoustic prototype vectors generated by tying to elementary models and clustering around reference vectors

Patent number: 5497447

Abstract: A speech coding apparatus in which measured acoustic feature vectors are each represented by the best matched prototype vector. The prototype vectors are generated by storing a model of a training script comprising a series of elementary models. The value of at least one feature of a training utterance of the training script is measured over each of a series of successive time intervals to produce a series of training feature vectors. A first set of training feature vectors corresponding to a first elementary model in the training script is identified. The feature value of each training feature vector signal in the first set is compared to the parameter value of a first reference vector signal to obtain a first closeness score, and is compared to the parameter value of a second reference vector to obtain a second closeness score for each training feature vector.

Type: Grant

Filed: March 8, 1993

Date of Patent: March 5, 1996

Assignee: International Business Machines Corporation

Inventors: Lalit R. Bahl, Ponani S. Gopalakrishnan, Michael A. Picheny, Peter D. De Souza
Labelling speech using context-dependent acoustic prototypes

Patent number: 5455889

Abstract: The present invention relates to labelling of speech in a context-dependent speech recognition system. When labelling speech using context-dependent prototypes the phone context of a frame of speech needs to be aligned with the appropriate acoustic parameter vector. Since aligning a large amount of data is difficult if based upon arc ranks, the present invention aligns the data using context-independent acoustic prototypes. The phonetic context of each phone of the data is known. Therefore after the alignment step the acoustic parameter vectors are tagged with a corresponding phonetic context. Context-dependent prototype vectors exists for each label. For all labels the context-dependent prototype vectors having the same phonetic context as the tagged acoustic parameter vector are determined.

Type: Grant

Filed: February 8, 1993

Date of Patent: October 3, 1995

Assignee: International Business Machines Corporation

Inventors: Lalit R. Bahl, Peter de Souza, P. S. Gopalakrishnan, Michael A. Picheny
Speech recognizer having a speech coder for an acoustic match based on context-dependent speech-transition acoustic models

Patent number: 5333236

Abstract: A speech coding apparatus compares the closeness of the feature value of a feature vector signal of an utterance to the parameter values of prototype vector signals to obtain prototype match scores for the feature vector signal and each prototype vector signal. The speech coding apparatus stores a plurality of speech transition models representing speech transitions. At least one speech transition is represented by a plurality of different models. Each speech transition model has a plurality of model outputs, each comprising a prototype match score for a prototype vector signal. Each model output has an output probability. A model match score for a first feature vector signal and each speech transition model comprises the output probability for at least one prototype match score for the first feature vector signal and a prototype vector signal.

Type: Grant

Filed: September 10, 1992

Date of Patent: July 26, 1994

Assignee: International Business Machines Corporation

Inventors: Lalit R. Bahl, Peter V. De Souza, Ponani S. Gopalakrishnan, Michael A. Picheny
Speech coding apparatus with single-dimension acoustic prototypes for a speech recognizer

Patent number: 5280562

Abstract: In speech recognition and speech coding, the values of at least two features of an utterance are measured during a series of time intervals to produce a series of feature vector signals. A plurality of single-dimension prototype vector signals having only one parameter value are stored. At least two single-dimension prototype vector signals having parameter values representing first feature values, and at least two other single-dimension prototype vector signals have parameter values representing second feature values. A plurality of compound-dimension prototype vector signals have unique identification values and comprise one first-dimension and one second-dimension prototype vector signal. At least two compound-dimension prototype vector signals comprise the same first-dimension prototype vector signal. The feature values of each feature vector signal are compared to the parameter values of the compound-dimension prototype vector signals to obtain prototype match scores.

Type: Grant

Filed: October 3, 1991

Date of Patent: January 18, 1994

Assignee: International Business Machines Corporation

Inventors: Lalit R. Bahl, Jerome R. Bellegarda, Edward A. Epstein, John M. Lucassen, David Nahamoo, Michael A. Picheny
Speech coding apparatus having speaker dependent prototypes generated from nonuser reference data

Patent number: 5278942

Abstract: A speech coding apparatus and method for use in a speech recognition apparatus and method. The value of at least one feature of an utterance is measured during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values. A plurality of prototype vector signals, each having at least one parameter value and a unique identification value are stored. The closeness of the feature vector signal is compared to the parameter values of the prototype vector signals to obtain prototype match scores for the feature value signal and each prototype vector signal. The identification value of the prototype vector signal having the best prototype match score is output as a coded representation signal of the feature vector signal. Speaker-dependent prototype vector signals are generated from both synthesized training vector signals and measured training vector signals.

Type: Grant

Filed: December 5, 1991

Date of Patent: January 11, 1994

Assignee: International Business Machines Corporation

Inventors: Lalit R. Bahl, Jerome R. Bellegarda, Peter V. De Souza, Ponani S. Gopalakrishnan, Arthur J. Nadas, David Nahamoo, Michael A. Picheny
Fast algorithm for deriving acoustic prototypes for automatic speech recognition

Patent number: 5276766

Abstract: An apparatus for generating a set of acoustic prototype signals for encoding speech includes a memory for storing a training script model comprising a series of word-segment models. Each word-segment model comprises a series of elementary models. An acoustic measure is provided for measuring the value of at least one feature of an utterance of the training script during each of a series of time intervals to produce a series of feature vector signals representing the feature values of the utterance. An acoustic matcher is provided for estimating at least one path through the training script model which would produce the entire series of measured feature vector signals. From the estimated path, the elementary model in the training script model which would produce each feature vector signal is estimated. The apparatus further comprises a cluster processor for clustering the feature vector signals into a plurality of clusters.

Type: Grant

Filed: July 16, 1991

Date of Patent: January 4, 1994

Assignee: International Business Machines Corporation

Inventors: Lalit R. Bahl, Jerome R. Bellegarda, Peter V. DeSouza, David Nahamoo, Michael A. Picheny
Context-dependent speech recognizer using estimated next word context

Patent number: 5233681

Abstract: A speech recognition apparatus and method estimates the next word context for each current candidate word in a speech hypothesis. An initial model of each speech hypothesis comprises a model of a partial hypothesis of zero or more words followed by a model of a candidate word. An initial hypothesis score for each speech hypothesis comprises an estimate of the closeness of a match between the initial model of the speech hypothesis and a sequence of coded representations of the utterance. The speech hypotheses having the best initial hypothesis scores form an initial subset. For each speech hypothesis in the initial subset, the word which is most likely to follow the speech hypothesis is estimated. A revised model of each speech hypothesis in the initial subset comprises a model of the partial hypothesis followed by a revised model of the candidate word. The revised candidate word model is dependent at least on the word which is estimated to be most likely to follow the speech hypothesis.

Type: Grant

Filed: April 24, 1992

Date of Patent: August 3, 1993

Assignee: International Business Machines Corporation

Inventors: Lalit R. Bahl, Peter V. De Souza, Ponani S. Gopalakrishnan, Michael A. Picheny
Speech recognition apparatus having a speech coder outputting acoustic prototype ranks

Patent number: 5222146

Abstract: A speech coding and speech recognition apparatus. The value of at least one feature of an utterance is measured over each of a series of successive time intervals to produce a series of feature vector signals. The closeness of the feature value of each feature vector signal to the parameter value of each of a set of prototype vector signals is determined to obtain prototype match scores for each vector signal and each prototype vector signal. For each feature vector signal, first-rank and second-rank scores are associated with the prototype vector signals having the best and second best prototype match scores, respectively. For each feature vector signal, at least the identification value and the rank score of the first-ranked and second-ranked prototype vector signals are output as a coded utterance representation signal of the feature vector signal, to produce a series of coded utterance representation signals.

Type: Grant

Filed: October 23, 1991

Date of Patent: June 22, 1993

Assignee: International Business Machines Corporation

Inventors: Latit R. Bahl, Peter V. De Souza, Ponani S. Gopalakrishnan, Michael A. Picheny
Apparatus and method of grouping utterances of a phoneme into context-dependent categories based on sound-similarity for automatic speech recognition

Patent number: 5195167

Abstract: Symbol feature values and contextual feature values of each event in a training set of events are measured. At least two pairs of complementary subsets of observed events are selected. In each pair of complementary subsets of observed events, one subset has contextual features with values in a set C.sub.n, and the other set has contextual features with values in a set C.sub.n, were the sets in C.sub.n and C.sub.n are complementary sets of contextual feature values. For each subset of observed events, the similarity values of the symbol features of the observed events in the subsets are calculated. For each pair of complementary sets of observed events, a "goodness of fit" is the sum of the symbol feature value similarity of the subsets. The sets of contextual feature values associated with the subsets of observed events having the best "goodness of fit" are identified and form context-dependent bases for grouping the observed events into two output sets.

Type: Grant

Filed: April 17, 1992

Date of Patent: March 16, 1993

Assignee: International Business Machines Corporation

Inventors: Lalit R. Bahl, Peter V. De Souza, Ponani S. Gopalakrishnan, David Nahamoo, Michael A. Picheny
Speaker-independent label coding apparatus

Patent number: 5182773

Abstract: The present invention is related to speech recognition and particularly to a new type of vector quantizer and a new vector quantization technique in which the error rate of associating a sound with an incoming speech signal is drastically reduced. To achieve this end, the present invention technique groups the feature vectors in a space into different prototypes at least two of which represent a class of sound. Each of the prototypes may in turn have a number of subclasses or partitions. Each of the prototypes and their subclasses may be assigned respective identifying values. To identify an incoming speech feature vector, at least one of the feature values of the incoming feature vector is compared with the different values of the respective prototypes, or the subclasses of the prototypes.

Type: Grant

Filed: March 22, 1991

Date of Patent: January 26, 1993

Assignee: International Business Machines Corporation

Inventors: Lalit R. Bahl, Michael A. Picheny, David Nahamoo, Peter V. de Souza

prev 1 2 3 next