Patents by Inventor Michael Picheny

Michael Picheny has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 6859778
    Abstract: A multi-lingual translation system that provides multiple output sentences for a given word or phrase. Each output sentence for a given word or phrase reflects, for example, a different emotional emphasis, dialect, accents, loudness or rates of speech. A given output sentence could be selected automatically, or manually as desired, to create a desired effect. For example, the same output sentence for a given word or phrase can be recorded three times, to selectively reflect excitement, sadness or fear. The multi-lingual translation system includes a phrase-spotting mechanism, a translation mechanism, a speech output mechanism and optionally, a language understanding mechanism or an event measuring mechanism or both. The phrase-spotting mechanism identifies a spoken phrase from a restricted domain of phrases. The language understanding mechanism, if present, maps the identified phrase onto a small set of formal phrases.
    Type: Grant
    Filed: March 16, 2000
    Date of Patent: February 22, 2005
    Assignees: International Business Machines Corporation, OIPENN, Inc.
    Inventors: Raimo Bakis, Mark Edward Epstein, William Stuart Meisel, Miroslav Novak, Michael Picheny, Ridley M. Whitaker
  • Patent number: 6615170
    Abstract: A system and method for voice activity detection, in accordance with the invention includes the steps of inputting data including frames of speech and noise, and deciding if the frames of the input data include speech or noise by employing a log-likelihood ratio test statistic and pitch. The frames of the input data are tagged based on the log-likelihood ratio test statistic and pitch characteristics of the input data as being most likely noise or most likely speech. The tags are counted in a plurality of frames to determine if the input data is speech or noise.
    Type: Grant
    Filed: March 7, 2000
    Date of Patent: September 2, 2003
    Assignee: International Business Machines Corporation
    Inventors: Fu-Hua Liu, Michael A. Picheny
  • Patent number: 6556972
    Abstract: A multi-lingual time-synchronized translation system and method provide automatic time-synchronized spoken translations of spoken phrases. The multi-lingual time-synchronized translation system includes a phrase-spotting mechanism, optionally, a language understanding mechanism, a translation mechanism, a speech output mechanism and an event measuring mechanism. The phrase-spotting mechanism identifies a spoken phrase from a restricted domain of phrases. The language understanding mechanism, if present, maps the identified phrase onto a small set of formal phrases. The translation mechanism maps the formal phrase onto a well-formed phrase in one or more target languages. The speech output mechanism produces high-quality output speech using the output of the event measuring mechanism for time synchronization. The event-measuring mechanism measures the duration of various key events in the source phrase.
    Type: Grant
    Filed: March 16, 2000
    Date of Patent: April 29, 2003
    Assignee: International Business Machines Corporation
    Inventors: Raimo Bakis, Mark Edward Epstein, William Stuart Meisel, Miroslav Novak, Michael Picheny, Ridley M. Whitaker
  • Publication number: 20030046077
    Abstract: In a text-to-speech system, a method of converting text-to-speech can include receiving a text input and comparing the received text input to at least one entry in a text-to-speech cache memory. Each entry in the text-to-speech cache memory can specify a corresponding spoken output. If the text input matches one of the entries in the text-to-speech cache memory, the cached speech output specified by the matching entry can be provided.
    Type: Application
    Filed: August 29, 2001
    Publication date: March 6, 2003
    Applicant: International Business Machines Corporation
    Inventors: Raimo Bakis, Hari Chittaluru, Edward A. Epstein, Steven J. Friedland, Abraham Ittycheriah, Stephen G. Lawrence, Michael A. Picheny, Charles Rutherfoord, Maria E. Smith
  • Publication number: 20020196911
    Abstract: Techniques for providing an automated conversational name dialing system for placing a call in response to an input by a user. One technique begins with the step of analyzing an input from a user, wherein the input includes information directed to identifying an intended recipient of a telephone call from the user. At least one candidate for the intended recipient is identified in response to the input, wherein the at least one candidate represents at least one potential match between the intended recipient and a predetermined vocabulary. A confidence measure indicative of a likelihood that the at least one candidate is the intended recipient is determined, and additional information is obtained from the user to increase the likelihood that the at least one candidate is the intended recipient, based on the determined confidence measure.
    Type: Application
    Filed: May 3, 2002
    Publication date: December 26, 2002
    Applicant: International Business Machines Corporation
    Inventors: Yuqing Gao, Bhuvana Ramabhadran, Chengjun Julian Chen, Hakan Erdogan, Michael A. Picheny
  • Patent number: 6493667
    Abstract: In order to achieve low error rates in a speech recognition system, for example, in a system employing rank-based decoding, we discriminate the most confusable incorrect leaves from the correct leaf by lowering their ranks. That is, we increase the likelihood of the correct leaf of a frame, while decreasing the likelihoods of the confusable leaves. In order to do this, we use the auxiliary information from the prediction of the neighboring frames to augment the likelihood computation of the current frame. We then use the residual errors in the predictions of neighboring frames to discriminate between the correct (best) and incorrect leaves of a given frame. We present a new methodology that incorporates prediction error likelihoods into the overall likelihood computation to improve the rank position of the correct leaf.
    Type: Grant
    Filed: August 5, 1999
    Date of Patent: December 10, 2002
    Assignee: International Business Machines Corporation
    Inventors: Peter V. de Souza, Yuqing Gao, Michael Picheny, Bhuvana Ramabhadran
  • Publication number: 20020152069
    Abstract: N sets of feature vectors are generated from a set of observation vectors which are indicative of a pattern which it is desired to recognize. At least one of the sets of feature vectors is different than at least one other of the sets of feature vectors, and is preselected for purposes of containing at least some complimentary information with regard to the at least one other set of feature vectors. The N sets of feature vectors are combined in a manner to obtain an optimized set of feature vectors which best represents the pattern. The combination is performed via one of a weighted likelihood combination scheme and a rank-based state-selection scheme; preferably, it is done in accordance with an equation set forth herein. In one aspect, a weighted likelihood combination can be employed, while in another aspect, rank-based state selection can be employed. An apparatus suitable for performing the method is described, and implementation in a computer program product is also contemplated.
    Type: Application
    Filed: October 1, 2001
    Publication date: October 17, 2002
    Applicant: International Business Machines Corporation
    Inventors: Yuging Gao, Michael A. Picheny, Bhuvana Ramabhadran
  • Publication number: 20020120643
    Abstract: Methods and apparatus for obtaining visual data in connection with speech recognition. An image capture device captures visible images, a text-supplying device supplies text, and a substantially fully frontal image of a human face is captured during the reading of text from the text-supplying device.
    Type: Application
    Filed: February 28, 2001
    Publication date: August 29, 2002
    Applicant: IBM Corporation
    Inventors: Giridharan Iyengar, Chalapathy Neti, Michael A. Picheny, Gerasimos Potamianos
  • Patent number: 6275801
    Abstract: A method for fast match processing, comprising two stages, a pre-processing stage and an on-line stage. The pre-processing stage comprises the steps of computing an a-priori probability of occurrence for each word from an acoustic vocabulary; deriving a penalty score for each word from said acoustic vocabulary based on each words a-priori probability of occurrence in an input text. The on-line stage operates on an input text stream, comprising the steps of, computing a path score for each word from said input text; combining the computed path score with the derived penalty score to form a combined score and testing the combined score against a threshold to determine top ranking candidate words.
    Type: Grant
    Filed: November 3, 1998
    Date of Patent: August 14, 2001
    Assignee: International Business Machines Corporation
    Inventors: Miroslav Novak, Michael Picheny
  • Patent number: 6219638
    Abstract: A messaging system for receiving speech over a telephone and converting the speech to text includes a first server for receiving speech input by a user, a speech recognition system for converting the speech to text, a speech synthesizer for converting the text to speech for playing back the synthesized speech for correction by the user and a correction mechanism for enabling the user to correct the speech such that the corrected speech is provided as text for transmittal over a communication system.
    Type: Grant
    Filed: November 3, 1998
    Date of Patent: April 17, 2001
    Assignee: International Business Machines Corporation
    Inventors: Mukund Padmanabhan, Michael Picheny, David Nahamoo, Salim Roukos
  • Patent number: 6199041
    Abstract: A method and system for transforming a sampling rate in speech recognition systems, in accordance with the present invention, includes the steps of providing cepstral based data including utterances comprised of segments at a reference frequency, the segments being represented by cepstral vector coefficients, converting the cepstral vector coefficients to energy bands in logarithmic spectra, filtering the energy bands of the logarithmic spectra to remove energy bands having a frequency above a predetermined portion of a target frequency and converting the filtered logarithmic spectra to modified cepstral vector coefficients at the target frequency. Another method and system convert system prototypes for speech recognition systems from a reference frequency to a target frequency.
    Type: Grant
    Filed: November 20, 1998
    Date of Patent: March 6, 2001
    Assignee: International Business Machines Corporation
    Inventors: Fu-Hua Liu, Michael A. Picheny
  • Patent number: 5615299
    Abstract: A speech recognition technique utilizes a set of N different principal discriminant matrices. Each principal discriminant matrix is associated with a distinct class. The class is an indication of the proximity of a speech segment to neighboring phones. A technique for speech encoding includes arranging speech signal into a series of frames. A feature vector is derived which represents the speech signal for a speech segment or series of speech segments for each frame. A set of N different projected vectors are generated for each frame, by multiplying the principal discriminant matrices by the vector. This speech encoding technique is capable of being used in speech recognition systems by utilizing models, in which each model transition is tagged with one of the N classes. The projected vector is utilized with the corresponding tag to compute the probability that at least one particular speech port is present in said frame.
    Type: Grant
    Filed: June 20, 1994
    Date of Patent: March 25, 1997
    Assignee: International Business Machines Corporation
    Inventors: Lahit R. Bahl, Peter V. de Souza, Ponani Gopalakrishnan, Michael A. Picheny
  • Patent number: 5544277
    Abstract: A speech coding apparatus and method measures the values of at least first and second different features of an utterance during each of a series of successive time intervals. For each time interval, a feature vector signal has a first component value equal to a first weighted combination of the values of only one feature of the utterance for at least two time intervals. The feature vector signal has a second component value equal to a second weighted combination, different from the first weighted combination, of the values of only one feature of the utterance for at least two time intervals. The resulting feature vector signals for a series of successive time intervals form a coded representation of the utterance. In one embodiment, a first weighted mixture signal has a value equal to a first weighted mixture of the values of the features of the utterance during a single time interval.
    Type: Grant
    Filed: July 28, 1993
    Date of Patent: August 6, 1996
    Assignee: International Business Machines Corporation
    Inventors: Raimo Bakis, Ponani S. Gopalakrishnan, Dimitri Kanevsky, Arthur J. Nadas, David Nahamoo, Michael A. Picheny, Jan Sedivy
  • Patent number: 5522011
    Abstract: A speech coding apparatus and method uses classification rules to code an utterance while consuming fewer computing resources. The value of at least one feature of an utterance is measured during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values. The classification rules comprise at least first and second sets of classification rules. The first set of classification rules map each feature vector signal from a set of all possible feature vector signals to exactly one of at least two disjoint subsets of feature vector signals. The second set of classification rules map each feature vector signal in a subset of feature vector signals to exactly one of at least two different classes of prototype vector signals. Each class contains a plurality of prototype vector signals. According to the classification rules, a first feature vector signal is mapped to a first class of prototype vector signals.
    Type: Grant
    Filed: September 27, 1993
    Date of Patent: May 28, 1996
    Assignee: International Business Machines Corporation
    Inventors: Mark E. Epstein, Ponani S. Gopalakrishnan, David Nahamoo, Michael A. Picheny, Jan Sedivy
  • Patent number: 5497447
    Abstract: A speech coding apparatus in which measured acoustic feature vectors are each represented by the best matched prototype vector. The prototype vectors are generated by storing a model of a training script comprising a series of elementary models. The value of at least one feature of a training utterance of the training script is measured over each of a series of successive time intervals to produce a series of training feature vectors. A first set of training feature vectors corresponding to a first elementary model in the training script is identified. The feature value of each training feature vector signal in the first set is compared to the parameter value of a first reference vector signal to obtain a first closeness score, and is compared to the parameter value of a second reference vector to obtain a second closeness score for each training feature vector.
    Type: Grant
    Filed: March 8, 1993
    Date of Patent: March 5, 1996
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Ponani S. Gopalakrishnan, Michael A. Picheny, Peter D. De Souza
  • Patent number: 5455889
    Abstract: The present invention relates to labelling of speech in a context-dependent speech recognition system. When labelling speech using context-dependent prototypes the phone context of a frame of speech needs to be aligned with the appropriate acoustic parameter vector. Since aligning a large amount of data is difficult if based upon arc ranks, the present invention aligns the data using context-independent acoustic prototypes. The phonetic context of each phone of the data is known. Therefore after the alignment step the acoustic parameter vectors are tagged with a corresponding phonetic context. Context-dependent prototype vectors exists for each label. For all labels the context-dependent prototype vectors having the same phonetic context as the tagged acoustic parameter vector are determined.
    Type: Grant
    Filed: February 8, 1993
    Date of Patent: October 3, 1995
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Peter de Souza, P. S. Gopalakrishnan, Michael A. Picheny
  • Patent number: 5333236
    Abstract: A speech coding apparatus compares the closeness of the feature value of a feature vector signal of an utterance to the parameter values of prototype vector signals to obtain prototype match scores for the feature vector signal and each prototype vector signal. The speech coding apparatus stores a plurality of speech transition models representing speech transitions. At least one speech transition is represented by a plurality of different models. Each speech transition model has a plurality of model outputs, each comprising a prototype match score for a prototype vector signal. Each model output has an output probability. A model match score for a first feature vector signal and each speech transition model comprises the output probability for at least one prototype match score for the first feature vector signal and a prototype vector signal.
    Type: Grant
    Filed: September 10, 1992
    Date of Patent: July 26, 1994
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Peter V. De Souza, Ponani S. Gopalakrishnan, Michael A. Picheny
  • Patent number: 5280562
    Abstract: In speech recognition and speech coding, the values of at least two features of an utterance are measured during a series of time intervals to produce a series of feature vector signals. A plurality of single-dimension prototype vector signals having only one parameter value are stored. At least two single-dimension prototype vector signals having parameter values representing first feature values, and at least two other single-dimension prototype vector signals have parameter values representing second feature values. A plurality of compound-dimension prototype vector signals have unique identification values and comprise one first-dimension and one second-dimension prototype vector signal. At least two compound-dimension prototype vector signals comprise the same first-dimension prototype vector signal. The feature values of each feature vector signal are compared to the parameter values of the compound-dimension prototype vector signals to obtain prototype match scores.
    Type: Grant
    Filed: October 3, 1991
    Date of Patent: January 18, 1994
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Jerome R. Bellegarda, Edward A. Epstein, John M. Lucassen, David Nahamoo, Michael A. Picheny
  • Patent number: 5278942
    Abstract: A speech coding apparatus and method for use in a speech recognition apparatus and method. The value of at least one feature of an utterance is measured during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values. A plurality of prototype vector signals, each having at least one parameter value and a unique identification value are stored. The closeness of the feature vector signal is compared to the parameter values of the prototype vector signals to obtain prototype match scores for the feature value signal and each prototype vector signal. The identification value of the prototype vector signal having the best prototype match score is output as a coded representation signal of the feature vector signal. Speaker-dependent prototype vector signals are generated from both synthesized training vector signals and measured training vector signals.
    Type: Grant
    Filed: December 5, 1991
    Date of Patent: January 11, 1994
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Jerome R. Bellegarda, Peter V. De Souza, Ponani S. Gopalakrishnan, Arthur J. Nadas, David Nahamoo, Michael A. Picheny
  • Patent number: 5276766
    Abstract: An apparatus for generating a set of acoustic prototype signals for encoding speech includes a memory for storing a training script model comprising a series of word-segment models. Each word-segment model comprises a series of elementary models. An acoustic measure is provided for measuring the value of at least one feature of an utterance of the training script during each of a series of time intervals to produce a series of feature vector signals representing the feature values of the utterance. An acoustic matcher is provided for estimating at least one path through the training script model which would produce the entire series of measured feature vector signals. From the estimated path, the elementary model in the training script model which would produce each feature vector signal is estimated. The apparatus further comprises a cluster processor for clustering the feature vector signals into a plurality of clusters.
    Type: Grant
    Filed: July 16, 1991
    Date of Patent: January 4, 1994
    Assignee: International Business Machines Corporation
    Inventors: Lalit R. Bahl, Jerome R. Bellegarda, Peter V. DeSouza, David Nahamoo, Michael A. Picheny