Patents by Inventor Peter V. de Souza
Peter V. de Souza has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 6493667Abstract: In order to achieve low error rates in a speech recognition system, for example, in a system employing rank-based decoding, we discriminate the most confusable incorrect leaves from the correct leaf by lowering their ranks. That is, we increase the likelihood of the correct leaf of a frame, while decreasing the likelihoods of the confusable leaves. In order to do this, we use the auxiliary information from the prediction of the neighboring frames to augment the likelihood computation of the current frame. We then use the residual errors in the predictions of neighboring frames to discriminate between the correct (best) and incorrect leaves of a given frame. We present a new methodology that incorporates prediction error likelihoods into the overall likelihood computation to improve the rank position of the correct leaf.Type: GrantFiled: August 5, 1999Date of Patent: December 10, 2002Assignee: International Business Machines CorporationInventors: Peter V. de Souza, Yuqing Gao, Michael Picheny, Bhuvana Ramabhadran
-
Patent number: 5884261Abstract: Tone-sensitive acoustic models are generated by first generating acoustic vectors which represent the input data. The input data is separated into multiple frames and an acoustic vector is generated for each frame which represents the input data over its corresponding frame. A tone-sensitive parameter is then generated for each of the frames which indicates the tone of the input data at its corresponding frame. Tone-sensitive parameters are generated in accordance with two embodiments. First, a pitch detector may be used to calculate a pitch for each of the frames. If a pitch cannot be detected for a particular frame, then a pitch is created for that frame based on the pitch values of surrounding frames. Second, the cross covariance between the autocorrelation coefficients for each frame and its successive frame may be generated and used as the tone-sensitive parameter.Type: GrantFiled: July 7, 1994Date of Patent: March 16, 1999Assignee: Apple Computer, inc.Inventors: Peter V. de Souza, Adam B. Fineberg, Hsiao-Wuen Hon, Baosheng Yuan
-
Patent number: 5615299Abstract: A speech recognition technique utilizes a set of N different principal discriminant matrices. Each principal discriminant matrix is associated with a distinct class. The class is an indication of the proximity of a speech segment to neighboring phones. A technique for speech encoding includes arranging speech signal into a series of frames. A feature vector is derived which represents the speech signal for a speech segment or series of speech segments for each frame. A set of N different projected vectors are generated for each frame, by multiplying the principal discriminant matrices by the vector. This speech encoding technique is capable of being used in speech recognition systems by utilizing models, in which each model transition is tagged with one of the N classes. The projected vector is utilized with the corresponding tag to compute the probability that at least one particular speech port is present in said frame.Type: GrantFiled: June 20, 1994Date of Patent: March 25, 1997Assignee: International Business Machines CorporationInventors: Lahit R. Bahl, Peter V. de Souza, Ponani Gopalakrishnan, Michael A. Picheny
-
Patent number: 5333236Abstract: A speech coding apparatus compares the closeness of the feature value of a feature vector signal of an utterance to the parameter values of prototype vector signals to obtain prototype match scores for the feature vector signal and each prototype vector signal. The speech coding apparatus stores a plurality of speech transition models representing speech transitions. At least one speech transition is represented by a plurality of different models. Each speech transition model has a plurality of model outputs, each comprising a prototype match score for a prototype vector signal. Each model output has an output probability. A model match score for a first feature vector signal and each speech transition model comprises the output probability for at least one prototype match score for the first feature vector signal and a prototype vector signal.Type: GrantFiled: September 10, 1992Date of Patent: July 26, 1994Assignee: International Business Machines CorporationInventors: Lalit R. Bahl, Peter V. De Souza, Ponani S. Gopalakrishnan, Michael A. Picheny
-
Patent number: 5278942Abstract: A speech coding apparatus and method for use in a speech recognition apparatus and method. The value of at least one feature of an utterance is measured during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values. A plurality of prototype vector signals, each having at least one parameter value and a unique identification value are stored. The closeness of the feature vector signal is compared to the parameter values of the prototype vector signals to obtain prototype match scores for the feature value signal and each prototype vector signal. The identification value of the prototype vector signal having the best prototype match score is output as a coded representation signal of the feature vector signal. Speaker-dependent prototype vector signals are generated from both synthesized training vector signals and measured training vector signals.Type: GrantFiled: December 5, 1991Date of Patent: January 11, 1994Assignee: International Business Machines CorporationInventors: Lalit R. Bahl, Jerome R. Bellegarda, Peter V. De Souza, Ponani S. Gopalakrishnan, Arthur J. Nadas, David Nahamoo, Michael A. Picheny
-
Patent number: 5233681Abstract: A speech recognition apparatus and method estimates the next word context for each current candidate word in a speech hypothesis. An initial model of each speech hypothesis comprises a model of a partial hypothesis of zero or more words followed by a model of a candidate word. An initial hypothesis score for each speech hypothesis comprises an estimate of the closeness of a match between the initial model of the speech hypothesis and a sequence of coded representations of the utterance. The speech hypotheses having the best initial hypothesis scores form an initial subset. For each speech hypothesis in the initial subset, the word which is most likely to follow the speech hypothesis is estimated. A revised model of each speech hypothesis in the initial subset comprises a model of the partial hypothesis followed by a revised model of the candidate word. The revised candidate word model is dependent at least on the word which is estimated to be most likely to follow the speech hypothesis.Type: GrantFiled: April 24, 1992Date of Patent: August 3, 1993Assignee: International Business Machines CorporationInventors: Lalit R. Bahl, Peter V. De Souza, Ponani S. Gopalakrishnan, Michael A. Picheny
-
Patent number: 5222146Abstract: A speech coding and speech recognition apparatus. The value of at least one feature of an utterance is measured over each of a series of successive time intervals to produce a series of feature vector signals. The closeness of the feature value of each feature vector signal to the parameter value of each of a set of prototype vector signals is determined to obtain prototype match scores for each vector signal and each prototype vector signal. For each feature vector signal, first-rank and second-rank scores are associated with the prototype vector signals having the best and second best prototype match scores, respectively. For each feature vector signal, at least the identification value and the rank score of the first-ranked and second-ranked prototype vector signals are output as a coded utterance representation signal of the feature vector signal, to produce a series of coded utterance representation signals.Type: GrantFiled: October 23, 1991Date of Patent: June 22, 1993Assignee: International Business Machines CorporationInventors: Latit R. Bahl, Peter V. De Souza, Ponani S. Gopalakrishnan, Michael A. Picheny
-
Patent number: 5195167Abstract: Symbol feature values and contextual feature values of each event in a training set of events are measured. At least two pairs of complementary subsets of observed events are selected. In each pair of complementary subsets of observed events, one subset has contextual features with values in a set C.sub.n, and the other set has contextual features with values in a set C.sub.n, were the sets in C.sub.n and C.sub.n are complementary sets of contextual feature values. For each subset of observed events, the similarity values of the symbol features of the observed events in the subsets are calculated. For each pair of complementary sets of observed events, a "goodness of fit" is the sum of the symbol feature value similarity of the subsets. The sets of contextual feature values associated with the subsets of observed events having the best "goodness of fit" are identified and form context-dependent bases for grouping the observed events into two output sets.Type: GrantFiled: April 17, 1992Date of Patent: March 16, 1993Assignee: International Business Machines CorporationInventors: Lalit R. Bahl, Peter V. De Souza, Ponani S. Gopalakrishnan, David Nahamoo, Michael A. Picheny
-
Patent number: 5182773Abstract: The present invention is related to speech recognition and particularly to a new type of vector quantizer and a new vector quantization technique in which the error rate of associating a sound with an incoming speech signal is drastically reduced. To achieve this end, the present invention technique groups the feature vectors in a space into different prototypes at least two of which represent a class of sound. Each of the prototypes may in turn have a number of subclasses or partitions. Each of the prototypes and their subclasses may be assigned respective identifying values. To identify an incoming speech feature vector, at least one of the feature values of the incoming feature vector is compared with the different values of the respective prototypes, or the subclasses of the prototypes.Type: GrantFiled: March 22, 1991Date of Patent: January 26, 1993Assignee: International Business Machines CorporationInventors: Lalit R. Bahl, Michael A. Picheny, David Nahamoo, Peter V. de Souza
-
Patent number: 5129001Abstract: Modeling a word is done by concatenating a series of elemental models to form a word model. At least one elemental model in the series is a composite elemental model formed by combining the starting states of at least first and second primitive elemental models. Each primitive elemental model represents a speech component. The primitive elemental models are combined by a weighted combination of their parameters in proportion to the values of the weighting factors. To tailor the word model to closely represent variations in the pronunciation of the word, the word is uttered a plurality of times by a plurality of different speakers. Constructing word models from composite elemental models, and constructing composite elemental models from primitive elemental models enables word models to represent many variations in the pronunciation of a word.Type: GrantFiled: April 25, 1990Date of Patent: July 7, 1992Assignee: International Business Machines CorporationInventors: Lalit R. Bahl, Jerome R. Bellegarda, Peter V. De Souza, Ponani S. Gopalakrishnan, David Nahamoo, Michael A. Picheny