Specialized Models Patents (Class 704/250)
  • Patent number: 6195637
    Abstract: A method for correcting misrecognition errors comprises the steps of: dictating to a speech application; marking misrecognized words during the dictating step; and, after the dictating and marking steps, displaying and correcting the marked misrecognized words, whereby the correcting of the misrecognized words is deferred until after the dictating step is concluded and the dictating step is not significantly interrupted. The displaying and correcting step can be implemented by invoking a correction tool of the speech application, whereby the correcting of the misrecognized words trains the speech application.
    Type: Grant
    Filed: March 25, 1998
    Date of Patent: February 27, 2001
    Assignee: International Business Machines Corp.
    Inventors: Barbara E. Ballard, Kerry A. Ortega
  • Patent number: 6182038
    Abstract: A method and apparatus for generating a context dependent phoneme network as an intermediate step of encoding speech information. The context dependent phoneme network is generated from speech in a phoneme network generator (48) associated with an operating system (44). The context dependent phoneme network is then transmitted to a first application (52).
    Type: Grant
    Filed: December 1, 1997
    Date of Patent: January 30, 2001
    Assignee: Motorola, Inc.
    Inventors: Sreeram Balakrishnan, Stephen Austin
  • Patent number: 6178401
    Abstract: A method is provided for reducing search complexity in a speech recognition system having a fast match, a detailed match, and a language model. Based on at least one predetermined variable, the fast match is optionally employed to generate candidate words and acoustic scores corresponding to the candidate words. The language model is employed to generate language model scores. The acoustic scores are combined with the language model scores and the combined scores are ranked to determine top ranking candidate words to be later processed by the detailed match, when the fast match is employed. The detailed match is employed to generate detailed match scores for the top ranking candidate words.
    Type: Grant
    Filed: August 28, 1998
    Date of Patent: January 23, 2001
    Assignee: International Business Machines Corporation
    Inventors: Martin Franz, Miroslav Novak
  • Patent number: 6173260
    Abstract: The classification of speech according to emotional content employs acoustic measures in addition to pitch as classification input. In one embodiment, two different kinds of features in a speech signal are analyzed for classification purposes. One set of features is based on pitch information that is obtained from a speech signal, and the other set of features is based on changes in the spectral shape of the speech signal over time. This latter feature is used to distinguish long, smoothly varying sounds from quickly changing sound, which may indicate the emotional state of the speaker. These changes are determined by means of a low-dimensional representation of the speech signal, such as MFCC or LPC. Additional features of the speech signal, such as energy, can also be employed for classification purposes. Different variations of pitch and spectral shape features can be measured and analyzed, to assist in the classification of individual utterances.
    Type: Grant
    Filed: March 31, 1998
    Date of Patent: January 9, 2001
    Assignee: Interval Research Corporation
    Inventor: Malcolm Slaney
  • Patent number: 6107935
    Abstract: A speaker recognition system for selectively permitting access by a requesting speaker to one of a service and facility include an acoustic front-end for computing at least one feature vector from a speech utterance provided by the requesting speaker; a speaker dependent codebook store for pre-storing sets of acoustic features, in the form of codebooks, respectively corresponding to a pool of previously enrolled speakers; a speaker identifier/verifier module operatively coupled to the acoustic front-end, wherein: the speaker identifier/verifier module identifies, from identifying indicia provided by the requesting speaker, a previously enrolled speaker as a claimed speaker; further, the speaker identifier/verifier module associates, with the claimed speaker, first and second groups of previously enrolled speakers, the first group being defined as speakers whose codebooks are respectively acoustically similar to the claimed speaker (i.e.
    Type: Grant
    Filed: February 11, 1998
    Date of Patent: August 22, 2000
    Assignee: International Business Machines Corporation
    Inventors: Liam David Comerford, Stephane Herman Maes
  • Patent number: 6094632
    Abstract: A speaker recognition device for judging whether or not an unknown speaker is an authentic registered speaker himself/herself executes `text verification using speaker independent speech recognition` and `speaker verification by comparison with a reference pattern of a password of a registered speaker`. A presentation section instructs the unknown speaker to input an ID and utter a specified text designated by a text generation section and a password. The `text verification` of the specified text is executed by a text verification section, and the `speaker verification` of the password is executed by a similarity calculation section. The judgment section judges that the unknown speaker is the authentic registered speaker himself/herself if both the results of the `text verification` and the `speaker verification` are affirmative.
    Type: Grant
    Filed: January 29, 1998
    Date of Patent: July 25, 2000
    Assignee: NEC Corporation
    Inventor: Hiroaki Hattori
  • Patent number: 6081660
    Abstract: A method of forming a cohort for use in identification of an individual by comparing a model of characteristics of the individual, such as a model of utterances, with models of the cohort including a model for the client in respect of whom it is desired to test whether the individual is identifiable. Models related to the population excluding the client are tested to determine whether they meet an acceptance threshold test as to identify with a model for the client. Then, from each meeting the threshold test, it is determined whether those models are distributed so as to present at least a substantial probability that models for nonmembers of the population spaced from the client model in all directions will each be closer to a member of the cohort, excluding the client, than to the client. If that probability is less than a predetermined value, a selection is made from the population of another cohort member which will reduce that probability.
    Type: Grant
    Filed: August 25, 1997
    Date of Patent: June 27, 2000
    Assignee: The Australian National University
    Inventors: Iain Donald Graham Macleod, John Bruce Millar, Fangxin Chen, William Laverty
  • Patent number: 6076055
    Abstract: A speaker verification method consist of the following steps: (1) generating a code book (42) covering a number of speakers having a number of training utterances for each of the speakers; (2) receiving a number of test utterances (44) from a speaker; (3) comparing (46) each of the test utterances to each of the training utterances for the speaker to form a number of decisions, one decision for each of the number of test utterances; (4) weighting each of the decisions (48) to form a number of weighted decisions; and (5) combining (50) the plurality of weighted decisions to form a verification decision (52).
    Type: Grant
    Filed: May 27, 1997
    Date of Patent: June 13, 2000
    Assignee: Ameritech
    Inventors: Robert Wesley Bossemeyer, Jr., Rapeepat Ratasuk
  • Patent number: 6055499
    Abstract: A class of features related to voicing parameters that indicate whether the vocal chords are vibrating. Features describing voicing characteristics of speech signals are integrated with an existing 38-dimensional feature vector consisting of first and second order time derivatives of the frame energy and of the cepstral coefficients with their first and second derivatives. Hidden Markov Model (HMM)-based connected digit recognition experiments comparing the traditional and extended feature sets show that voicing features and spectral information are complementary and that improved speech recognition performance is obtained by combining the two sources of information.
    Type: Grant
    Filed: May 1, 1998
    Date of Patent: April 25, 2000
    Assignee: Lucent Technologies Inc.
    Inventors: Rathinavelu Chengalvarayan, David Lynn Thomson
  • Patent number: 6038528
    Abstract: The present invention relates to a robust speech processing method and system which models channel and noise variations with affine transforms to reduce mismatched conditions between training and testing. The affine transform relating the training vectors C.sub.k with the vectors for testing condition c.sub.k', is represented by the form:c'.sub.k.sup.T =Ac.sub.k.sup.T +bfor k=1 to N in which A is a matrix of predicator coefficients representing noise distortions and vector b represents channel distortions. Alternatively, an affine invariant cepstrum is generated during testing and training for modeling speech to account for noise and channel effects. From the improved speech processing, improved speaker recognition with channel and noise variations is obtained.
    Type: Grant
    Filed: July 17, 1996
    Date of Patent: March 14, 2000
    Assignee: T-Netix, Inc.
    Inventors: Richard Mammone, Xiaoyu Zhang
  • Patent number: 5995927
    Abstract: A method and an apparatus for performing stochastic matching of a set of input test speech data with a corresponding set of training speech data. In particular, a set of input test speech feature information, having been generated from an input test speech utterance, is transformed so that the stochastic characteristics thereof more closely match the stochastic characteristics of a corresponding set of training speech feature information. The corresponding set of training speech data may, for example, comprise training data which was generated from a speaker having the claimed identity of the speaker of the input test speech utterance. Specifically, in accordance with the present invention, a first covariance matrix representative of stochastic characteristics of input test speech feature information is generated based on the input test speech feature information.
    Type: Grant
    Filed: March 14, 1997
    Date of Patent: November 30, 1999
    Assignee: Lucent Technologies Inc.
    Inventor: Qi P. Li
  • Patent number: 5956702
    Abstract: Each neural element of a column-structured recurrent neural network generates an output from input data and recurrent data provided from a context layer of a corresponding column. One or more candidates for an estimated value is obtained, and an occurrence probability is computed using an internal state by solving an estimation equation determined by the internal state output from the neural network. A candidate having the highest occurrence probability is an estimated value for unknown data. Thus, the internal state of the recurrent neural network is explicitly associated with the estimated value for data, and a data change can be efficiently estimated.
    Type: Grant
    Filed: August 22, 1996
    Date of Patent: September 21, 1999
    Assignee: Fujitsu Limited
    Inventors: Masahiro Matsuoka, Mostefa Golea
  • Patent number: 5950157
    Abstract: Adverse effects of type mismatch between acoustic input devices used during testing and during training in machine-based recognition of the source of acoustic phenomena are minimized. A normalizing model is matched to a source model based, or dependent, upon an acoustic input device whose transfer characteristics color acoustic characteristics of a source as represented in the source model. An application of the present invention is to speaker recognition, i.e., recognition of the identity of a speaker by the speaker's voice.
    Type: Grant
    Filed: April 18, 1997
    Date of Patent: September 7, 1999
    Assignee: SRI International
    Inventors: Larry P. Heck, Mitchel Weintraub
  • Patent number: 5946654
    Abstract: A speech model is produced for use in determining whether a speaker associated with the speech model produced an unidentified speech sample. First a sample of speech of a particular speaker is obtained. Next, the contents of the sample of speech are identified using speech recognition. Finally, a speech model associated with the particular speaker is produced using the sample of speech and the identified contents thereof. The speech model is produced without using an external mechanism to monitor the accuracy with which the contents were identified.
    Type: Grant
    Filed: February 21, 1997
    Date of Patent: August 31, 1999
    Assignee: Dragon Systems, Inc.
    Inventors: Michael Jack Newman, Laurence S. Gillick, Yoshiko Ito
  • Patent number: 5913192
    Abstract: A speaker identification system includes a speaker-independent phrase recognizer. The speaker-independent phrase recognizer scores a password utterance against all the sets of phonetic transcriptions in a lexicon database to determine the N best speaker-independent scores, determines the N best sets of phonetic transcriptions based on the N best speaker-independent scores, and determines the N best possible identities. A speaker-dependent phrase recognizer retrieves the hidden Markov model corresponding to each of the N best possible identities, and scores the password utterance against each of the N hidden Markov models to generate a speaker-dependent score for each of the N best possible identities. A score processor coupled to the outputs of the speaker-independent phrase recognizer and the speaker-dependent phrase recognizer determines a putative identity. A verifier coupled to the score processor authenticates the determined putative identity.
    Type: Grant
    Filed: August 22, 1997
    Date of Patent: June 15, 1999
    Assignee: AT&T Corp
    Inventors: Sarangarajan Parthasarathy, Aaron Edward Rosenberg
  • Patent number: 5899976
    Abstract: A method and system for editing words that have been misrecognized. The system allows a speaker to specify a number of alternative words to be displayed in a correction window by resizing the correction window. The system also displays the words in the correction window in alphabetical order. A preferred system eliminates the possibility, when a misrecognized word is respoken, that the respoken utterance will be again recognized as the same misrecognized word. The system, when operating with a word processor, allows the speaker to specify the amount of speech that is buffered before transferring to the word processor.
    Type: Grant
    Filed: October 31, 1996
    Date of Patent: May 4, 1999
    Assignee: Microsoft Corporation
    Inventor: Michael J. Rozak
  • Patent number: 5839104
    Abstract: A point-of-sale system which uses speech entry to identify and tally items without bar code labels, such as produce items. The system includes a microphone, a scale, and a terminal. The terminal uses the microphone to record words from a checker identifying a produce item during a transaction and prompts an operator to identify the produce item by speaking into the microphone when the produce item is placed upon the scale. The terminal also identifies a price for the item and enters the price into the transaction when the produce item is removed from the scale.
    Type: Grant
    Filed: February 20, 1996
    Date of Patent: November 17, 1998
    Assignee: NCR Corporation
    Inventors: Michael S. Miller, Janet L. Fath, Diego J. Castano, Joe Wahome, Jr.
  • Patent number: 5832429
    Abstract: A method and system for enrolling speed dial names includes providing speaker dependent templates and associated telephone numbers and providing a penalized garbage model for unrecognized speech. When a request for a new template is received it is determined if the list of speed dial names is full (Step 201) and is not it is determined if that name is too similar (Step 205) to a name already on the speed dial list. If so, that name is rejected but if not it is determined if the speed dial name is too short (Step 302), and if not; too short or if the user wants to enter the short name the system asks the user to repeat the speed dial name and if a match it is entered. If not a match the system will swap the first and second utterance and compare to see if a match.
    Type: Grant
    Filed: September 11, 1996
    Date of Patent: November 3, 1998
    Assignee: Texas Instruments Incorporated
    Inventors: Michele B. Gammel, Thomas Drew Fisher
  • Patent number: 5819219
    Abstract: A digital signal processor employable for utilization for speech processing or for some other pattern recognition overcomes the weaknesses of digital signal processors given the subtraction with following amount formation that must often be implemented in these applications, an auxiliary hardware is provided that contains the feature vector that is to be compared to reference feature vectors from the dictionary in a separate memory. The calculating work is thereby implemented by a separate arithmetic unit that provides a separate difference-forming and amount-forming unit for each feature comparison. The number of clock cycles of the digital signal processor required per comparison can be dramatically reduced by the invention. A suitable addressing method thereby assures that it is always corresponding features of the individual feature vectors that can be compared to one another.
    Type: Grant
    Filed: December 11, 1996
    Date of Patent: October 6, 1998
    Assignee: Siemens Aktiengesellschaft
    Inventors: Luc De Vos, Daniel Goryn
  • Patent number: 5794193
    Abstract: A methodology for automated task selection is provided, where the selected task is identified in natural speech of a user making such a selection. A set of meaningful phrases are determined by a grammatical inference algorithm which operates on a predetermined corpus of speech utterances, each such utterance being associated with a specific task objective, and wherein each utterance is marked with its associated task objective. Each meaningful phrase developed by the grammatical inference algorithm can be characterized as having both a Mutual Information value and a Salience value (relative to an associated task objective) above a predetermined threshold.
    Type: Grant
    Filed: September 15, 1995
    Date of Patent: August 11, 1998
    Assignee: Lucent Technologies Inc.
    Inventor: Allen Louis Gorin
  • Patent number: 5774850
    Abstract: A voice characteristics analyzer analyzes the voices of unspecified persons through conventional Karaoke systems. A voice characteristics extraction device extracts spoken voice characteristics when a user speaks specific words. A voice characteristic classification table pre-stores analysis information derived from non-user voices and corresponding to voice characteristics of these specific words. An analysis information output device outputs results of the characteristics analysis that compares current user data with the pre-stored information.
    Type: Grant
    Filed: April 26, 1996
    Date of Patent: June 30, 1998
    Assignee: Fujitsu Limited & Animo Limited
    Inventors: Ichiro Hattori, Akira Suzuki
  • Patent number: 5732393
    Abstract: A sound processor (12) calculates first through third parameters according to an LPC cepstrum, a primary delta cepstrum and a secondary delta cepstrum. The first parameter catches a static characteristic, the second parameter catches a dynamic characteristic with time, and the third parameter catches a locally dynamic characteristic with time. A word dictionary (14) stores first through third parameters for a standard pattern. Hence, a DP matching unit (16) recognizes a voice based on a distance between an input voice of the three parameters and the standard pattern.
    Type: Grant
    Filed: December 15, 1995
    Date of Patent: March 24, 1998
    Assignee: Toyota Jidosha Kabushiki Kaisha
    Inventor: Shigeki Aoshima