Similarity Patents (Class 704/239)
-
Patent number: 6470314Abstract: A method of adapting a speech recognition system to one or more acoustic conditions comprises the steps of: (i) computing cumulative distribution functions based on dimensions of speech vectors associated with training speech data provided to the speech recognition system; (ii) computing cumulative distribution functions based on dimensions of speech vectors associated with test speech data provided to the speech recognition system; (iii) computing a nonlinear transformation mapping based on the cumulative distribution functions associated with the training speech data and the cumulative distribution functions associated with the test speech data; and (iv) applying the nonlinear transformation mapping to speech vectors associated with the test speech data prior to recognition, wherein the speech vectors transformed in accordance with the nonlinear transformation mapping are substantially similar to speech vectors associated with the training speech data.Type: GrantFiled: April 6, 2000Date of Patent: October 22, 2002Assignee: International Business Machines CorporationInventors: Satyanarayana Dharanipragada, Mukund Padmanabhan
-
Patent number: 6427134Abstract: A voice activity detector suitable for deployment in a mobile phone apparatus is disclosed. An advantage of the voice activity detector is that it is better able to provide a decision (79) as to whether an input signal (19) consists of noise (which it is not desired to transmit) or comprises speech or information tones (which are required to be transmitted), especially in noisy environments. The voice activity detector includes a number of components, in particular an auxiliary voice activity detector (3). The auxiliary voice activity detector (3) distinguishes between noise and speech on the basis that the spectrum of speech changes more rapidly than that of noise. This results in the auxiliary detector (3) rarely mistaking a speech signal to be a noise signal. Hence, a very reliable noise template (421) is obtained. For this reason, the auxiliary detector (3) is also useful in noise reduction applications. The voice activity detector also uses a neural net classifier (7).Type: GrantFiled: September 26, 1998Date of Patent: July 30, 2002Assignee: British Telecommunications public limited companyInventors: Neil Robert Garner, Paul Alexander Barrett
-
Patent number: 6421640Abstract: The invention relates to a method of automatically recognizing speech utterances, in which a recognition result is evaluated by means of a first confidence measure and a plurality of second confidence measures determined for a recognition result is automatically combined for determining the first confidence measure. To reduce the resultant error rate in the assessment of the correctness of a recognition result, the method is characterized in that the determination of the parameters weighting the combination of the second confidence measures is based on a minimization of a cross-entropy-error measure. A further improvement is achieved by means of a post-processing operation based on the maximization of the Gardner-Derrida error function.Type: GrantFiled: September 13, 1999Date of Patent: July 16, 2002Assignee: Koninklijke Philips Electronics N.V.Inventors: Jannes G. A. Dolfing, Andreas Wendemuth
-
Patent number: 6411929Abstract: Frames making up an input speech are each collated with a string of phonemes representing speech candidates to be recognized, whereby evaluation values regarding the phonemes are computed. The frames are each compared with part of the phoneme string so as to reduce computations and memory capacity required in recognizing the input speech based on the evaluation values. That is, each frame is compared with a portion of the phoneme string to acquire an evaluation value for each phoneme. If the acquired evaluation value meets a predetermined condition, part of the phonemes to be collated with the next frame are changed. Illustratively, if the evaluation value for the phoneme heading a given portion of collated phonemes is smaller than the evaluation value of the phoneme which terminates that phoneme portion, then the head phoneme is replaced by the next phoneme. The new portion of phonemes obtained by the replacement is used for collation with the next frame.Type: GrantFiled: July 26, 2000Date of Patent: June 25, 2002Assignee: Hitachi, Ltd.Inventors: Kazuyoshi Ishiwatari, Kazuo Kondo, Shinji Wakisaka
-
Patent number: 6408271Abstract: The invention relates to a method and apparatus for generating phrasal transcriptions. The invention provides generating a group of word transcriptions for each vocabulary item in an orthographic phrase. According to a first embodiment, the invention further provides permuting the word transcriptions to generate a plurality of phrasal transcriptions and computing a score for each phrasal transcription in the plurality of phrasal transcriptions. The set of phrasal transcriptions is then selected from the plurality of phrasal transcriptions at least in part on a basis of the score data elements and stored in a format suitable for use by a speech recognition dictionary. As a variant, the phrasal transcriptions may be released in a format suitable for use by a speech synthesizer.Type: GrantFiled: September 24, 1999Date of Patent: June 18, 2002Assignee: Nortel Networks LimitedInventors: Kenneth W. Smith, Michael G. Sabourin
-
Patent number: 6404925Abstract: Methods for segmenting audio-video recording of meetings containing slide presentations by one or more speakers are described. These segments serve as indexes into the recorded meeting. If an agenda is provided for the meeting, these segments can be labeled using information from the agenda. The system automatically detects intervals of video that correspond to presentation slides. Under the assumption that only one person is speaking during an interval when slides are displayed in the video, possible speaker intervals are extracted from the audio soundtrack by finding these regions. Since the same speaker may talk across multiple slide intervals, the acoustic data from these intervals is clustered to yield an estimate of the number of distinct speakers and their order. Clustering the audio data from these intervals yields an estimate of the number of different speakers and their order.Type: GrantFiled: March 11, 1999Date of Patent: June 11, 2002Assignees: Fuji Xerox Co., Ltd., Xerox CorporationInventors: Jonathan T. Foote, Lynn Wilcox
-
Patent number: 6397086Abstract: The disclosure relates to an infrared controlled hand-free operator for cellular phones housed in a vehicle. More particularly, it is concerned with a hand-free operator which can automatically transmit an infrared signal similar to a control signal of a vehicle's audio stereo system to turn the stereo system off or mute when income signals are received by a cellular phone mounted to a vehicle. The hand-free operator thus can automatically cut off the noise sound of an audio stereo system in operation so as to make the operation of a cellular phone free of interference in a vehicle on the move.Type: GrantFiled: June 22, 1999Date of Patent: May 28, 2002Assignee: E-Lead Electronic Co., Ltd.Inventor: Tonny Chen
-
Patent number: 6393397Abstract: An apparatus for selecting a cohort model for use in a speaker verification system includes a model generator (108) for determining a target speaker model (114) from a speech sample collected from the target speaker (106). A cohort selector (110) determines a similarity value between each of a number of predetermined existing speaker models from a model pool (112) and the target speaker model (114) and a dissimilarity value between each of the existing speaker models and any previously selected cohort models (116). An existing speaker model which is most similar to the target speaker model, but most dissimilar to previously chosen cohort models, is then chosen as another cohort model for the target speaker.Type: GrantFiled: June 14, 1999Date of Patent: May 21, 2002Assignee: Motorola, Inc.Inventors: Ho Chuen Choi, Xiaoyuan Zhu, Jianming Song
-
Patent number: 6349148Abstract: The invention relates to a device for the verification of time-dependent, user-specific signals which includes means for generating a set of feature vectors which serve to provide an approximative description of an input signal and are associated with selectable sampling intervals of the signal; means for preparing an HMM model for the signal; means for determining a first probability value which describes the probability of occurrence of the set of feature vectors, given the HMM model, and a threshold decider for comparing the first probability value with a threshold value and for deciding on the verification of the signal.Type: GrantFiled: May 26, 1999Date of Patent: February 19, 2002Assignee: U.S. Philips CorporationInventor: Jannes G.A. Dolfing
-
Patent number: 6341263Abstract: A voice recognition system, method and storage medium is provided. The system includes a plurality of storage sections, a selection section, an adaptation section, a plurality of calculation sections, an adaptation section, a normalization section and a decision section. The method includes the steps for performing the functions associated with the sections.Type: GrantFiled: May 17, 1999Date of Patent: January 22, 2002Assignee: NEC CorporationInventors: Eiko Yamada, Hiroaki Hattori
-
Publication number: 20010047257Abstract: A method of assigning a similarity score representative of a similarity between a first speech signal and a second speech signal. The method includes generating a signal transformation responsive to both the first and second signals, determining a transformation score based on at least one characteristic of the generated transformation and calculating the similarity score as a function of the transformation score.Type: ApplicationFiled: January 24, 2001Publication date: November 29, 2001Inventors: Gabriel Artzi, Yaron Paz, Yehuda Hershkovits
-
Patent number: 6314392Abstract: In a computerized method a continuous signal is segmented in order to determine statistically stationary units of the signal. The continuous signal is sampled at periodic intervals to produce a timed sequence of digital samples. Fixed numbers of adjacent digital samples are grouped into a plurality of disjoint sets or frames. A statistical distance between adjacent frames is determined. The adjacent sets are merged into a larger set of samples or cluster if the statistical distance is less than a predetermined threshold. In an iterative process, the statistical distance between the adjacent sets are determined, and as long as the distance is less than the predetermined threshold, the sets are iteratively merged to segment the signal into statistically stationary units.Type: GrantFiled: September 20, 1996Date of Patent: November 6, 2001Assignee: Digital Equipment CorporationInventors: Brian S. Eberman, William D. Goldenthal
-
Patent number: 6301559Abstract: To realize a speech recognition method and speech recognition device that reduce erroneous recognition for words that are not to be recognized and ambient sounds, and to improve the recognition capability, characteristic parameters of words to be recognized and characteristic parameters of words that are not to be recognized and ambient sounds are previously entered in a vocabulary template 40. Degrees of similarity are obtained in a speech recognition section 30 between characteristic parameters for input words (or sounds) and all the characteristic parameters stored in the vocabulary template 40. Information indicating one characteristic parameter, from among the characteristic parameters stored in the vocabulary template 40, that is the closest approximation to the characteristic parameter of the input word (or sound), is generated as a recognition result.Type: GrantFiled: November 16, 1998Date of Patent: October 9, 2001Assignee: Oki Electric Industry Co., Ltd.Inventors: Hiroshi Shinotsuka, Noritoshi Hino
-
Patent number: 6292776Abstract: A method and apparatus for first training and then recognizing speech. The method and apparatus use subband cepstral features to improve the recognition string accuracy rates for speech inputs.Type: GrantFiled: March 12, 1999Date of Patent: September 18, 2001Assignee: Lucent Technologies Inc.Inventor: Rathinavelu Chengalvarayan
-
Patent number: 6269335Abstract: A method of identifying homophones of a word uttered by a user from at least a portion of existing words of a vocabulary of a speech recognition engine comprises the steps of: a user uttering the word; decoding the uttered word; computing respective measures between the decoded word and at least a portion of the other existing vocabulary words, the respective measures indicative of acoustic similarity between the word and the at least a portion of other existing words; if at least one measure is within a threshold range, indicating, to the user, results associated with the at least one measure, the results preferably including the decoded word and the other existing vocabulary word associated with the at least one measure; and the user preferably making a selection depending on the word the user intended to utter.Type: GrantFiled: August 14, 1998Date of Patent: July 31, 2001Assignee: International Business Machines CorporationInventors: Abraham Ittycheriah, Stephane Herman Maes, Michael Daniel Monkowski, Jeffrey Scott Sorensen
-
Publication number: 20010010039Abstract: Apparatus for Mandarin Chinese speech recognition by using initial/final phoneme similarity vector, for improving the Chinese speech recognition accuracy and downsizing the needed memory is provided.Type: ApplicationFiled: December 8, 2000Publication date: July 26, 2001Applicant: Matsushita Electrical Industrial Co., Ltd.Inventor: Chung-Ho Yang
-
Patent number: 6263309Abstract: A set of speaker dependent models is trained upon a comparatively large number of training speakers, one model per speaker, and model parameters are extracted in a predefined order to construct a set of supervectors, one per speaker. Principle component analysis is then performed on the set of supervectors to generate a set of eigenvectors that define an eigenvoice space. If desired, the number of vectors may be reduced to achieve data compression. Thereafter, a new speaker provides adaptation data from which a supervector is constructed by constraining this supervector to be in the eigenvoice space based on a maximum likelihood estimation. The resulting coefficients in the eigenspace of this new speaker may then be used to construct a new set of model parameters from which an adapted model is constructed for that speaker. Environmental adaptation may be performed by including environmental variations in the training data.Type: GrantFiled: April 30, 1998Date of Patent: July 17, 2001Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Patrick Nguyen, Roland Kuhn, Jean-Claude Junqua
-
Patent number: 6260012Abstract: An apparatus and method for performing improved speech recognition in a communication terminal, e.g., a mobile phone with a hands-free voice dialing function. In a speech recognition mode, a user's input speech such as a desired called party name, number or a phone command, is converted to feature data and compared to individual pre-stored feature data sets corresponding to pre-recorded speech obtained during a registration process. Difference values representing the respective differences between the current user's input speech and the respective data sets are computed. A first closest (most similar) and second closest feature data set correspond to the first smallest and second smallest difference values so obtained. A closeness threshold is computed as the sum of a small, predetermined threshold and a differential value between the first and second difference values.Type: GrantFiled: March 1, 1999Date of Patent: July 10, 2001Assignee: Samsung Electronics Co., LTDInventor: Joung-Kyou Park
-
Patent number: 6256630Abstract: A database accessing system for processing a request to access a database including a multiplicity of entries, each entry including at least one word, the request including a sequence of representations of possibly erroneous user inputs, the system including a similar word finder operative, for at least one interpretation of each representation, to find at least one database word which is at least similar to that interpretation, and a database entry evaluator operative, for each database word found by the similar word finder, to assign similarity values for relevant entries in the database, said values representing the degree of similarity between each database entry and the request.Type: GrantFiled: June 17, 1999Date of Patent: July 3, 2001Assignee: Phonetic Systems Ltd.Inventors: Atzmon Gilai, Hezi Resnekov
-
Patent number: 6246982Abstract: A method for computing a distance between collections of distributions or finite mixture models of features. Data is processed so as to define at least first and second collections of distributions of features. For each distribution of the first collection, the distance to each distribution of the second collection is measured to determine which distribution of the second collection is the closest (most similar). The same procedure is performed for the distributions of the second collection. Based on the closest distance measures, a final distance is computed representing the distance between the first and second collections. This final distance may be a weighted sum of the closest distances. The distance measure may be used in a number of applications such as [speaker classification,] speaker recognition and audio segmentation.Type: GrantFiled: January 26, 1999Date of Patent: June 12, 2001Assignee: International Business Machines CorporationInventors: Homayoon S. M. Beigi, Stephane H. Maes, Jeffrey S. Sorensen
-
Publication number: 20010003173Abstract: A method for increasing voice recognition rate in a voice recognition system comprising the steps of: establishing a reference model for user voices subjected to recognition; receiving the user voices for voice recognition commands; detecting the range and characteristics of the received voice data; comparing the range and characteristics of the detected voice data with the characteristics of the previously obtained reference voice model to retrieve a word having the largest similarity; comparing the similarity of the retrieved word with the similarity reference value to report a voice recognition failure when the compared result is below the reference value, and to report a voice recognition success and perform the command corresponding to the recognized word when the compared result is at least the reference value; and modifying the characteristics of the voice data which succeeded in the voice recognition into the reference voice model which was used in the corresponding voice recognition.Type: ApplicationFiled: December 6, 2000Publication date: June 7, 2001Applicant: LG Electronics Inc.Inventor: Keun Ok Lim
-
Patent number: 6236964Abstract: A speech recognition method and apparatus in which a speech section is sliced by the unit of a word by spotting and candidate words are selected. Next, in a second stage, matching is conducted by the unit of a phoneme. Consequently, selection of the candidate words and slicing of the speech section can be performed concurrently. Furthermore, narrowing of the candidate words is facilitated. Furthermore, since reference phoneme patterns under a plurality of environments are prepared, recognition of an input speech under a larger number of conditions is possible using a smaller amount of data when compared with the case in which reference word patterns under a plurality of environments are prepared.Type: GrantFiled: February 14, 1994Date of Patent: May 22, 2001Assignee: Canon Kabushiki KaishaInventors: Junichi Tamura, Tetsuo Kosaka, Atsushi Sakurai
-
Patent number: 6211876Abstract: Methods and apparatus are provided for accessing an experience journal which includes unstructured text items relating to a topic, such as a medical condition. The method is implemented in a computer system including a processor, a storage device, a video display unit having a display screen, and a user interface. The unstructured text items are stored in the storage device. Similarities among the unstructured text items are determined, and icons, one corresponding to each of the unstructured text items, are displayed on the display screen. The icons are positioned on the display screen relative to each other, such that the distances between icons are representative of the determined similarities among the unstructured text items. In response to user selection of one of the icons, the corresponding unstructured text item is displayed on the display screen.Type: GrantFiled: June 22, 1998Date of Patent: April 3, 2001Assignee: Mitsubishi Electric Research Laboratories, Inc.Inventors: Edith Ackermann, Dennis Nathan Bromley, David Ray DeMaso, Sara Frances Frisken Gibson, Joseph Gonzalez-Heydrich, Judith Galler Karlin, Joseph Marks, Chia Shen, Carol Strohecker
-
Patent number: 6195634Abstract: Assessing decoys for use in an audio recognition process for identifying predetermined sounds in an unknown input audio signal, involves a test recognition process for matching known training audio signals to models representing the predetermined sounds and the decoys and determining for each of the decoys, from the results of the test recognition process, a score representing the effect of the respective decoy in the recognition of any of the known training audio signals. An advantage arising from generating scores for decoys is that the chance of a poor selection of decoys can be reduced. Thus the possibility of poor recognition performance arising from poorly selected decoys can be reduced. Furthermore, the requirement for expert input into the decoy creation process, which may be time consuming, can be reduced. This can make it easier, or quicker, or less expensive to install.Type: GrantFiled: December 24, 1997Date of Patent: February 27, 2001Assignee: Nortel Networks CorporationInventors: Martin Dudemaine, Claude Pelletier
-
Patent number: 6192337Abstract: A method of training at least one new word for addition to a vocabulary of a speech recognition engine containing existing words comprises the steps of: a user uttering the at least one new word; computing respective measures between the at least one newly uttered word and at least a portion of the existing vocabulary words, the respective measures indicative of acoustic similarity between the at least one word and the at least a portion of existing words; if no measure is within the threshold range, automatically adding the at least one newly uttered word to the vocabulary; and if at least one measure is within a threshold range, refraining from automatically adding the at least one newly uttered word to the vocabulary.Type: GrantFiled: August 14, 1998Date of Patent: February 20, 2001Assignee: International Business Machines CorporationInventors: Abraham Ittycheriah, Stephane H. Maes
-
Patent number: 6182039Abstract: The speech recognizer incorporates a language model that reduces the number of acoustic pattern matching sequences that must be performed by the recognizer. The language model is based on knowledge of a pre-defined set of syntactically defined content and includes a data structure that organizes the content according to acoustic confusability. A spelled name recognition system based on the recognizer employs a language model based on classes of letters that the recognizer frequently confuses for one another. The language model data structure is optionally an N-gram data structure, a tree data structure, or an incrementally configured network that is built during a training sequence. The incrementally configured network has nodes that are selected based on acoustic distance from a predetermined lexicon.Type: GrantFiled: March 24, 1998Date of Patent: January 30, 2001Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Luca Rigazio, Jean-Claude Junqua, Michael Galler
-
Patent number: 6157911Abstract: A method and a system substantially eliminates an erroneous voice recognition of repetitive elements in word spotting. One preferred embodiment according to the current invention eliminates erroneous voice recognition of repetitive elements by selectively prolonging a response time of words containing repetitive elements. In order to substantially eliminate the errors, in another preferred embodiment according to the current invention, words containing repetitive elements are marked by a silent key word.Type: GrantFiled: March 27, 1998Date of Patent: December 5, 2000Assignee: Ricoh Company, Ltd.Inventor: Masaru Kuroda
-
Patent number: 6151575Abstract: A source-adapted model for use in speech recognition is generated by defining a linear relationship between a first element of an initial model and a first element of the source-adapted model. Thereafter, speech data that corresponds to the first element of the initial model is assembled from a set of speech data for a particular source associated with the source-adapted model. A linear transform that maps between the assembled speech data and the first element of the initial model is then determined. Finally, a first element of the source-adapted model is produced from the first element of the initial model using the linear transform.Type: GrantFiled: October 28, 1997Date of Patent: November 21, 2000Assignee: Dragon Systems, Inc.Inventors: Michael Jack Newman, Laurence S. Gillick, Venkatesh Nagesha
-
Patent number: 6151576Abstract: Methods and apparatus of processing, storing and transmitting an original data stream of digitized speech samples. The method converts a stream of digitized speech samples to a stream of text and associated reliability measures. A mixed-media data stream is created with the stream of text as a text component and selected portions of the digitized stream of speech as a speech component. The selected portions are those whose corresponding reliability measures fall below a threshold. The threshold can be changed to change the amount of storage or bandwidth used by the mixed-media data stream. The mixed-media data stream can be searched and the results can be spoken as synthetic speech derived form the text component or as speech samples taken from the digitized speech component.Type: GrantFiled: August 11, 1998Date of Patent: November 21, 2000Assignee: Adobe Systems IncorporatedInventors: John E. Warnock, T. V. Raman
-
Patent number: 6134527Abstract: A method of testing a new vocabulary word is performed using any set of enrollment utterances provided by the user or from an available database. The present method preferably does not use separate training and similarity test utterances. This allows any or all available repetitions of a vocabulary word being enrolled to be used for training (204), therefore improving the robustness of the trained models. Likewise, any or all training repetitions can also be utilized for similarity analysis (212), providing additional test samples which should further improve the detection of acoustically similar words. Additionally, the similarity analysis progresses incrementally and does not need to continue if a confusable word is found. Finally, first and second thresholds could be employed (212, 302) to provide greater flexibility for a user training a speech recognition system.Type: GrantFiled: January 30, 1998Date of Patent: October 17, 2000Assignee: Motorola, Inc.Inventors: Jeffrey Arthur Meunier, Edward Srenger, Steven Albrecht
-
Patent number: 6094632Abstract: A speaker recognition device for judging whether or not an unknown speaker is an authentic registered speaker himself/herself executes `text verification using speaker independent speech recognition` and `speaker verification by comparison with a reference pattern of a password of a registered speaker`. A presentation section instructs the unknown speaker to input an ID and utter a specified text designated by a text generation section and a password. The `text verification` of the specified text is executed by a text verification section, and the `speaker verification` of the password is executed by a similarity calculation section. The judgment section judges that the unknown speaker is the authentic registered speaker himself/herself if both the results of the `text verification` and the `speaker verification` are affirmative.Type: GrantFiled: January 29, 1998Date of Patent: July 25, 2000Assignee: NEC CorporationInventor: Hiroaki Hattori
-
Patent number: 6052662Abstract: Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.Type: GrantFiled: January 29, 1998Date of Patent: April 18, 2000Assignee: Regents of the University of CaliforniaInventor: John E. Hogden
-
Patent number: 6018736Abstract: A database accessing system for processing a request to access a database including a multiplicity of entries, each entry including at least one word, the request including a sequence of representations of possibly erroneous user inputs, the system including a similar word finder operative, for at least one interpretation of each representation, to find at least one database word which is at least similar to that interpretation, and a database entry evaluator operative, for each database word found by the similar word finder, to assign similarity values for relevant entries in the database, said values representing the degree of similarity between each database entry and the request.Type: GrantFiled: November 20, 1996Date of Patent: January 25, 2000Assignee: Phonetic Systems Ltd.Inventors: Atzmon Gilai, Hezi Resnekov
-
Patent number: 6006182Abstract: Systems and methods consistent with the present invention determine whether to accept one of a plurality of intermediate recognition results output by a speech recognition system as a final recognition result. The system first combines a plurality of speech rejection features into a feature function in which weights are assigned to each rejection feature in accordance with a recognition accuracy of each rejection feature. Feature values are then calculated for each of the rejection features using the plurality of intermediate recognition results. The system next computes the feature function according to the calculated feature values to determine a rejection decision value. Finally, one of the plurality of intermediate recognition results is accepted as the final recognition result according to the rejection decision value.Type: GrantFiled: September 22, 1997Date of Patent: December 21, 1999Assignee: Northern Telecom LimitedInventors: Waleed Fakhr, Serge Robillard, Vishwa Gupta, Real Tremblay, Michael Sabourin, Jean-Francois Crespo
-
Patent number: 5999902Abstract: A recognizer is provided with a priori probability values (e.g., from some previous recognition) indicating how likely the various words of the recognizer's vocabulary are to occur in the particular context, and recognition "scores" are weighted by these values before a result (or results) is chosen. The recognizer also employs "pruning" whereby low-scoring partial results are discarded, so as to speed the recognition process. To avoid premature pruning of the more likely words, probability values are applied before the pruning decisions are made. A method of applying these probability values is described.Type: GrantFiled: July 16, 1997Date of Patent: December 7, 1999Assignee: British Telecommunications public Limited CompanyInventors: Francis James Scahill, Alison Diane Simons, Steven John Whittaker
-
Apparatus and method for processing natural language and apparatus and method for speech recognition
Patent number: 5991721Abstract: An apparatus and a method for processing a natural language arranged so as to improve the speech recognition rate. In an example search section, the degree of similarity between each of a plurality of examples of the actual use of the language stored in an example data base and each of a plurality of probable recognition results output from a recognition section, and one of the examples corresponding to the highest degree of similarity is selected. A final speech recognition result is obtained by using the selected example. The example search section calculates the degree of similarity by weighting the degree of similarity on the basis of a context according to at least one of the examples previously selected.Type: GrantFiled: May 29, 1996Date of Patent: November 23, 1999Assignee: Sony CorporationInventors: Yasuharu Asano, Masao Watari, Makoto Akabane, Tetsuya Kagami, Kazuo Ishii, Miyuki Tanaka, Yasuhiko Kato, Hiroshi Kakuda, Hiroaki Ogawa -
Patent number: 5987411Abstract: Methods and systems consistent with the present invention enroll a candidate phrase uttered by a user in a dictionary having at least one previously enrolled phrase. The system receives utterances of the candidate phrase and determines whether the first utterance is confusingly similar to a previously enrolled phrase and whether they are consistent with each other. The system then enrolls the candidate phrase in the dictionary according to these determinations.Type: GrantFiled: December 17, 1997Date of Patent: November 16, 1999Assignee: Northern Telecom LimitedInventors: Marco Petroni, Hung S. Ma
-
Patent number: 5970450Abstract: A speech recognition system, in which partial reference patterns, and cumulative similarities of these patterns, are stored in a temporary pattern memory. The partial reference patterns are to be used as subjects of a similarity computation with an input speech pattern that has its feature quantities extracted by a speech analyzing unit. A counting unit counts partial reference patterns having corresponding cumulative similarities that are higher than a threshold value stored in a threshold memory. A threshold computing unit computes a threshold of pruning from a correspondence relation between the number of partial reference patterns that have corresponding cumulative similarities that exceed the threshold, and the threshold. A similarity computing unit computes a similarity, with respect to the feature quantities, of partial reference patterns with corresponding cumulative similarities that are greater than the threshold of pruning.Type: GrantFiled: November 24, 1997Date of Patent: October 19, 1999Assignee: NEC CorporationInventor: Hiroaki Hattori
-
Patent number: 5953699Abstract: A speech recognition apparatus has an analysis section that outputs features of input speech as a time sequence of feature vectors defined for discrete time points corresponding to a processed speech frame. Reference paradigm utterances are converted into a time sequence of standard (reference) feature vectors. The possible continuous variation of standard feature vectors at each point in time is expressed by a line segment, or set of line segments, connecting the feature vectors for the two end points of the "movable" range within which the feature can change, rather than using a larger set of reference vectors as in a conventional multitemplate approach to speech recognition. For example, the continuous range of possible background noise levels in input speech defines a line segment connecting the two feature vectors at the two SNR value limits.Type: GrantFiled: October 28, 1997Date of Patent: September 14, 1999Assignee: NEC CorporationInventor: Keizaburo Takagi
-
Patent number: 5878390Abstract: A speech recognition apparatus which includes a speech recognition section for performing a speech recognition process on an uttered speech with reference to a predetermined statistical language model, based on a series of speech signal of the uttered speech sentence composed of a series of input words. The speech recognition section calculates a functional value of a predetermined erroneous sentence judging function with respect to speech recognition candidates, where the erroneous sentence judging representing a degree of unsuitability for the speech recognition candidates. When the calculated functional value exceeds a predetermined threshold value, the speech recognition section performs the speech recognition process by eliminating a speech recognition candidate corresponding to a calculated functional value.Type: GrantFiled: June 23, 1997Date of Patent: March 2, 1999Assignee: ATR Interpreting Telecommunications Research LaboratoriesInventors: Jun Kawai, Yumi Wakita
-
Patent number: 5848389Abstract: In a speech recognizing apparatus, a grammatical qualification of a proposed speech recognition result candidate is judged without using a grammatical rule. The speech recognizing apparatus for performing sentence/speech recognition is comprised of an analyzing unit for acoustically analyzing speech inputted therein to extract a feature parameter of the inputted speech; a recognizing unit for recognizing the inputted speech based upon the feature parameter outputted from said analyzing unit to thereby a plurality of proposed recognition result candidates; an example data base for storing therein a plurality of examples; and an example retrieving unit for calculating a resemblance degree between each of said plurality of proposed recognition result candidates and each of the plural examples stored in the example data base and for obtaining the speech recognition result based on said calculated resemblance degree.Type: GrantFiled: April 5, 1996Date of Patent: December 8, 1998Assignee: Sony CorporationInventors: Yasuharu Asano, Hiroaki Ogawa, Yasuhiko Kato, Tetsuya Kagami, Masao Watari, Makoto Akabane, Kazuo Ishii, Miyuki Tanaka, Hiroshi Kakuda
-
Patent number: 5848388Abstract: A recognition system includes a speech recognition processing unit for processing input speech signals to indicate similarity to predetermined patterns to be recognized. The recognition processing unit is arranged to repeatedly partition the input speech signal into a pattern-containing portion and, preceding and following the pattern-containing portions, noise or silence portions, and to identify a pattern corresponding to the pattern containing portion. An output supplies a recognition signal indicating recognition of one of the patterns. A pause detector detects the noise or silence portion which follows the pattern-containing portion. In response to its detection, a signal identifying the pattern currently corresponding to the pattern portion is supplied to the output. Also provided are similarly operating rejection portions.Type: GrantFiled: December 19, 1995Date of Patent: December 8, 1998Assignee: British Telecommunications plcInventors: Kevin Joseph Power, Stephen Howard Johnson, Francis James Scahill, Simon Patrick Ringland, John Edward Talintyre
-
Patent number: 5842161Abstract: A recognition criterion or set of recognition criteria are updated automatically, over time, in accordance with the speech input of the user(s). Each input utterance is compared to one or more models of speech to determine a similarity metric for each such comparison. A model of speech which most closely matches the utterance is determined based on the one or more similarity metrics. The similarity metric corresponding to the most closely matching model of speech is analyzed to determine whether the similarity metric satisfies the selected set of recognition criteria. The recognition criteria are automatically altered during use or "on-the-fly", so that more appropriate criteria (and associated thresholds) may be used to either increase the probability of recognition or decrease the incidence of false positive results. Illustratively, if a voice sample results in a near miss of a template, a more liberal criterion is thereafter employed to increase the probability of recognition for subsequent input.Type: GrantFiled: June 25, 1996Date of Patent: November 24, 1998Assignee: Lucent Technologies Inc.Inventors: Paul Wesley Cohrs, Mitra P. Deldar, Donald Marion Keen, Ellen Anne Keen
-
Patent number: 5828998Abstract: A discriminant or identification function is used for pattern recognition in which the highest performance can be offered when adaptation is made. Learning is carried out while a discriminant or identification function is adapted to a learning sample. For example, a standard pattern of the character "A" used as an identification function is learned such that when the character "A" slanting in the right or left direction is input, the standard pattern of the character "A" is rotated (adapted) in accordance with the slanting of the input learning sample.Type: GrantFiled: September 24, 1996Date of Patent: October 27, 1998Assignee: Sony CorporationInventor: Naoto Iwahashi
-
Patent number: 5822728Abstract: The multistage word recognizer uses a word reference representation based on reliably detected peaks of phoneme similarity values. The word reference representation captures the basic features of the words by targets that describe the location and shape of stable peaks of phoneme similarity values. The first stage of the word hypothesizer represents each reference word with statistical information on the number of high similarity regions over a predefined number of time intervals. The second stage represents each word by a prototype that consists of a series of phoneme targets and global statistics, namely the average word duration and average match rate. These represent the degree of fit of the word prototype to its training data. Word recognition scores generated in the two stages are converted to dimensionless normalized values and combined by averaging for use in selecting the most probable word candidates.Type: GrantFiled: September 8, 1995Date of Patent: October 13, 1998Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Ted H. Applebaum, Philippe R. Morin
-
Patent number: 5819219Abstract: A digital signal processor employable for utilization for speech processing or for some other pattern recognition overcomes the weaknesses of digital signal processors given the subtraction with following amount formation that must often be implemented in these applications, an auxiliary hardware is provided that contains the feature vector that is to be compared to reference feature vectors from the dictionary in a separate memory. The calculating work is thereby implemented by a separate arithmetic unit that provides a separate difference-forming and amount-forming unit for each feature comparison. The number of clock cycles of the digital signal processor required per comparison can be dramatically reduced by the invention. A suitable addressing method thereby assures that it is always corresponding features of the individual feature vectors that can be compared to one another.Type: GrantFiled: December 11, 1996Date of Patent: October 6, 1998Assignee: Siemens AktiengesellschaftInventors: Luc De Vos, Daniel Goryn
-
Patent number: 5809461Abstract: A speech recognition apparatus using a neural network is provided. A neuron-like element stores a value of its inner conditions. The neuron-like element also updates a value of its internal status on the basis of an output from the neuron-like element itself, outputs from other neuron-like elements and an external input outside. The neuron-like element also converts a value of its internal status into an external output. Accordingly, the neuron-like element itself can retain the history of input data. This enables time series data, such as speech, to be processed without providing any special devices in the neural network.Type: GrantFiled: June 7, 1995Date of Patent: September 15, 1998Assignee: Seiko Epson CorporationInventor: Mitsuhiro Inazumi
-
Patent number: 5806024Abstract: Harmonics coefficients are estimated in primary coefficients of an orthogonal transform of a speech or a music input signal by using a pitch frequency extracted from the input signal and are quantized into a harmonics code vector. Residue coefficients are calculated by removing the harmonics coefficients from the primary coefficients and quantized into residue code vectors and gain code vectors. It is possible to search harmonics excitation pulses at the harmonics locations for harmonics quantization into the harmonics code vector. On the other hand, it is possible to estimate the harmonics coefficients or excitation pulses by using quantized LSP parameters and to calculate secondary coefficients for use in weighting the harmonics quantization and residue quantization and, if applicable, in excitation pulse search.Type: GrantFiled: December 23, 1996Date of Patent: September 8, 1998Assignee: NEC CorporationInventor: Kazunori Ozawa
-
Method and device for rating of speech quality by calculating time delays from onset of vowel sounds
Patent number: 5806028Abstract: A method and device for determining quality of speech. The speech to be evaluated is listened to by a person who reproduces the speech. The end of vowel sounds in the produced and reproduced speech respectively are determined. The difference between the ends of the vowel sounds is registered. From the obtained time differences an average value is determined. The average value indicates the quality of the produced speech. The invention can be used for evaluation of different speech sources.Type: GrantFiled: February 14, 1996Date of Patent: September 8, 1998Assignee: Telia ABInventor: Bertil Lyberg -
Patent number: 5799274Abstract: A speech recognition system and method having an increased recognition accuracy for a compound word composed of a first word and a second word. Standard information corresponding to each of the registered words is stored. The standard information includes predetermined feature information and time information with respect to each of the registered words. The time information represents a continuous time length for pronouncing each of the registered words at a normal speed. Feature information extracted from an input word is compared with the standard information to obtain a similarity between the feature information and the standard information corresponding to one of the registered words. A determination time is set to determine a result of recognition when the compound word is input and when a first degree of similarity is obtained from the first word at a first time and a maximum degree of similarity is obtained from one of the second word and the compound word at a second time.Type: GrantFiled: September 18, 1996Date of Patent: August 25, 1998Assignee: Ricoh Company, Ltd.Inventor: Masaru Kuroda