Similarity Patents (Class 704/239)
  • Patent number: 7356466
    Abstract: A method and apparatus for calculating an observation probability includes a first operation unit that subtracts a mean of a first plurality of parameters of an input voice signal from a second parameter of an input voice signal, and multiplies the subtraction result to obtain a first output. The first output is squared and accumulated N times in a second operation unit to obtain a second output. A third operation unit subtracts a given weighted value from the second output to obtain a third output, and a comparator stores the third output for a comparator stores the third output in order to extract L outputs therefrom, and stores the L extracted outputs based on an order of magnitude of the extracted L outputs.
    Type: Grant
    Filed: June 20, 2003
    Date of Patent: April 8, 2008
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Byung-Ho Min, Tae-Su Kim, Hyun-Woo Park, Ho-Rang Jang, Keun-Cheol Hong, Sung-Jae Kim
  • Patent number: 7349576
    Abstract: A method for recognition of a handwritten character comprises the steps of determining a plurality of position features defining the handwritten character, and comparing the handwritten character to reference characters stored in a database in order to find the closest matching reference character. The step of comparing comprises the steps of computing a difference between one of the plurality of position features of the handwritten character and a corresponding position feature of one of the reference characters, determining, by lookup in a predefined table, a distance measure based on the computed difference and determining a distance measure for each of the plurality of position features of the handwritten character, and computing a cost function based on the determined distance measures. A device and a computer program for implementing the method are also described.
    Type: Grant
    Filed: January 11, 2002
    Date of Patent: March 25, 2008
    Assignee: Zi Decuma AB
    Inventor: Anders Holtsberg
  • Patent number: 7310600
    Abstract: A dynamic programming technique is provided for matching two sequences of phonemes both of which may be generated from text or speech. The scoring of the dynamic programming matching technique uses phoneme confusion scores, phoneme insertion scores and phoneme deletion scores which are obtained in advance in a training session and, if appropriate, confidence data generated by a recognition system if the sequences are generated from speech.
    Type: Grant
    Filed: October 25, 2000
    Date of Patent: December 18, 2007
    Assignee: Canon Kabushiki Kaisha
    Inventors: Philip Neil Garner, Jason Peter Andrew Charlesworth, Asako Higuchi
  • Patent number: 7299187
    Abstract: When a user issued voice command does not match grammars registered in advance, the voice command is identified as a sentence (step S305). This sentence is compared with the registered grammars to calculate a similarity (step S307). When the similarity is higher than a first threshold value (TH1), the voice command is executed (step S315). When the similarity is equal to or lower than the first threshold value (TH1) and higher than a second threshold value (TH2), command choices are displayed for the user and the user is permitted to select a command to be executed (step S319). When the similarity is equal to or lower than the second threshold value (TH2), the command is not executed (step S321). Furthermore, once a command has been executed it is added as a grammar, so that it can be identified when next it is used.
    Type: Grant
    Filed: February 10, 2003
    Date of Patent: November 20, 2007
    Assignee: International Business Machines Corporation
    Inventors: Yoshinori Tahara, Daisuke Tomoda, Kikuo Mitsubo, Yoshinori Atake
  • Patent number: 7284012
    Abstract: A system selects a plurality of attributes common to objects being compared and defines at least one value space for each selected attribute. Each selected attribute can have a value space that is different from a value space of the same attribute associated with a different object. The system defines an ordering system for each value space including selecting whether the value space consists of non-ordered values, partially ordered values, or fully ordered values. An objective function is defined for each ordering system and the objective function maps a pair of values of a value space to a number value. The system normalizes each objective function and defines a mapping from the plurality of objective functions to a first general objective function.
    Type: Grant
    Filed: January 24, 2003
    Date of Patent: October 16, 2007
    Assignee: International Business Machines Corporation
    Inventors: Dikran S. Meliksetian, Nianjun Zhou
  • Publication number: 20070225979
    Abstract: A similarity degree estimation method is performed by two processes. In a first process, an inter-band correlation matrix is created from spectral data of an input voice such that the spectral data are divided into a plurality of discrete bands which are separated from each other with spaces therebetween along a frequency axis, a plurality of envelope components of the spectral data are obtained from the plurality of the discrete bands, and elements of the inter-band correlation matrix are correlation values between the respective envelope components of the input voice. In a second process, a degree of similarity is calculated between a pair of input voices to be compared with each other by using respective inter-band correlation matrices obtained for the pair of the input voices through the inter-band correlation matrix creation process.
    Type: Application
    Filed: March 20, 2007
    Publication date: September 27, 2007
    Applicants: YAMAHA CORPORATION, WASEDA UNIVERSITY
    Inventors: Mikio Tohyama, Michiko Kazama, Satoru Goto, Takehiko Kawahara, Yasuo Yoshioka
  • Patent number: 7246061
    Abstract: A method for the voice-operated identification of the user of a telecommunications line in a telecommunications network is provided in the course of a dialog with a voice-operated dialog system. Utterances spoken by a caller from a group of callers limited to one telecommunications line are used during a human-to-human and/or human-to-machine dialog to apply a reference pattern for the caller. For each reference pattern, a user identifier is stored which is activated once the caller is identified, and, together with the CLI and/or ANI identifier of the telecommunications line, are made available to a server having a voice-controlled dialog system. On the basis of the CLI, including the user identifier, data previously stored for this user are ascertained by the system and made available for the dialog interface with the customer.
    Type: Grant
    Filed: December 15, 2000
    Date of Patent: July 17, 2007
    Assignee: Deutsche Telekom AG
    Inventors: Fred Runge, Christel Mueller, Marian Trinkel, Thomas Ziem
  • Patent number: 7219059
    Abstract: A method and apparatus for generating a pronunciation score by receiving a user phrase intended to conform to a reference phrase and processing the user phrase in accordance with at least one of an articulation-scoring engine, a duration scoring engine and an intonation-scoring engine to derive thereby the pronunciation score.
    Type: Grant
    Filed: July 3, 2002
    Date of Patent: May 15, 2007
    Assignee: Lucent Technologies Inc.
    Inventors: Sunil K. Gupta, Ziyi Lu, Fengguang Zhao
  • Patent number: 7200558
    Abstract: A prosody generation apparatus capable of suppressing distortion that occurs when generating prosodic patterns and therefore generating a natural prosody is provided. A prosody changing point extraction unit in this apparatus extracts a prosody changing point located at the beginning and the ending of a sentence, the beginning and the ending of a breath group, an accent position and the like. A selection rule and a transformation rule of a prosodic pattern including the prosody changing point is generated by means of a statistical or learning technique and the thus generate rules are stored in a representative prosodic pattern selection rule table and a transformation rule table beforehand. A pattern selection unit selects a representative prosodic pattern from the representative prosodic pattern selection rule table according to the selection rule.
    Type: Grant
    Filed: March 8, 2002
    Date of Patent: April 3, 2007
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Yumiko Kato, Takahiro Kamai
  • Patent number: 7177402
    Abstract: Embodiments of the invention are directed to automated reception systems that may receive voice information indicating action to be taken by the system. The automated reception system may receive a call and transmit a speech message to the caller identifying actions that the caller may ask the system to take. The caller may verbally select an action for the system to execute. Possible actions may depend upon the system context and information about possible actions may be provided to the caller through dynamically generated messages. The caller may also access voicemail or electronic mail messages using embodiments of the invention. Furthermore, in some embodiments, a caller may be able to control a separate communication session using voice or other commands input during a telephone session.
    Type: Grant
    Filed: March 1, 2001
    Date of Patent: February 13, 2007
    Assignee: Applied Voice & Speech Technologies, Inc.
    Inventor: Michael D. Metcalf
  • Patent number: 7177863
    Abstract: A system and associated method for tuning a data clustering program to a clustering task, determine at least one internal parameter of a data clustering program. The determination of one or more of the internal parameters of the data clustering program occurs before the clustering begins. Consequently, clustering does not need to be performed iteratively, thus improving clustering program performance in terms of the required processing time and processing resources. The system provides pairs of data records; the user indicates whether or not these data records should belong to the same cluster. The similarity values of the records of the selected pairs are calculated based on the default parameters of the clustering program. From the resulting similarity values, an optimal similarity threshold is determined. When the optimization criterion does not yield a single optimal similarity threshold range, equivalent candidate ranges are selected.
    Type: Grant
    Filed: March 14, 2003
    Date of Patent: February 13, 2007
    Assignee: International Business Machines Corporation
    Inventors: Boris Charpiot, Barbara Hartel, Christoph Lingenfelder, Thilo Maier
  • Patent number: 7171360
    Abstract: A speaker identification system includes a speaker model generator 110 for generating a plurality of speaker models. To this end, the generator records training utterances from a plurality of speakers in the background, without prior knowledge of the speakers who spoke the utterances. The generator performs a blind clustering of the training utterances based on a predetermined criterion. For each of the clusters a corresponding speaker model is trained. A speaker identifier 130 identifies a speaker determining a most likely one of the speaker models for an utterance received from the speaker. The speaker associated with the most likely speaker model is identified as the speaker of the test utterance.
    Type: Grant
    Filed: May 7, 2002
    Date of Patent: January 30, 2007
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Chao-Shih Huang, Ya-Cherng Chu, Wei-Ho Tsai, Jyh-Min Cheng
  • Patent number: 7165031
    Abstract: A speech recognition method and apparatus is disclosed in which outputs a confidence score indicative of the posterior probability of an utterance being correctly matched to a word model. The confidence score for the matching of an utterance to a word model is determined directly from the generated values indicative of the goodness of match between the utterance and stored word models utilizing the following equation: confidence = exp ? ( - 2 ? ? S ? ( x | w ) ) ? words ? exp ? ( 2 ? ? S ? ( x | w ) ) where S(x|w) is the match score for the correlation between a signal x and word w and ? is an experimentally determined constant.
    Type: Grant
    Filed: November 6, 2002
    Date of Patent: January 16, 2007
    Assignee: Canon Kabushiki Kaisha
    Inventor: David Llewellyn Rees
  • Patent number: 7139706
    Abstract: A comprehensive system is provided for designing a voice activated user interface (VA UI) having a semantic and syntactic structure adapted to the culture and conventions of spoken language for the intended users. The system poses, to at least one respondent, a hypothetical task to be performed; asks each of the at least one respondent for a word that the respondent would use to command the hypothetical task to be performed; receives, from each of the at least one respondent, a command word; develops a list of command words from the received command word; and rejects the received command word, if the received command word is acoustically similar to another word in the list of command words. The approach is general across languages and encompasses universal variables of language and culture. Also provided are prompting grammar and error handling methods adapted to such user interfaces.
    Type: Grant
    Filed: August 12, 2002
    Date of Patent: November 21, 2006
    Assignee: Comverse, Inc.
    Inventor: Matthew John Yuschik
  • Patent number: 7054812
    Abstract: A system is provided for determining a sequence of sub-word units representative of at least two words output by a word recognition unit in response to an input word to be recognized. In a preferred embodiment, the word alternatives output by the recognition unit are converted into sequences of phonemes. An optimum alignment between these sequences is then determined using a dynamic programming alignment technique. The sequence of phonemes representative of the input sequences is then determined using this optimum alignment.
    Type: Grant
    Filed: April 25, 2001
    Date of Patent: May 30, 2006
    Assignee: Canon Kabushiki Kaisha
    Inventors: Jason Peter Andrew Charlesworth, Philip Neil Garner
  • Patent number: 7031917
    Abstract: The present invention relates to a speech recognition apparatus and a speech recognition method for speech recognition with improved accuracy. A distance calculator 47 determines the distance from a microphone 21 to a user uttering. Data indicating the determined distance is supplied to a speech recognition unit 41B. The speech recognition unit 41B has plural sets of acoustic models produced from speech data obtained by capturing speeches uttered at various distances. From those sets of acoustic models, the speech recognition unit 41B selects a set of acoustic models produced from speech data uttered at a distance closest to the distance determined by the distance calculator 47, and the speech recognition unit 41B performs speech recognition using the selected set of acoustic models.
    Type: Grant
    Filed: October 21, 2002
    Date of Patent: April 18, 2006
    Assignee: Sony Corporation
    Inventor: Yasuharu Asano
  • Patent number: 7006970
    Abstract: Disclosed is a method for obtaining a precise detected value of a similarity between voices or the like. Standard and input pattern matrices, each having a voice feature amount as a component, are prepared (S1 and S2). A reference shape having a variance different for each specified component of the pattern matrices is prepared, and positive and negative reference pattern vectors, each having a value of the reference shape as a component, are prepared. Then, while the specified component (a center of the reference shape) being made to move to each component position j1=1 to m1, j2=1 to m2 of the standard pattern matrix, a shape change between the standard and input pattern matrices is substituted for shape changes of the positive and negative reference pattern vectors. And, an amount of change in kurtosis of each reference pattern vector is numerically evaluated to obtain a shape change amount Dj1j2 (S3). Then, a value of a geometric distance between the pattern matrices is calculated from Dj1j2 (S4).
    Type: Grant
    Filed: September 11, 2001
    Date of Patent: February 28, 2006
    Assignee: Entropy Software Laboratory, Inc.
    Inventors: Michihiro Jinnai, Hiroshi Yamaguchi
  • Patent number: 7003456
    Abstract: A computer-based method of routing a message to a system includes receiving a message, and processing the message using large-vocabulary continuous speech recognition to generate a string of text corresponding to the message. The method includes generating a confidence estimate of the string of text corresponding to the message and comparing the confidence estimate to a predetermined threshold. If the confidence estimate satisfies the predetermined threshold, the string of text is forwarded to the system. If the confidence estimate does not satisfy the predetermined threshold, the information relating to the message is forwarded to a transcriptionist. The message may include one or more utterances. Each utterance in the message may be separately or jointly processed. In this way, a confidence estimate may be generated and evaluated for each utterance or for the whole message. Information relating to each utterance may be separately or jointly forwarded based on the results of the generation and evaluation.
    Type: Grant
    Filed: June 12, 2001
    Date of Patent: February 21, 2006
    Inventors: Laurence S. Gillick, Robert Roth, Linda Manganaro, Barbara R. Peskin, David C. Petty, Ashwin Rao
  • Patent number: 6996527
    Abstract: A common requirement in automatic speech recognition is to recognize a set of words for any speaker without training the system for each new speaker. A speech recognition system is provided utilizing linear discriminant based phonetic similarities with inter-phonetic unit value normalization. Linear discriminant analysis is utilized using training data with both in-class and out-class sample training utterances for generating linear discriminant vectors for each of the phonetic units. The dot product of each linear discriminant vector and the time spectral pattern vectors generated from the input speech are computed. The resultant raw similarity vectors are then normalized utilizing normalization look-up tables for providing similarity vectors which are utilized by a word matcher for word recognition.
    Type: Grant
    Filed: July 26, 2001
    Date of Patent: February 7, 2006
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Robert C. Boman, Philippe R. Morin, Ted H. Applebaum
  • Patent number: 6961701
    Abstract: An extended-word selecting section calculates a score for a phoneme string formed of one more phonemes, corresponding to a user's speech, and searches a large-vocabulary-dictionary for a word having one or more phonemes equal to or similar to those of a phoneme string having a score equal to or higher than a predetermined value. A matching section calculates scores for the word searched for by the extended-word selecting section in addition to a word preliminary word-selecting section. A control section determines a word string as the result of recognition of the speech uttered by the user.
    Type: Grant
    Filed: March 3, 2001
    Date of Patent: November 1, 2005
    Assignee: Sony Corporation
    Inventors: Hiroaki Ogawa, Katsuki Minamino, Yasuharu Asano, Helmut Lucke
  • Patent number: 6947890
    Abstract: A method and system are provided for speech recognition. The speech recognition method includes the steps of preparing training data representing acoustic parameters of each of phonemes at each time frame; receiving an input signal representing a sound to be recognized and converting the input signal to input data; comparing the input data at each frame with the training data of each of the phonemes to derive a similarity measure of the input data with respect to each of the phonemes; and processing the similarity measures obtained in the comparing step using a neural net model governing development of activities of plural cells to conduct speech recognition of the input signal.
    Type: Grant
    Filed: May 30, 2000
    Date of Patent: September 20, 2005
    Inventors: Tetsuro Kitazoe, Sung-Ill Kim, Tomoyuki Ichiki
  • Patent number: 6937982
    Abstract: A speech recognition apparatus recognizes a speech signal received from a speaker and provides the result of recognition for an external device. In the apparatus, a pattern matching section performs pattern matching between each of reference patterns in a vocabulary and characteristic parameters extracted from the speech signal. The vocabulary includes reference patterns corresponding to words. Further the apparatus has a similar sound group which includes reference patterns corresponding to the sound similar to that of a specific word. The specific word is a word in response to which the external device performs an operation which cannot be easily undone. The speech signal is rerecognized by using the similar sound group. As a result, the pattern matching section outputs a word other than the specific word, if one of the reference patterns in the similar sound group has a high similarity with the characteristic parameters.
    Type: Grant
    Filed: July 19, 2001
    Date of Patent: August 30, 2005
    Assignee: Denso Corporation
    Inventors: Norihide Kitaoka, Hiroshi Ohno
  • Patent number: 6910012
    Abstract: A method for performing speech recognition can include the steps of providing a grammar including entries comprising a parent word and a pseudo word being substantially phonetically equivalent to the parent word. The grammar can provide a translation from the pseudo word to the parent word. The parent word can be received as speech and the speech can be compared to the grammar entries. Additionally, the speech can be matched to the pseudo word and the pseudo word can be translated to the parent word.
    Type: Grant
    Filed: May 16, 2001
    Date of Patent: June 21, 2005
    Assignee: International Business Machines Corporation
    Inventors: Matthew W. Hartley, David E. Reich
  • Patent number: 6907367
    Abstract: A method for segmenting a signal into segments having similar spectral characteristics is provided. Initially the method generates a table of previous values from older signal values that contains a scoring value for the best segmentation of previous values and a segment length of the last previously identified segment. The method then receives a new sample of the signal and computes a new spectral characteristic function for the signal based on the received sample. A new scoring function is computed from the spectral characteristic function. Segments of the signal are recursively identified based on the newly computed scoring function and the table of previous values. The spectral characteristic function can be a selected one of an autocorrelation function and a discrete Fourier transform. An example is provided for segmenting a speech signal.
    Type: Grant
    Filed: August 31, 2001
    Date of Patent: June 14, 2005
    Assignee: The United States of America as represented by the Secretary of the Navy
    Inventor: Paul M. Baggenstoss
  • Patent number: 6882970
    Abstract: A system is provided for comparing an input query with a number of stored annotations to identify information to be retrieved from a database. The comparison technique divides the input query into a number of fixed-size fragments and identifies how many times each of the fragments occurs within each annotation using a dynamic programming matching technique. The frequencies of occurrence of the fragments in both the query and the annotation are then compared to provide a measure of the similarity between the query and the annotation. The information to be retrieved is then determined from the similarity measures obtained for all the annotations.
    Type: Grant
    Filed: October 25, 2000
    Date of Patent: April 19, 2005
    Assignee: Canon Kabushiki Kaisha
    Inventors: Philip Neil Garner, Jason Peter Andrew Charlesworth, Asako Higuchi
  • Patent number: 6839667
    Abstract: A method for performing speech recognition can include receiving user speech and determining a plurality of potential candidates. Each of the candidates can provide a textual interpretation of the speech. Confidence scores can be calculated for the candidates. The confidence scores can be compared to a predetermined threshold. Also, selected ones of the plurality of candidates can be presented to the user as alternative interpretations of the speech when none of the confidence scores is greater than the predetermined threshold. The selected ones of the plurality of candidates can have confidence scores above a predetermined minimum threshold, and thus can have confidence scores within a predetermined range.
    Type: Grant
    Filed: May 16, 2001
    Date of Patent: January 4, 2005
    Assignee: International Business Machines Corporation
    Inventor: David E. Reich
  • Publication number: 20040249637
    Abstract: A method of speech recognition obtains acoustic data from a plurality of conversations. A plurality of pairs of utterances are selected from the plurality of conversations. At least one portion of the first utterance of the pair of utterances is dynamically aligned with at least one portion of the second utterance of the pair of utterance, and an acoustic similarity is computed. At least one pair that includes a first portion from a first utterance and a second portion from a second utterance is chosen, based on a criterion of acoustic similarity. A common pattern template is created from the first portion and the second portion.
    Type: Application
    Filed: June 2, 2004
    Publication date: December 9, 2004
    Applicant: Aurilab, LLC
    Inventor: James K. Baker
  • Patent number: 6823308
    Abstract: A speech recognition method for use in a multimodal input system comprises receiving a multimodal input comprising digitized speech as a first modality input and data in at least one further modality input. Features in the speech and in the data in at least one further modality are identified. The identified features in the speech and in the data are used in the recognition of words by comparing the identified features with states in models for the words. The models have states for the recognition of speech and for words having features in at least one further modality associated with the words, the models also have states for the recognition of events in the further modality or each further modality.
    Type: Grant
    Filed: February 16, 2001
    Date of Patent: November 23, 2004
    Assignee: Canon Kabushiki Kaisha
    Inventors: Robert Alexander Keiller, Nicolas David Fortescue
  • Publication number: 20040215454
    Abstract: A speech recognizer 300 built into a navigation apparatus 100 includes a noise estimator 320 which calculates a noise model based on a microphone input signal, an adaptive processor 330 which performs an adaptive process on each keyword model and each non-keyword model stored in an HMM database 310 based on the noise model. The adaptive processor 330 performs a data adaptation process on each keyword model and non-keyword model based on the noise model and word spotting is performed based on the keyword models and the non-keyword models subjected to the data adaptation process.
    Type: Application
    Filed: April 22, 2004
    Publication date: October 28, 2004
    Inventors: Hajime Kobayashi, Kengo Hanai
  • Publication number: 20040204939
    Abstract: A speaker change detection system performs speaker change detection on an input audio stream. The speaker change detection system includes a segmentation component [401], a phone classification decode component [402], and a speaker change detection component [403]. The segmentation component [401] segments the audio stream into segments [501-504] of predetermined length intervals. The segments may overlap one another. The phone classification decode component decodes the intervals to produce a set of phone classes corresponding to each of the intervals. The speaker change detection component detects locations of speaker changes in the audio stream based on a similarity value calculated at phone class boundaries.
    Type: Application
    Filed: October 16, 2003
    Publication date: October 14, 2004
    Inventors: Daben Liu, Francis G. Kubala
  • Patent number: 6778957
    Abstract: Disclosed is a method of automated handset identification, comprising receiving a sample speech input signal from a sample handset; deriving a cepstral covariance sample matrix from said first sample speech signal; calculating, with a distance metric, all distances between said sample matrix and one or more cepstral covariance handset matrices, wherein each said handset matrix is derived from a plurality of speech signals taken from different speakers through the same handset; and determining if the smallest of said distances is below a predetermined threshold value.
    Type: Grant
    Filed: August 21, 2001
    Date of Patent: August 17, 2004
    Assignee: International Business Machines Corporation
    Inventors: Zhong-Hua Wang, David Lubensky, Cheng Wu
  • Patent number: 6766294
    Abstract: A performance gauge for use in conjunction with a transcription system including a speech processor linked to at least one speech recognition engine and at least one transcriptionist. The speech processor includes an input for receiving speech files and storage means for storing the received speech files until such a time that they are forwarded to a selected speech recognition engine or transcriptionist for processing. The system includes a transcriptionist text file database in which manually transcribed transcriptionist text files are stored, each stored transcriptionist text file including time stamped data indicative of position within an original speech file. The system further includes a recognition engine text file database in which recognition engine text files transcribed via the at least one speech recognition engine are stored, each stored recognition engine text file including time stamped data indicative of position within an original speech file.
    Type: Grant
    Filed: November 30, 2001
    Date of Patent: July 20, 2004
    Assignee: Dictaphone Corporation
    Inventors: Andrew MacGinite, James Cyr, Martin Hold, Channell Greene, Regina Kuhnen
  • Patent number: 6725196
    Abstract: A method and apparatus is provided for matching a first sequence of patterns representative of a first signal with a second sequence of patterns representative of a second signal. The system uses a plurality of different pruning thresholds (th) to control the propagation of paths which represent possible matchings between a sequence of second signal patterns and a sequence of first signal patterns ending at the current first signal pattern. In particular, the pruning threshold used for a given path during the processing of a current first signal pattern depends upon the position, within the sequence of patterns representing the second signal, of the second signal pattern which is at the end of the given path.
    Type: Grant
    Filed: March 20, 2001
    Date of Patent: April 20, 2004
    Assignee: Canon Kabushiki Kaisha
    Inventors: Robert Alexander Keiller, Eli Tzirkel-Hancock, Julian Richard Seward
  • Patent number: 6701292
    Abstract: A speech-recognizing apparatus for recognizing input speech comprises, an analysis unit for computing a characteristic vector for each of frames of the input speech, a correction-value storage unit for storing a correction distance in advance, a vector-to-vector-distance-computing unit for computing a vector-to-vector distance between the characteristic vector and the phoneme characteristic vector, an average-value-computing unit for computing an average value of vector-to-vector distances for one of the frames, a correction unit for computing a corrected vector-to-vector distance as a value of an expression of (the vector-to-vector distance-the average value+the correction distance), and a recognition unit for cumulating corrected vector-to-vector distances into a cumulative vector-to-vector distance and comparing the cumulative vector-to-vector distance with the word standard pattern in order to recognize the input speech.
    Type: Grant
    Filed: October 30, 2000
    Date of Patent: March 2, 2004
    Assignee: Fujitsu Limited
    Inventors: Chiharu Kawai, Hiroshi Katayama, Takehiro Nakai
  • Publication number: 20040019483
    Abstract: A method of speech recognition is provided that identifies a production-related dynamics value by performing a linear interpolation between a production-related dynamics value at a previous time and a production-related target using a time-dependent interpolation weight. The hidden production-related dynamics value is used to compute a predicted value that is compared to an observed value of acoustics to determine the likelihood of the observed acoustics given a sequence of hidden phonological units. In some embodiments, the production-related dynamics value at the previous time is selected from a set of continuous values. In addition, the likelihood of the observed acoustics given a sequence of hidden phonological units is combined with a score associated with a discrete class of production-related dynamic values at the previous time to determine a score for a current phonological state.
    Type: Application
    Filed: October 9, 2002
    Publication date: January 29, 2004
    Inventors: Li Deng, Jian-Iai Zhou, Frank Torsten Bernd Seide, Asela J.R. Gunawardana, Hagai Attias, Alejandro Acero, Xuedong Huang
  • Publication number: 20030200086
    Abstract: A speech recognition apparatus comprises a speech analyzer which extracts feature patterns of spontaneous speech divided into frames; a keyword model database which prestores keyword which represent feature patterns of a plurality of keywords to be recognized; a garbage model database which prestores feature patterns of components of extraneous speech to be identified; and a first likelihood calculator which calculates likelihood of feature values based on feature values patterns of each frames and keywords; a second likelihood calculator which calculates likelihood of feature values based on feature values patterns of each frames and extraneous speech. The device recognizes keywords contained in the spontaneous speech by calculating cumulative likelihood based on the calculated likelihood adding a predetermined correction value in the second likelihood calculator.
    Type: Application
    Filed: April 15, 2003
    Publication date: October 23, 2003
    Applicant: PIONEER CORPORATION
    Inventors: Yoshihiro Kawazoe, Hajime Kobayashi
  • Patent number: 6631349
    Abstract: Frames making up an input speech are each collated with a string of phonemes representing speech candidates to be recognized, whereby evaluation values regarding the phonemes are computed. The frames are each compared with part of the phoneme string so as to reduce computations and memory capacity required in recognizing the input speech based on the evaluation values. That is, each frame is compared with a portion of the phoneme string to acquire an evaluation value for each phoneme. If the acquired evaluation value meets a predetermined condition, part of the phonemes to be collated with the next frame are changed. Illustratively, if the evaluation value for the phoneme heading a given portion of collated phonemes is smaller than the evaluation value of the phoneme which terminates that phoneme portion, then the head phoneme is replaced by the next phoneme. The new portion of phonemes obtained by the replacement is used for collation with the next frame.
    Type: Grant
    Filed: May 9, 2000
    Date of Patent: October 7, 2003
    Assignee: Hitachi, Ltd.
    Inventors: Kazuyoshi Ishiwatari, Kazuo Kondo, Shinji Wakisaka
  • Patent number: 6625600
    Abstract: The invention concerns a method and apparatus for processing a user's communication. The invention may include receiving a list of recognized symbol strings of one or more recognized entries. The list of recognized symbol strings may include a first similarity score associated with each recognized entry. From each recognized symbol string one or more contiguous sequences of N-symbols may be extracted. One of the extracted contiguous sequences of N-symbols may be matched with at least one stored contiguous sequence of N-symbols from a first database. A preliminary set of symbol strings and associated second similarity scores may be generated. The preliminary set of symbol strings may include one or more stored symbol strings from a second database that correspond to the at least one matched contiguous sequence of N-symbols. A third similarity score associated with the one or more stored symbol strings included in the preliminary set of symbol strings may be computed.
    Type: Grant
    Filed: May 1, 2001
    Date of Patent: September 23, 2003
    Assignee: Telelogue, Inc.
    Inventors: Yevgenly Lyudovyk, Esther Levin
  • Patent number: 6618697
    Abstract: A computer implemented method which does not require a stored dictionary for correcting spelling errors in a sequence of words comprises storing a plurality of spelling rules defined as regular expressions for matching a potentially illegal n-gram which may comprise less than all letters in the word and for replacing an illegal n-gram with a legal n-gram to return a corrected word, submitting a word from said sequence of words to the spelling rules and replacing a word in the string of words with a corrected word.
    Type: Grant
    Filed: May 14, 1999
    Date of Patent: September 9, 2003
    Assignee: Justsystem Corporation
    Inventors: Mark Kantrowitz, Shumeet Baluja
  • Patent number: 6609094
    Abstract: Improvements in speech recognition systems are achieved by considering projections of the high dimensional data on lower dimensional subspaces, subsequently by estimating the univariate probability densities via known univariate techniques, and then by reconstructing the density in the original higher dimensional space from the collection of univariate densities so obtained. The reconstructed density is by no means unique unless further restrictions on the estimated density are imposed. The variety of choices of candidate univariate densities as well as the choices of subspaces on which to project the data including their number further add to this non-uniqueness. Probability density functions are then considered that maximize certain optimality criterion as a solution to this problem. Specifically, those probability density function's that either maximize the entropy functional, or alternatively, the likelihood associated with the data are considered.
    Type: Grant
    Filed: May 22, 2000
    Date of Patent: August 19, 2003
    Assignee: International Business Machines Corporation
    Inventors: Sankar Basu, Charles A. Micchelli, Peder Olsen
  • Publication number: 20030154077
    Abstract: When a user issued voice command does not match grammars registered in advance, the voice command is identified as a sentence (step S305). This sentence is compared with the registered grammars to calculate a similarity (step S307). When the similarity is higher than a first threshold value (TH1), the voice command is executed (step S315). When the similarity is equal to or lower than the first threshold value (TH1) and higher than a second threshold value (TH2), command choices are displayed for the user and the user is permitted to select a command to be executed (step S319). When the similarity is equal to or lower than the second threshold value (TH2), the command is not executed (step S321). Furthermore, once a command has been executed it is added as a grammar, so that it can be identified when next it is used.
    Type: Application
    Filed: February 10, 2003
    Publication date: August 14, 2003
    Applicant: International Business Machines Corporation
    Inventors: Yoshinori Tahara, Daisuke Tomoda, Kikuo Mitsubo, Yoshinori Atake
  • Patent number: 6594392
    Abstract: The present invention is a method and apparatus to determine a similarity measure between first and second patterns. First and second storages store first and second feature vectors which represent the first and second patterns, respectively. A similarity estimator is coupled to the first and second storages to compute a similarity probability of the first and second feature vectors using a piecewise linear probability density function (PDF). The similarity probability corresponds to the similarity measure.
    Type: Grant
    Filed: May 17, 1999
    Date of Patent: July 15, 2003
    Assignee: Intel Corporation
    Inventor: Umberto Santoni
  • Patent number: 6577996
    Abstract: A method and apparatus for objectively evaluating sound quality of a signal processor or transmission channel. The present invention analyzes the distortion in a series of test sound frames compared to a series of sample sound frames. The invention detects sequences of test sound frames having distortion levels that are greater than a temporal distortion threshold and calculates an average length and a maximum length of these sequences. The present invention also detects individual test sound frames having distortion levels that are greater than an outlier distortion threshold and calculates a percentage of these frames present in the series of test sound frames. Further, the present invention calculates the average distortion level in the series of test sound frames and a variance of the distortion level in the test sound frames.
    Type: Grant
    Filed: December 8, 1998
    Date of Patent: June 10, 2003
    Assignee: Cisco Technology, Inc.
    Inventor: Ramanathan T. Jagadeesan
  • Patent number: 6560575
    Abstract: An apparatus is provided for checking the consistency between two training words which can be used in, for example, a speech recognition or verification system. Two training examples are aligned using a dynamic programming alignment process and an average frame score is calculated from the alignment results together with the worst score in a number of consecutive frames. These values are then compared with similar values obtained from training examples which are known to be consistent to determine if the training examples are consistent.
    Type: Grant
    Filed: September 30, 1999
    Date of Patent: May 6, 2003
    Assignee: Canon Kabushiki Kaisha
    Inventor: Robert Alexander Keiller
  • Publication number: 20030065510
    Abstract: A similarity evaluation program capable of determining similarity between probability models at a high speed (with little calculation) is disclosed. The similarity evaluation program is implemented on an apparatus such as a computer for evaluating similarity between a pair of probability model information each including a plurality of probability information constituted by a plurality of types of data, and this apparatus is provided with a dynamic programming operation unit for performing arithmetic processing based on dynamic programming techniques using a similarity value indicating similarity between probability information included in one of the pair probability model information and probability information included in the other of the pair of probability model information as an indicator for selecting a path.
    Type: Application
    Filed: March 28, 2002
    Publication date: April 3, 2003
    Applicant: Fujitsu Limited
    Inventor: Makihiko Sato
  • Patent number: 6539351
    Abstract: A method is provided for generating a high dimensional density model within an acoustic model for one of a speech and a speaker recognition system. Acoustic data obtained from at least one speaker is transformed into high dimensional feature vectors. The density model is formed to model the feature vectors by a mixture of compound Gaussians with a linear transform, wherein each compound Gaussian is associated with a compound Gaussian prior and models each coordinate of each component of the density model independently by a univariate Gaussian mixture comprising a univariate Gaussian prior, variance, and mean. An iterative expectation maximization (EM) method is applied to the feature vectors. The EM method includes the step of computing an auxiliary function Q of the EM method.
    Type: Grant
    Filed: May 5, 2000
    Date of Patent: March 25, 2003
    Assignee: International Business Machines Corporation
    Inventors: Scott Shaobing Chen, Ramesh Ambat Gopinath
  • Patent number: 6535850
    Abstract: In a speech training and recognition system, the current invention detects and warns the user about the similar sounding entries to vocabulary and permits entry of such confusingly similar terms which are marked along with the stored similar terms to identify the similar words. In addition, the states in similar words are weighted to apply more emphasis to the differences between similar words than the similarities of such words. Another aspect of the current invention is to use modified scoring algorithm to improve the recognition performance in the case where confusing entries were made to the vocabulary despite the warning. Yet another aspect of the current invention is to detect and warn the user about potential problems with new entries such as short words and two or more word entries with long silence periods in between words. Finally, the current invention also includes alerting the user about the dissimilarity of the multiple tokens of the same vocabulary item in the case of multiple-token training.
    Type: Grant
    Filed: March 9, 2000
    Date of Patent: March 18, 2003
    Assignee: Conexant Systems, Inc.
    Inventor: Aruna Bayya
  • Patent number: 6507815
    Abstract: A group of words to be registered in a word dictionary are sorted in order of sound models to produce a word list. A tree-structure word dictionary in which sound models at head part of the words are shared among the words, is prepared using this word list. Each node having a different set of reachable words from a parent node holds word information including a minimum out of word IDs of words reachable from that node, and the number of words reachable from that node. For searching for a word matching with speech input, language likelihoods are looked ahead using this word information. The word matching with the speech input can be recognized efficiently, using such a tree-structure word dictionary and a look-ahead method of language likelihood.
    Type: Grant
    Filed: March 29, 2000
    Date of Patent: January 14, 2003
    Assignee: Canon Kabushiki Kaisha
    Inventor: Hiroki Yamamoto
  • Patent number: 6496800
    Abstract: A speaker verification system using the voice of a user uttering a continuous, random length digit string is provided. The speaker verification system includes a random digit generator for generating a continuous, random length digit string; a user interface for providing the continuous, random length digit string; a feature extractor for extracting voice features from the user's voice uttering the continuous, random length digit string; a digit voice verification unit for comparing the voice features with items in a speaker-independent continuous digit voice model to derive a digit string corresponding to items in the speaker-independent continuous digit voice model, which match the voice features, and for determining whether the derived digit string is identical to the digit string provided to the user via the user interface; and a speaker verification unit for comparing the voice features with a speaker-dependent model of the user to measure the similarity between them.
    Type: Grant
    Filed: May 1, 2000
    Date of Patent: December 17, 2002
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Byung-goo Kong, Sang-ryong Kim
  • Patent number: 6490559
    Abstract: The distance computation represents a central, constantly recurrent task in sample and speech recognition. It is used in speech recognition as a degree of similarity between a part of a speech utterance and a speech reference. In picture processing and sample recognition, it is used for data compression. The distance computation requires the longest computation time so that a reduction of the computation time results in a considerable efficiency improvement. A reduction of the computation time is achieved by the integration of the distance computation in a memory module in which particularly the reference data are stored. Due to this integration, the other components of the overall system are relieved of this constantly recurrent task and are available for more complex processes in this period of time. This integration makes the distance computation essentially shorter because the communication between memory sections and computation unit takes place directly without utilizing a busy system.
    Type: Grant
    Filed: October 13, 1998
    Date of Patent: December 3, 2002
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Wolfgang O. Budde, Volker Steinbiss