Similarity Patents (Class 704/239)

Method and apparatus for performing observation probability calculations

Patent number: 7356466

Abstract: A method and apparatus for calculating an observation probability includes a first operation unit that subtracts a mean of a first plurality of parameters of an input voice signal from a second parameter of an input voice signal, and multiplies the subtraction result to obtain a first output. The first output is squared and accumulated N times in a second operation unit to obtain a second output. A third operation unit subtracts a given weighted value from the second output to obtain a third output, and a comparator stores the third output for a comparator stores the third output in order to extract L outputs therefrom, and stores the L extracted outputs based on an order of magnitude of the extracted L outputs.

Type: Grant

Filed: June 20, 2003

Date of Patent: April 8, 2008

Assignee: Samsung Electronics Co., Ltd.

Inventors: Byung-Ho Min, Tae-Su Kim, Hyun-Woo Park, Ho-Rang Jang, Keun-Cheol Hong, Sung-Jae Kim
Method, device and computer program for recognition of a handwritten character

Patent number: 7349576

Abstract: A method for recognition of a handwritten character comprises the steps of determining a plurality of position features defining the handwritten character, and comparing the handwritten character to reference characters stored in a database in order to find the closest matching reference character. The step of comparing comprises the steps of computing a difference between one of the plurality of position features of the handwritten character and a corresponding position feature of one of the reference characters, determining, by lookup in a predefined table, a distance measure based on the computed difference and determining a distance measure for each of the plurality of position features of the handwritten character, and computing a cost function based on the determined distance measures. A device and a computer program for implementing the method are also described.

Type: Grant

Filed: January 11, 2002

Date of Patent: March 25, 2008

Assignee: Zi Decuma AB

Inventor: Anders Holtsberg
Language recognition using a similarity measure

Patent number: 7310600

Abstract: A dynamic programming technique is provided for matching two sequences of phonemes both of which may be generated from text or speech. The scoring of the dynamic programming matching technique uses phoneme confusion scores, phoneme insertion scores and phoneme deletion scores which are obtained in advance in a training session and, if appropriate, confidence data generated by a recognition system if the sequences are generated from speech.

Type: Grant

Filed: October 25, 2000

Date of Patent: December 18, 2007

Assignee: Canon Kabushiki Kaisha

Inventors: Philip Neil Garner, Jason Peter Andrew Charlesworth, Asako Higuchi
Voice command processing system and computer therefor, and voice command processing method

Patent number: 7299187

Abstract: When a user issued voice command does not match grammars registered in advance, the voice command is identified as a sentence (step S305). This sentence is compared with the registered grammars to calculate a similarity (step S307). When the similarity is higher than a first threshold value (TH1), the voice command is executed (step S315). When the similarity is equal to or lower than the first threshold value (TH1) and higher than a second threshold value (TH2), command choices are displayed for the user and the user is permitted to select a command to be executed (step S319). When the similarity is equal to or lower than the second threshold value (TH2), the command is not executed (step S321). Furthermore, once a command has been executed it is added as a grammar, so that it can be identified when next it is used.

Type: Grant

Filed: February 10, 2003

Date of Patent: November 20, 2007

Assignee: International Business Machines Corporation

Inventors: Yoshinori Tahara, Daisuke Tomoda, Kikuo Mitsubo, Yoshinori Atake
Multiple attribute object comparison based on quantitative distance measurement

Patent number: 7284012

Abstract: A system selects a plurality of attributes common to objects being compared and defines at least one value space for each selected attribute. Each selected attribute can have a value space that is different from a value space of the same attribute associated with a different object. The system defines an ordering system for each value space including selecting whether the value space consists of non-ordered values, partially ordered values, or fully ordered values. An objective function is defined for each ordering system and the objective function maps a pair of values of a value space to a number value. The system normalizes each objective function and defines a mapping from the plurality of objective functions to a first general objective function.

Type: Grant

Filed: January 24, 2003

Date of Patent: October 16, 2007

Assignee: International Business Machines Corporation

Inventors: Dikran S. Meliksetian, Nianjun Zhou
Method and apparatus for estimating degree of similarity between voices

Publication number: 20070225979

Abstract: A similarity degree estimation method is performed by two processes. In a first process, an inter-band correlation matrix is created from spectral data of an input voice such that the spectral data are divided into a plurality of discrete bands which are separated from each other with spaces therebetween along a frequency axis, a plurality of envelope components of the spectral data are obtained from the plurality of the discrete bands, and elements of the inter-band correlation matrix are correlation values between the respective envelope components of the input voice. In a second process, a degree of similarity is calculated between a pair of input voices to be compared with each other by using respective inter-band correlation matrices obtained for the pair of the input voices through the inter-band correlation matrix creation process.

Type: Application

Filed: March 20, 2007

Publication date: September 27, 2007

Applicants: YAMAHA CORPORATION, WASEDA UNIVERSITY

Inventors: Mikio Tohyama, Michiko Kazama, Satoru Goto, Takehiko Kawahara, Yasuo Yoshioka
Method for the voice-operated identification of the user of a telecommunications line in a telecommunications network in the course of a dialog with a voice-operated dialog system

Patent number: 7246061

Abstract: A method for the voice-operated identification of the user of a telecommunications line in a telecommunications network is provided in the course of a dialog with a voice-operated dialog system. Utterances spoken by a caller from a group of callers limited to one telecommunications line are used during a human-to-human and/or human-to-machine dialog to apply a reference pattern for the caller. For each reference pattern, a user identifier is stored which is activated once the caller is identified, and, together with the CLI and/or ANI identifier of the telecommunications line, are made available to a server having a voice-controlled dialog system. On the basis of the CLI, including the user identifier, data previously stored for this user are ascertained by the system and made available for the dialog interface with the customer.

Type: Grant

Filed: December 15, 2000

Date of Patent: July 17, 2007

Assignee: Deutsche Telekom AG

Inventors: Fred Runge, Christel Mueller, Marian Trinkel, Thomas Ziem
Automatic pronunciation scoring for language learning

Patent number: 7219059

Abstract: A method and apparatus for generating a pronunciation score by receiving a user phrase intended to conform to a reference phrase and processing the user phrase in accordance with at least one of an articulation-scoring engine, a duration scoring engine and an intonation-scoring engine to derive thereby the pronunciation score.

Type: Grant

Filed: July 3, 2002

Date of Patent: May 15, 2007

Assignee: Lucent Technologies Inc.

Inventors: Sunil K. Gupta, Ziyi Lu, Fengguang Zhao
Prosody generating device, prosody generating method, and program

Patent number: 7200558

Abstract: A prosody generation apparatus capable of suppressing distortion that occurs when generating prosodic patterns and therefore generating a natural prosody is provided. A prosody changing point extraction unit in this apparatus extracts a prosody changing point located at the beginning and the ending of a sentence, the beginning and the ending of a breath group, an accent position and the like. A selection rule and a transformation rule of a prosodic pattern including the prosody changing point is generated by means of a statistical or learning technique and the thus generate rules are stored in a representative prosodic pattern selection rule table and a transformation rule table beforehand. A pattern selection unit selects a representative prosodic pattern from the representative prosodic pattern selection rule table according to the selection rule.

Type: Grant

Filed: March 8, 2002

Date of Patent: April 3, 2007

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Yumiko Kato, Takahiro Kamai
Voice-activated interactive multimedia information processing system

Patent number: 7177402

Abstract: Embodiments of the invention are directed to automated reception systems that may receive voice information indicating action to be taken by the system. The automated reception system may receive a call and transmit a speech message to the caller identifying actions that the caller may ask the system to take. The caller may verbally select an action for the system to execute. Possible actions may depend upon the system context and information about possible actions may be provided to the caller through dynamically generated messages. The caller may also access voicemail or electronic mail messages using embodiments of the invention. Furthermore, in some embodiments, a caller may be able to control a separate communication session using voice or other commands input during a telephone session.

Type: Grant

Filed: March 1, 2001

Date of Patent: February 13, 2007

Assignee: Applied Voice & Speech Technologies, Inc.

Inventor: Michael D. Metcalf
System and method for determining internal parameters of a data clustering program

Patent number: 7177863

Abstract: A system and associated method for tuning a data clustering program to a clustering task, determine at least one internal parameter of a data clustering program. The determination of one or more of the internal parameters of the data clustering program occurs before the clustering begins. Consequently, clustering does not need to be performed iteratively, thus improving clustering program performance in terms of the required processing time and processing resources. The system provides pairs of data records; the user indicates whether or not these data records should belong to the same cluster. The similarity values of the records of the selected pairs are calculated based on the default parameters of the clustering program. From the resulting similarity values, an optimal similarity threshold is determined. When the optimization criterion does not yield a single optimal similarity threshold range, equivalent candidate ranges are selected.

Type: Grant

Filed: March 14, 2003

Date of Patent: February 13, 2007

Assignee: International Business Machines Corporation

Inventors: Boris Charpiot, Barbara Hartel, Christoph Lingenfelder, Thilo Maier
Background learning of speaker voices

Patent number: 7171360

Abstract: A speaker identification system includes a speaker model generator 110 for generating a plurality of speaker models. To this end, the generator records training utterances from a plurality of speakers in the background, without prior knowledge of the speakers who spoke the utterances. The generator performs a blind clustering of the training utterances based on a predetermined criterion. For each of the clusters a corresponding speaker model is trained. A speaker identifier 130 identifies a speaker determining a most likely one of the speaker models for an utterance received from the speaker. The speaker associated with the most likely speaker model is identified as the speaker of the test utterance.

Type: Grant

Filed: May 7, 2002

Date of Patent: January 30, 2007

Assignee: Koninklijke Philips Electronics N.V.

Inventors: Chao-Shih Huang, Ya-Cherng Chu, Wei-Ho Tsai, Jyh-Min Cheng
Speech processing apparatus and method using confidence scores

Patent number: 7165031

Abstract: A speech recognition method and apparatus is disclosed in which outputs a confidence score indicative of the posterior probability of an utterance being correctly matched to a word model. The confidence score for the matching of an utterance to a word model is determined directly from the generated values indicative of the goodness of match between the utterance and stored word models utilizing the following equation: confidence = exp ? ( - 2 ? ? S ? ( x | w ) ) ? words ? exp ? ( 2 ? ? S ? ( x | w ) ) where S(x|w) is the match score for the correlation between a signal x and word w and ? is an experimentally determined constant.

Type: Grant

Filed: November 6, 2002

Date of Patent: January 16, 2007

Assignee: Canon Kabushiki Kaisha

Inventor: David Llewellyn Rees
System and method of developing automatic speech recognition vocabulary for voice activated services

Patent number: 7139706

Abstract: A comprehensive system is provided for designing a voice activated user interface (VA UI) having a semantic and syntactic structure adapted to the culture and conventions of spoken language for the intended users. The system poses, to at least one respondent, a hypothetical task to be performed; asks each of the at least one respondent for a word that the respondent would use to command the hypothetical task to be performed; receives, from each of the at least one respondent, a command word; develops a list of command words from the received command word; and rejects the received command word, if the received command word is acoustically similar to another word in the list of command words. The approach is general across languages and encompasses universal variables of language and culture. Also provided are prompting grammar and error handling methods adapted to such user interfaces.

Type: Grant

Filed: August 12, 2002

Date of Patent: November 21, 2006

Assignee: Comverse, Inc.

Inventor: Matthew John Yuschik
Database annotation and retrieval

Patent number: 7054812

Abstract: A system is provided for determining a sequence of sub-word units representative of at least two words output by a word recognition unit in response to an input word to be recognized. In a preferred embodiment, the word alternatives output by the recognition unit are converted into sequences of phonemes. An optimum alignment between these sequences is then determined using a dynamic programming alignment technique. The sequence of phonemes representative of the input sequences is then determined using this optimum alignment.

Type: Grant

Filed: April 25, 2001

Date of Patent: May 30, 2006

Assignee: Canon Kabushiki Kaisha

Inventors: Jason Peter Andrew Charlesworth, Philip Neil Garner
Speech recognition apparatus using distance based acoustic models

Patent number: 7031917

Abstract: The present invention relates to a speech recognition apparatus and a speech recognition method for speech recognition with improved accuracy. A distance calculator 47 determines the distance from a microphone 21 to a user uttering. Data indicating the determined distance is supplied to a speech recognition unit 41B. The speech recognition unit 41B has plural sets of acoustic models produced from speech data obtained by capturing speeches uttered at various distances. From those sets of acoustic models, the speech recognition unit 41B selects a set of acoustic models produced from speech data uttered at a distance closest to the distance determined by the distance calculator 47, and the speech recognition unit 41B performs speech recognition using the selected set of acoustic models.

Type: Grant

Filed: October 21, 2002

Date of Patent: April 18, 2006

Assignee: Sony Corporation

Inventor: Yasuharu Asano
Method for detecting similarity between standard information and input information and method for judging the input information by use of detected result of the similarity

Patent number: 7006970

Abstract: Disclosed is a method for obtaining a precise detected value of a similarity between voices or the like. Standard and input pattern matrices, each having a voice feature amount as a component, are prepared (S1 and S2). A reference shape having a variance different for each specified component of the pattern matrices is prepared, and positive and negative reference pattern vectors, each having a value of the reference shape as a component, are prepared. Then, while the specified component (a center of the reference shape) being made to move to each component position j1=1 to m1, j2=1 to m2 of the standard pattern matrix, a shape change between the standard and input pattern matrices is substituted for shape changes of the positive and negative reference pattern vectors. And, an amount of change in kurtosis of each reference pattern vector is numerically evaluated to obtain a shape change amount Dj1j2 (S3). Then, a value of a geometric distance between the pattern matrices is calculated from Dj1j2 (S4).

Type: Grant

Filed: September 11, 2001

Date of Patent: February 28, 2006

Assignee: Entropy Software Laboratory, Inc.

Inventors: Michihiro Jinnai, Hiroshi Yamaguchi
Methods and systems of routing utterances based on confidence estimates

Patent number: 7003456

Abstract: A computer-based method of routing a message to a system includes receiving a message, and processing the message using large-vocabulary continuous speech recognition to generate a string of text corresponding to the message. The method includes generating a confidence estimate of the string of text corresponding to the message and comparing the confidence estimate to a predetermined threshold. If the confidence estimate satisfies the predetermined threshold, the string of text is forwarded to the system. If the confidence estimate does not satisfy the predetermined threshold, the information relating to the message is forwarded to a transcriptionist. The message may include one or more utterances. Each utterance in the message may be separately or jointly processed. In this way, a confidence estimate may be generated and evaluated for each utterance or for the whole message. Information relating to each utterance may be separately or jointly forwarded based on the results of the generation and evaluation.

Type: Grant

Filed: June 12, 2001

Date of Patent: February 21, 2006

Inventors: Laurence S. Gillick, Robert Roth, Linda Manganaro, Barbara R. Peskin, David C. Petty, Ashwin Rao
Linear discriminant based sound class similarities with unit value normalization

Patent number: 6996527

Abstract: A common requirement in automatic speech recognition is to recognize a set of words for any speaker without training the system for each new speaker. A speech recognition system is provided utilizing linear discriminant based phonetic similarities with inter-phonetic unit value normalization. Linear discriminant analysis is utilized using training data with both in-class and out-class sample training utterances for generating linear discriminant vectors for each of the phonetic units. The dot product of each linear discriminant vector and the time spectral pattern vectors generated from the input speech are computed. The resultant raw similarity vectors are then normalized utilizing normalization look-up tables for providing similarity vectors which are utilized by a word matcher for word recognition.

Type: Grant

Filed: July 26, 2001

Date of Patent: February 7, 2006

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Robert C. Boman, Philippe R. Morin, Ted H. Applebaum
Voice recognition apparatus and method, and recording medium

Patent number: 6961701

Abstract: An extended-word selecting section calculates a score for a phoneme string formed of one more phonemes, corresponding to a user's speech, and searches a large-vocabulary-dictionary for a word having one or more phonemes equal to or similar to those of a phoneme string having a score equal to or higher than a predetermined value. A matching section calculates scores for the word searched for by the extended-word selecting section in addition to a word preliminary word-selecting section. A control section determines a word string as the result of recognition of the speech uttered by the user.

Type: Grant

Filed: March 3, 2001

Date of Patent: November 1, 2005

Assignee: Sony Corporation

Inventors: Hiroaki Ogawa, Katsuki Minamino, Yasuharu Asano, Helmut Lucke
Acoustic speech recognition method and system using stereo vision neural networks with competition and cooperation

Patent number: 6947890

Abstract: A method and system are provided for speech recognition. The speech recognition method includes the steps of preparing training data representing acoustic parameters of each of phonemes at each time frame; receiving an input signal representing a sound to be recognized and converting the input signal to input data; comparing the input data at each frame with the training data of each of the phonemes to derive a similarity measure of the input data with respect to each of the phonemes; and processing the similarity measures obtained in the comparing step using a neural net model governing development of activities of plural cells to conduct speech recognition of the input signal.

Type: Grant

Filed: May 30, 2000

Date of Patent: September 20, 2005

Inventors: Tetsuro Kitazoe, Sung-Ill Kim, Tomoyuki Ichiki
Speech recognition apparatus and method using two opposite words

Patent number: 6937982

Abstract: A speech recognition apparatus recognizes a speech signal received from a speaker and provides the result of recognition for an external device. In the apparatus, a pattern matching section performs pattern matching between each of reference patterns in a vocabulary and characteristic parameters extracted from the speech signal. The vocabulary includes reference patterns corresponding to words. Further the apparatus has a similar sound group which includes reference patterns corresponding to the sound similar to that of a specific word. The specific word is a word in response to which the external device performs an operation which cannot be easily undone. The speech signal is rerecognized by using the similar sound group. As a result, the pattern matching section outputs a word other than the specific word, if one of the reference patterns in the similar sound group has a high similarity with the characteristic parameters.

Type: Grant

Filed: July 19, 2001

Date of Patent: August 30, 2005

Assignee: Denso Corporation

Inventors: Norihide Kitaoka, Hiroshi Ohno
Method and system for speech recognition using phonetically similar word alternatives

Patent number: 6910012

Abstract: A method for performing speech recognition can include the steps of providing a grammar including entries comprising a parent word and a pseudo word being substantially phonetically equivalent to the parent word. The grammar can provide a translation from the pseudo word to the parent word. The parent word can be received as speech and the speech can be compared to the grammar entries. Additionally, the speech can be matched to the pseudo word and the pseudo word can be translated to the parent word.

Type: Grant

Filed: May 16, 2001

Date of Patent: June 21, 2005

Assignee: International Business Machines Corporation

Inventors: Matthew W. Hartley, David E. Reich
Time-series segmentation

Patent number: 6907367

Abstract: A method for segmenting a signal into segments having similar spectral characteristics is provided. Initially the method generates a table of previous values from older signal values that contains a scoring value for the best segmentation of previous values and a segment length of the last previously identified segment. The method then receives a new sample of the signal and computes a new spectral characteristic function for the signal based on the received sample. A new scoring function is computed from the spectral characteristic function. Segments of the signal are recursively identified based on the newly computed scoring function and the table of previous values. The spectral characteristic function can be a selected one of an autocorrelation function and a discrete Fourier transform. An example is provided for segmenting a speech signal.

Type: Grant

Filed: August 31, 2001

Date of Patent: June 14, 2005

Assignee: The United States of America as represented by the Secretary of the Navy

Inventor: Paul M. Baggenstoss
Language recognition using sequence frequency

Patent number: 6882970

Abstract: A system is provided for comparing an input query with a number of stored annotations to identify information to be retrieved from a database. The comparison technique divides the input query into a number of fixed-size fragments and identifies how many times each of the fragments occurs within each annotation using a dynamic programming matching technique. The frequencies of occurrence of the fragments in both the query and the annotation are then compared to provide a measure of the similarity between the query and the annotation. The information to be retrieved is then determined from the similarity measures obtained for all the annotations.

Type: Grant

Filed: October 25, 2000

Date of Patent: April 19, 2005

Assignee: Canon Kabushiki Kaisha

Inventors: Philip Neil Garner, Jason Peter Andrew Charlesworth, Asako Higuchi
Method of speech recognition by presenting N-best word candidates

Patent number: 6839667

Abstract: A method for performing speech recognition can include receiving user speech and determining a plurality of potential candidates. Each of the candidates can provide a textual interpretation of the speech. Confidence scores can be calculated for the candidates. The confidence scores can be compared to a predetermined threshold. Also, selected ones of the plurality of candidates can be presented to the user as alternative interpretations of the speech when none of the confidence scores is greater than the predetermined threshold. The selected ones of the plurality of candidates can have confidence scores above a predetermined minimum threshold, and thus can have confidence scores within a predetermined range.

Type: Grant

Filed: May 16, 2001

Date of Patent: January 4, 2005

Assignee: International Business Machines Corporation

Inventor: David E. Reich
Detecting repeated phrases and inference of dialogue models

Publication number: 20040249637

Abstract: A method of speech recognition obtains acoustic data from a plurality of conversations. A plurality of pairs of utterances are selected from the plurality of conversations. At least one portion of the first utterance of the pair of utterances is dynamically aligned with at least one portion of the second utterance of the pair of utterance, and an acoustic similarity is computed. At least one pair that includes a first portion from a first utterance and a second portion from a second utterance is chosen, based on a criterion of acoustic similarity. A common pattern template is created from the first portion and the second portion.

Type: Application

Filed: June 2, 2004

Publication date: December 9, 2004

Applicant: Aurilab, LLC

Inventor: James K. Baker
Speech recognition accuracy in a multimodal input system

Patent number: 6823308

Abstract: A speech recognition method for use in a multimodal input system comprises receiving a multimodal input comprising digitized speech as a first modality input and data in at least one further modality input. Features in the speech and in the data in at least one further modality are identified. The identified features in the speech and in the data are used in the recognition of words by comparing the identified features with states in models for the words. The models have states for the recognition of speech and for words having features in at least one further modality associated with the words, the models also have states for the recognition of events in the further modality or each further modality.

Type: Grant

Filed: February 16, 2001

Date of Patent: November 23, 2004

Assignee: Canon Kabushiki Kaisha

Inventors: Robert Alexander Keiller, Nicolas David Fortescue
Speech recognition apparatus, speech recognition method, and recording medium on which speech recognition program is computer-readable recorded

Publication number: 20040215454

Abstract: A speech recognizer 300 built into a navigation apparatus 100 includes a noise estimator 320 which calculates a noise model based on a microphone input signal, an adaptive processor 330 which performs an adaptive process on each keyword model and each non-keyword model stored in an HMM database 310 based on the noise model. The adaptive processor 330 performs a data adaptation process on each keyword model and non-keyword model based on the noise model and word spotting is performed based on the keyword models and the non-keyword models subjected to the data adaptation process.

Type: Application

Filed: April 22, 2004

Publication date: October 28, 2004

Inventors: Hajime Kobayashi, Kengo Hanai
Systems and methods for speaker change detection

Publication number: 20040204939

Abstract: A speaker change detection system performs speaker change detection on an input audio stream. The speaker change detection system includes a segmentation component [401], a phone classification decode component [402], and a speaker change detection component [403]. The segmentation component [401] segments the audio stream into segments [501-504] of predetermined length intervals. The segments may overlap one another. The phone classification decode component decodes the intervals to produce a set of phone classes corresponding to each of the intervals. The speaker change detection component detects locations of speaker changes in the audio stream based on a similarity value calculated at phone class boundaries.

Type: Application

Filed: October 16, 2003

Publication date: October 14, 2004

Inventors: Daben Liu, Francis G. Kubala
Method and apparatus for handset detection

Patent number: 6778957

Abstract: Disclosed is a method of automated handset identification, comprising receiving a sample speech input signal from a sample handset; deriving a cepstral covariance sample matrix from said first sample speech signal; calculating, with a distance metric, all distances between said sample matrix and one or more cepstral covariance handset matrices, wherein each said handset matrix is derived from a plurality of speech signals taken from different speakers through the same handset; and determining if the smallest of said distances is below a predetermined threshold value.

Type: Grant

Filed: August 21, 2001

Date of Patent: August 17, 2004

Assignee: International Business Machines Corporation

Inventors: Zhong-Hua Wang, David Lubensky, Cheng Wu
Performance gauge for a distributed speech recognition system

Patent number: 6766294

Abstract: A performance gauge for use in conjunction with a transcription system including a speech processor linked to at least one speech recognition engine and at least one transcriptionist. The speech processor includes an input for receiving speech files and storage means for storing the received speech files until such a time that they are forwarded to a selected speech recognition engine or transcriptionist for processing. The system includes a transcriptionist text file database in which manually transcribed transcriptionist text files are stored, each stored transcriptionist text file including time stamped data indicative of position within an original speech file. The system further includes a recognition engine text file database in which recognition engine text files transcribed via the at least one speech recognition engine are stored, each stored recognition engine text file including time stamped data indicative of position within an original speech file.

Type: Grant

Filed: November 30, 2001

Date of Patent: July 20, 2004

Assignee: Dictaphone Corporation

Inventors: Andrew MacGinite, James Cyr, Martin Hold, Channell Greene, Regina Kuhnen
Pattern matching method and apparatus

Patent number: 6725196

Abstract: A method and apparatus is provided for matching a first sequence of patterns representative of a first signal with a second sequence of patterns representative of a second signal. The system uses a plurality of different pruning thresholds (th) to control the propagation of paths which represent possible matchings between a sequence of second signal patterns and a sequence of first signal patterns ending at the current first signal pattern. In particular, the pruning threshold used for a given path during the processing of a current first signal pattern depends upon the position, within the sequence of patterns representing the second signal, of the second signal pattern which is at the end of the given path.

Type: Grant

Filed: March 20, 2001

Date of Patent: April 20, 2004

Assignee: Canon Kabushiki Kaisha

Inventors: Robert Alexander Keiller, Eli Tzirkel-Hancock, Julian Richard Seward
Speech recognizing apparatus

Patent number: 6701292

Abstract: A speech-recognizing apparatus for recognizing input speech comprises, an analysis unit for computing a characteristic vector for each of frames of the input speech, a correction-value storage unit for storing a correction distance in advance, a vector-to-vector-distance-computing unit for computing a vector-to-vector distance between the characteristic vector and the phoneme characteristic vector, an average-value-computing unit for computing an average value of vector-to-vector distances for one of the frames, a correction unit for computing a corrected vector-to-vector distance as a value of an expression of (the vector-to-vector distance-the average value+the correction distance), and a recognition unit for cumulating corrected vector-to-vector distances into a cumulative vector-to-vector distance and comparing the cumulative vector-to-vector distance with the word standard pattern in order to recognize the input speech.

Type: Grant

Filed: October 30, 2000

Date of Patent: March 2, 2004

Assignee: Fujitsu Limited

Inventors: Chiharu Kawai, Hiroshi Katayama, Takehiro Nakai
Method of speech recognition using time-dependent interpolation and hidden dynamic value classes

Publication number: 20040019483

Abstract: A method of speech recognition is provided that identifies a production-related dynamics value by performing a linear interpolation between a production-related dynamics value at a previous time and a production-related target using a time-dependent interpolation weight. The hidden production-related dynamics value is used to compute a predicted value that is compared to an observed value of acoustics to determine the likelihood of the observed acoustics given a sequence of hidden phonological units. In some embodiments, the production-related dynamics value at the previous time is selected from a set of continuous values. In addition, the likelihood of the observed acoustics given a sequence of hidden phonological units is combined with a score associated with a discrete class of production-related dynamic values at the previous time to determine a score for a current phonological state.

Type: Application

Filed: October 9, 2002

Publication date: January 29, 2004

Inventors: Li Deng, Jian-Iai Zhou, Frank Torsten Bernd Seide, Asela J.R. Gunawardana, Hagai Attias, Alejandro Acero, Xuedong Huang
Speech recognition apparatus, speech recognition method, and computer-readable recording medium in which speech recognition program is recorded

Publication number: 20030200086

Abstract: A speech recognition apparatus comprises a speech analyzer which extracts feature patterns of spontaneous speech divided into frames; a keyword model database which prestores keyword which represent feature patterns of a plurality of keywords to be recognized; a garbage model database which prestores feature patterns of components of extraneous speech to be identified; and a first likelihood calculator which calculates likelihood of feature values based on feature values patterns of each frames and keywords; a second likelihood calculator which calculates likelihood of feature values based on feature values patterns of each frames and extraneous speech. The device recognizes keywords contained in the spontaneous speech by calculating cumulative likelihood based on the calculated likelihood adding a predetermined correction value in the second likelihood calculator.

Type: Application

Filed: April 15, 2003

Publication date: October 23, 2003

Applicant: PIONEER CORPORATION

Inventors: Yoshihiro Kawazoe, Hajime Kobayashi
Speech recognition method and system

Patent number: 6631349

Abstract: Frames making up an input speech are each collated with a string of phonemes representing speech candidates to be recognized, whereby evaluation values regarding the phonemes are computed. The frames are each compared with part of the phoneme string so as to reduce computations and memory capacity required in recognizing the input speech based on the evaluation values. That is, each frame is compared with a portion of the phoneme string to acquire an evaluation value for each phoneme. If the acquired evaluation value meets a predetermined condition, part of the phonemes to be collated with the next frame are changed. Illustratively, if the evaluation value for the phoneme heading a given portion of collated phonemes is smaller than the evaluation value of the phoneme which terminates that phoneme portion, then the head phoneme is replaced by the next phoneme. The new portion of phonemes obtained by the replacement is used for collation with the next frame.

Type: Grant

Filed: May 9, 2000

Date of Patent: October 7, 2003

Assignee: Hitachi, Ltd.

Inventors: Kazuyoshi Ishiwatari, Kazuo Kondo, Shinji Wakisaka
Method and apparatus for automatically processing a user's communication

Patent number: 6625600

Abstract: The invention concerns a method and apparatus for processing a user's communication. The invention may include receiving a list of recognized symbol strings of one or more recognized entries. The list of recognized symbol strings may include a first similarity score associated with each recognized entry. From each recognized symbol string one or more contiguous sequences of N-symbols may be extracted. One of the extracted contiguous sequences of N-symbols may be matched with at least one stored contiguous sequence of N-symbols from a first database. A preliminary set of symbol strings and associated second similarity scores may be generated. The preliminary set of symbol strings may include one or more stored symbol strings from a second database that correspond to the at least one matched contiguous sequence of N-symbols. A third similarity score associated with the one or more stored symbol strings included in the preliminary set of symbol strings may be computed.

Type: Grant

Filed: May 1, 2001

Date of Patent: September 23, 2003

Assignee: Telelogue, Inc.

Inventors: Yevgenly Lyudovyk, Esther Levin
Method for rule-based correction of spelling and grammar errors

Patent number: 6618697

Abstract: A computer implemented method which does not require a stored dictionary for correcting spelling errors in a sequence of words comprises storing a plurality of spelling rules defined as regular expressions for matching a potentially illegal n-gram which may comprise less than all letters in the word and for replacing an illegal n-gram with a legal n-gram to return a corrected word, submitting a word from said sequence of words to the spelling rules and replacing a word in the string of words with a corrected word.

Type: Grant

Filed: May 14, 1999

Date of Patent: September 9, 2003

Assignee: Justsystem Corporation

Inventors: Mark Kantrowitz, Shumeet Baluja
Maximum entropy and maximum likelihood criteria for feature selection from multivariate data

Patent number: 6609094

Abstract: Improvements in speech recognition systems are achieved by considering projections of the high dimensional data on lower dimensional subspaces, subsequently by estimating the univariate probability densities via known univariate techniques, and then by reconstructing the density in the original higher dimensional space from the collection of univariate densities so obtained. The reconstructed density is by no means unique unless further restrictions on the estimated density are imposed. The variety of choices of candidate univariate densities as well as the choices of subspaces on which to project the data including their number further add to this non-uniqueness. Probability density functions are then considered that maximize certain optimality criterion as a solution to this problem. Specifically, those probability density function's that either maximize the entropy functional, or alternatively, the likelihood associated with the data are considered.

Type: Grant

Filed: May 22, 2000

Date of Patent: August 19, 2003

Assignee: International Business Machines Corporation

Inventors: Sankar Basu, Charles A. Micchelli, Peder Olsen
Voice command processing system and computer therefor, and voice command processing method

Publication number: 20030154077

Abstract: When a user issued voice command does not match grammars registered in advance, the voice command is identified as a sentence (step S305). This sentence is compared with the registered grammars to calculate a similarity (step S307). When the similarity is higher than a first threshold value (TH1), the voice command is executed (step S315). When the similarity is equal to or lower than the first threshold value (TH1) and higher than a second threshold value (TH2), command choices are displayed for the user and the user is permitted to select a command to be executed (step S319). When the similarity is equal to or lower than the second threshold value (TH2), the command is not executed (step S321). Furthermore, once a command has been executed it is added as a grammar, so that it can be identified when next it is used.

Type: Application

Filed: February 10, 2003

Publication date: August 14, 2003

Applicant: International Business Machines Corporation

Inventors: Yoshinori Tahara, Daisuke Tomoda, Kikuo Mitsubo, Yoshinori Atake
Pattern recognition based on piecewise linear probability density function

Patent number: 6594392

Abstract: The present invention is a method and apparatus to determine a similarity measure between first and second patterns. First and second storages store first and second feature vectors which represent the first and second patterns, respectively. A similarity estimator is coupled to the first and second storages to compute a similarity probability of the first and second feature vectors using a piecewise linear probability density function (PDF). The similarity probability corresponds to the similarity measure.

Type: Grant

Filed: May 17, 1999

Date of Patent: July 15, 2003

Assignee: Intel Corporation

Inventor: Umberto Santoni
Method and apparatus for objective sound quality measurement using statistical and temporal distribution parameters

Patent number: 6577996

Abstract: A method and apparatus for objectively evaluating sound quality of a signal processor or transmission channel. The present invention analyzes the distortion in a series of test sound frames compared to a series of sample sound frames. The invention detects sequences of test sound frames having distortion levels that are greater than a temporal distortion threshold and calculates an average length and a maximum length of these sequences. The present invention also detects individual test sound frames having distortion levels that are greater than an outlier distortion threshold and calculates a percentage of these frames present in the series of test sound frames. Further, the present invention calculates the average distortion level in the series of test sound frames and a variance of the distortion level in the test sound frames.

Type: Grant

Filed: December 8, 1998

Date of Patent: June 10, 2003

Assignee: Cisco Technology, Inc.

Inventor: Ramanathan T. Jagadeesan
Speech processing apparatus and method

Patent number: 6560575

Abstract: An apparatus is provided for checking the consistency between two training words which can be used in, for example, a speech recognition or verification system. Two training examples are aligned using a dynamic programming alignment process and an average frame score is calculated from the alignment results together with the worst score in a number of consecutive frames. These values are then compared with similar values obtained from training examples which are known to be consistent to determine if the training examples are consistent.

Type: Grant

Filed: September 30, 1999

Date of Patent: May 6, 2003

Assignee: Canon Kabushiki Kaisha

Inventor: Robert Alexander Keiller
Similarity evaluation method, similarity evaluation program and similarity evaluation apparatus

Publication number: 20030065510

Abstract: A similarity evaluation program capable of determining similarity between probability models at a high speed (with little calculation) is disclosed. The similarity evaluation program is implemented on an apparatus such as a computer for evaluating similarity between a pair of probability model information each including a plurality of probability information constituted by a plurality of types of data, and this apparatus is provided with a dynamic programming operation unit for performing arithmetic processing based on dynamic programming techniques using a similarity value indicating similarity between probability information included in one of the pair probability model information and probability information included in the other of the pair of probability model information as an indicator for selecting a path.

Type: Application

Filed: March 28, 2002

Publication date: April 3, 2003

Applicant: Fujitsu Limited

Inventor: Makihiko Sato
High dimensional acoustic modeling via mixtures of compound gaussians with linear transforms

Patent number: 6539351

Abstract: A method is provided for generating a high dimensional density model within an acoustic model for one of a speech and a speaker recognition system. Acoustic data obtained from at least one speaker is transformed into high dimensional feature vectors. The density model is formed to model the feature vectors by a mixture of compound Gaussians with a linear transform, wherein each compound Gaussian is associated with a compound Gaussian prior and models each coordinate of each component of the density model independently by a univariate Gaussian mixture comprising a univariate Gaussian prior, variance, and mean. An iterative expectation maximization (EM) method is applied to the feature vectors. The EM method includes the step of computing an auxiliary function Q of the EM method.

Type: Grant

Filed: May 5, 2000

Date of Patent: March 25, 2003

Assignee: International Business Machines Corporation

Inventors: Scott Shaobing Chen, Ramesh Ambat Gopinath
Smart training and smart scoring in SD speech recognition system with user defined vocabulary

Patent number: 6535850

Abstract: In a speech training and recognition system, the current invention detects and warns the user about the similar sounding entries to vocabulary and permits entry of such confusingly similar terms which are marked along with the stored similar terms to identify the similar words. In addition, the states in similar words are weighted to apply more emphasis to the differences between similar words than the similarities of such words. Another aspect of the current invention is to use modified scoring algorithm to improve the recognition performance in the case where confusing entries were made to the vocabulary despite the warning. Yet another aspect of the current invention is to detect and warn the user about potential problems with new entries such as short words and two or more word entries with long silence periods in between words. Finally, the current invention also includes alerting the user about the dissimilarity of the multiple tokens of the same vocabulary item in the case of multiple-token training.

Type: Grant

Filed: March 9, 2000

Date of Patent: March 18, 2003

Assignee: Conexant Systems, Inc.

Inventor: Aruna Bayya
Speech recognition apparatus and method

Patent number: 6507815

Abstract: A group of words to be registered in a word dictionary are sorted in order of sound models to produce a word list. A tree-structure word dictionary in which sound models at head part of the words are shared among the words, is prepared using this word list. Each node having a different set of reachable words from a parent node holds word information including a minimum out of word IDs of words reachable from that node, and the number of words reachable from that node. For searching for a word matching with speech input, language likelihoods are looked ahead using this word information. The word matching with the speech input can be recognized efficiently, using such a tree-structure word dictionary and a look-ahead method of language likelihood.

Type: Grant

Filed: March 29, 2000

Date of Patent: January 14, 2003

Assignee: Canon Kabushiki Kaisha

Inventor: Hiroki Yamamoto
Speaker verification system and method using spoken continuous, random length digit string

Patent number: 6496800

Abstract: A speaker verification system using the voice of a user uttering a continuous, random length digit string is provided. The speaker verification system includes a random digit generator for generating a continuous, random length digit string; a user interface for providing the continuous, random length digit string; a feature extractor for extracting voice features from the user's voice uttering the continuous, random length digit string; a digit voice verification unit for comparing the voice features with items in a speaker-independent continuous digit voice model to derive a digit string corresponding to items in the speaker-independent continuous digit voice model, which match the voice features, and for determining whether the derived digit string is identical to the digit string provided to the user via the user interface; and a speaker verification unit for comparing the voice features with a speaker-dependent model of the user to measure the similarity between them.

Type: Grant

Filed: May 1, 2000

Date of Patent: December 17, 2002

Assignee: Samsung Electronics Co., Ltd.

Inventors: Byung-goo Kong, Sang-ryong Kim
Memory having speech recognition support, by integrated local distance-computation/reference-vector-storage, for applications with general-purpose microprocessor systems

Patent number: 6490559

Abstract: The distance computation represents a central, constantly recurrent task in sample and speech recognition. It is used in speech recognition as a degree of similarity between a part of a speech utterance and a speech reference. In picture processing and sample recognition, it is used for data compression. The distance computation requires the longest computation time so that a reduction of the computation time results in a considerable efficiency improvement. A reduction of the computation time is achieved by the integration of the distance computation in a memory module in which particularly the reference data are stored. Due to this integration, the other components of the overall system are relieved of this constantly recurrent task and are available for more complex processes in this period of time. This integration makes the distance computation essentially shorter because the communication between memory sections and computation unit takes place directly without utilizing a busy system.

Type: Grant

Filed: October 13, 1998

Date of Patent: December 3, 2002

Assignee: Koninklijke Philips Electronics N.V.

Inventors: Wolfgang O. Budde, Volker Steinbiss

prev 1 2 3 4 5 6 7 next