Clustering Patents (Class 704/245)
-
Patent number: 6243674Abstract: A sound compression system adaptively switches codebooks in and out based on a calculation carried out with the output of the codebook. The system uses three separate codebooks: adaptive vector quantization codebook, real pitch codebook, and noise codebook. The perceptually-weighted filter is generated adaptively using the predictive coefficients from the current sub-frame.Type: GrantFiled: March 2, 1998Date of Patent: June 5, 2001Assignee: American Online, Inc.Inventor: Alfred Yu
-
Patent number: 6212500Abstract: In a method for determining the similarities of sounds across different languages, hidden Markov modelling of multilingual phonemes is employed wherein language-specific as well as language-independent properties are identified by combining of the probability densities for different hidden Markov sound models in various languages.Type: GrantFiled: March 9, 1999Date of Patent: April 3, 2001Assignee: Siemens AktiengesellschaftInventor: Joachim Köhler
-
Patent number: 6205424Abstract: Speech signals from speakers having known identities are used to create sets of acoustic models. The acoustic models along with their corresponding identities are stored in a memory. A plurality of sets of cohort models that characterize the speech signals are selected from the stored sets of acoustic models, and linked to the set of acoustic models of each identified speaker. During a testing session speech signals produced by an unknown speaker having a claimed identity are processed to generate processed speech signals. The processed speech signals are compared to the set of models of the claimed speaker to produce first scores. The processed speech signals are also compared to the sets cohort models to produce second scores. A subset of scores are dynamically selected from the second scores according to a predetermined criteria.Type: GrantFiled: July 31, 1996Date of Patent: March 20, 2001Assignee: Compaq Computer CorporationInventors: William D. Goldenthal, Brian S. Eberman
-
Patent number: 6182037Abstract: Fast and detailed match techniques for speaker recognition are combined into a hybrid system in which speakers are associated in groups when potential confusion is detected between a speaker being enrolled and a previously enrolled speaker. Thus the detailed match techniques are invoked only at the potential onset of saturation of the fast match technique while the detailed match is facilitated by limitation of comparisons to the group and the development of speaker-dependent models which principally function to distinguish between members of a group rather than to more fully characterize each speaker. Thus storage and computational requirements are limited and fast and accurate speaker recognition can be extended over populations of speakers which would degrade or saturate fast match systems and degrade performance of detailed match systems.Type: GrantFiled: May 6, 1997Date of Patent: January 30, 2001Assignee: International Business Machines CorporationInventor: Stephane Herman Maes
-
Patent number: 6163769Abstract: A text-to-speech system includes a storage device for storing a clustered set of context-dependent phoneme-based units of a target speaker. In one embodiment, decision trees are used wherein each decision tree based context-dependent phoneme-based unit is arranged based on context of at least one immediately preceding and succeeding phoneme. At least one of the context-dependent phoneme-based units represents other non-stored context-dependent phoneme units of similar sound due to similar contexts. A text analyzer obtains a string of phonetic symbols representative of text to be converted to speech. A concatenation module selects stored decision tree based context-dependent phoneme-based units from the set decision tree based context-dependent phoneme-based units based on the context of the phonetic symbols and synthesizes the selected phoneme-based units to generate speech corresponding to the text.Type: GrantFiled: October 2, 1997Date of Patent: December 19, 2000Assignee: Microsoft CorporationInventors: Alejandro Acero, Hsiao-Wuen Hon, Xuedong D. Huang
-
Patent number: 6141641Abstract: The present invention includes a system for recognizing speech based on an input data stream. The system includes an acoustic model which has a model size. The model is adjustable to a desired size based on characteristics of a computer system on which the recognition system is run.Type: GrantFiled: April 15, 1998Date of Patent: October 31, 2000Assignee: Microsoft CorporationInventors: Mei-Yuh Hwang, Xuedong D. Huang
-
Patent number: 6122612Abstract: A method and apparatus for matching at least a first input identifier with a reference identifier. A user provides an input identifier into a system, and the system produces a recognized identifier based on the input identifier. The system of the present invention perform a check-sum operation to determine whether the recognized identifier was recognized correctly. If the check-sum operation reveals that the recognized identifier is incorrect, the system of the present invention generates a plurality of substitute identifiers. The substitute identifiers are compared to a set of pre-stored reference identifiers. If a match is found between a reference identifier and a substitute identifier, the matched reference identifier is selected as corresponding to the input identifier provided by the user.Type: GrantFiled: November 20, 1997Date of Patent: September 19, 2000Assignee: AT&T CorpInventor: Randy G. Goldberg
-
Patent number: 6107935Abstract: A speaker recognition system for selectively permitting access by a requesting speaker to one of a service and facility include an acoustic front-end for computing at least one feature vector from a speech utterance provided by the requesting speaker; a speaker dependent codebook store for pre-storing sets of acoustic features, in the form of codebooks, respectively corresponding to a pool of previously enrolled speakers; a speaker identifier/verifier module operatively coupled to the acoustic front-end, wherein: the speaker identifier/verifier module identifies, from identifying indicia provided by the requesting speaker, a previously enrolled speaker as a claimed speaker; further, the speaker identifier/verifier module associates, with the claimed speaker, first and second groups of previously enrolled speakers, the first group being defined as speakers whose codebooks are respectively acoustically similar to the claimed speaker (i.e.Type: GrantFiled: February 11, 1998Date of Patent: August 22, 2000Assignee: International Business Machines CorporationInventors: Liam David Comerford, Stephane Herman Maes
-
Patent number: 6073096Abstract: A method of speech recognition, in accordance with the present invention includes the steps of grouping acoustics to form classes based on acoustic features, clustering training speakers by the classes to provide class-specific cluster systems, selecting from the cluster systems, a subset of cluster systems closest to adaptation data from a test speaker, transforming the subset of cluster systems to bring the subset of cluster systems closer to the test speaker based on the adaptation data to form adapted cluster systems and combining the adapted cluster systems to create a speaker adapted system for decoding speech from the test speaker. System and methods for building speech recognition systems as well as adapting speaker systems for class-specific speaker clusters are included.Type: GrantFiled: February 4, 1998Date of Patent: June 6, 2000Assignee: International Business Machines CorporationInventors: Yuqing Gao, Mukund Padmanabhan, Michael Alan Picheny
-
Patent number: 6064958Abstract: A pattern recognition scheme using probabilistic models that are capable of reducing a calculation cost for the output probability while improving a recognition performance even when a number of mixture component distributions of respective states is small, by arranging distributions with low calculation cost and high expressive power as the mixture component distribution. In this pattern recognition scheme, a probability of each probabilistic model expressing features of each recognition category with respect to each input feature vector derived from each input signal is calculated, where the probabilistic model represents a feature parameter subspace in which feature vectors of each recognition category exist and the feature parameter subspace is expressed by using mixture distributions of one-dimensional discrete distributions with arbitrary distribution shapes which are arranged in respective dimensions.Type: GrantFiled: September 19, 1997Date of Patent: May 16, 2000Assignee: Nippon Telegraph and Telephone CorporationInventors: Satoshi Takahashi, Shigeki Sagayama
-
Patent number: 6061652Abstract: A HMM device, and a DP matching device, capable of performing word spotting accurately with a small amount of calculation is provided. For that purpose, a code book is provided in which representative vectors of respective clusters are stored in a form searchable by their labels, wherein in HMM, similarity degrees based on Kullbach-Leibler Divergence of distributions of occurrence probabilities of the clusters under respective states and distribution of degrees of input feature vectors to be recognized to the respective clusters are rendered occurrence degrees of the feature vectors from the states and in DP matching, similarity degrees based on Kullbach-Leibler Divergence of distributions of membership degrees of feature vectors forming reference patterns to the respective clusters and distribution of degrees of input feature vectors to the respective clusters are rendered inter-frame similarity degrees of frames of the input patterns and frames of corresponding reference patterns.Type: GrantFiled: June 18, 1996Date of Patent: May 9, 2000Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Eiichi Tsuboka, Junichi Nakahashi
-
Patent number: 6038528Abstract: The present invention relates to a robust speech processing method and system which models channel and noise variations with affine transforms to reduce mismatched conditions between training and testing. The affine transform relating the training vectors C.sub.k with the vectors for testing condition c.sub.k', is represented by the form:c'.sub.k.sup.T =Ac.sub.k.sup.T +bfor k=1 to N in which A is a matrix of predicator coefficients representing noise distortions and vector b represents channel distortions. Alternatively, an affine invariant cepstrum is generated during testing and training for modeling speech to account for noise and channel effects. From the improved speech processing, improved speaker recognition with channel and noise variations is obtained.Type: GrantFiled: July 17, 1996Date of Patent: March 14, 2000Assignee: T-Netix, Inc.Inventors: Richard Mammone, Xiaoyu Zhang
-
Patent number: 6009392Abstract: A method is provided which trains acoustic models in an automatic speech recognizer ("ASR") without explicitly matching decoded scripts with correct scripts from which acoustic training data is generated. In the method, audio data is input and segmented to produce audio segments. The audio segments are clustered into groups of clustered audio segments such that the clustered audio segments in each of the groups have similar characteristics. Also, the groups respectively form audio similarity classes. Then, audio segment probability distributions for the clustered audio segments in the audio similarity classes are calculated, and audio segment frequencies for the clustered audio segments are determined based on the audio segment probability distributions. The audio segment frequencies are matched to known audio segment frequencies for at least one of letters, combination of letters, and words to determine frequency matches, and a textual corpus of words is formed based on the frequency matches.Type: GrantFiled: January 15, 1998Date of Patent: December 28, 1999Assignee: International Business Machines CorporationInventors: Dimitri Kanevsky, Wlodek Wlodzimierz Zadrozny
-
Patent number: 6009390Abstract: In a speech recognition system, tied-mixture hidden Markov models (HMMs) are used to match, in the maximum likelihood sense, the phonemes of spoken words given the acoustic input thereof. In a well known manner, such speech recognition requires computation of state observation likelihoods (SOLs). Because of the use of HMMs, each SOL computation involves a substantial number of Gaussian kernels and mixture component weights. In accordance with the invention, the number of Gaussian kernels is cut down to reduce the computational complexity and increase the efficiency of memory access to the kernels. For example, only the non-zero mixture component weights and the Gaussian kernels associated therewith are considered in the SOL computation. In accordance with an aspect of the invention, only a subset of the Gaussian kernels of significant values, regardless of the values of the associated mixture component weights, are considered in the SOL computation.Type: GrantFiled: September 11, 1997Date of Patent: December 28, 1999Assignee: Lucent Technologies Inc.Inventors: Sunil K. Gupta, Raziel Haimi-Cohen, Frank K. Soong
-
Patent number: 6006184Abstract: In a speaker recognition system, a tree-structured reference pattern storing unit has first through M-th node stages each of which has nodes that respectively store a reference pattern of inhibiting speakers. The reference pattern of each node of (N-1)-th node stage represents acoustic features in the reference patterns of predetermined ones of the nodes of the N-th node stage. An analysis unit analyzes input speech and converts the input speech into feature vectors. A similarities calculating unit calculates similarities between the feature vectors and the reference patterns of all of the inhibiting speakers. An inhibiting speaker selecting unit sorts the similarities and selects a predetermined number of inhibiting speakers.Type: GrantFiled: January 28, 1998Date of Patent: December 21, 1999Assignee: NEC CorporationInventors: Eiko Yamada, Hiroaki Hattori
-
Patent number: 5995930Abstract: A method and apparatus for processing a sequence of words in a speech signal for speech recognition. The method includes the steps of sampling, at recurrent instants, said speech signal for generating a series of test signals. Signal-by-signal matching and scoring is generated between the test signals and a series of reference signals, where each of the series of reference signals forms one of a plurality of vocabulary words arranged as a vocabulary tree. The vocabulary tree includes a root and a plurality of tree branches wherein any tree branch has a predetermined number of reference signals and is assigned to a speech element and any vocabulary word is assigned to a particular branch junction or branch end. Acoustic recombination determines both continuations of branches and the most probable partial hypotheses within a word because of the use of a vocabulary built up as a tree with branches having reference signals.Type: GrantFiled: November 19, 1996Date of Patent: November 30, 1999Assignee: U.S. Philips CorporationInventors: Reinhold Hab-Umbach, Hermann Ney
-
Patent number: 5987414Abstract: A vocabulary sub-set is selected from a large speech recognition dictionary. The selected vocabulary sub-set may be used in a real time directory assistance system to improve the system's real-time performance. The selection process is effected on the basis of the cost-benefit ratio, the benefit being measured in savings in operator working time. On the other hand, the cost is measured in terms of hardware limitations, namely processor throughput. Typically, the vocabulary sub-set is limited to a maximum number orthographies that would enable the system to achieve real-time performance.Type: GrantFiled: October 31, 1996Date of Patent: November 16, 1999Assignee: Nortel Networks CorporationInventors: Michael George Sabourin, Jeff Marcus
-
Patent number: 5983178Abstract: A speaker clustering apparatus generates HMMs for clusters based on feature quantities of a vocal-tract configuration of speech waveform data, and a speech recognition apparatus provided with the speaker clustering apparatus. In response to the speech waveform data of N speakers, an estimator estimates feature quantities of vocal-tract configurations, with reference to correspondence between vocal-tract configuration parameters and Formant frequencies predetermined based on a predetermined vocal tract model of a standard speaker. Further, a clustering processor calculates speaker-to-speaker distances between the N speakers based on the feature quantities of the vocal-tract configurations of the N speakers as estimated, and clusters the vocal-tract configurations of the N speakers using a clustering algorithm based on calculated speaker-to-speaker distances, thereby generating K clusters.Type: GrantFiled: December 10, 1998Date of Patent: November 9, 1999Assignee: ATR Interpreting Telecommunications Research LaboratoriesInventors: Masaki Naito, Li Deng, Yoshinori Sagisaka
-
Patent number: 5960396Abstract: The invention provides a standard pattern production system which produces an optimum recognition unit in terms of an information criterion to given learning data using an information criterion in learning of a standard pattern in pattern recognition. An input pattern production section holds an input pattern, and a standard pattern producing parameter production section calculates and outputs parameters necessary to produce standard patterns of individual categories. A cluster set production section divides a category set into cluster sets. A common standard pattern production section calculates standard patterns of individual clusters of the cluster sets. An optimum cluster selection section receives a plurality of cluster sets and common standard patterns and selects an optimum cluster using an information criterion. A standard pattern storage section stores the common standard pattern of the optimum cluster set as a standard pattern for the individual categories.Type: GrantFiled: April 21, 1997Date of Patent: September 28, 1999Assignee: NEC CorporationInventor: Koichi Shinoda
-
Patent number: 5895448Abstract: Methods and apparatus for generating and using both speaker dependent and speaker independent garbage models in speaker dependent speech recognition applications are described. The present invention recognizes that in some speech recognition systems, e.g., systems where multiple speech recognition operations are performed on the same signal, it may be desirable to recognize and treat words or phrases in one part of the speech recognition system as garbage or out of vocabulary utterances with the understanding that the very same words or phrases will be recognized and treated as in-vocabulary by another portion of the system. In accordance with the present invention, in systems where both speaker independent and speaker dependent speech recognition operations are performed independently, e.g.Type: GrantFiled: April 30, 1997Date of Patent: April 20, 1999Assignee: Nynex Science and Technology, Inc.Inventors: George J. Vysotsky, Vijay R. Raman
-
Patent number: 5890114Abstract: HMM training method comprising a first parameter predicting step, a centroid state set calculating step, a reconstructing step, a second parameter predicting step and a control step. In the first parameter predicting step, a parameter of an HMM (hidden Markov model) is predicted based on training data. In the centroid state set calculating step, a centroid state set is calculated by clustering the state of said HMM whose parameter is predicted in the first parameter predicting step. In the reconstructing step, an HMM is reconstructed with using the centroid state calculated in the centroid state set calculating step. In the second parameter predicting step, predicted a parameter of the HMM reconstructed in the reconstructing step with using the training data. And, the centroid step is reexecuted by the control step in the case that a likelihood of the HMM whose parameter is predicted in the second parameter predicting step does not satisfy a predetermined condition.Type: GrantFiled: February 28, 1997Date of Patent: March 30, 1999Assignee: Oki Electric Industry Co., Ltd.Inventor: Jie Yi
-
Patent number: 5890110Abstract: A variable dimension vector quantization method that uses a single "universal" codebook. The method can be given the interpretation of sampling full-dimensioned codevectors in the universal codebook and generating subcodevectors of the same dimension as input data subvector, which dimension may vary in time. A subcodevector is selected from the codebook to have minimum distortion between it and the input data subvector. The subcodevector with minimum distortion corresponds to the representative, full-dimensioned codevector in the codebook. The codebook is designed by inverse sampling of training subvectors to obtain full-dimension vectors, then iteratively clustering the training set until a stable centroid vector is obtained.Type: GrantFiled: March 27, 1995Date of Patent: March 30, 1999Assignee: The Regents of the University of CaliforniaInventors: Allen Gersho, Amitava Das, Ajit Venkat Rao
-
Patent number: 5884258Abstract: A method and system for editing words that have been misrecognized. The system allows a speaker to specify a number of alternative words to be displayed in a correction window by resizing the correction window. The system also displays the words in the correction window in alphabetical order. A preferred system eliminates the possibility, when a misrecognized word is respoken, that the respoken utterance will be again recognized as the same misrecognized word. The system, when operating with a word processor, allows the speaker to specify the amount of speech that is buffered before transferring to the word processor.Type: GrantFiled: October 31, 1996Date of Patent: March 16, 1999Assignee: Microsoft CorporationInventors: Michael J. Rozak, Fileno A Alleva
-
Patent number: 5875425Abstract: A speech recognition system for recognizing a system user's speech can shorten a recognition period by reducing the amount of necessary calculations without deteriorating the accuracy rate of recognition. The speech recognition system successively calculates statistical probabilities of acoustic models, outputs a one sentence recognition result corresponding to acoustic models having the highest reliability when the one sentence is detected and stops the following calculations.Type: GrantFiled: December 23, 1996Date of Patent: February 23, 1999Assignee: Kokusai Denshin Denwa Co., Ltd.Inventors: Makoto Nakamura, Naomi Inoue, Fumihiro Yato, Seiichi Yamamoto
-
Patent number: 5864807Abstract: A method and apparatus for training a system to assess the identity of a person through the audio characteristics of their voice. The system inserts an audio input (10) into an A/D Converter (20) for processing in a digital signal processor (30). The system then applies Neural network type processing by using a polynomial pattern classifier (60) for training the speaker recognition system.Type: GrantFiled: February 25, 1997Date of Patent: January 26, 1999Assignee: Motorola, Inc.Inventors: William Michael Campbell, Khaled Talal Assaleh
-
Patent number: 5860063Abstract: A system and method for automated task selection is provided where a selected task is identified from the natural speech of the user making the selection. The system and method incorporate the selection of meaningful phrases through the use of a test for significance. The selected meaningful phrases are then clustered. The meaningful phrase clusters are input to a speech recognizer that determines whether any meaningful phrase clusters are present in the input speech. Task-type decisions are then made on the basis of the recognized meaningful phrase clusters.Type: GrantFiled: July 11, 1997Date of Patent: January 12, 1999Assignee: AT&T CorpInventors: Allen Louis Gorin, Jeremy Huntley Wright
-
Patent number: 5854999Abstract: Compensatory values for compensating a reference pattern to match with an utterance environment of an input speech are employed for determining an environmental variation index to be input to a secondary matching controller, which is responsible for magnitudes of the index smaller than a threshold to hold a second matching section inoperative so that a recognition result of a primary matching of a previous compensated reference pattern is output, and for magnitudes of the index larger than the threshold to operate the second matching section to output a recognition result of a second matching based on a current compensated reference pattern to be stored as a subsequent reference pattern.Type: GrantFiled: June 24, 1996Date of Patent: December 29, 1998Assignee: NEC CorporationInventor: Hiroshi Hirayama
-
Patent number: 5852804Abstract: A speech recognizing apparatus compares a speech command from a user with one of registration patterns stored in a storage unit in turn. Then if the speech command coincides with one of the registration patterns, the speech recognizing apparatus controls a predetermined electronic apparatus associated with an operation related to the registration pattern. If the speech command does not coincide with any one of the registration patterns, the speech recognizing apparatus stores into a memory the speech command as a new registration pattern in which the speech command is related to a manipulation of the electronic apparatus produced by the user immediately after speech command is produced.Type: GrantFiled: April 11, 1997Date of Patent: December 22, 1998Assignee: Fujitsu LimitedInventor: Kazuya Sako
-
Patent number: 5819221Abstract: Improved speech recognition is achieved according to the present invention by use of between word and/or between phrase coarticulation. The increase in the number of phonetic models required to model this additional vocabulary is reduced by clustering 19, 20 the inter-word/phrase models and grammar into only a few classes. By using one class for consonant inter-word context and two classes for vowel contexts, the accuracy for Japanese was almost as good as for unclustered models while the number of models was reduced more than half.Type: GrantFiled: August 31, 1994Date of Patent: October 6, 1998Assignee: Texas Instruments IncorporatedInventors: Kazuhiro Kondo, Ikuo Kudo, Yu-Hung Kao, Barbara J. Wheatley
-
Patent number: 5812975Abstract: A method of designing a state transition model capable of high speed voice recognition and a voice recognition method and apparatus using the state transition model is provided. The methods provide a state transition model in which a state shared structure of the state transition model is designed. The method includes a step of setting the states of a triphone state transition model in an acoustic space as initial clusters, a clustering step of generating a cluster containing the initial clusters by top-down clustering, a step of determining a state shared structure by assigning a short distance cluster among clusters generated by the clustering step, to the state transition model, and a step of learning a state shared model by analyzing the states of the triphones in accordance with the determined state shared structure.Type: GrantFiled: June 18, 1996Date of Patent: September 22, 1998Assignee: Canon Kabushiki KaishaInventors: Yasuhiro Komori, Yasunori Ohora
-
Patent number: 5806029Abstract: Hierarchical signal bias removal (HSBR) signal conditioning uses a codebook constructed from the set of recognition models and is updated as the recognition models are modified during recognition model training. As a result, HSBR signal conditioning and recognition model training are based on the same set of recognition model parameters, which provides significant reduction in recognition error rate for the speech recognition system.Type: GrantFiled: September 15, 1995Date of Patent: September 8, 1998Assignee: AT&T CorpInventors: Eric Rolfe Buhrke, Wu Chou, Mazin G. Rahim
-
Patent number: 5806030Abstract: The clustering technique produces a low complexity and yet high accuracy speech representation for use with speech recognizers. The task database comprising the test speech to be modeled is segmented into subword units such as phonemes and labeled to indicate each phoneme in its left and right context (triphones). Hidden Markov Models are constructed for each context-independent phoneme and trained. Then the center states are tied for all phonemes of the same class. Triphones are trained and all poorly-trained models are eliminated by merging their training data with the nearest well-trained model using a weighted divergence computation to ascertain distance. Before merging, the threshold for each class is adjusted until the number of good models for each phoneme class is within predetermined upper and lower limits. Finally, if desired, the number of mixture components used to represent each model may be increased and the models retrained. This latter step increases the accuracy.Type: GrantFiled: May 6, 1996Date of Patent: September 8, 1998Inventor: Jean-Claude Junqua
-
Patent number: 5787394Abstract: A system and method for adaptation of a speaker independent speech recognition system for use by a particular user. The system and method gather acoustic characterization data from a test speaker and compare the data with acoustic characterization data generated for a plurality of training speakers. A match score is computed between the test speaker's acoustic characterization for a particular acoustic subspace and each training speaker's acoustic characterization for the same acoustic subspace. The training speakers are ranked for the subspace according to their scores and a new acoustic model is generated for the test speaker based upon the test speaker's acoustic characterization data and the acoustic characterization data of the closest matching training speakers. The process is repeated for each acoustic subspace.Type: GrantFiled: December 13, 1995Date of Patent: July 28, 1998Assignee: International Business Machines CorporationInventors: Lalit Rai Bahl, Ponani Gopalakrishnan, David Nahamoo, Mukund Padmanabhan
-
Patent number: 5787395Abstract: A voice recognizing method in which a plurality of voice recognition objective words are provided. Scores are accumulated for an unknown input voice signal as compared to the voice recognition objective words by using parameters which are calculated in advance. Upon receipt of an unknown voice signal, a corresponding voice recognition objective word is extracted and recognized. The voice recognition objective words are structured into an overlapping hierarchical structure by using correlation values between each pair of voice recognition objective words. This correlation may be computed from acoustic features, HMM parameters or the like. Score calculation is performed on the unknown input voice signal by using a dictionary of the voice recognition objective words structured in the hierarchical structure. Upon preliminary recognition, the dictionary of the voice recognition objective words is resorted without recalculation of the correlation values.Type: GrantFiled: July 18, 1996Date of Patent: July 28, 1998Assignee: Sony CorporationInventor: Katsuki Minamino
-
Patent number: 5778336Abstract: A joint data (features) and channel (bias) estimation framework for robust processing of speech received over a channel is described. A trellis encoded vector quantizer is used as a pre-processor to estimate the channel bias using blind maximum likelihood sequence estimation. Sequential constraint in the feature vector sequence of a speech signal is applied for the selection of the quantized signal constellation and for the decoding process in joint data and channel estimation. A two state trellis encoded vector quantizer is designed for signal bias removal applications.Type: GrantFiled: October 1, 1996Date of Patent: July 7, 1998Assignee: Lucent Technologies Inc.Inventors: Wu Chou, Nambirajan Seshadri
-
Patent number: 5749072Abstract: A communications device (20) that is responsive to voice commands is provided. The communications device (20) can be a two-way radio, cellular telephone, PDA, or pager. The communications device (20) includes an interface (22) for allowing a user to access a communications channel according a control signal and a speech-recognition system (24) for producing the control signal in response to a voice command. Included in the speech recognition system (24) are a feature extractor (26) and one or more classifiers (28) utilizing polynomial discriminant functions.Type: GrantFiled: December 28, 1995Date of Patent: May 5, 1998Assignee: Motorola Inc.Inventors: Theodore Mazurkiewicz, Gil E. Levendel, Shay-Ping Thomas Wang
-
Patent number: 5749066Abstract: An automated speech recognition system converts a speech signal into a compact, coded representation that correlates to a speech phoneme set. A number of different neural network pattern matching schemes may be used to perform the necessary speech coding. An integrated user interface guides a user unfamiliar with the details of speech recognition or neural networks to quickly develop and test a neural network for phoneme recognition. To train the neural network, digitized voice data containing known phonemes that the user wants the neural network to ultimately recognize are processed by the integrated user interface. The digitized speech is segmented into phonemes with each segment being labelled with a corresponding phoneme code. Based on a user selected transformation method and transformation parameters, each segment is transformed into a series of multiple dimension vectors representative of the speech characteristics of that segment.Type: GrantFiled: April 24, 1995Date of Patent: May 5, 1998Assignee: Ericsson Messaging Systems Inc.Inventor: Paul A. Nussbaum
-
Patent number: 5664058Abstract: To train a speech recognizer, a new voice message (one or a few isolated words), after being spoken by a user, is converted into a token. The token is then compared with a plurality of templates stored in the recognizer and a recognition score is obtained each time. The templates previously stored in the recognizer include templates for previously trained voice messages and one or more previously formed templates of the new voice message. Three tests are applied to the recognition scores to determine if the token and one of the previously formed templates of the new voice message can become paradigm templates, if the new voice message is too close in pronunciation to a voice message the recognizer has been previously trained to recognize, or if the user should repeat the new voice message to form another token. This training procedure provides a certain level of automatic control over the training process of a speaker dependent speech recognizer in an otherwise unsupervised environment.Type: GrantFiled: May 12, 1993Date of Patent: September 2, 1997Assignee: NYNEX Science & TechnologyInventor: George Vysotsky