Probability Patents (Class 704/240)

Decoding multiple HMM sets using a single sentence grammar

Patent number: 7269558

Abstract: For a given sentence grammar, speech recognizers are often required to decode M sets of HMMs each of which models a specific acoustic environment. In order to match input acoustic observations to each of the environments, typically recognition search methods require a network of M sub-networks. A new speech recognition search method is described here, which needs that is only the size of a single sub-network and yet gives the same recognition performance, thus reducing memory requirement for network storage by (M?1)/M.

Type: Grant

Filed: July 26, 2001

Date of Patent: September 11, 2007

Assignee: Texas Instruments Incorporated

Inventor: Yifan Gong
Pattern recognition

Patent number: 7269556

Abstract: Pattern recognition, wherein a sequence of feature vectors is formed from a digitized incoming signal, the feature vectors comprising feature vector components, and at least one feature vector is compared with templates of candidate patterns by computing a distortion measure. A control signal based on at least one time-dependent variable of the recognition process is formulated, and the distortion measure is computed using only a subset of the vector components of the feature vector, the subset being chosen in accordance with said control signal. This reduces the computational complexity of the computation, as the dimensionality of the vectors involved in the computation is effectively reduced. Although such a dimension reduction decreases the computational need, it has been found not to significantly impair the classification performance.

Type: Grant

Filed: March 26, 2003

Date of Patent: September 11, 2007

Assignee: Nokia Corporation

Inventors: Imre Kiss, Marcel Vasilache
Method and system for learning linguistically valid word pronunciations from acoustic data

Patent number: 7266495

Abstract: A computerized pronunciation system is provided for generating pronunciations for words and storing the pronunciations in a pronunciation dictionary. The system includes a word list including at least one word; transcribed acoustic data including at least one waveform for the word and transcribed text associated with the waveform; a pronunciation-learning module configured to accept as input the word list and the transcribed acoustic data, the pronunciation-learning module including: sets of initial pronunciations of the word, a scoring module configured score pronunciations and to generate phone probabilities, and a set of alternate pronunciations of the word, wherein the set of alternate pronunciations include a highest-scoring set of initial pronunciations with a highest-scoring substitute phone substituted for a lowest-probability phone; and a pronunciation dictionary configured to receive the highest-scoring set of initial pronunciations and the set of alternate pronunciations.

Type: Grant

Filed: September 12, 2003

Date of Patent: September 4, 2007

Assignee: Nuance Communications, Inc.

Inventors: Francoise Beaufays, Ananth Sankar, Mitchel Weintraub, Shaun Williams
Training machine learning by sequential conditional generalized iterative scaling

Patent number: 7266492

Abstract: A system and method facilitating training machine learning systems utilizing sequential conditional generalized iterative scaling is provided. The invention includes an expected value update component that modifies an expected value based, at least in part, upon a feature function of an input vector and an output value, a sum of lambda variable and a normalization variable. The invention further includes an error calculator that calculates an error based, at least in part, upon the expected value and an observed value. The invention also includes a parameter update component that modifies a trainable parameter based, at least in part, upon the error. A variable update component that updates at least one of the sum of lambda variable and the normalization variable based, at least in part, upon the error is also provided.

Type: Grant

Filed: August 16, 2006

Date of Patent: September 4, 2007

Assignee: Microsoft Corporation

Inventor: Joshua Theodore Goodman
Method and apparatus for identifying noise environments from noisy signals

Patent number: 7266494

Abstract: A method and apparatus are provided for identifying a noise environment for a frame of an input signal based on at least one feature for that frame. To identify the noise environment, a probability for a noise environment is determined by applying the noisy input feature vector to a distribution of noisy training feature vectors. In one embodiment, each noisy training feature vector in the distribution is formed by modifying a set of clean training feature vectors. In one embodiment, the probabilities of the noise environments for past frames are included in the identification of an environment for a current frame. In one embodiment, a correction vector is then selected based on the identified noise environment.

Type: Grant

Filed: November 10, 2004

Date of Patent: September 4, 2007

Assignee: Microsoft Corporation

Inventors: James G. Droppo, Alejandro Acero, Li Deng
Robust detection and classification of objects in audio using limited training data

Patent number: 7263485

Abstract: A method (200) and apparatus (100) for classifying a homogeneous audio segment are disclosed. The homogeneous audio comprises a sequence of audio samples (x(n)). The method (200) starts by forming a sequence of frames (701-704) along the sequence of audio samples (x(n)), each frame (701-704) comprising a plurality of the audio samples (x(n)). The homogeneous audio segment is next divided (206) into a plurality of audio clips (711-714), with each audio clip being associated with a plurality of the frames (701-704). The method (200) then extracts (208) at least one frame feature for each clip (711-714). A clip feature vector (f) is next extracted from frame features of frames associated with the audio clip (711-714). Finally the segment is classified based on a continuous function during the distribution of the clip feature vectors (f).

Type: Grant

Filed: May 28, 2003

Date of Patent: August 28, 2007

Assignee: Canon Kabushiki Kaisha

Inventor: Timothy John Wark
Method, apparatus, and medium for measuring confidence about speech recognition in speech recognizer

Publication number: 20070185712

Abstract: A method of measuring confidence of speech recognition in a speech recognizer compares a phase change point with a phoneme string change point and uses a difference between the phase change point and the phoneme string change point and a likelihood ratio, and an apparatus using the method is provided. That is, the method of the present invention includes detecting a phase change point of a speech signal; detecting a phoneme string change point according to a result of speech recognition; calculating confidence of the speech recognition by using a difference between the detected phase change point and phoneme string change point. According to the present invention, a performance of measuring confidence may become improved by simultaneously taking not only a likelihood ratio, but also taking a comparison result of a phase change point with a phoneme string change point into consideration.

Type: Application

Filed: June 30, 2006

Publication date: August 9, 2007

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Jae-Hoon Jeong, Kwang Cheol Oh
Method and apparatus for distribution-based language model adaptation

Patent number: 7254529

Abstract: A method and apparatus are provided for adapting a language model to a task-specific domain. Under the method and apparatus, the relative frequency of n-grams in a small training set (i.e. task-specific training data set) and the relative frequency of n-grams in a large training set (i.e. out-of-domain training data set) are used to weight a distribution count of n-grams in the large training set. The weighted distributions are then used to form a modified language model by identifying probabilities for n-grams from the weighted distributions.

Type: Grant

Filed: September 13, 2005

Date of Patent: August 7, 2007

Assignee: MIcrosoft Corporation

Inventors: Jianfeng Gao, Mingjing Li
Nonlinear mapping for feature extraction in automatic speech recognition

Patent number: 7254538

Abstract: The present invention successfully combines neural-net discriminative feature processing with Gaussian-mixture distribution modeling (GMM). By training one or more neural networks to generate subword probability posteriors, then using transformations of these estimates as the base features for a conventionally-trained Gaussian-mixture based system, substantial error rate reductions may be achieved. The present invention effectively has two acoustic models in tandem—first a neural net and then a GMM. By using a variety of combination schemes available for connectionist models, various systems based upon multiple features streams can be constructed with even greater error rate reductions.

Type: Grant

Filed: November 16, 2000

Date of Patent: August 7, 2007

Assignee: International Computer Science Institute

Inventors: Hynek Hermansky, Sangita Sharma, Daniel Ellis
Distribution goodness-of-fit test device, consumable goods supply timing judgment device, image forming device, distribution goodness-of-fit test method and distribution goodness-of-fit test program

Patent number: 7231315

Abstract: A distribution goodness-of-fit test device for testing whether measured data matches an estimated probability distribution has a counting section determination unit, a counting unit and a goodness-of-fit test unit. The counting section determination unit determines according to the number of the measured data, widths of counting sections for counting the measured data. The counting unit counts the numbers of data in the respective determined counting sections. Also, the goodness-of-fit test unit performs a goodness-of-fit test based on the numbers of data in the respective counting sections.

Type: Grant

Filed: December 3, 2004

Date of Patent: June 12, 2007

Assignee: Fuji Xerox Co., Ltd.

Inventor: Masakazu Fujimoto
Speech recognition system having multiple speech recognizers

Patent number: 7228275

Abstract: A speech recognition system recognizes an input speech signal by using a first speech recognizer and a second speech recognizer each coupled to a decision module. Each of the first and second speech recognizers outputs first and second recognized speech texts and first and second associated confidence scores, respectively, and the decision module selects either the first or the second speech text depending upon which of the first or second confidence score is higher. The decision module may also adjust the first and second confidence scores to generate first and second adjusted confidence scores, respectively, and select either the first or second speech text depending upon which of the first or second adjusted confidence scores is higher. The first and second confidence scores may be adjusted based upon the location of a speaker, the identity or accent of the speaker, the context of the speech, and the like.

Type: Grant

Filed: January 13, 2003

Date of Patent: June 5, 2007

Assignees: Toyota InfoTechnology Center Co., Ltd., iAnywhere Solutions, Inc.

Inventors: Norikazu Endo, John R. Brookes, Benjamin K. Reaves, Babak Hodjat, Masahiko Funaki
Automatic pronunciation scoring for language learning

Patent number: 7219059

Abstract: A method and apparatus for generating a pronunciation score by receiving a user phrase intended to conform to a reference phrase and processing the user phrase in accordance with at least one of an articulation-scoring engine, a duration scoring engine and an intonation-scoring engine to derive thereby the pronunciation score.

Type: Grant

Filed: July 3, 2002

Date of Patent: May 15, 2007

Assignee: Lucent Technologies Inc.

Inventors: Sunil K. Gupta, Ziyi Lu, Fengguang Zhao
Lattice-based unsupervised maximum likelihood linear regression for speaker adaptation

Patent number: 7216077

Abstract: Methods and arrangements using lattice-based information for unsupervised speaker adaptation. By performing adaptation against a word lattice, correct models are more likely to be used in estimating a transform. Further, a particular type of lattice proposed herein enables the use of a natural confidence measure given by the posterior occupancy probability of a state, that is, the statistics of a particular state will be updated with the current frame only if the a posteriori probability of the state at that particular time is greater than a predetermined threshold.

Type: Grant

Filed: September 26, 2000

Date of Patent: May 8, 2007

Assignee: International Business Machines Corporation

Inventors: Mukund Padmanabhan, George A. Saon, Geoffrey G. Zweig
Excitation codebook search method in a speech coding system

Patent number: 7206739

Abstract: A method for searching an excitation (or fixed) codebook in a speech coding system. In a speech coding system including a synthesis filter for synthesizing a speech signal, a fixed codebook searcher according to the present invention segments a speech signal frame into a plurality of subframes to generate an excitation signal to be used in a synthesis filter, segments again each of the subframes into a plurality of subgroups, and searches the respective subframes each comprised of a plurality of pulse position/amplitude combinations for pulses. The fixed codebook searcher searches the respective subgroups for a predetermine number of pulses having non-zero amplitude, and generates the searched pulses as an initial vector. Next, the fixed codebook searcher selects a pulse combination including at least one pulse among the pulses of the initial vector, and then substitutes pulses of the selected pulse combination for pulses in other positions in the subgroups.

Type: Grant

Filed: May 23, 2002

Date of Patent: April 17, 2007

Assignee: Samsung Electronics Co., Ltd.

Inventor: Dae-Ryong Lee
Method of determining uncertainty associated with acoustic distortion-based noise reduction

Patent number: 7174292

Abstract: A method and apparatus are provided for determining uncertainty in noise reduction based on a parametric model of speech distortion. The method is first used to reduce noise in a noisy signal. In particular, noise is reduced from a representation of a portion of a noisy signal to produce a representation of a cleaned signal by utilizing an acoustic environment model. The uncertainty associated with the noise reduction process is then computed. In one embodiment, the uncertainty of the noise reduction process is used, in conjunction with the noise-reduced signal, to decode a pattern state.

Type: Grant

Filed: September 5, 2002

Date of Patent: February 6, 2007

Assignee: Microsoft Corporation

Inventors: Li Deng, Alejandro Acero, James G. Droppo
Speech recognition over lossy networks with rejection threshold

Patent number: 7171359

Abstract: Recognizing a stream of speech received as speech vectors over a lossy communications link includes constructing for a speech recognizer a series of speech vectors from packets received over a lossy packetized transmission link, wherein some of the packets associated with each speech vector are lost or corrupted during transmission. Each constructed speech vector is multi-dimensional and includes associated features. Potentially corrupted features within the speech vector are indicated to the speech recognizer when present. Speech recognition is attempted at the speech recognizer on the speech vectors when corrupted features are present. This recognition may be based only on certain or valid features within each speech vector. Retransmission of a missing or corrupted packet is requested when corrupted values are indicated by the indicating step and when the attempted recognition step fails.

Type: Grant

Filed: July 29, 2004

Date of Patent: January 30, 2007

Assignee: AT&T Corp.

Inventors: Richard Vandervoort Cox, Stephen Michael Marcus, Mazin G. Rahim, Nambirajan Seshadri, Robert Douglas Sharp
Compression of language model structures and word identifiers for automated speech recognition systems

Patent number: 7171358

Abstract: A method compresses one or more ordered arrays of integer values. The integer values can represent a vocabulary of a language mode, in the form of an N-gram, of an automated speech recognition system. For each ordered array to be compressed, and an inverse array I[.] is defined. One or more spilt inverse arrays are also defined for each ordered array. The minimum and optimum number of bits required to store the array A[.] in terms of the split arrays and split inverse arrays are determined. Then, the original array is stored in such a way that the total amount of memory used is minimized.

Type: Grant

Filed: January 13, 2003

Date of Patent: January 30, 2007

Assignee: Mitsubishi Electric Research Laboratories, Inc.

Inventors: Edward W. D. Whittaker, Bhiksha Ramakrishnan
Speech processing apparatus and method using confidence scores

Patent number: 7165031

Abstract: A speech recognition method and apparatus is disclosed in which outputs a confidence score indicative of the posterior probability of an utterance being correctly matched to a word model. The confidence score for the matching of an utterance to a word model is determined directly from the generated values indicative of the goodness of match between the utterance and stored word models utilizing the following equation: confidence = exp ? ( - 2 ? ? S ? ( x | w ) ) ? words ? exp ? ( 2 ? ? S ? ( x | w ) ) where S(x|w) is the match score for the correlation between a signal x and word w and ? is an experimentally determined constant.

Type: Grant

Filed: November 6, 2002

Date of Patent: January 16, 2007

Assignee: Canon Kabushiki Kaisha

Inventor: David Llewellyn Rees
Probabalistic networks for detecting signal content

Patent number: 7136813

Abstract: A method and apparatus using a probabilistic network to estimate probability values each representing a probability that at least part of a signal represents content, such as voice activity, and to combine the probability values into an overall probability value. The invention may conform itself to particular system and/or signal characteristics by using some probability estimates and discarding other probability estimates.

Type: Grant

Filed: September 25, 2001

Date of Patent: November 14, 2006

Assignee: Intel Corporation

Inventors: Maxim Likhachev, Murat Eren
Method and apparatus using spectral addition for speaker recognition

Patent number: 7133826

Abstract: A method and apparatus for speaker recognition is provided that matches the noise in training data to noise in testing data using spectral addition. Under spectral addition, the mean and variance for a plurality of frequency components are adjusted in the training data and the test data so that each mean and variance is matched in a resulting matched training signal and matched test signal. The adjustments made to the training data and test data add to the mean and variance of the training data and test data instead of subtracting from the mean and variance.

Type: Grant

Filed: February 24, 2005

Date of Patent: November 7, 2006

Assignee: Microsoft Corporation

Inventors: Xuedong Huang, Michael D. Plumpe
Dynamic semantic control of a speech recognition system

Patent number: 7127393

Abstract: A method and apparatus are provided for automatically recognizing words of spoken speech using a computer-based speech recognition system according to a dynamic semantic model. In an embodiment, the speech recognition system recognizes speech and generates one or more word strings, each of which is a hypothesis of the speech, and creates and stores a probability value or score for each of the word strings. The word strings are ordered by probability value. The speech recognition system also creates and stores, for each of the word strings, one or more keyword-value pairs that represent semantic elements and semantic values of the semantic elements for the speech that was spoken. One or more dynamic semantic rules are defined that specify how a probability value of a word string should be modified based on information about external conditions, facts, or the environment of the application in relation to the semantic values of that word string.

Type: Grant

Filed: February 10, 2003

Date of Patent: October 24, 2006

Assignee: Speech Works International, Inc.

Inventors: Michael S. Phillips, Etienne Barnard, Jean-Guy Dahan, Michael J. Metzger
Training machine learning by sequential conditional generalized iterative scaling

Patent number: 7107207

Abstract: A system and method facilitating training machine learning systems utilizing sequential conditional generalized iterative scaling is provided. The invention includes an expected value update component that modifies an expected value based, at least in part, upon a feature function of an input vector and an output value, a sum of lambda variable and a normalization variable. The invention further includes an error calculator that calculates an error based, at least in part, upon the expected value and an observed value. The invention also includes a parameter update component that modifies a trainable parameter based, at least in part, upon the error. A variable update component that updates at least one of the sum of lambda variable and the normalization variable based, at least in part, upon the error is also provided.

Type: Grant

Filed: June 19, 2002

Date of Patent: September 12, 2006

Assignee: Microsoft Corporation

Inventor: Joshua Theodore Goodman
Automatic recognition of company names in speech utterances

Patent number: 7103546

Abstract: A method of automatic recognition of company names in speech utterances includes generating at least one word sequence hypothesis by a speech recognizer from a speech utterance consisting of one or more words, comparing the word sequence hypothesis with the entries representing company names stored in a database, and selecting, in dependence on the result of the comparison, one company name as a recognition result.

Type: Grant

Filed: August 7, 2001

Date of Patent: September 5, 2006

Assignee: Koninklijke Philips Electronics, N.V.

Inventor: Georg Rose
Method of pattern recognition using noise reduction uncertainty

Patent number: 7103540

Abstract: A method and apparatus are provided for using the uncertainty of a noise-removal process during pattern recognition. In particular, noise is removed from a representation of a portion of a noisy signal to produce a representation of a cleaned signal. In the meantime, an uncertainty associated with the noise removal is computed and is used with the representation of the cleaned signal to modify a probability for a phonetic state in the recognition system. In particular embodiments, the uncertainty is used to modify a probability distribution, by increasing the variance in each Gaussian distribution by the amount equal to the estimated variance of the cleaned signal, which is used in decoding the phonetic state sequence in a pattern recognition task.

Type: Grant

Filed: May 20, 2002

Date of Patent: September 5, 2006

Assignee: Microsoft Corporation

Inventors: James G. Droppo, Alejandro Acero, Li Deng
System and method for speech verification using a robust confidence measure

Patent number: 7103543

Abstract: The present invention comprises a system and method for speech verification using a robust confidence measure, and includes a speech verifier which compares a confidence measure for a recognized word to a predetermined threshold value in order to determine whether the recognized word is valid, where a recognized word corresponds to a word model that produces a highest recognition score. In accordance with the present invention, the foregoing confidence measure may be calculated using the recognition score for the recognized word, a background score of a worst recognition candidate, and a pseudo filler score that may be based upon selected average recognition scores from an N-best list of recognition candidates.

Type: Grant

Filed: August 13, 2002

Date of Patent: September 5, 2006

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Gustavo Hernandez-Abrego, Xavier Menendez-Pidal
Method and apparatus for predicting word error rates from text

Patent number: 7103544

Abstract: A method of modeling a speech recognition system includes decoding a speech signal produced from a training text to produce a sequence of predicted speech units. The training text comprises a sequence of actual speech units that is used with the sequence of predicted speech units to form a confusion model. In further embodiments, the confusion model is used to decode a text to identify an error rate that would be expected if the speech recognition system decoded speech based on the text.

Type: Grant

Filed: June 6, 2005

Date of Patent: September 5, 2006

Assignee: Microsoft Corporation

Inventors: Milind Mahajan, Yonggang Deng, Alejandro Acero, Asela J. R. Gunawardana, Ciprian Chelba
Method and apparatus for performing packet loss or frame erasure concealment

Patent number: 7047190

Abstract: The invention concerns a method and apparatus for performing packet loss or Frame Erasure Concealment (FEC) for a speech coder that does not have a built-in or standard FEC process. A receiver with a decoder receives encoded frames of compressed speech information transmitted from an encoder. A lost frame detector at the receiver determines if an encoded frame has been lost or corrupted in transmission, or erased. If the encoded frame is not erased, the encoded frame is decoded by a decoder and a temporary memory is updated with the decoder's output. A predetermined delay period is applied and the audio frame is then output. If the lost frame detector determines that the encoded frame is erased, a FEC module applies a frame concealment process to the signal. The FEC processing produces natural sounding synthetic speech for the erased frames.

Type: Grant

Filed: April 19, 2000

Date of Patent: May 16, 2006

Assignee: AT&TCorp.

Inventor: David A. Kapilow
Method and apparatus for distribution-based language model adaptation

Patent number: 7043422

Abstract: A method and apparatus are provided for adapting a language model to a task-specific domain. Under the method and apparatus, the relative frequency of n-grams in a small training set (i.e. task-specific training data set) and the relative frequency of n-grams in a large training set (i.e. out-of-domain training data set) are used to weight a distribution count of n-grams in the large training set. The weighted distributions are then used to form a modified language model by identifying probabilities for n-grams from the weighted distributions.

Type: Grant

Filed: September 4, 2001

Date of Patent: May 9, 2006

Assignee: Microsoft Corporation

Inventors: Jianfeng Gao, Mingjing Li
Control using multiple speech receptors in an in-vehicle speech recognition system

Patent number: 7016836

Abstract: Disclosed are a speech recognition system which comprises the following components, and a speech recognition method for this speech recognition system. The speech recognition system comprises a plurality of voice pickup sections for picking up uttered voices, a determination section for determining a speech signal suitable for speech recognition from speech signals output from the plurality of voice pickup sections, and a speech recognizer for performing speech recognition based on the speech signal determined by the determination section.

Type: Grant

Filed: August 30, 2000

Date of Patent: March 21, 2006

Assignee: Pioneer Corporation

Inventor: Shoutarou Yoda
Method of phrase verification with probabilistic confidence tagging

Patent number: 7010484

Abstract: A method of phrase verification to verify a phrase not only according to its confidence measures but also according to neighboring concepts and their confidence tags. First, an utterance is received, and the received utterance is parsed to find a concept sequence. Subsequently, a plurality of tag sequences corresponding to the concept sequence is produced. Then, a first score of each of the tag sequences is calculated. Finally, the tag sequence of the highest first score is selected as the most probable tag sequence, and the tags contained therein are selected as the most probable confidence tags, respectively corresponding to the concepts in the concept sequence.

Type: Grant

Filed: December 12, 2001

Date of Patent: March 7, 2006

Assignee: Industrial Technology Research Institute

Inventor: Yi-Chung Lin
Speech processing system

Patent number: 7010483

Abstract: A speech processing system is provided which is operable to receive sets of signal values representative of a speech signal generated by a speech source. The system is operable to determine a measure of the quality of the speech signal by performing a statistical analysis of the received sets of signal values. The system stores data defining a predetermined function derived from a signal model which models the speech source and which defines a probability density function which gives, for a given set of model parameters, the probability that the signal model has those model parameters given that the signal model is assumed to have generated the received set of signal values. The system applies a current set of received signal values to the stored probability density function and then draws samples from it using a Gibbs sampler.

Type: Grant

Filed: May 30, 2001

Date of Patent: March 7, 2006

Assignee: Canon Kabushiki Kaisha

Inventor: Jebu Jacob Rajan
Method and system for predicting understanding errors in automated dialog systems

Patent number: 7003459

Abstract: This invention concerns a method and system for monitoring an automated dialog system for the automatic recognition of language understanding errors based on a user's input communications. The method includes determining whether a probability of understanding the user's input communication exceeds a first thresholds, where if the first threshold is exceeded, further dialog is conducted with the user. Otherwise, the user may be directed to a human for assistance. In another possible embodiment, the method operates as above except that if the probability also exceeds a second threshold, the second threshold being higher than the first, then further dialog is conducted with the user using the current dialog strategy. However, if the probability falls between a first threshold and a second threshold, the dialog strategy may be adapted in order to improve the chances of conducting a successful dialog with the user.

Type: Grant

Filed: January 22, 2001

Date of Patent: February 21, 2006

Assignee: AT&T Corp.

Inventors: Allen Louis Gorin, Irene Langkilde Geary, Marilyn Ann Walker, Jeremy H. Wright
Method and apparatus for an adaptive speech recognition system utilizing HMM models

Patent number: 7003460

Abstract: In speech recognition, phonemes of a language are modelled by a hidden Markov model, whereby each status of the hidden Markov model is described by a probability density function. For speech recognition of a modified vocabulary, the probability density function is split into a first and into a second probability density function. As a result thereof, it is possible to compensate variations in the speaking habits of a speaker or to add a new word to the vocabulary of the speech recognition unit and thereby assure that this new word is distinguished with adequate quality from the words already present in the speech recognition unit and is thus recognized.

Type: Grant

Filed: May 3, 1999

Date of Patent: February 21, 2006

Assignee: Siemens Aktiengesellschaft

Inventors: Udo Bub, Harald Höge
Selecting one of multiple speech recognizers in a system based on performance predections resulting from experience

Patent number: 6996525

Abstract: A method for selecting a speech recognizer from a number of speech recognizers in a speech recognition system. The speech recognition system receives an audio stream from an application and derives enabling information. The speech recognition system then enables at least some of the speech recognizers and receives their results. It derives selection information and uses it to select the best speech recognizer and its results and returns those results back to the application.

Type: Grant

Filed: June 15, 2001

Date of Patent: February 7, 2006

Assignee: Intel Corporation

Inventors: Steven M. Bennett, Andrew V. Anderson
Managing power consumption based on utilization statistics

Patent number: 6996728

Abstract: The present invention, in various embodiments, provides techniques for managing system power. In one embodiment, system compute loads and/or system resources invoked by services running on the system consume power. To better manage power consumption, the spare capacity of a system resource is periodically measured, and if this spare capacity is outside a predefined range, then the resource operation is adjusted, e.g., the CPU speed is increased or decreased, so that the spare capacity is within the range. Further, the spare capacity is kept as close to zero as practical, and this spare capacity is determined based on the statistical distribution of a number of utilization values of the resources, which is also taken periodically. The spare capacity is also calculated based on considerations of the probability that the system resources are saturated.

Type: Grant

Filed: April 26, 2002

Date of Patent: February 7, 2006

Assignee: Hewlett-Packard Development Company, L.P.

Inventor: Jitendra K. Singh
Distance measure for probability distribution function of mixture type

Patent number: 6993452

Abstract: In accordance with our invention, for two mixture-type probability distribution functions (PDF's), G, H, G ? ( x ) = ? i = 1 N ? ? i ? g i ? ( x ) , ? H ? ( x ) = ? k = 1 K ? ? k ? h k ? ( x ) , where G is a mixture of N component PDF's gi (x), H is a mixture of K component PDF's hk (x), ?i and ?k are corresponding weights that satisfy ? i = 1 N ? ? i = 1 ? ? and ? ? ? k = 1 K ? ? k = 1 ; we define their distance, DM(G, H), as D M ? ( G , H ) = min w = [ ? ik ] ? ? i = 1 N ? ? k = 1 K ? ? ik ? d ? ( g i , h k ) where d(gI, hk is the element distance between component PDF's gi and hk and w satisfie ?ik?0, 1?i?N, 1?k?K; and ? k = 1 K ? ? ik = ? i , 1 ? i ? N , ? i = 1 N ? ? ik = ? k , 1 ? k ? K . The application of this definition of distance to various sets of real world data is demonstrated.

Type: Grant

Filed: May 4, 2001

Date of Patent: January 31, 2006

Assignee: AT&T Corp.

Inventors: Qian Huang, Zhu Liu
Method and apparatus using spectral addition for speaker recognition

Patent number: 6990446

Abstract: A method and apparatus for speaker recognition is provided that matches the noise in training data to noise in testing data using spectral addition. Under spectral addition, the mean and variance for a plurality of frequency components are adjusted in the training data and the test data so that each mean and variance is matched in a resulting matched training signal and matched test signal. The adjustments made to the training data and test data add to the mean and variance of the training data and test data instead of subtracting from the mean and variance.

Type: Grant

Filed: October 10, 2000

Date of Patent: January 24, 2006

Assignee: Microsoft Corporation

Inventors: Xuedong Huang, Michael D. Plumpe
Method and apparatus for denoising and deverberation using variational inference and strong speech models

Patent number: 6990447

Abstract: A probability distribution for speech model parameters, such as auto-regression parameters, is used to identify a distribution of denoised values from a noisy signal. Under one embodiment, the probability distributions of the speech model parameters and the denoised values are adjusted to improve a variational inference so that the variational inference better approximates the joint probability of the speech model parameters and the denoised values given a noisy signal. In some embodiments, this improvement is performed during an expectation step in an expectation-maximization algorithm. The statistical model can also be used to identify an average spectrum for the clean signal and this average spectrum may be provided to a speech recognizer instead of the estimate of the clean signal.

Type: Grant

Filed: November 15, 2001

Date of Patent: January 24, 2006

Assignee: Microsoft Corportion

Inventors: Hagai Attias, John Carlton Platt, Li Deng, Alejandro Acero
Method and apparatus for removing noise from feature vectors

Patent number: 6985858

Abstract: A method and computer-readable medium are provided for identifying clean signal feature vectors from noisy signal feature vectors. The method is based on variational inference techniques. One aspect of the invention includes using an iterative approach to identify the clean signal feature vector. Another aspect of the invention includes using the variance of a set of noise feature vectors and/or channel distortion feature vectors when identifying the clean signal feature vectors. Further aspects of the invention use mixtures of distributions of noise feature vectors and/or channel distortion feature vectors when identifying the clean signal feature vectors. Additional aspects of the invention include using a variance for the noisy signal feature vector conditioned on fixed values of noise, channel transfer function, and clean speech, when identifying the clean signal feature vector.

Type: Grant

Filed: March 20, 2001

Date of Patent: January 10, 2006

Assignee: Microsoft Corporation

Inventors: Brendan J. Frey, Alejandro Acero, Li Deng
Robust word-spotting system using an intelligibility criterion for reliable keyword detection under adverse and unknown noisy environments

Patent number: 6985859

Abstract: A method and system for spotting words in a speech signal having adverse and unknown noisy environments is provided. The method removes the dynamic bias introduced by the environment (i.e., noise and channel effect) that is specific to each word of the lexicon. The method includes the step of generating a first recognition score based on the speech signal and a lexicon entry for a word. The recognition score tracks an absolute likelihood that the word is in the speech signal. A background score is estimated based on the first recognition score. The method further provides for calculating a confidence score based on a matching ratio between a minimum recognition value and the background score. The method and system can be implemented for any number of words, depending upon the application. The confidence scores therefore track noise-corrected likelihoods that the words are in the speech signal.

Type: Grant

Filed: March 28, 2001

Date of Patent: January 10, 2006

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventor: Philippe R. Morin
Source normalization training for HMM modeling of speech

Patent number: 6980952

Abstract: A maximum likelihood (ML) linear regression (LR) solution to environment normalization is provided where the environment is modeled as a hidden (non-observable) variable. By application of an expectation maximization algorithm and extension of Baum-Welch forward and backward variables (Steps 23a–23d) a source normalization is achieved such that it is not necessary to label a database in terms of environment such as speaker identity, channel, microphone and noise type.

Type: Grant

Filed: June 7, 2000

Date of Patent: December 27, 2005

Assignee: Texas Instruments Incorporated

Inventor: Yifan Gong
Machine apparatus and its driving method, and recorded medium

Patent number: 6980956

Abstract: Drive means for performing a behavior based on a behavioral model prescribing a behavior is controlled, and the behavioral model is changed depending on a predetermined stimulus. Therefore, by giving the stimulus, the behavioral model can be freely changed, and a mechanical device, etc. having an improved amusing element can be realized.

Type: Grant

Filed: January 7, 2000

Date of Patent: December 27, 2005

Assignee: Sony Corporation

Inventors: Tsuyoshi Takagi, Masanori Omote
Attribute-based word modeling

Patent number: 6963837

Abstract: An attribute-based speech recognition system is described. A speech pre-processor receives input speech and produces a sequence of acoustic observations representative of the input speech. A database of context-dependent acoustic models characterize a probability of a given sequence of sounds producing the sequence of acoustic observations. Each acoustic model includes phonetic attributes and suprasegmental non-phonetic attributes. A finite state language model characterizes a probability of a given sequence of words being spoken. A one-pass decoder compares the sequence of acoustic observations to the acoustic models and the language model, and outputs at least one word sequence representative of the input speech.

Type: Grant

Filed: October 6, 2000

Date of Patent: November 8, 2005

Assignee: Multimodal Technologies, Inc.

Inventors: Michael Finke, Jurgen Fritsch, Detleff Koll, Alex Waibel
Method of speech recognition using empirically determined word candidates

Patent number: 6963834

Abstract: A method for performing speech recognition can include determining a recognition result for received user speech. The recognition result can include recognized text and a corresponding confidence score. The confidence score of the recognition result can correspond to a predetermined minimum threshold. If the confidence score does not exceed the predetermined minimum threshold, the user can be presented with at least one empirically determined alternate word candidate corresponding to the recognition result.

Type: Grant

Filed: May 29, 2001

Date of Patent: November 8, 2005

Assignee: International Business Machines Corporation

Inventors: Matthew W. Hartley, James R. Lewis, David E. Reich
Including the category of environmental noise when processing speech signals

Patent number: 6959276

Abstract: A method and apparatus are provided for identifying a noise environment for a frame of an input signal based on at least one feature for that frame. Under one embodiment, the noise environment is identified by determining the probability of each of a set of possible noise environments. For some embodiments, the probabilities of the noise environments for past frames are included in the identification of an environment for a current frame. In one particular embodiment, a count is generated for each environment that indicates the number of past frames for which the environment was the most probable environment. The environment with the highest count is then selected as the environment for the current frame.

Type: Grant

Filed: September 27, 2001

Date of Patent: October 25, 2005

Assignee: Microsoft Corporation

Inventors: James G. Droppo, Alejandro Acero, Li Deng
Method of speech recognition using variational inference with switching state space models

Patent number: 6931374

Abstract: A method is developed which includes 1) defining a switching state space model for a continuous valued hidden production-related parameter and the observed speech acoustics, and 2) approximating a posterior probability that provides the likelihood of a sequence of the hidden production-related parameters and a sequence of speech units based on a sequence of observed input values. In approximating the posterior probability, the boundaries of the speech units are not fixed but are optimally determined. Under one embodiment, a mixture of Gaussian approximation is used. In another embodiment, an HMM posterior approximation is used.

Type: Grant

Filed: April 1, 2003

Date of Patent: August 16, 2005

Assignee: Microsoft Corporation

Inventors: Hagai Attias, Leo Jingyu Lee, Li Deng
Determining near-optimal block size for incremental-type expectation maximization (EM) algorithms

Patent number: 6922660

Abstract: Determining the near-optimal block size for incremental-type expectation maximization (EM) algorithms is disclosed. Block size is determined based on the novel insight that the speed increase resulting from using an incremental-type EM algorithm as opposed to the standard EM algorithm is roughly the same for a given range of block sizes. Furthermore, this block size can be determined by an initial version of the EM algorithm that does not reach convergence. For a current block size, the speed increase is determined, and if the speed increase is the greatest determined so far, the current block size is set as the target block size. This process is repeated for new block sizes, until no new block sizes can be determined.

Type: Grant

Filed: December 1, 2000

Date of Patent: July 26, 2005

Assignee: Microsoft Corporation

Inventors: Bo Thiesson, Christopher A. Meek, David E. Heckerman
Method for calculating HMM output probability and speech recognition apparatus

Patent number: 6901365

Abstract: The invention enables even a CPU having low processing performance to find an HMM output probability by simplifying arithmetic operations. The dimensions of an input vector are grouped into several sets, and tables are created for the sets. When an output probability is calculated, codes corresponding to the first dimension to n-the dimension of the input vector are sequentially obtained, and for each code, by referring to the corresponding table, output values for each table are obtained. By substituting the output values for each table for a formula for finding an output probability, the output probability is found.

Type: Grant

Filed: September 19, 2001

Date of Patent: May 31, 2005

Assignee: Seiko Epson Corporation

Inventor: Yasunaga Miyazawa
Voice actuation with contextual learning for intelligent machine control

Patent number: 6895380

Abstract: An interactive voice actuated control system for a testing machine such as a tensile testing machine is described. Voice commands are passed through a user-command predictor and integrated with a graphical user interface control panel to allow hands-free operation. The user-command predictor learns operator command patterns on-line and predicts the most likely next action. It assists less experienced operators by recommending the next command, and it adds robustness to the voice command interpreter by verbally asking the operator to repeat unlikely commanded actions. The voice actuated control system applies to industrial machines whose normal operation is characterized by a nonrandom series of commands.

Type: Grant

Filed: March 2, 2001

Date of Patent: May 17, 2005

Assignee: Electro Standards Laboratories

Inventor: Raymond Sepe, Jr.
Speech recognition system and method for generating phonotic estimates

Patent number: 6868380

Abstract: A speech recognition system for transforming an acoustic signal into a stream of phonetic estimates includes a frequency analyzer for generating a short-time frequency representation of the acoustic signal. A novelty processor separates background components of the representation from region of interest components of the representation. The output of the novelty processor includes the region of interest components of the representation according to the novelty parameters. An attention processor produces a gating signal as a function of the novelty output according to attention parameters. A coincidence processor produces information regarding co-occurrences between samples of the novelty output over time and frequency. The coincidence processor selectively gates the coincidence output as a function of the gating signal according to one or more coincidence parameters.

Type: Grant

Filed: March 23, 2001

Date of Patent: March 15, 2005

Assignee: Eliza Corporation

Inventor: John Kroeker

prev … 7 8 9 10 11 12 13 14 15 next