Probability Patents (Class 704/240)
  • Patent number: 7269558
    Abstract: For a given sentence grammar, speech recognizers are often required to decode M sets of HMMs each of which models a specific acoustic environment. In order to match input acoustic observations to each of the environments, typically recognition search methods require a network of M sub-networks. A new speech recognition search method is described here, which needs that is only the size of a single sub-network and yet gives the same recognition performance, thus reducing memory requirement for network storage by (M?1)/M.
    Type: Grant
    Filed: July 26, 2001
    Date of Patent: September 11, 2007
    Assignee: Texas Instruments Incorporated
    Inventor: Yifan Gong
  • Patent number: 7269556
    Abstract: Pattern recognition, wherein a sequence of feature vectors is formed from a digitized incoming signal, the feature vectors comprising feature vector components, and at least one feature vector is compared with templates of candidate patterns by computing a distortion measure. A control signal based on at least one time-dependent variable of the recognition process is formulated, and the distortion measure is computed using only a subset of the vector components of the feature vector, the subset being chosen in accordance with said control signal. This reduces the computational complexity of the computation, as the dimensionality of the vectors involved in the computation is effectively reduced. Although such a dimension reduction decreases the computational need, it has been found not to significantly impair the classification performance.
    Type: Grant
    Filed: March 26, 2003
    Date of Patent: September 11, 2007
    Assignee: Nokia Corporation
    Inventors: Imre Kiss, Marcel Vasilache
  • Patent number: 7266495
    Abstract: A computerized pronunciation system is provided for generating pronunciations for words and storing the pronunciations in a pronunciation dictionary. The system includes a word list including at least one word; transcribed acoustic data including at least one waveform for the word and transcribed text associated with the waveform; a pronunciation-learning module configured to accept as input the word list and the transcribed acoustic data, the pronunciation-learning module including: sets of initial pronunciations of the word, a scoring module configured score pronunciations and to generate phone probabilities, and a set of alternate pronunciations of the word, wherein the set of alternate pronunciations include a highest-scoring set of initial pronunciations with a highest-scoring substitute phone substituted for a lowest-probability phone; and a pronunciation dictionary configured to receive the highest-scoring set of initial pronunciations and the set of alternate pronunciations.
    Type: Grant
    Filed: September 12, 2003
    Date of Patent: September 4, 2007
    Assignee: Nuance Communications, Inc.
    Inventors: Francoise Beaufays, Ananth Sankar, Mitchel Weintraub, Shaun Williams
  • Patent number: 7266492
    Abstract: A system and method facilitating training machine learning systems utilizing sequential conditional generalized iterative scaling is provided. The invention includes an expected value update component that modifies an expected value based, at least in part, upon a feature function of an input vector and an output value, a sum of lambda variable and a normalization variable. The invention further includes an error calculator that calculates an error based, at least in part, upon the expected value and an observed value. The invention also includes a parameter update component that modifies a trainable parameter based, at least in part, upon the error. A variable update component that updates at least one of the sum of lambda variable and the normalization variable based, at least in part, upon the error is also provided.
    Type: Grant
    Filed: August 16, 2006
    Date of Patent: September 4, 2007
    Assignee: Microsoft Corporation
    Inventor: Joshua Theodore Goodman
  • Patent number: 7266494
    Abstract: A method and apparatus are provided for identifying a noise environment for a frame of an input signal based on at least one feature for that frame. To identify the noise environment, a probability for a noise environment is determined by applying the noisy input feature vector to a distribution of noisy training feature vectors. In one embodiment, each noisy training feature vector in the distribution is formed by modifying a set of clean training feature vectors. In one embodiment, the probabilities of the noise environments for past frames are included in the identification of an environment for a current frame. In one embodiment, a correction vector is then selected based on the identified noise environment.
    Type: Grant
    Filed: November 10, 2004
    Date of Patent: September 4, 2007
    Assignee: Microsoft Corporation
    Inventors: James G. Droppo, Alejandro Acero, Li Deng
  • Patent number: 7263485
    Abstract: A method (200) and apparatus (100) for classifying a homogeneous audio segment are disclosed. The homogeneous audio comprises a sequence of audio samples (x(n)). The method (200) starts by forming a sequence of frames (701-704) along the sequence of audio samples (x(n)), each frame (701-704) comprising a plurality of the audio samples (x(n)). The homogeneous audio segment is next divided (206) into a plurality of audio clips (711-714), with each audio clip being associated with a plurality of the frames (701-704). The method (200) then extracts (208) at least one frame feature for each clip (711-714). A clip feature vector (f) is next extracted from frame features of frames associated with the audio clip (711-714). Finally the segment is classified based on a continuous function during the distribution of the clip feature vectors (f).
    Type: Grant
    Filed: May 28, 2003
    Date of Patent: August 28, 2007
    Assignee: Canon Kabushiki Kaisha
    Inventor: Timothy John Wark
  • Publication number: 20070185712
    Abstract: A method of measuring confidence of speech recognition in a speech recognizer compares a phase change point with a phoneme string change point and uses a difference between the phase change point and the phoneme string change point and a likelihood ratio, and an apparatus using the method is provided. That is, the method of the present invention includes detecting a phase change point of a speech signal; detecting a phoneme string change point according to a result of speech recognition; calculating confidence of the speech recognition by using a difference between the detected phase change point and phoneme string change point. According to the present invention, a performance of measuring confidence may become improved by simultaneously taking not only a likelihood ratio, but also taking a comparison result of a phase change point with a phoneme string change point into consideration.
    Type: Application
    Filed: June 30, 2006
    Publication date: August 9, 2007
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jae-Hoon Jeong, Kwang Cheol Oh
  • Patent number: 7254529
    Abstract: A method and apparatus are provided for adapting a language model to a task-specific domain. Under the method and apparatus, the relative frequency of n-grams in a small training set (i.e. task-specific training data set) and the relative frequency of n-grams in a large training set (i.e. out-of-domain training data set) are used to weight a distribution count of n-grams in the large training set. The weighted distributions are then used to form a modified language model by identifying probabilities for n-grams from the weighted distributions.
    Type: Grant
    Filed: September 13, 2005
    Date of Patent: August 7, 2007
    Assignee: MIcrosoft Corporation
    Inventors: Jianfeng Gao, Mingjing Li
  • Patent number: 7254538
    Abstract: The present invention successfully combines neural-net discriminative feature processing with Gaussian-mixture distribution modeling (GMM). By training one or more neural networks to generate subword probability posteriors, then using transformations of these estimates as the base features for a conventionally-trained Gaussian-mixture based system, substantial error rate reductions may be achieved. The present invention effectively has two acoustic models in tandem—first a neural net and then a GMM. By using a variety of combination schemes available for connectionist models, various systems based upon multiple features streams can be constructed with even greater error rate reductions.
    Type: Grant
    Filed: November 16, 2000
    Date of Patent: August 7, 2007
    Assignee: International Computer Science Institute
    Inventors: Hynek Hermansky, Sangita Sharma, Daniel Ellis
  • Patent number: 7231315
    Abstract: A distribution goodness-of-fit test device for testing whether measured data matches an estimated probability distribution has a counting section determination unit, a counting unit and a goodness-of-fit test unit. The counting section determination unit determines according to the number of the measured data, widths of counting sections for counting the measured data. The counting unit counts the numbers of data in the respective determined counting sections. Also, the goodness-of-fit test unit performs a goodness-of-fit test based on the numbers of data in the respective counting sections.
    Type: Grant
    Filed: December 3, 2004
    Date of Patent: June 12, 2007
    Assignee: Fuji Xerox Co., Ltd.
    Inventor: Masakazu Fujimoto
  • Patent number: 7228275
    Abstract: A speech recognition system recognizes an input speech signal by using a first speech recognizer and a second speech recognizer each coupled to a decision module. Each of the first and second speech recognizers outputs first and second recognized speech texts and first and second associated confidence scores, respectively, and the decision module selects either the first or the second speech text depending upon which of the first or second confidence score is higher. The decision module may also adjust the first and second confidence scores to generate first and second adjusted confidence scores, respectively, and select either the first or second speech text depending upon which of the first or second adjusted confidence scores is higher. The first and second confidence scores may be adjusted based upon the location of a speaker, the identity or accent of the speaker, the context of the speech, and the like.
    Type: Grant
    Filed: January 13, 2003
    Date of Patent: June 5, 2007
    Assignees: Toyota InfoTechnology Center Co., Ltd., iAnywhere Solutions, Inc.
    Inventors: Norikazu Endo, John R. Brookes, Benjamin K. Reaves, Babak Hodjat, Masahiko Funaki
  • Patent number: 7219059
    Abstract: A method and apparatus for generating a pronunciation score by receiving a user phrase intended to conform to a reference phrase and processing the user phrase in accordance with at least one of an articulation-scoring engine, a duration scoring engine and an intonation-scoring engine to derive thereby the pronunciation score.
    Type: Grant
    Filed: July 3, 2002
    Date of Patent: May 15, 2007
    Assignee: Lucent Technologies Inc.
    Inventors: Sunil K. Gupta, Ziyi Lu, Fengguang Zhao
  • Patent number: 7216077
    Abstract: Methods and arrangements using lattice-based information for unsupervised speaker adaptation. By performing adaptation against a word lattice, correct models are more likely to be used in estimating a transform. Further, a particular type of lattice proposed herein enables the use of a natural confidence measure given by the posterior occupancy probability of a state, that is, the statistics of a particular state will be updated with the current frame only if the a posteriori probability of the state at that particular time is greater than a predetermined threshold.
    Type: Grant
    Filed: September 26, 2000
    Date of Patent: May 8, 2007
    Assignee: International Business Machines Corporation
    Inventors: Mukund Padmanabhan, George A. Saon, Geoffrey G. Zweig
  • Patent number: 7206739
    Abstract: A method for searching an excitation (or fixed) codebook in a speech coding system. In a speech coding system including a synthesis filter for synthesizing a speech signal, a fixed codebook searcher according to the present invention segments a speech signal frame into a plurality of subframes to generate an excitation signal to be used in a synthesis filter, segments again each of the subframes into a plurality of subgroups, and searches the respective subframes each comprised of a plurality of pulse position/amplitude combinations for pulses. The fixed codebook searcher searches the respective subgroups for a predetermine number of pulses having non-zero amplitude, and generates the searched pulses as an initial vector. Next, the fixed codebook searcher selects a pulse combination including at least one pulse among the pulses of the initial vector, and then substitutes pulses of the selected pulse combination for pulses in other positions in the subgroups.
    Type: Grant
    Filed: May 23, 2002
    Date of Patent: April 17, 2007
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Dae-Ryong Lee
  • Patent number: 7174292
    Abstract: A method and apparatus are provided for determining uncertainty in noise reduction based on a parametric model of speech distortion. The method is first used to reduce noise in a noisy signal. In particular, noise is reduced from a representation of a portion of a noisy signal to produce a representation of a cleaned signal by utilizing an acoustic environment model. The uncertainty associated with the noise reduction process is then computed. In one embodiment, the uncertainty of the noise reduction process is used, in conjunction with the noise-reduced signal, to decode a pattern state.
    Type: Grant
    Filed: September 5, 2002
    Date of Patent: February 6, 2007
    Assignee: Microsoft Corporation
    Inventors: Li Deng, Alejandro Acero, James G. Droppo
  • Patent number: 7171359
    Abstract: Recognizing a stream of speech received as speech vectors over a lossy communications link includes constructing for a speech recognizer a series of speech vectors from packets received over a lossy packetized transmission link, wherein some of the packets associated with each speech vector are lost or corrupted during transmission. Each constructed speech vector is multi-dimensional and includes associated features. Potentially corrupted features within the speech vector are indicated to the speech recognizer when present. Speech recognition is attempted at the speech recognizer on the speech vectors when corrupted features are present. This recognition may be based only on certain or valid features within each speech vector. Retransmission of a missing or corrupted packet is requested when corrupted values are indicated by the indicating step and when the attempted recognition step fails.
    Type: Grant
    Filed: July 29, 2004
    Date of Patent: January 30, 2007
    Assignee: AT&T Corp.
    Inventors: Richard Vandervoort Cox, Stephen Michael Marcus, Mazin G. Rahim, Nambirajan Seshadri, Robert Douglas Sharp
  • Patent number: 7171358
    Abstract: A method compresses one or more ordered arrays of integer values. The integer values can represent a vocabulary of a language mode, in the form of an N-gram, of an automated speech recognition system. For each ordered array to be compressed, and an inverse array I[.] is defined. One or more spilt inverse arrays are also defined for each ordered array. The minimum and optimum number of bits required to store the array A[.] in terms of the split arrays and split inverse arrays are determined. Then, the original array is stored in such a way that the total amount of memory used is minimized.
    Type: Grant
    Filed: January 13, 2003
    Date of Patent: January 30, 2007
    Assignee: Mitsubishi Electric Research Laboratories, Inc.
    Inventors: Edward W. D. Whittaker, Bhiksha Ramakrishnan
  • Patent number: 7165031
    Abstract: A speech recognition method and apparatus is disclosed in which outputs a confidence score indicative of the posterior probability of an utterance being correctly matched to a word model. The confidence score for the matching of an utterance to a word model is determined directly from the generated values indicative of the goodness of match between the utterance and stored word models utilizing the following equation: confidence = exp ? ( - 2 ? ? S ? ( x | w ) ) ? words ? exp ? ( 2 ? ? S ? ( x | w ) ) where S(x|w) is the match score for the correlation between a signal x and word w and ? is an experimentally determined constant.
    Type: Grant
    Filed: November 6, 2002
    Date of Patent: January 16, 2007
    Assignee: Canon Kabushiki Kaisha
    Inventor: David Llewellyn Rees
  • Patent number: 7136813
    Abstract: A method and apparatus using a probabilistic network to estimate probability values each representing a probability that at least part of a signal represents content, such as voice activity, and to combine the probability values into an overall probability value. The invention may conform itself to particular system and/or signal characteristics by using some probability estimates and discarding other probability estimates.
    Type: Grant
    Filed: September 25, 2001
    Date of Patent: November 14, 2006
    Assignee: Intel Corporation
    Inventors: Maxim Likhachev, Murat Eren
  • Patent number: 7133826
    Abstract: A method and apparatus for speaker recognition is provided that matches the noise in training data to noise in testing data using spectral addition. Under spectral addition, the mean and variance for a plurality of frequency components are adjusted in the training data and the test data so that each mean and variance is matched in a resulting matched training signal and matched test signal. The adjustments made to the training data and test data add to the mean and variance of the training data and test data instead of subtracting from the mean and variance.
    Type: Grant
    Filed: February 24, 2005
    Date of Patent: November 7, 2006
    Assignee: Microsoft Corporation
    Inventors: Xuedong Huang, Michael D. Plumpe
  • Patent number: 7127393
    Abstract: A method and apparatus are provided for automatically recognizing words of spoken speech using a computer-based speech recognition system according to a dynamic semantic model. In an embodiment, the speech recognition system recognizes speech and generates one or more word strings, each of which is a hypothesis of the speech, and creates and stores a probability value or score for each of the word strings. The word strings are ordered by probability value. The speech recognition system also creates and stores, for each of the word strings, one or more keyword-value pairs that represent semantic elements and semantic values of the semantic elements for the speech that was spoken. One or more dynamic semantic rules are defined that specify how a probability value of a word string should be modified based on information about external conditions, facts, or the environment of the application in relation to the semantic values of that word string.
    Type: Grant
    Filed: February 10, 2003
    Date of Patent: October 24, 2006
    Assignee: Speech Works International, Inc.
    Inventors: Michael S. Phillips, Etienne Barnard, Jean-Guy Dahan, Michael J. Metzger
  • Patent number: 7107207
    Abstract: A system and method facilitating training machine learning systems utilizing sequential conditional generalized iterative scaling is provided. The invention includes an expected value update component that modifies an expected value based, at least in part, upon a feature function of an input vector and an output value, a sum of lambda variable and a normalization variable. The invention further includes an error calculator that calculates an error based, at least in part, upon the expected value and an observed value. The invention also includes a parameter update component that modifies a trainable parameter based, at least in part, upon the error. A variable update component that updates at least one of the sum of lambda variable and the normalization variable based, at least in part, upon the error is also provided.
    Type: Grant
    Filed: June 19, 2002
    Date of Patent: September 12, 2006
    Assignee: Microsoft Corporation
    Inventor: Joshua Theodore Goodman
  • Patent number: 7103546
    Abstract: A method of automatic recognition of company names in speech utterances includes generating at least one word sequence hypothesis by a speech recognizer from a speech utterance consisting of one or more words, comparing the word sequence hypothesis with the entries representing company names stored in a database, and selecting, in dependence on the result of the comparison, one company name as a recognition result.
    Type: Grant
    Filed: August 7, 2001
    Date of Patent: September 5, 2006
    Assignee: Koninklijke Philips Electronics, N.V.
    Inventor: Georg Rose
  • Patent number: 7103540
    Abstract: A method and apparatus are provided for using the uncertainty of a noise-removal process during pattern recognition. In particular, noise is removed from a representation of a portion of a noisy signal to produce a representation of a cleaned signal. In the meantime, an uncertainty associated with the noise removal is computed and is used with the representation of the cleaned signal to modify a probability for a phonetic state in the recognition system. In particular embodiments, the uncertainty is used to modify a probability distribution, by increasing the variance in each Gaussian distribution by the amount equal to the estimated variance of the cleaned signal, which is used in decoding the phonetic state sequence in a pattern recognition task.
    Type: Grant
    Filed: May 20, 2002
    Date of Patent: September 5, 2006
    Assignee: Microsoft Corporation
    Inventors: James G. Droppo, Alejandro Acero, Li Deng
  • Patent number: 7103543
    Abstract: The present invention comprises a system and method for speech verification using a robust confidence measure, and includes a speech verifier which compares a confidence measure for a recognized word to a predetermined threshold value in order to determine whether the recognized word is valid, where a recognized word corresponds to a word model that produces a highest recognition score. In accordance with the present invention, the foregoing confidence measure may be calculated using the recognition score for the recognized word, a background score of a worst recognition candidate, and a pseudo filler score that may be based upon selected average recognition scores from an N-best list of recognition candidates.
    Type: Grant
    Filed: August 13, 2002
    Date of Patent: September 5, 2006
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Gustavo Hernandez-Abrego, Xavier Menendez-Pidal
  • Patent number: 7103544
    Abstract: A method of modeling a speech recognition system includes decoding a speech signal produced from a training text to produce a sequence of predicted speech units. The training text comprises a sequence of actual speech units that is used with the sequence of predicted speech units to form a confusion model. In further embodiments, the confusion model is used to decode a text to identify an error rate that would be expected if the speech recognition system decoded speech based on the text.
    Type: Grant
    Filed: June 6, 2005
    Date of Patent: September 5, 2006
    Assignee: Microsoft Corporation
    Inventors: Milind Mahajan, Yonggang Deng, Alejandro Acero, Asela J. R. Gunawardana, Ciprian Chelba
  • Patent number: 7047190
    Abstract: The invention concerns a method and apparatus for performing packet loss or Frame Erasure Concealment (FEC) for a speech coder that does not have a built-in or standard FEC process. A receiver with a decoder receives encoded frames of compressed speech information transmitted from an encoder. A lost frame detector at the receiver determines if an encoded frame has been lost or corrupted in transmission, or erased. If the encoded frame is not erased, the encoded frame is decoded by a decoder and a temporary memory is updated with the decoder's output. A predetermined delay period is applied and the audio frame is then output. If the lost frame detector determines that the encoded frame is erased, a FEC module applies a frame concealment process to the signal. The FEC processing produces natural sounding synthetic speech for the erased frames.
    Type: Grant
    Filed: April 19, 2000
    Date of Patent: May 16, 2006
    Assignee: AT&TCorp.
    Inventor: David A. Kapilow
  • Patent number: 7043422
    Abstract: A method and apparatus are provided for adapting a language model to a task-specific domain. Under the method and apparatus, the relative frequency of n-grams in a small training set (i.e. task-specific training data set) and the relative frequency of n-grams in a large training set (i.e. out-of-domain training data set) are used to weight a distribution count of n-grams in the large training set. The weighted distributions are then used to form a modified language model by identifying probabilities for n-grams from the weighted distributions.
    Type: Grant
    Filed: September 4, 2001
    Date of Patent: May 9, 2006
    Assignee: Microsoft Corporation
    Inventors: Jianfeng Gao, Mingjing Li
  • Patent number: 7016836
    Abstract: Disclosed are a speech recognition system which comprises the following components, and a speech recognition method for this speech recognition system. The speech recognition system comprises a plurality of voice pickup sections for picking up uttered voices, a determination section for determining a speech signal suitable for speech recognition from speech signals output from the plurality of voice pickup sections, and a speech recognizer for performing speech recognition based on the speech signal determined by the determination section.
    Type: Grant
    Filed: August 30, 2000
    Date of Patent: March 21, 2006
    Assignee: Pioneer Corporation
    Inventor: Shoutarou Yoda
  • Patent number: 7010484
    Abstract: A method of phrase verification to verify a phrase not only according to its confidence measures but also according to neighboring concepts and their confidence tags. First, an utterance is received, and the received utterance is parsed to find a concept sequence. Subsequently, a plurality of tag sequences corresponding to the concept sequence is produced. Then, a first score of each of the tag sequences is calculated. Finally, the tag sequence of the highest first score is selected as the most probable tag sequence, and the tags contained therein are selected as the most probable confidence tags, respectively corresponding to the concepts in the concept sequence.
    Type: Grant
    Filed: December 12, 2001
    Date of Patent: March 7, 2006
    Assignee: Industrial Technology Research Institute
    Inventor: Yi-Chung Lin
  • Patent number: 7010483
    Abstract: A speech processing system is provided which is operable to receive sets of signal values representative of a speech signal generated by a speech source. The system is operable to determine a measure of the quality of the speech signal by performing a statistical analysis of the received sets of signal values. The system stores data defining a predetermined function derived from a signal model which models the speech source and which defines a probability density function which gives, for a given set of model parameters, the probability that the signal model has those model parameters given that the signal model is assumed to have generated the received set of signal values. The system applies a current set of received signal values to the stored probability density function and then draws samples from it using a Gibbs sampler.
    Type: Grant
    Filed: May 30, 2001
    Date of Patent: March 7, 2006
    Assignee: Canon Kabushiki Kaisha
    Inventor: Jebu Jacob Rajan
  • Patent number: 7003459
    Abstract: This invention concerns a method and system for monitoring an automated dialog system for the automatic recognition of language understanding errors based on a user's input communications. The method includes determining whether a probability of understanding the user's input communication exceeds a first thresholds, where if the first threshold is exceeded, further dialog is conducted with the user. Otherwise, the user may be directed to a human for assistance. In another possible embodiment, the method operates as above except that if the probability also exceeds a second threshold, the second threshold being higher than the first, then further dialog is conducted with the user using the current dialog strategy. However, if the probability falls between a first threshold and a second threshold, the dialog strategy may be adapted in order to improve the chances of conducting a successful dialog with the user.
    Type: Grant
    Filed: January 22, 2001
    Date of Patent: February 21, 2006
    Assignee: AT&T Corp.
    Inventors: Allen Louis Gorin, Irene Langkilde Geary, Marilyn Ann Walker, Jeremy H. Wright
  • Patent number: 7003460
    Abstract: In speech recognition, phonemes of a language are modelled by a hidden Markov model, whereby each status of the hidden Markov model is described by a probability density function. For speech recognition of a modified vocabulary, the probability density function is split into a first and into a second probability density function. As a result thereof, it is possible to compensate variations in the speaking habits of a speaker or to add a new word to the vocabulary of the speech recognition unit and thereby assure that this new word is distinguished with adequate quality from the words already present in the speech recognition unit and is thus recognized.
    Type: Grant
    Filed: May 3, 1999
    Date of Patent: February 21, 2006
    Assignee: Siemens Aktiengesellschaft
    Inventors: Udo Bub, Harald Höge
  • Patent number: 6996525
    Abstract: A method for selecting a speech recognizer from a number of speech recognizers in a speech recognition system. The speech recognition system receives an audio stream from an application and derives enabling information. The speech recognition system then enables at least some of the speech recognizers and receives their results. It derives selection information and uses it to select the best speech recognizer and its results and returns those results back to the application.
    Type: Grant
    Filed: June 15, 2001
    Date of Patent: February 7, 2006
    Assignee: Intel Corporation
    Inventors: Steven M. Bennett, Andrew V. Anderson
  • Patent number: 6996728
    Abstract: The present invention, in various embodiments, provides techniques for managing system power. In one embodiment, system compute loads and/or system resources invoked by services running on the system consume power. To better manage power consumption, the spare capacity of a system resource is periodically measured, and if this spare capacity is outside a predefined range, then the resource operation is adjusted, e.g., the CPU speed is increased or decreased, so that the spare capacity is within the range. Further, the spare capacity is kept as close to zero as practical, and this spare capacity is determined based on the statistical distribution of a number of utilization values of the resources, which is also taken periodically. The spare capacity is also calculated based on considerations of the probability that the system resources are saturated.
    Type: Grant
    Filed: April 26, 2002
    Date of Patent: February 7, 2006
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Jitendra K. Singh
  • Patent number: 6993452
    Abstract: In accordance with our invention, for two mixture-type probability distribution functions (PDF's), G, H, G ? ( x ) = ? i = 1 N ? ? i ? g i ? ( x ) , ? H ? ( x ) = ? k = 1 K ? ? k ? h k ? ( x ) , where G is a mixture of N component PDF's gi (x), H is a mixture of K component PDF's hk (x), ?i and ?k are corresponding weights that satisfy ? i = 1 N ? ? i = 1 ? ? and ? ? ? k = 1 K ? ? k = 1 ; we define their distance, DM(G, H), as D M ? ( G , H ) = min w = [ ? ik ] ? ? i = 1 N ? ? k = 1 K ? ? ik ? d ? ( g i , h k ) where d(gI, hk is the element distance between component PDF's gi and hk and w satisfie ?ik?0, 1?i?N, 1?k?K; and ? k = 1 K ? ? ik = ? i , 1 ? i ? N , ? i = 1 N ? ? ik = ? k , 1 ? k ? K . The application of this definition of distance to various sets of real world data is demonstrated.
    Type: Grant
    Filed: May 4, 2001
    Date of Patent: January 31, 2006
    Assignee: AT&T Corp.
    Inventors: Qian Huang, Zhu Liu
  • Patent number: 6990446
    Abstract: A method and apparatus for speaker recognition is provided that matches the noise in training data to noise in testing data using spectral addition. Under spectral addition, the mean and variance for a plurality of frequency components are adjusted in the training data and the test data so that each mean and variance is matched in a resulting matched training signal and matched test signal. The adjustments made to the training data and test data add to the mean and variance of the training data and test data instead of subtracting from the mean and variance.
    Type: Grant
    Filed: October 10, 2000
    Date of Patent: January 24, 2006
    Assignee: Microsoft Corporation
    Inventors: Xuedong Huang, Michael D. Plumpe
  • Patent number: 6990447
    Abstract: A probability distribution for speech model parameters, such as auto-regression parameters, is used to identify a distribution of denoised values from a noisy signal. Under one embodiment, the probability distributions of the speech model parameters and the denoised values are adjusted to improve a variational inference so that the variational inference better approximates the joint probability of the speech model parameters and the denoised values given a noisy signal. In some embodiments, this improvement is performed during an expectation step in an expectation-maximization algorithm. The statistical model can also be used to identify an average spectrum for the clean signal and this average spectrum may be provided to a speech recognizer instead of the estimate of the clean signal.
    Type: Grant
    Filed: November 15, 2001
    Date of Patent: January 24, 2006
    Assignee: Microsoft Corportion
    Inventors: Hagai Attias, John Carlton Platt, Li Deng, Alejandro Acero
  • Patent number: 6985858
    Abstract: A method and computer-readable medium are provided for identifying clean signal feature vectors from noisy signal feature vectors. The method is based on variational inference techniques. One aspect of the invention includes using an iterative approach to identify the clean signal feature vector. Another aspect of the invention includes using the variance of a set of noise feature vectors and/or channel distortion feature vectors when identifying the clean signal feature vectors. Further aspects of the invention use mixtures of distributions of noise feature vectors and/or channel distortion feature vectors when identifying the clean signal feature vectors. Additional aspects of the invention include using a variance for the noisy signal feature vector conditioned on fixed values of noise, channel transfer function, and clean speech, when identifying the clean signal feature vector.
    Type: Grant
    Filed: March 20, 2001
    Date of Patent: January 10, 2006
    Assignee: Microsoft Corporation
    Inventors: Brendan J. Frey, Alejandro Acero, Li Deng
  • Patent number: 6985859
    Abstract: A method and system for spotting words in a speech signal having adverse and unknown noisy environments is provided. The method removes the dynamic bias introduced by the environment (i.e., noise and channel effect) that is specific to each word of the lexicon. The method includes the step of generating a first recognition score based on the speech signal and a lexicon entry for a word. The recognition score tracks an absolute likelihood that the word is in the speech signal. A background score is estimated based on the first recognition score. The method further provides for calculating a confidence score based on a matching ratio between a minimum recognition value and the background score. The method and system can be implemented for any number of words, depending upon the application. The confidence scores therefore track noise-corrected likelihoods that the words are in the speech signal.
    Type: Grant
    Filed: March 28, 2001
    Date of Patent: January 10, 2006
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventor: Philippe R. Morin
  • Patent number: 6980952
    Abstract: A maximum likelihood (ML) linear regression (LR) solution to environment normalization is provided where the environment is modeled as a hidden (non-observable) variable. By application of an expectation maximization algorithm and extension of Baum-Welch forward and backward variables (Steps 23a–23d) a source normalization is achieved such that it is not necessary to label a database in terms of environment such as speaker identity, channel, microphone and noise type.
    Type: Grant
    Filed: June 7, 2000
    Date of Patent: December 27, 2005
    Assignee: Texas Instruments Incorporated
    Inventor: Yifan Gong
  • Patent number: 6980956
    Abstract: Drive means for performing a behavior based on a behavioral model prescribing a behavior is controlled, and the behavioral model is changed depending on a predetermined stimulus. Therefore, by giving the stimulus, the behavioral model can be freely changed, and a mechanical device, etc. having an improved amusing element can be realized.
    Type: Grant
    Filed: January 7, 2000
    Date of Patent: December 27, 2005
    Assignee: Sony Corporation
    Inventors: Tsuyoshi Takagi, Masanori Omote
  • Patent number: 6963837
    Abstract: An attribute-based speech recognition system is described. A speech pre-processor receives input speech and produces a sequence of acoustic observations representative of the input speech. A database of context-dependent acoustic models characterize a probability of a given sequence of sounds producing the sequence of acoustic observations. Each acoustic model includes phonetic attributes and suprasegmental non-phonetic attributes. A finite state language model characterizes a probability of a given sequence of words being spoken. A one-pass decoder compares the sequence of acoustic observations to the acoustic models and the language model, and outputs at least one word sequence representative of the input speech.
    Type: Grant
    Filed: October 6, 2000
    Date of Patent: November 8, 2005
    Assignee: Multimodal Technologies, Inc.
    Inventors: Michael Finke, Jurgen Fritsch, Detleff Koll, Alex Waibel
  • Patent number: 6963834
    Abstract: A method for performing speech recognition can include determining a recognition result for received user speech. The recognition result can include recognized text and a corresponding confidence score. The confidence score of the recognition result can correspond to a predetermined minimum threshold. If the confidence score does not exceed the predetermined minimum threshold, the user can be presented with at least one empirically determined alternate word candidate corresponding to the recognition result.
    Type: Grant
    Filed: May 29, 2001
    Date of Patent: November 8, 2005
    Assignee: International Business Machines Corporation
    Inventors: Matthew W. Hartley, James R. Lewis, David E. Reich
  • Patent number: 6959276
    Abstract: A method and apparatus are provided for identifying a noise environment for a frame of an input signal based on at least one feature for that frame. Under one embodiment, the noise environment is identified by determining the probability of each of a set of possible noise environments. For some embodiments, the probabilities of the noise environments for past frames are included in the identification of an environment for a current frame. In one particular embodiment, a count is generated for each environment that indicates the number of past frames for which the environment was the most probable environment. The environment with the highest count is then selected as the environment for the current frame.
    Type: Grant
    Filed: September 27, 2001
    Date of Patent: October 25, 2005
    Assignee: Microsoft Corporation
    Inventors: James G. Droppo, Alejandro Acero, Li Deng
  • Patent number: 6931374
    Abstract: A method is developed which includes 1) defining a switching state space model for a continuous valued hidden production-related parameter and the observed speech acoustics, and 2) approximating a posterior probability that provides the likelihood of a sequence of the hidden production-related parameters and a sequence of speech units based on a sequence of observed input values. In approximating the posterior probability, the boundaries of the speech units are not fixed but are optimally determined. Under one embodiment, a mixture of Gaussian approximation is used. In another embodiment, an HMM posterior approximation is used.
    Type: Grant
    Filed: April 1, 2003
    Date of Patent: August 16, 2005
    Assignee: Microsoft Corporation
    Inventors: Hagai Attias, Leo Jingyu Lee, Li Deng
  • Patent number: 6922660
    Abstract: Determining the near-optimal block size for incremental-type expectation maximization (EM) algorithms is disclosed. Block size is determined based on the novel insight that the speed increase resulting from using an incremental-type EM algorithm as opposed to the standard EM algorithm is roughly the same for a given range of block sizes. Furthermore, this block size can be determined by an initial version of the EM algorithm that does not reach convergence. For a current block size, the speed increase is determined, and if the speed increase is the greatest determined so far, the current block size is set as the target block size. This process is repeated for new block sizes, until no new block sizes can be determined.
    Type: Grant
    Filed: December 1, 2000
    Date of Patent: July 26, 2005
    Assignee: Microsoft Corporation
    Inventors: Bo Thiesson, Christopher A. Meek, David E. Heckerman
  • Patent number: 6901365
    Abstract: The invention enables even a CPU having low processing performance to find an HMM output probability by simplifying arithmetic operations. The dimensions of an input vector are grouped into several sets, and tables are created for the sets. When an output probability is calculated, codes corresponding to the first dimension to n-the dimension of the input vector are sequentially obtained, and for each code, by referring to the corresponding table, output values for each table are obtained. By substituting the output values for each table for a formula for finding an output probability, the output probability is found.
    Type: Grant
    Filed: September 19, 2001
    Date of Patent: May 31, 2005
    Assignee: Seiko Epson Corporation
    Inventor: Yasunaga Miyazawa
  • Patent number: 6895380
    Abstract: An interactive voice actuated control system for a testing machine such as a tensile testing machine is described. Voice commands are passed through a user-command predictor and integrated with a graphical user interface control panel to allow hands-free operation. The user-command predictor learns operator command patterns on-line and predicts the most likely next action. It assists less experienced operators by recommending the next command, and it adds robustness to the voice command interpreter by verbally asking the operator to repeat unlikely commanded actions. The voice actuated control system applies to industrial machines whose normal operation is characterized by a nonrandom series of commands.
    Type: Grant
    Filed: March 2, 2001
    Date of Patent: May 17, 2005
    Assignee: Electro Standards Laboratories
    Inventor: Raymond Sepe, Jr.
  • Patent number: 6868380
    Abstract: A speech recognition system for transforming an acoustic signal into a stream of phonetic estimates includes a frequency analyzer for generating a short-time frequency representation of the acoustic signal. A novelty processor separates background components of the representation from region of interest components of the representation. The output of the novelty processor includes the region of interest components of the representation according to the novelty parameters. An attention processor produces a gating signal as a function of the novelty output according to attention parameters. A coincidence processor produces information regarding co-occurrences between samples of the novelty output over time and frequency. The coincidence processor selectively gates the coincidence output as a function of the gating signal according to one or more coincidence parameters.
    Type: Grant
    Filed: March 23, 2001
    Date of Patent: March 15, 2005
    Assignee: Eliza Corporation
    Inventor: John Kroeker