Probability Patents (Class 704/240)
  • Patent number: 7487088
    Abstract: This invention concerns a method and system for monitoring an automated dialog system for the automatic recognition of language understanding errors based on a user's input communications. The method may include determining whether a probability of understanding the user's input communication exceeds a first threshold. If the first threshold is exceeded, further dialog is conducted with the user. Otherwise, the user may be directed to a human for assistance. The method also illustratively determines whether the probability also exceeds a second threshold, the second threshold being higher than the first. If so, then further dialog is conducted with the user using the current dialog strategy. However, if the probability falls between a first threshold and a second threshold, the dialog strategy may be adapted in order to improve the chances of conducting a successful dialog with the user.
    Type: Grant
    Filed: May 12, 2006
    Date of Patent: February 3, 2009
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Allen Louis Gorin, Irene Langkilde Geary, Marilyn Ann Walker, Jeremy H. Wright
  • Publication number: 20090030686
    Abstract: In a confidence computing method and system, a processor may interpret speech signals as a text string or directly receive a text string as input, generate a syntactical parse tree representing the interpreted string and including a plurality of sub-trees which each represents a corresponding section of the interpreted text string, determine for each sub-tree whether the sub-tree is accurate, obtain replacement speech signals for each sub-tree determined to be inaccurate, and provide output based on corresponding text string sections of at least one sub-tree determined to be accurate.
    Type: Application
    Filed: July 27, 2007
    Publication date: January 29, 2009
    Inventors: Fuliang Weng, Feng Lin, Zhe Feng
  • Patent number: 7480614
    Abstract: The present invention provides an energy feature extraction method for noisy speech recognition. At first, noisy speech energy of an input noisy speech is computed. Next, the noise energy in the input noisy speech is estimated. Then, the estimated noise energy is subtracted from the noisy speech energy to obtain estimated clean speech energy. Finally, delta operations are performed on the log of the estimated clean speech energy to determine the energy derivative features for the noisy speech.
    Type: Grant
    Filed: December 30, 2003
    Date of Patent: January 20, 2009
    Assignee: Industrial Technology Research Institute
    Inventor: Tai-Huei Huang
  • Patent number: 7480615
    Abstract: A method of efficiently setting posterior probability parameters for a switching state space model begins by defining a window containing at least two but fewer than all of the frames. A separate posterior probability parameter is determined for each frame in the window. The window is then shifted sequentially from left to right in time so that it includes one or more subsequent frames in the sequence of frames. A separate posterior probability parameter is then determined for each frame in the shifted window. This method closely approximates a more rigorous solution but saves computational cost by two to three orders of magnitude. Further, a method of determining the optimal discrete state sequence in the switching state space model is invented that directly exploits the observation vector on a frame-by-frame basis and operates from left to right in time.
    Type: Grant
    Filed: January 20, 2004
    Date of Patent: January 20, 2009
    Assignee: Microsoft Corporation
    Inventors: Hagai Attias, Li Deng, Leo Lee
  • Patent number: 7475012
    Abstract: Robust signal detection against various types of background noise is implemented. According to a signal detection apparatus, the feature amount of an input signal sequence and the feature amount of a noise component contained in the signal sequence are extracted. After that, the first likelihood indicating probability that the signal sequence is detected and the second likelihood indicating probability that the noise component is detected are calculated on the basis of a predetermined signal-to-noise ratio and the extracted feature amount of the signal sequence. Additionally, a likelihood ratio indicating the ratio between the first likelihood and the second likelihood is calculated. Detection of the signal sequence is determined on the basis of the likelihood ratio.
    Type: Grant
    Filed: December 9, 2004
    Date of Patent: January 6, 2009
    Assignee: Canon Kabushiki Kaisha
    Inventors: Philip Garner, Toshiaki Fukada, Yasuhiro Komori
  • Patent number: 7473838
    Abstract: A sound identification apparatus which reduces the chance of a drop in the identification rate, including: a frame sound feature extraction unit which extracts a sound feature per frame of an inputted audio signal; a frame likelihood calculation unit which calculates a frame likelihood of the sound feature in each frame, for each of a plurality of sound models; a confidence measure judgment unit which judges a confidence measure based on the frame likelihood; a cumulative likelihood output unit time determination unit which determines a cumulative likelihood output unit time based on the confidence measure; a cumulative likelihood calculation unit which calculates a cumulative likelihood in which the frame likelihoods of the frames included in the cumulative likelihood output unit time are cumulated, for each sound model; a sound type candidate judgment unit which determines, for each cumulative likelihood output unit time, a sound type corresponding to the sound model that has a maximum cumulative likelihood
    Type: Grant
    Filed: April 9, 2007
    Date of Patent: January 6, 2009
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Tetsu Suzuki, Yoshihisa Nakatoh, Shinichi Yoshizawa
  • Patent number: 7472060
    Abstract: This invention concerns a method and system for monitoring an automated dialog system for the automatic recognition of language understanding errors based on a user's input communications in a dialog with the user. The probability of conducting a successful dialog with the user is determined based, at least in part, on understanding data from at least one prior dialog exchange of the dialog.
    Type: Grant
    Filed: September 6, 2005
    Date of Patent: December 30, 2008
    Assignee: AT&T Corp.
    Inventors: Allen Louis Gorin, Irene Langkilde Geary, Marilyn Ann Walker, Jeremy H. Wright
  • Patent number: 7472062
    Abstract: Methods and arrangements for facilitating data clustering. From a set of input data, a predetermined number of non-overlapping subsets are created. The input data is split recursively to create the subsets.
    Type: Grant
    Filed: January 4, 2002
    Date of Patent: December 30, 2008
    Assignee: International Business Machines Corporation
    Inventors: Upendra V. Chaudhari, Jiri Navratil, Ganesh N. Ramaswamy
  • Publication number: 20080312921
    Abstract: In a speech recognition system, the combination of a log-linear model with a multitude of speech features is provided to recognize unknown speech utterances. The speech recognition system models the posterior probability of linguistic units relevant to speech recognition using a log-linear model. The posterior model captures the probability of the linguistic unit given the observed speech features and the parameters of the posterior model. The posterior model may be determined using the probability of the word sequence hypotheses given a multitude of speech features. Log-linear models are used with features derived from sparse or incomplete data. The speech features that are utilized may include asynchronous, overlapping, and statistically non-independent speech features. Not all features used in training need to appear in testing/recognition.
    Type: Application
    Filed: August 20, 2008
    Publication date: December 18, 2008
    Inventors: Scott E. Axelrod, Sreeram Viswanath Balakrishnan, Stanley F. Chen, Yuging Gao, Rameah A. Gopinath, Hong-Kwang Kuo, Benoit Maison, David Nahamoo, Michael Alan Picheny, George A. Saon, Geoffrey G. Zweig
  • Patent number: 7464033
    Abstract: For a given sentence grammar, speech recognizers are often required to decode M sets of HMMs each of which models a specific acoustic environment. In order to match input acoustic observations to each of the environments, typically recognition search methods require a network of M sub-networks. A new speech recognition search method is described here, which needs a network that is only the size of a single sub-network and yet provides the same recognition performance, thus reducing the memory requirements for network storage by (M-1)/M.
    Type: Grant
    Filed: February 4, 2005
    Date of Patent: December 9, 2008
    Assignee: Texas Instruments Incorporated
    Inventor: Yifan Gong
  • Patent number: 7464031
    Abstract: In a speech recognition system, the combination of a log-linear model with a multitude of speech features is provided to recognize unknown speech utterances. The speech recognition system models the posterior probability of linguistic units relevant to speech recognition using a log-linear model. The posterior model captures the probability of the linguistic unit given the observed speech features and the parameters of the posterior model. The posterior model may be determined using the probability of the word sequence hypotheses given a multitude of speech features. Log-linear models are used with features derived from sparse or incomplete data. The speech features that are utilized may include asynchronous, overlapping, and statistically non-independent speech features. Not all features used in training need to appear in testing/recognition.
    Type: Grant
    Filed: November 28, 2003
    Date of Patent: December 9, 2008
    Assignee: International Business Machines Corporation
    Inventors: Scott E. Axelrod, Sreeram Viswanath Balakrishnan, Stanley F. Chen, Yuging Gao, Ramesh A. Gopinath, Hong-Kwang Kuo, Benoit Maison, David Nahamoo, Michael Alan Picheny, George A. Saon, Geoffrey G. Zweig
  • Patent number: 7460992
    Abstract: A method and apparatus are provided for using the uncertainty of a noise-removal process during pattern recognition. In particular, noise is removed from a representation of a portion of a noisy signal to produce a representation of a cleaned signal. In the meantime, an uncertainty associated with the noise removal is computed and is used with the representation of the cleaned signal to modify a probability for a phonetic state in the recognition system. In particular embodiments, the uncertainty is used to modify a probability distribution, by increasing the variance in each Gaussian distribution by the amount equal to the estimated variance of the cleaned signal, which is used in decoding the phonetic state sequence in a pattern recognition task.
    Type: Grant
    Filed: May 16, 2006
    Date of Patent: December 2, 2008
    Assignee: Microsoft Corporation
    Inventors: James G. Droppo, Alejandro Acero, Li Deng
  • Patent number: 7457748
    Abstract: Method of automatically processing a speech signal which comprises the steps of: determining a sequence of probability models corresponding to a given text; determining a sequence of acoustic strings corresponding to the diction of the given text; aligning between the sequence of acoustic strings and the sequence of models; and determining a confidence index of acoustic alignment for each association between a model and an acoustic segment. Each determining step of an alignment confidence index is carried out at least from a combination of the model probability, a priori model probabilities and the average duration of occupancy of the models.
    Type: Grant
    Filed: August 12, 2003
    Date of Patent: November 25, 2008
    Assignee: France Telecom
    Inventors: Samir Nefti, Olivier Boeffard
  • Patent number: 7454337
    Abstract: The present invention is a method of modeling a single class of data from data containing multiple classes of data of the same type of data by first receiving a collection of data that includes data from multiple classes of data of the same type where the amount of data of the single class of data exceeds that of any other class of data. A first statistical model of the received collection of data is generated. The collection of data is divided into subsets. Each subset of the speech collection of data is scored using the first statistical model. A set of scores is selected. The subsets corresponding to the selected scores are identified. The identified subsets are combined. A second statistical model of the type of the first statistical model is generated for the combined subsets and used as the model of the single class of data.
    Type: Grant
    Filed: May 13, 2004
    Date of Patent: November 18, 2008
    Assignee: The United States of America as represented by the Director, National Security Agency, The
    Inventors: David C. Smith, Daniel J. Richman
  • Patent number: 7454336
    Abstract: A system and method that facilitate modeling unobserved speech dynamics based upon a hidden dynamic speech model in the form of segmental switching state space model that employs model parameters including those describing the unobserved speech dynamics and those describing the relationship between the unobserved speech dynamic vector and the observed acoustic feature vector is provided. The model parameters are modified based, at least in part, upon, a variational learning technique. In accordance with an aspect of the present invention, novel and powerful variational expectation maximization (EM) algorithm(s) for the segmental switching state space models used in speech applications, which are capable of capturing key internal (or hidden) dynamics of natural speech production, are provided. For example, modification of model parameters can be based upon an approximate mixture of Gaussian (MOG) posterior and/or based upon an approximate hidden Markov model (HMM) posterior using a variational technique.
    Type: Grant
    Filed: June 20, 2003
    Date of Patent: November 18, 2008
    Assignee: Microsoft Corporation
    Inventors: Hagai Attias, Li Deng, Leo J. Lee
  • Patent number: 7451083
    Abstract: A method and computer-readable medium are provided for identifying clean signal feature vectors from noisy signal feature vectors. One aspect of the invention includes using an iterative approach to identify the clean signal feature vector. Another aspect of the invention includes using the variance of a set of noise feature vectors and/or channel distortion feature vectors when identifying the clean signal feature vectors.
    Type: Grant
    Filed: July 20, 2005
    Date of Patent: November 11, 2008
    Assignee: Microsoft Corporation
    Inventors: Brendan J. Frey, Alejandro Acero, Li Deng
  • Patent number: 7447626
    Abstract: A method of extracting significant phrases from one or more documents stored in a computer-readable medium. A sequence of words is read from the one or more documents and a score is determined for each word in the sequence based on the length of the word. The score for each word in the sequence is compared against a threshold score. The sequence of words is indicated to be a significant phrase if the number of words in the sequences that have a score greater than the threshold score equals or exceeds a predetermined number. A sentence containing the sequence of words is retrieved from the document, if the sequence of words is a significant phrase. An abstract of the document is searched to determine if the sentence has been previously included in the abstract. If not, the sentence is added to the abstract.
    Type: Grant
    Filed: December 21, 2004
    Date of Patent: November 4, 2008
    Assignee: UDICO Holdings
    Inventors: Garnet R. Chaney, Robert F. Richardson, Seymour I. Rubinstein
  • Patent number: 7444284
    Abstract: A system, method and computer program product are provided for speech recognition. During operation, a database of words are maintained. Initially, a probability is assigned to each of the words which indicates a prevalence of use of the word. Further, an utterance is received for speech recognition purposes. Such utterance is matched with one of the words in the database based on least in part on the probability.
    Type: Grant
    Filed: November 15, 2004
    Date of Patent: October 28, 2008
    Assignee: BeVocal, Inc.
    Inventor: Bertrand A. Damiba
  • Patent number: 7440893
    Abstract: This invention concerns a method and system for monitoring an automated dialog system for the automatic recognition of language understanding errors based on a user's input communications. The method illustratively determines whether a probability of understanding the user's input communication exceeds a first threshold. If the first threshold is exceeded, further dialog is conducted with the user. Otherwise, the user may be directed to a human for assistance. The method also illustratively determines whether the probability also exceeds a second threshold, the second threshold being higher than the first. If so, then further dialog is conducted with the user using the current dialog strategy. However, if the probability falls between a first threshold and a second threshold, the dialog strategy may be adapted in order to improve the chances of conducting a successful dialog with the user.
    Type: Grant
    Filed: September 6, 2005
    Date of Patent: October 21, 2008
    Assignee: AT&T Corp.
    Inventors: Allen Louis Gorin, Irene Langkilde Geary, Marilyn Ann Walker, Jeremy H. Wright
  • Patent number: 7437288
    Abstract: A speech recognition apparatus using a probability model that employs a mixed distribution, the apparatus formed by a standard pattern storage means for storing a standard pattern; a recognition means for outputting recognition results corresponding to an input speech by using the standard pattern; a standard pattern generating means for inputting learning speech and generating the standard pattern; and a standard pattern adjustment means, provided between the standard pattern generating means and the standard pattern storage means, for adjusting the number of element distributions of the mixed distribution of the standard pattern.
    Type: Grant
    Filed: March 11, 2002
    Date of Patent: October 14, 2008
    Assignee: NEC Corporation
    Inventor: Koichi Shinoda
  • Publication number: 20080243502
    Abstract: The invention discloses prompting for a spoken response that provides input for multiple elements. A single spoken utterance including content for multiple elements can be received, where each element is mapped to a data field. The spoken utterance can be speech-to-text converted to derive values for each of the multiple elements. An utterance level confidence score can be determined, which can fall below an associated certainty threshold. Element-level confidence scores for each of the derived elements can then be ascertained. A first set of the multiple elements can have element-level confidence scores above an associated certainty threshold and a second set can have scores below. Values can be stored in data fields mapped to the first set. A prompt for input for the second set can be played. Accordingly, data fields are partially filled in based upon the original speech utterance, where a second prompt for unfilled fields is played.
    Type: Application
    Filed: March 28, 2007
    Publication date: October 2, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: SOONTHORN ATIVANICHAYAPHONG, Gerald M. McCobb, PARITOSH D. PATEL, MARC WHITE
  • Publication number: 20080235007
    Abstract: A method and system for speaker recognition and identification includes transforming features of a speaker utterance in a first condition state to match a second condition state and provide a transformed utterance. A discriminative criterion is used to generate a transform that maps an utterance to obtain a computed result. The discriminative criterion is maximized over a plurality of speakers to obtain a best transform for recognizing speech and/or identifying a speaker under the second condition state. Speech recognition and speaker identity may be determined by employing the best transform for decoding speech to reduce channel mismatch.
    Type: Application
    Filed: June 3, 2008
    Publication date: September 25, 2008
    Inventors: Jiri Navratil, Jagon Pelecanos, Ganesh N. Ramaswamy
  • Patent number: 7421387
    Abstract: A method for reducing recognition errors. The method includes receiving an N-best list associated with an input of a computer based recognition system. The N-best list includes one or more hypotheses and associated confidence values. The input is classified in response to the N-best list, resulting in a classification. A re-scoring algorithm that is tuned for the classification is selected. The re-scoring algorithm is applied to the N-best list to create a re-scored N-best list. A hypothesis for the value of the input is selected based on the re-scored N-best list.
    Type: Grant
    Filed: May 18, 2004
    Date of Patent: September 2, 2008
    Assignee: General Motors Corporation
    Inventor: Kurt S. Godden
  • Publication number: 20080189109
    Abstract: Boundary points for speech in an audio signal are determined based on posterior probabilities for the boundary points given a set of possible segmentations of the audio signal. The boundary point posterior probability is determined based on a set of level posterior probabilities that each provide the probability of a sequence of feature vectors given one of the segmentations in the set of possible segmentations.
    Type: Application
    Filed: February 5, 2007
    Publication date: August 7, 2008
    Applicant: Microsoft Corporation
    Inventors: Yu Shi, Frank Kao-Ping Soong
  • Patent number: 7409342
    Abstract: A speech recognizing device. Natural speech recognizing means recognizes speech input in an application program by dictation. Recognition result converting means converts a recognition result from said natural speech recognizing means into a final recognition result processable by said application program on the basis of a grammar to he used for recognizing said input speech in a grammar method. The recognition result converting means further comprises candidate sentence generating means for evolving said grammar to generate candidate sentences that are candidates for said final recognition result: and matching means for selecting a candidate sentence as said final recognition result among the candidate sentences by matching said candidate sentences generated by said candidate sentence generating means against the recognition result by said natural speech recognizing means.
    Type: Grant
    Filed: March 31, 2004
    Date of Patent: August 5, 2008
    Assignee: International Business Machines Corporation
    Inventors: Hiroaki Kashima, Yoshinori Tahara, Daisuke Tomoda
  • Patent number: 7406416
    Abstract: A method and apparatus are provided for storing parameters of a deleted interpolation language model as parameters of a backoff language model. In particular, the parameters of the deleted interpolation language model are stored in the standard ARPA format. Under one embodiment, the deleted interpolation language model parameters are formed using fractional counts.
    Type: Grant
    Filed: March 26, 2004
    Date of Patent: July 29, 2008
    Assignee: Microsoft Corporation
    Inventors: Ciprian Chelba, Milind Mahajan, Alejandro Acero
  • Patent number: 7406408
    Abstract: Method of recognizing phones in speech of any language. Acquire phones for all languages and a set of languages. Acquire a pronunciation dictionary, a transcript of speech for the set of languages, and speech for the transcript. Receive speech containing unknown phones. If the speech's language is unknown, compare it to the phones for all languages to determine the phones. If the language is known but no phones were acquired in that language, compare the speech to the phones for all languages to determine the phones. If phones were acquired in the speech's language but no corresponding pronunciation dictionary was acquired, compare the speech to the phones for all languages to determine the phones. If a pronunciation dictionary was acquired for the phones in the speech's language but no transcript was acquired then compare the speech to the phones for all languages to determine the phones.
    Type: Grant
    Filed: August 24, 2004
    Date of Patent: July 29, 2008
    Assignee: The United States of America as represented by the Director, National Security Agency
    Inventors: Bradley C. Lackey, Patrick J. Schone, Brenton D. Walker
  • Publication number: 20080154595
    Abstract: A system and method for classifying a voice signal to one of a set of predefined categories, based upon a statistical analysis of features extracted from the voice signal. The system includes an acoustic processor and a classifier. The acoustic processor extracts features that are characteristic of the voice signal and generates feature vectors using the extracted spectral features. The classifier uses the feature vectors to compute the probability that the voice signal belongs to each of the predefined categories and classifies the voice signal to a predefined category that is associated with the highest probability.
    Type: Application
    Filed: March 4, 2008
    Publication date: June 26, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Israel Nelken
  • Patent number: 7389230
    Abstract: A system and method for classifying a voice signal to one of a set of predefined categories, based upon a statistical analysis of features extracted from the voice signal. The system includes an acoustic processor and a classifier. The acoustic processor extracts features that are characteristic of the voice signal and generates feature vectors using the extracted spectral features. The classifier uses the feature vectors to compute the probability that the voice signal belongs to each of the predefined categories and classifies the voice signal to a predefined category that is associated with the highest probability.
    Type: Grant
    Filed: April 22, 2003
    Date of Patent: June 17, 2008
    Assignee: International Business Machines Corporation
    Inventor: Israel Nelken
  • Publication number: 20080140399
    Abstract: Provided is a method and system for high-speed speech recognition. On the basis of a continuous density hidden Markov model (CDHMM) using a Gaussian mixture model (GMM) for an observation probability, the method and system add only K Gaussian components highly contributing to a state-specific observation probability for an input feature vector and calculate the state-specific observation probability. Thus, in the aspect of the recognition ratio, the degree of approximation of a state-specific observation probability increases, thereby minimizing deterioration of speech recognition performance. In addition, in the aspect of the amount of computation, the number of addition operations required for computing an observation probability is reduced, in comparison with conventional speech recognition that adds all Gaussian probabilities of an input feature vector and uses it for a state-specific observation probability, thereby reducing the total amount of computation required for speech recognition.
    Type: Application
    Filed: July 30, 2007
    Publication date: June 12, 2008
    Inventor: Hoon Chung
  • Patent number: 7386438
    Abstract: A system and method for identifying language attributes through probabilistic analysis is described. A set of language classes and a plurality of training documents are defined, Each language class identifies a language and a character set encoding. Occurrences of one or more document properties within each training document are evaluated. For each language class, a probability for the document properties set conditioned on the occurrence of the language class is calculated. Byte occurrences within each training document are evaluated. For each language class, a probability for the byte occurrences conditioned on the occurrence of the language class is calculated.
    Type: Grant
    Filed: August 4, 2003
    Date of Patent: June 10, 2008
    Assignee: Google Inc.
    Inventors: Alexander Franz, Brian Milch, Eric Jackson, Jenny Zhou, Benjamin Diament
  • Patent number: 7379867
    Abstract: Methods are disclosed for estimating language models such that the conditional likelihood of a class given a word string, which is very well correlated with classification accuracy, is maximized. The methods comprise tuning statistical language model parameters jointly for all classes such that a classifier discriminates between the correct class and the incorrect ones for a given training sentence or utterance. Specific embodiments of the present invention pertain to implementation of the rational function growth transform in the context of a discriminative training technique for n-gram classifiers.
    Type: Grant
    Filed: June 3, 2003
    Date of Patent: May 27, 2008
    Assignee: Microsoft Corporation
    Inventors: Ciprian Chelba, Alejandro Acero, Milind Mahajan
  • Publication number: 20080097758
    Abstract: An opinion system infers the opinion of a sentence of a product review based on a probability that the sentence contains certain sequences of parts of speech that are commonly used to express an opinion as indicated by the training data and the probabilities of the training data. When provided with the sentence, the opinion system identifies possible sequences of parts of speech of the sentence that are commonly used to express an opinion and the probability that the sequence is the correct sequence for the sentence. For each sequence, the opinion system then retrieves a probability derived from the training data that the sequence contains an opinion word that expresses an opinion. The opinion system then retrieves a probability from the training data that the opinion words of the sentence are used to express an opinion. The opinion system then combines the probabilities to generate an overall probability that the sentence with that sequence expresses an opinion.
    Type: Application
    Filed: October 23, 2006
    Publication date: April 24, 2008
    Applicant: Microsoft Corporation
    Inventors: Hua Li, Jian-Lai Zhou, Zheng Chen, Jian Wang, Dongmei Zhang
  • Publication number: 20080091424
    Abstract: Hidden Markov Model (HMM) parameters are updated using update equations based on growth transformation optimization of a minimum classification error objective function. Using the list of N-best competitor word sequences obtained by decoding the training data with the current-iteration HMM parameters, the current HMM parameters are updated iteratively. The updating procedure involves using weights for each competitor word sequence that can take any positive real value. The updating procedure is further extended to the case where a decoded lattice of competitors is used. In this case, updating the model parameters relies on determining the probability for a state at a time point based on the word that spans the time point instead of the entire word sequence. This word-bound span of time is shorter than the duration of the entire word sequence and thus reduces the computing time.
    Type: Application
    Filed: October 16, 2006
    Publication date: April 17, 2008
    Applicant: Microsoft Corporation
    Inventors: Xiaodong He, Li Deng
  • Patent number: 7356466
    Abstract: A method and apparatus for calculating an observation probability includes a first operation unit that subtracts a mean of a first plurality of parameters of an input voice signal from a second parameter of an input voice signal, and multiplies the subtraction result to obtain a first output. The first output is squared and accumulated N times in a second operation unit to obtain a second output. A third operation unit subtracts a given weighted value from the second output to obtain a third output, and a comparator stores the third output for a comparator stores the third output in order to extract L outputs therefrom, and stores the L extracted outputs based on an order of magnitude of the extracted L outputs.
    Type: Grant
    Filed: June 20, 2003
    Date of Patent: April 8, 2008
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Byung-Ho Min, Tae-Su Kim, Hyun-Woo Park, Ho-Rang Jang, Keun-Cheol Hong, Sung-Jae Kim
  • Patent number: 7346509
    Abstract: Computer-implemented methods and apparatus are provided to facilitate the recognition of the content of a body of speech data. In one embodiment, a method for analyzing verbal communication is provided, comprising acts of producing an electronic recording of a plurality of spoken words; processing the electronic recording to identify a plurality of word alternatives for each of the spoken words, each of the plurality of word alternatives being identified by comparing a portion of the electronic recording with a lexicon, and each of the plurality of word alternatives being assigned a probability of correctly identifying a spoken word; loading the word alternatives and the probabilities to a database for subsequent analysis; and examining the word alternatives and the probabilities to determine at least one characteristic of the plurality of spoken words.
    Type: Grant
    Filed: September 26, 2003
    Date of Patent: March 18, 2008
    Assignee: Callminer, Inc.
    Inventor: Jeffrey A. Gallino
  • Publication number: 20080040111
    Abstract: A device of the present invention obtains a character string of a speech recognition result and a confidence factor thereof. A time monitor monitors time and determines whether or not processing is delayed by checking the confidence factor and time status. When the processing is not delayed, a checker is asked to perform manual judgment. In this event, speech is processed and the manual judgment of the speech recognition result is performed on the basis of the processed speech. When the processing is delayed, automatic judgment is performed by use of the confidence factor. When the character string is judged to be correct as a result of the manual judgment or the automatic judgment, the character string is displayed as a confirmed character string. When the character string is judged to be incorrect, automatic correction is performed by matching on the basis of a next candidate obtained by the speech recognition, texts and attributes of the presentation, a script text, and the like.
    Type: Application
    Filed: March 21, 2007
    Publication date: February 14, 2008
    Inventors: Kohtaroh Miyamoto, Kenichi Arakawa, Toshiya Ohgane
  • Patent number: 7324940
    Abstract: Systems and methods for determining a confidence score associated with a decoding output of a speech recognition engine. In one embodiment, a method of determining the confidence score comprises arranging time frame and acoustic score data into an array, determining a phoneme sequence in the array that yields the highest sum of acoustic scores under certain constraints, e.g., minimum number of time frames and order of phonemes in a phoneme string. A relative score is derived by applying a functional relationship between the acoustic score and different sums comprising acoustic scores from the array. The confidence score, in some embodiments, depends at least in part on the relative score and a measure of ambiguity associated with similar sounding phrases being included in different concepts of a specified grammar.
    Type: Grant
    Filed: February 27, 2004
    Date of Patent: January 29, 2008
    Assignee: Lumen Vox, LLC
    Inventors: Edward S. Miller, James F. Blake, II, Kyle N. Danielson, Keith C. Herold
  • Patent number: 7324927
    Abstract: A method to select features for maximum entropy modeling in which the gains for all candidate features are determined during an initialization stage and gains for only top-ranked features are determined during each feature selection stage. The candidate features are ranked in an ordered list based on the determined gains, a top-ranked feature in the ordered list with a highest gain is selected, and the model is adjusted using the selected top-ranked feature.
    Type: Grant
    Filed: July 3, 2003
    Date of Patent: January 29, 2008
    Assignees: Robert Bosch GmbH, The Board Of Trustees Of The Leland Stanford Junior University
    Inventors: Fuliang Weng, Yaqian Zhou
  • Patent number: 7318028
    Abstract: For determining an estimate of a need for information units for encoding a signal, a measure for the distribution of the energy in the frequency band is taken into account in addition to the admissible interference for a frequency band and an energy of the frequency band. With this, a better estimate of the need for information units is obtained, so that coding can be done more efficiently and more accurately.
    Type: Grant
    Filed: August 31, 2006
    Date of Patent: January 8, 2008
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.
    Inventors: Michael Schug, Johannes Hilpert, Stefan Geyersberger, Max Neuendorf
  • Patent number: 7310601
    Abstract: The present invention provides a speech recognition apparatus which appropriately performs speech recognition by generating, in real time, language models adapted to a new topic even in the case where topics are changed.
    Type: Grant
    Filed: December 8, 2005
    Date of Patent: December 18, 2007
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Makoto Nishizaki, Yoshihisa Nakatoh, Maki Yamada, Shinichi Yoshizawa
  • Patent number: 7310599
    Abstract: A method and computer-readable medium are provided for identifying clean signal feature vectors from noisy signal feature vectors. Aspects of the invention use mixtures of distributions of noise feature vectors and/or channel distortion feature vectors when identifying the clean signal feature vectors.
    Type: Grant
    Filed: July 20, 2005
    Date of Patent: December 18, 2007
    Assignee: Microsoft Corporation
    Inventors: Brendan J. Frey, Alejandro Acero, Li Deng
  • Patent number: 7310600
    Abstract: A dynamic programming technique is provided for matching two sequences of phonemes both of which may be generated from text or speech. The scoring of the dynamic programming matching technique uses phoneme confusion scores, phoneme insertion scores and phoneme deletion scores which are obtained in advance in a training session and, if appropriate, confidence data generated by a recognition system if the sequences are generated from speech.
    Type: Grant
    Filed: October 25, 2000
    Date of Patent: December 18, 2007
    Assignee: Canon Kabushiki Kaisha
    Inventors: Philip Neil Garner, Jason Peter Andrew Charlesworth, Asako Higuchi
  • Patent number: 7299187
    Abstract: When a user issued voice command does not match grammars registered in advance, the voice command is identified as a sentence (step S305). This sentence is compared with the registered grammars to calculate a similarity (step S307). When the similarity is higher than a first threshold value (TH1), the voice command is executed (step S315). When the similarity is equal to or lower than the first threshold value (TH1) and higher than a second threshold value (TH2), command choices are displayed for the user and the user is permitted to select a command to be executed (step S319). When the similarity is equal to or lower than the second threshold value (TH2), the command is not executed (step S321). Furthermore, once a command has been executed it is added as a grammar, so that it can be identified when next it is used.
    Type: Grant
    Filed: February 10, 2003
    Date of Patent: November 20, 2007
    Assignee: International Business Machines Corporation
    Inventors: Yoshinori Tahara, Daisuke Tomoda, Kikuo Mitsubo, Yoshinori Atake
  • Patent number: 7295978
    Abstract: A system for recognizing speech receives an input speech vector and identifies a Gaussian distribution. The system determines an address from the input speech vector (610) and uses the address to retrieve a distance value for the Gaussian distribution from a table (620). The system then determines the probability of the Gaussian distribution using the distance value (630) and recognizes the input speech vector based on the determined probability (640).
    Type: Grant
    Filed: September 5, 2000
    Date of Patent: November 13, 2007
    Assignees: Verizon Corporate Services Group Inc., BBN Technologies Corp.
    Inventors: Richard Mark Schwartz, Jason Charles Davenport, James Donald Van Sciver, Long Nguyen
  • Patent number: 7289956
    Abstract: The present invention employs user modeling to model a user's behavior patterns. The user's behavior patterns are then used to influence named entity (NE) recognition.
    Type: Grant
    Filed: May 27, 2003
    Date of Patent: October 30, 2007
    Assignee: Microsoft Corporation
    Inventors: Dong Yu, Peter K. L. Mau, Kuansan Wang, Milind Mahajan, Alejandro Acero
  • Patent number: 7289955
    Abstract: A method and apparatus are provided for determining uncertainty in noise reduction based on a parametric model of speech distortion. The method is first used to reduce noise in a noisy signal. In particular, noise is reduced from a representation of a portion of a noisy signal to produce a representation of a cleaned signal by utilizing an acoustic environment model. The uncertainty associated with the noise reduction process is then computed. In one embodiment, the uncertainty of the noise reduction process is used, in conjunction with the noise-reduced signal, to decode a pattern state.
    Type: Grant
    Filed: December 20, 2006
    Date of Patent: October 30, 2007
    Assignee: Microsoft Corporation
    Inventors: Li Deng, Alejandro Acero, James G. Droppo
  • Patent number: 7280963
    Abstract: A computerized method is provided for generating pronunciations for words and storing the pronunciations in a pronunciation dictionary. The method includes graphing sets of initial pronunciations; thereafter in an ASR subsystem determining a highest-scoring set of initial pronunciations; generating sets of alternate pronunciations, wherein each set of alternate pronunciations includes the highest-scoring set of initial pronunciations with a lowest-probability phone of the highest-scoring initial pronunciation substituted with a unique-substitute phone; graphing the sets of alternate pronunciations; determining in the ASR subsystem a highest-scoring set of alternate pronunciations; and adding to a pronunciation dictionary the highest-scoring set of alternate pronunciations.
    Type: Grant
    Filed: September 12, 2003
    Date of Patent: October 9, 2007
    Assignee: Nuance Communications, Inc.
    Inventors: Francoise Beaufays, Ananth Sankar, Mitchel Weintraub, Shaun Williams
  • Publication number: 20070225980
    Abstract: A speech recognition apparatus includes a first-candidate selecting unit that selects a recognition result of a first speech from first recognition candidates based on likelihood of the first recognition candidates; a second-candidate selecting unit that extracts recognition candidates of a object word contained in the first speech and recognition candidates of a clue word from second recognition candidates, acquires the relevance ratio associated with the semantic relation between the extracted recognition candidates of the object word and the extracted recognition candidates of the clue word, and selects a recognition result of the second speech based on the acquired relevance ratio; a correction-portion identifying unit that identifies a portion corresponding to the object word in the first speech; and a correcting unit that corrects the word on identified portion.
    Type: Application
    Filed: March 1, 2007
    Publication date: September 27, 2007
    Inventor: Kazuo Sumita
  • Patent number: 7269560
    Abstract: A system and method facilitating speech detection and/or enhancement utilizing audio/video fusion is provided. The present invention fuses audio and video in a probabilistic generative model that implements cross-model, self-supervised learning, enabling rapid adaptation to audio visual data. The system can learn to detect and enhance speech in noise given only a short (e.g., 30 second) sequence of audio-visual data. In addition, it automatically learns to track the lips as they move around in the video.
    Type: Grant
    Filed: June 27, 2003
    Date of Patent: September 11, 2007
    Assignee: Microsoft Corporation
    Inventors: John R. Hershey, Trausti Thor Kristjansson, Hagai Attias, Nebojsa Jojic