Hidden Markov Model (hmm) (epo) Patents (Class 704/256.1)
  • Patent number: 7424427
    Abstract: An audio classification system classifies sounds in an audio stream as belonging to one of a relatively small number of classes. The audio classification system includes a signal analysis component [301] and a decoder [302]. The decoder [302] includes a number of models [310-316] for performing the audio classifications. In one implementation, the possible classifications include: vowels, fricatives, narrowband, wideband, coughing, gender, and silence. The classified audio may be used to enhance speech recognition of the audio stream.
    Type: Grant
    Filed: October 16, 2003
    Date of Patent: September 9, 2008
    Assignees: Verizon Corporate Services Group Inc., BBN Technologies Corp.
    Inventors: Daben Liu, Francis G. Kubala
  • Publication number: 20080140403
    Abstract: Improvement in the reliability of segmentation of a signal, such as an ECG signal, is disclosed through the use of duration constraints. The signal is analysed using a hidden Markov model. The duration constraints specify minimum allowed durations for specific states of the model. The duration constraints can be incorporated either in the model itself or in a Viterbi algorithm used to compute the most probable state sequence given a conventional model. Also disclosed is the derivation of a confidence measure from the model which can be used to assess the quality and robustness of the segmentation and to identify any signals for which the segmentation is unreliable, for example due to the presence of noise or abnormality in the signal.
    Type: Application
    Filed: May 6, 2005
    Publication date: June 12, 2008
    Applicant: ISIS INNOVATION LIMITED
    Inventors: Nicholas Hughes, Lionel Tarassenko, Stephen Roberts
  • Publication number: 20080140404
    Abstract: A method of equalization used to estimate a transmitted signal given a received output is presented herein. The equalization method involves modeling a transmission channel as a Hidden Markov Model (HMM). The HMM channel is evaluated as a finite state machine. A Markov Chain Monte Carlo technique of sampling and computation is then utilized to estimate the transmitted signal.
    Type: Application
    Filed: October 25, 2007
    Publication date: June 12, 2008
    Inventors: Henk Wymeersch, Moe Z. Win, Faisal M. Kashif
  • Publication number: 20080114595
    Abstract: An automatic speech recognition method for identifying words from an input speech signal includes providing at least one hypothesis recognition based on the input speech signal, the hypothesis recognition being an individual hypothesis word or a sequence of individual hypothesis words, and computing a confidence measure for the hypothesis recognition, based on the input speech signal, wherein computing a confidence measure includes computing differential contributions to the confidence measure, each as a difference between a constrained acoustic score and an unconstrained acoustic score, weighting each differential contribution by applying thereto a cumulative distribution function of the differential contribution, so as to make the distributions of the confidence measures homogeneous in terms of rejection capability, as the language, vocabulary and grammar vary, and computing the confidence measure by averaging the weighted differential contributions.
    Type: Application
    Filed: December 28, 2004
    Publication date: May 15, 2008
    Inventors: Claudio Vair, Daniele Colibro
  • Patent number: 7353174
    Abstract: The present invention comprises a system and method for effectively implementing a Mandarin Chinese speech recognition dictionary, and may include a recognizer configured to compare input speech data to phone strings from a vocabulary dictionary that is implemented according to an optimized Mandarin Chinese phone set. The optimized Mandarin Chinese phone set may efficiently be implemented by utilizing an allophone and phonemic variation technique. In addition, the foregoing vocabulary dictionary may be implemented by utilizing unified dictionary optimization techniques to provide robust and accurate speech recognition. Furthermore, the vocabulary dictionary may be implemented as an optimized dictionary to accurately recognize either Northern Mandarin Chinese speech or Southern Mandarin Chinese speech during the speech recognition procedure.
    Type: Grant
    Filed: March 31, 2003
    Date of Patent: April 1, 2008
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Xavier Menendez-Pidal, Lei Duan, Jingwen Lu, Lex Olorenshaw
  • Patent number: 7353173
    Abstract: The present invention comprises a system and method for implementing a Mandarin Chinese speech recognizer with an optimized phone set, and may include a recognizer configured to compare input speech data to phone strings from a vocabulary dictionary that is implemented according to an optimized Mandarin Chinese phone set. The optimized Mandarin Chinese phone set may be implemented with a phonetic technique to separately include consonantal phones and vocalic phones. For reasons of system efficiency, the optimized Mandarin Chinese phone set may preferably be implemented in a compact manner to include only a minimum required number of consonantal phones and vocalic phones to accurately represent Mandarin Chinese speech during the speech recognition procedure.
    Type: Grant
    Filed: March 31, 2003
    Date of Patent: April 1, 2008
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Xavier Menendez-Pidal, Lei Duan, Jingwen Lu, Lex Olorenshaw
  • Patent number: 7353172
    Abstract: The present invention comprises a system and method for implementing a Cantonese speech recognizer with an optimized phone set, and may include a recognizer configured to compare input speech data to phone strings from a vocabulary dictionary that is implemented according to an optimized Cantonese phone set. The optimized Cantonese phone set may be implemented with a phonetic technique to separately include consonantal phones and vocalic phones. For reasons of system efficiency, the optimized Cantonese phone set may preferably be implemented in a compact manner to include only a minimum required number of consonantal phones and vocalic phones to accurately represent Cantonese speech during the speech recognition procedure.
    Type: Grant
    Filed: March 24, 2003
    Date of Patent: April 1, 2008
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Michael Emonts, Xavier Menendez-Pidal, Lex Olorenshaw
  • Patent number: 7319960
    Abstract: A speech recognition system uses a phoneme counter to determine the length of a word to be recognized. The result is used to split a lexicon into one or more sub-lexicons containing only words which have the same or similar length to that of the word to be recognized, so restricting the search space significantly. In another aspect, a phoneme counter is used to estimate the number of phonemes in a word so that a transition bias can be calculated. This bias is applied to the transition probabilities between phoneme models in an HNN based recognizer to improve recognition performance for relatively short or long words.
    Type: Grant
    Filed: December 19, 2001
    Date of Patent: January 15, 2008
    Assignee: Nokia Corporation
    Inventors: Soren Riis, Konstantinos Koumpis
  • Patent number: 7313269
    Abstract: A method learns a structure of a video, in an unsupervised setting, to detect events in the video consistent with the structure. Sets of features are selected from the video. Based on the selected features, a hierarchical statistical model is updated, and an information gain of the hierarchical statistical model is evaluated. Redundant features are then filtered, and the hierarchical statistical model is updated, based on the filtered features. A Bayesian information criteria is applied to each model and feature set pair, which can then be rank ordered according to the criteria to detect the events in the video.
    Type: Grant
    Filed: December 12, 2003
    Date of Patent: December 25, 2007
    Assignee: Mitsubishi Electric Research Laboratories, Inc.
    Inventors: Lexing Xie, Ajay Divakaran, Shih-Fu Chang
  • Patent number: 7308443
    Abstract: A query is received. The query may be an object containing temporal information. A query model including static and temporal components is then determined for the object. A weighting for static and temporal components is also determined. The query model is then compared with one or more search models. The search models also include static and temporal components. Search results are then determined based on the comparison. In one embodiment, the comparison may compare the static and temporal components of the query model and the search model. A weighting of the differences between the static and temporal components may be used to determine the ranking for the search results.
    Type: Grant
    Filed: December 23, 2004
    Date of Patent: December 11, 2007
    Assignee: Ricoh Company, Ltd.
    Inventors: Dar-Shyang Lee, Jonathan J. Hull, Gregory J. Wolff
  • Patent number: 7308030
    Abstract: An object activity modeling method which can efficiently model complex objects such as a human body is provided. The object activity modeling method includes the steps of (a) obtaining an optical flow vector from a video sequence; (b) obtaining the probability distribution of the feature vector for a plurality of video frames, using the optical flow vector; (c) modeling states, using the probability distribution of the feature vector; and (d) expressing the activity of the object in the video sequence based on state transition. According to the modeling method, in video indexing and recognition field, complex activities such as human activities can be efficiently modeled and recognized without segmenting objects.
    Type: Grant
    Filed: April 12, 2005
    Date of Patent: December 11, 2007
    Assignees: Samsung Electronics Co., Ltd., The Regents of the University of California
    Inventors: Yang-lim Choi, Yun-ju Yu, Bangalore S. Manjunath, Xinding Sun, Ching-wei Chen
  • Patent number: 7254538
    Abstract: The present invention successfully combines neural-net discriminative feature processing with Gaussian-mixture distribution modeling (GMM). By training one or more neural networks to generate subword probability posteriors, then using transformations of these estimates as the base features for a conventionally-trained Gaussian-mixture based system, substantial error rate reductions may be achieved. The present invention effectively has two acoustic models in tandem—first a neural net and then a GMM. By using a variety of combination schemes available for connectionist models, various systems based upon multiple features streams can be constructed with even greater error rate reductions.
    Type: Grant
    Filed: November 16, 2000
    Date of Patent: August 7, 2007
    Assignee: International Computer Science Institute
    Inventors: Hynek Hermansky, Sangita Sharma, Daniel Ellis
  • Patent number: 7231019
    Abstract: A method and apparatus are provided for identifying a caller of a call from the caller to a recipient. A voice input is received from the caller, and characteristics of the voice input are applied to a plurality of acoustic models, which include a generic acoustic model and acoustic models of any previously identified callers, to obtain a plurality of respective acoustic scores. The caller is identified as one of the previously identified callers or as a new caller based on the plurality of acoustic scores. If the caller is identified as a new caller, a new acoustic model is generated for the new caller, which is specific to the new caller.
    Type: Grant
    Filed: February 12, 2004
    Date of Patent: June 12, 2007
    Assignee: Microsoft Corporation
    Inventor: Andrei Pascovici
  • Patent number: 7203368
    Abstract: A pattern recognition procedure forms a hierarchical statistical model using a hidden Markov model and a coupled hidden Markov model. The hierarchical statistical model supports a pa 20 layer having multiple supernodes and a child layer having multiple nodes associated with each supernode of the parent layer. After training, the hierarchical statistical model uses observation vectors extracted from a data set to find a substantially optimal state sequence segmentation.
    Type: Grant
    Filed: January 6, 2003
    Date of Patent: April 10, 2007
    Assignee: Intel Corporation
    Inventor: Ara V. Nefian
  • Patent number: 7181399
    Abstract: A system for recognizing connected digits in natural spoken dialogue includes a speech recognition processor that receives unconstrained fluent input speech and produces a string of words that can include a numeric language, and a numeric understanding processor that converts the string of words into a sequence of digits based on a set of rules. An acoustic model database utilized by the speech recognition processor includes a first set of hidden Markov models that characterize the acoustic features of numeric words and phrases, a second set of hidden Markov models that characterize the acoustic features of the remaining vocabulary words, and a filler model that characterizes the acoustic features of out-of-vocabulary utterances. An utterance verification processor verifies the accuracy of the string of words. A validation database stores a grammar, and a string validation processor outputs validity information based on a comparison of the sequence of digits with the grammar.
    Type: Grant
    Filed: May 19, 1999
    Date of Patent: February 20, 2007
    Assignee: AT&T Corp.
    Inventors: Mazin G. Rahim, Giuseppe Riccardi, Jeremy Huntley Wright, Bruce Melvin Buntschuh, Allen Louis Gorin
  • Patent number: 7171043
    Abstract: An image processing system useful for facial recognition and security identification obtains an array of observation vectors from a facial image to be identified. A Viterbi algorithm is applied to the observation vectors given the parameters of a hierarchical statistical model for each object, and a face is identified by finding a highest matching score between an observation sequence and the hierarchical statistical model.
    Type: Grant
    Filed: October 11, 2002
    Date of Patent: January 30, 2007
    Assignee: Intel Corporation
    Inventor: Ara V. Nefian
  • Patent number: 7165029
    Abstract: A speech recognition method includes use of synchronous or asynchronous audio and a video data to enhance speech recognition probabilities. A two stream coupled hidden Markov model is trained and used to identify speech. At least one stream is derived from audio data and a second stream is derived from mouth pattern data. Gestural or other suitable data streams can optionally be combined to reduce speech recognition error rates in noisy environments.
    Type: Grant
    Filed: May 9, 2002
    Date of Patent: January 16, 2007
    Assignee: Intel Corporation
    Inventor: Ara V. Nefian
  • Patent number: 7076102
    Abstract: A method and apparatus are disclosed for automatically learning and identifying events in image data using hierarchical HMMs to define and detect one or more events. The hierarchical HMMs include multiple paths that encompass variations of the same event. Hierarchical HMMs provide a framework for defining events that may be exhibited in various ways. Each event is modeled in the hierarchical HMM with a set of sequential states that describe the paths in a high-dimensional feature space. These models can then be used to analyze video sequences to segment and recognize each individual event to be recognized. The hierarchical HMM is generated during a training phase, by processing a number of images of the event of interest in various ways, typically observed from multiple viewpoints.
    Type: Grant
    Filed: June 27, 2002
    Date of Patent: July 11, 2006
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Yun-Ting Lin, Srinivas Gutta, Tomas Brodsky, Vasanth Philomin
  • Patent number: 6961703
    Abstract: A speech verification process involves comparison of enrollment and test speech data and an improved method of comparing the data is disclosed, wherein segmented frames of speech are analyzed jointly, rather than independently. The enrollment and test speech are both subjected to a feature extraction process to derive fixed-length feature vectors, and the feature vectors are compared, using a linear discriminant analysis and having no dependence upon the order of the words spoken or the speaking rate. The discriminant analysis is made possible, despite a relatively high dimensionality of the feature vectors, by a mathematical procedure provided for finding an eigenvector to simultaneously diagonalize the between-speaker and between-channel covariances of the enrollment and test data.
    Type: Grant
    Filed: September 13, 2000
    Date of Patent: November 1, 2005
    Assignee: ITT Manufacturing Enterprises, Inc.
    Inventors: Alan Lawrence Higgins, Lawrence George Bahler