Hidden Markov Model (hmm) (epo) Patents (Class 704/256.1)

Training of hmm (epo) (Class 704/256.2)

With insufficient amount of training data, e.g., state sharing, tying, deleted interpolation (EPO) (Class 704/256.3)

Duration modeling in hmm, e.g., semi hmm, segmental models, transition probabilities (epo) (Class 704/256.4)

Hidden markov (hm) network (epo) (Class 704/256.5)

State emission probability (epo) (Class 704/256.6)

Continuous density, e.g, Gaussian distribution, Lapalce (EPO) (Class 704/256.7)
Discrete density, e.g., Vector Quantization preprocessor, look up tables (EPO) (Class 704/256.8)

Systems and methods for classifying audio into broad phoneme classes

Patent number: 7424427

Abstract: An audio classification system classifies sounds in an audio stream as belonging to one of a relatively small number of classes. The audio classification system includes a signal analysis component [301] and a decoder [302]. The decoder [302] includes a number of models [310-316] for performing the audio classifications. In one implementation, the possible classifications include: vowels, fricatives, narrowband, wideband, coughing, gender, and silence. The classified audio may be used to enhance speech recognition of the audio stream.

Type: Grant

Filed: October 16, 2003

Date of Patent: September 9, 2008

Assignees: Verizon Corporate Services Group Inc., BBN Technologies Corp.

Inventors: Daben Liu, Francis G. Kubala
Signal Analysis Method

Publication number: 20080140403

Abstract: Improvement in the reliability of segmentation of a signal, such as an ECG signal, is disclosed through the use of duration constraints. The signal is analysed using a hidden Markov model. The duration constraints specify minimum allowed durations for specific states of the model. The duration constraints can be incorporated either in the model itself or in a Viterbi algorithm used to compute the most probable state sequence given a conventional model. Also disclosed is the derivation of a confidence measure from the model which can be used to assess the quality and robustness of the segmentation and to identify any signals for which the segmentation is unreliable, for example due to the presence of noise or abnormality in the signal.

Type: Application

Filed: May 6, 2005

Publication date: June 12, 2008

Applicant: ISIS INNOVATION LIMITED

Inventors: Nicholas Hughes, Lionel Tarassenko, Stephen Roberts
Method and apparatus for determining inputs to a finite state system

Publication number: 20080140404

Abstract: A method of equalization used to estimate a transmitted signal given a received output is presented herein. The equalization method involves modeling a transmission channel as a Hidden Markov Model (HMM). The HMM channel is evaluated as a finite state machine. A Markov Chain Monte Carlo technique of sampling and computation is then utilized to estimate the transmitted signal.

Type: Application

Filed: October 25, 2007

Publication date: June 12, 2008

Inventors: Henk Wymeersch, Moe Z. Win, Faisal M. Kashif
Automatic Speech Recognition System and Method

Publication number: 20080114595

Abstract: An automatic speech recognition method for identifying words from an input speech signal includes providing at least one hypothesis recognition based on the input speech signal, the hypothesis recognition being an individual hypothesis word or a sequence of individual hypothesis words, and computing a confidence measure for the hypothesis recognition, based on the input speech signal, wherein computing a confidence measure includes computing differential contributions to the confidence measure, each as a difference between a constrained acoustic score and an unconstrained acoustic score, weighting each differential contribution by applying thereto a cumulative distribution function of the differential contribution, so as to make the distributions of the confidence measures homogeneous in terms of rejection capability, as the language, vocabulary and grammar vary, and computing the confidence measure by averaging the weighted differential contributions.

Type: Application

Filed: December 28, 2004

Publication date: May 15, 2008

Inventors: Claudio Vair, Daniele Colibro
System and method for effectively implementing a Mandarin Chinese speech recognition dictionary

Patent number: 7353174

Abstract: The present invention comprises a system and method for effectively implementing a Mandarin Chinese speech recognition dictionary, and may include a recognizer configured to compare input speech data to phone strings from a vocabulary dictionary that is implemented according to an optimized Mandarin Chinese phone set. The optimized Mandarin Chinese phone set may efficiently be implemented by utilizing an allophone and phonemic variation technique. In addition, the foregoing vocabulary dictionary may be implemented by utilizing unified dictionary optimization techniques to provide robust and accurate speech recognition. Furthermore, the vocabulary dictionary may be implemented as an optimized dictionary to accurately recognize either Northern Mandarin Chinese speech or Southern Mandarin Chinese speech during the speech recognition procedure.

Type: Grant

Filed: March 31, 2003

Date of Patent: April 1, 2008

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Xavier Menendez-Pidal, Lei Duan, Jingwen Lu, Lex Olorenshaw
System and method for Mandarin Chinese speech recognition using an optimized phone set

Patent number: 7353173

Abstract: The present invention comprises a system and method for implementing a Mandarin Chinese speech recognizer with an optimized phone set, and may include a recognizer configured to compare input speech data to phone strings from a vocabulary dictionary that is implemented according to an optimized Mandarin Chinese phone set. The optimized Mandarin Chinese phone set may be implemented with a phonetic technique to separately include consonantal phones and vocalic phones. For reasons of system efficiency, the optimized Mandarin Chinese phone set may preferably be implemented in a compact manner to include only a minimum required number of consonantal phones and vocalic phones to accurately represent Mandarin Chinese speech during the speech recognition procedure.

Type: Grant

Filed: March 31, 2003

Date of Patent: April 1, 2008

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Xavier Menendez-Pidal, Lei Duan, Jingwen Lu, Lex Olorenshaw
System and method for cantonese speech recognition using an optimized phone set

Patent number: 7353172

Abstract: The present invention comprises a system and method for implementing a Cantonese speech recognizer with an optimized phone set, and may include a recognizer configured to compare input speech data to phone strings from a vocabulary dictionary that is implemented according to an optimized Cantonese phone set. The optimized Cantonese phone set may be implemented with a phonetic technique to separately include consonantal phones and vocalic phones. For reasons of system efficiency, the optimized Cantonese phone set may preferably be implemented in a compact manner to include only a minimum required number of consonantal phones and vocalic phones to accurately represent Cantonese speech during the speech recognition procedure.

Type: Grant

Filed: March 24, 2003

Date of Patent: April 1, 2008

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Michael Emonts, Xavier Menendez-Pidal, Lex Olorenshaw
Speech recognition method and system

Patent number: 7319960

Abstract: A speech recognition system uses a phoneme counter to determine the length of a word to be recognized. The result is used to split a lexicon into one or more sub-lexicons containing only words which have the same or similar length to that of the word to be recognized, so restricting the search space significantly. In another aspect, a phoneme counter is used to estimate the number of phonemes in a word so that a transition bias can be calculated. This bias is applied to the transition probabilities between phoneme models in an HNN based recognizer to improve recognition performance for relatively short or long words.

Type: Grant

Filed: December 19, 2001

Date of Patent: January 15, 2008

Assignee: Nokia Corporation

Inventors: Soren Riis, Konstantinos Koumpis
Unsupervised learning of video structures in videos using hierarchical statistical models to detect events

Patent number: 7313269

Abstract: A method learns a structure of a video, in an unsupervised setting, to detect events in the video consistent with the structure. Sets of features are selected from the video. Based on the selected features, a hierarchical statistical model is updated, and an information gain of the hierarchical statistical model is evaluated. Redundant features are then filtered, and the hierarchical statistical model is updated, based on the filtered features. A Bayesian information criteria is applied to each model and feature set pair, which can then be rank ordered according to the criteria to detect the events in the video.

Type: Grant

Filed: December 12, 2003

Date of Patent: December 25, 2007

Assignee: Mitsubishi Electric Research Laboratories, Inc.

Inventors: Lexing Xie, Ajay Divakaran, Shih-Fu Chang
Techniques for video retrieval based on HMM similarity

Patent number: 7308443

Abstract: A query is received. The query may be an object containing temporal information. A query model including static and temporal components is then determined for the object. A weighting for static and temporal components is also determined. The query model is then compared with one or more search models. The search models also include static and temporal components. Search results are then determined based on the comparison. In one embodiment, the comparison may compare the static and temporal components of the query model and the search model. A weighting of the differences between the static and temporal components may be used to determine the ranking for the search results.

Type: Grant

Filed: December 23, 2004

Date of Patent: December 11, 2007

Assignee: Ricoh Company, Ltd.

Inventors: Dar-Shyang Lee, Jonathan J. Hull, Gregory J. Wolff
Object activity modeling method

Patent number: 7308030

Abstract: An object activity modeling method which can efficiently model complex objects such as a human body is provided. The object activity modeling method includes the steps of (a) obtaining an optical flow vector from a video sequence; (b) obtaining the probability distribution of the feature vector for a plurality of video frames, using the optical flow vector; (c) modeling states, using the probability distribution of the feature vector; and (d) expressing the activity of the object in the video sequence based on state transition. According to the modeling method, in video indexing and recognition field, complex activities such as human activities can be efficiently modeled and recognized without segmenting objects.

Type: Grant

Filed: April 12, 2005

Date of Patent: December 11, 2007

Assignees: Samsung Electronics Co., Ltd., The Regents of the University of California

Inventors: Yang-lim Choi, Yun-ju Yu, Bangalore S. Manjunath, Xinding Sun, Ching-wei Chen
Nonlinear mapping for feature extraction in automatic speech recognition

Patent number: 7254538

Abstract: The present invention successfully combines neural-net discriminative feature processing with Gaussian-mixture distribution modeling (GMM). By training one or more neural networks to generate subword probability posteriors, then using transformations of these estimates as the base features for a conventionally-trained Gaussian-mixture based system, substantial error rate reductions may be achieved. The present invention effectively has two acoustic models in tandem—first a neural net and then a GMM. By using a variety of combination schemes available for connectionist models, various systems based upon multiple features streams can be constructed with even greater error rate reductions.

Type: Grant

Filed: November 16, 2000

Date of Patent: August 7, 2007

Assignee: International Computer Science Institute

Inventors: Hynek Hermansky, Sangita Sharma, Daniel Ellis
Automatic identification of telephone callers based on voice characteristics

Patent number: 7231019

Abstract: A method and apparatus are provided for identifying a caller of a call from the caller to a recipient. A voice input is received from the caller, and characteristics of the voice input are applied to a plurality of acoustic models, which include a generic acoustic model and acoustic models of any previously identified callers, to obtain a plurality of respective acoustic scores. The caller is identified as one of the previously identified callers or as a new caller based on the plurality of acoustic scores. If the caller is identified as a new caller, a new acoustic model is generated for the new caller, which is specific to the new caller.

Type: Grant

Filed: February 12, 2004

Date of Patent: June 12, 2007

Assignee: Microsoft Corporation

Inventor: Andrei Pascovici
Embedded bayesian network for pattern recognition

Patent number: 7203368

Abstract: A pattern recognition procedure forms a hierarchical statistical model using a hidden Markov model and a coupled hidden Markov model. The hierarchical statistical model supports a pa 20 layer having multiple supernodes and a child layer having multiple nodes associated with each supernode of the parent layer. After training, the hierarchical statistical model uses observation vectors extracted from a data set to find a substantially optimal state sequence segmentation.

Type: Grant

Filed: January 6, 2003

Date of Patent: April 10, 2007

Assignee: Intel Corporation

Inventor: Ara V. Nefian
Recognizing the numeric language in natural spoken dialogue

Patent number: 7181399

Abstract: A system for recognizing connected digits in natural spoken dialogue includes a speech recognition processor that receives unconstrained fluent input speech and produces a string of words that can include a numeric language, and a numeric understanding processor that converts the string of words into a sequence of digits based on a set of rules. An acoustic model database utilized by the speech recognition processor includes a first set of hidden Markov models that characterize the acoustic features of numeric words and phrases, a second set of hidden Markov models that characterize the acoustic features of the remaining vocabulary words, and a filler model that characterizes the acoustic features of out-of-vocabulary utterances. An utterance verification processor verifies the accuracy of the string of words. A validation database stores a grammar, and a string validation processor outputs validity information based on a comparison of the sequence of digits with the grammar.

Type: Grant

Filed: May 19, 1999

Date of Patent: February 20, 2007

Assignee: AT&T Corp.

Inventors: Mazin G. Rahim, Giuseppe Riccardi, Jeremy Huntley Wright, Bruce Melvin Buntschuh, Allen Louis Gorin
Image recognition using hidden markov models and coupled hidden markov models

Patent number: 7171043

Abstract: An image processing system useful for facial recognition and security identification obtains an array of observation vectors from a facial image to be identified. A Viterbi algorithm is applied to the observation vectors given the parameters of a hierarchical statistical model for each object, and a face is identified by finding a highest matching score between an observation sequence and the hierarchical statistical model.

Type: Grant

Filed: October 11, 2002

Date of Patent: January 30, 2007

Assignee: Intel Corporation

Inventor: Ara V. Nefian
Coupled hidden Markov model for audiovisual speech recognition

Patent number: 7165029

Abstract: A speech recognition method includes use of synchronous or asynchronous audio and a video data to enhance speech recognition probabilities. A two stream coupled hidden Markov model is trained and used to identify speech. At least one stream is derived from audio data and a second stream is derived from mouth pattern data. Gestural or other suitable data streams can optionally be combined to reduce speech recognition error rates in noisy environments.

Type: Grant

Filed: May 9, 2002

Date of Patent: January 16, 2007

Assignee: Intel Corporation

Inventor: Ara V. Nefian
Video monitoring system employing hierarchical hidden markov model (HMM) event learning and classification

Patent number: 7076102

Abstract: A method and apparatus are disclosed for automatically learning and identifying events in image data using hierarchical HMMs to define and detect one or more events. The hierarchical HMMs include multiple paths that encompass variations of the same event. Hierarchical HMMs provide a framework for defining events that may be exhibited in various ways. Each event is modeled in the hierarchical HMM with a set of sequential states that describe the paths in a high-dimensional feature space. These models can then be used to analyze video sequences to segment and recognize each individual event to be recognized. The hierarchical HMM is generated during a training phase, by processing a number of images of the event of interest in various ways, typically observed from multiple viewpoints.

Type: Grant

Filed: June 27, 2002

Date of Patent: July 11, 2006

Assignee: Koninklijke Philips Electronics N.V.

Inventors: Yun-Ting Lin, Srinivas Gutta, Tomas Brodsky, Vasanth Philomin
Method for speech processing involving whole-utterance modeling

Patent number: 6961703

Abstract: A speech verification process involves comparison of enrollment and test speech data and an improved method of comparing the data is disclosed, wherein segmented frames of speech are analyzed jointly, rather than independently. The enrollment and test speech are both subjected to a feature extraction process to derive fixed-length feature vectors, and the feature vectors are compared, using a linear discriminant analysis and having no dependence upon the order of the words spoken or the speaking rate. The discriminant analysis is made possible, despite a relatively high dimensionality of the feature vectors, by a mathematical procedure provided for finding an eigenvector to simultaneously diagonalize the between-speaker and between-channel covariances of the enrollment and test data.

Type: Grant

Filed: September 13, 2000

Date of Patent: November 1, 2005

Assignee: ITT Manufacturing Enterprises, Inc.

Inventors: Alan Lawrence Higgins, Lawrence George Bahler

prev 1 2 3