Abstract: A method models voice units and produces reference segments for modeling voice units. The reference segments describe voice modules by characteristic vectors, the characteristic vectors being stored in the order in which they are found in a training voice signal. Alternative characteristic vectors are associated with each characteristic vector. The reference segments for describing the voice modules are combined during the modeling of larger voice units. In the event of identification, the respectively best adapted characteristic vector alternatives are used to determined the distance between a test utterance and the larger vocal units.
Abstract: A speech recognition system uses a phoneme counter to determine the length of a word to be recognized. The result is used to split a lexicon into one or more sub-lexicons containing only words which have the same or similar length to that of the word to be recognized, so restricting the search space significantly. In another aspect, a phoneme counter is used to estimate the number of phonemes in a word so that a transition bias can be calculated. This bias is applied to the transition probabilities between phoneme models in an HNN based recognizer to improve recognition performance for relatively short or long words.
Abstract: An object activity modeling method which can efficiently model complex objects such as a human body is provided. The object activity modeling method includes the steps of (a) obtaining an optical flow vector from a video sequence; (b) obtaining the probability distribution of the feature vector for a plurality of video frames, using the optical flow vector; (c) modeling states, using the probability distribution of the feature vector; and (d) expressing the activity of the object in the video sequence based on state transition. According to the modeling method, in video indexing and recognition field, complex activities such as human activities can be efficiently modeled and recognized without segmenting objects.
Type:
Grant
Filed:
April 12, 2005
Date of Patent:
December 11, 2007
Assignees:
Samsung Electronics Co., Ltd., The Regents of the University of California
Abstract: A speech segment search unit searches a speech database for speech segments that satisfy a phonetic environment, and a HMM learning unit computes the HMMs of phonemes on the basis of the search result. A segment recognition unit performs segment recognition of speech segments on the basis of the computed HMMs of the phonemes, and when the phoneme of the segment recognition result is equal to a phoneme of the source speech segment, that speech segment is registered in a segment dictionary.
Abstract: A method of reconstructing a damaged sequence of symbols where some symbols are missing is provided in which statistical parameters of the sequence are used with confidence windowing techniques to quickly and efficiently reconstruct the damaged sequence to its original form. Confidence windowing techniques are provided that are equivalent to generalized hidden semi-Markov models but which are more easily used to determine the most likely missing symbol at a given point in the damaged sequence being reconstructed. The method can be used to reconstruct communications consisting of speech, music, digital transmission symbols and others having a bounded symbol set which can be described by statistical behaviors in the symbol stream.