Normalizing Patents (Class 704/234)
-
Patent number: 6157909Abstract: A process and device for blind equalization of the effects of a transmission channel on a speech signal. The speech signal is transformed into cepstral vectors which are representative of the speech signal over a given horizon. A reference cepstrum consisting of a constant cepstrum signal representative of the long-term cepstrum of the speech signal is calculated for each cepstral vector. Each of the cepstral vectors is subjected to adaptive filtering by LMS on the basis of the reference cepstrum so as to generate a set of equalized cepstral vectors on the basis of the calculation of an error signal between the reference cepstrum and equalized cepstral vectors. The error signal is expressed as the difference between the reference cepstrum component of a given rank and the component of the same rank of the equalized cepstral vector.Type: GrantFiled: July 20, 1998Date of Patent: December 5, 2000Assignee: France TelecomInventors: Laurent Mauuary, Jean Monne
-
Patent number: 6151573Abstract: A maximum likelihood (ML) linear regression (LR) solution to environment normalization is provided where the environment is modeled as a hidden (non-observable) variable. By application of an expectation maximization algorithm and extension of Baum-Welch forward and backward variables (Steps 23a-23d) a source normalization is achieved such that it is not necessary to label a database in terms of environment such as speaker identity, channel, microphone and noise type.Type: GrantFiled: August 15, 1998Date of Patent: November 21, 2000Assignee: Texas Instruments IncorporatedInventor: Yifan Gong
-
Patent number: 6138095Abstract: Speech recognition in which the log probabilities of the null and alternative hypothesis are computed for an input speech sample by comparison with specific stored speech vocabularies/grammars and with general speech characteristics. The difference in probabilities is normalized by the magnitude of the null hypothesis to derive a likelihood factor which is compared with a rejection threshold that is utterance-length dependent. Advantageously, a high-order polynomial representation of the rejection threshold length dependency may be simplified by a series of piece-wise constants which are stored as rejection thresholds to be selected in accordance with the length of the input speech sample.Type: GrantFiled: September 3, 1998Date of Patent: October 24, 2000Assignee: Lucent Technologies Inc.Inventors: Sunil K. Gupta, Frank Kao-Ping Soong
-
Patent number: 6098040Abstract: The invention relates to a method and apparatus for generating noise-attenuated feature vectors for use in recognizing speech, more particularly to a system and method providing a feature set for speech recognition that is robust to adverse noise conditions. This is done by receiving, through an input, a set of signal frames, at least some containing speech sounds, and then classifying the frames in the set of signal frames into classification groups on the basis of their energy levels. Each classification group is characterized by a mean energy value. In a specific example of implementation, the invention makes use of channel energy values to condition the frames in the set of signal frames. The frames in the set of signal frames are attenuated or noise reduced by altering the energy of the frames on the basis of the frames containing non-speech sounds.Type: GrantFiled: November 7, 1997Date of Patent: August 1, 2000Assignee: Nortel Networks CorporationInventors: Marco Petroni, Steven Douglas Peters
-
Patent number: 6038530Abstract: In a speech communication network in which a transmitter (1) transmits speech signals via a network (4) to a receiver (8), it can happen that more traffic is offered to the network (4) than it can handle. In order to reduce the network load, at least one node (24) is arranged to perform bitrate reduction to delete some of the prediction parameters representing the speech signal. It can also be the case that a receiver comprises a speech decoder having insufficient computational power available for decoding the encoded speech signal. In such case, the speech decoder is arranged for using only a part of the prediction parameters available. This results in a lower complexity of the synthesis filter (60).Type: GrantFiled: February 9, 1998Date of Patent: March 14, 2000Assignee: U.S. Philips CorporationInventors: Rakesh Taori, Andreas J. Gerrits
-
Patent number: 6032115Abstract: In sound recognition apparatus of the present invention, user's utterance or a sound provided by an output section using previously stored sound waveforms is simultaneously inputted through a basic microphone of known frequency characteristics and an input microphone of unknown frequency characteristics. An analysis section respectively analyzes the frequency of the input speech through the basic microphone and the input microphone. A frequency characteristics calculation section calculates first difference data between the frequencies of the input speech of the basic microphone and the input microphone, and calculates frequency characteristics of the input microphone according to the first difference data and the frequency characteristics of the basic microphone.Type: GrantFiled: September 26, 1997Date of Patent: February 29, 2000Assignee: Kabushiki Kaisha ToshibaInventors: Hiroshi Kanazawa, Takehiko Isaka, Yoshifumi Nagata, Hiroyuki Tsuboi
-
Patent number: 5950157Abstract: Adverse effects of type mismatch between acoustic input devices used during testing and during training in machine-based recognition of the source of acoustic phenomena are minimized. A normalizing model is matched to a source model based, or dependent, upon an acoustic input device whose transfer characteristics color acoustic characteristics of a source as represented in the source model. An application of the present invention is to speaker recognition, i.e., recognition of the identity of a speaker by the speaker's voice.Type: GrantFiled: April 18, 1997Date of Patent: September 7, 1999Assignee: SRI InternationalInventors: Larry P. Heck, Mitchel Weintraub
-
Patent number: 5946653Abstract: An improved method of training a SISRS uses less processing and memory resources by operating on vectors instead of matrices which represent spoken commands. Memory requirements are linearly proportional to the number of spoken commands for storing each command model. A spoken command is identified from the set of spoken commands by a command recognition procedure (200). The command recognition procedure (200) includes sampling the speaker's speech, deriving cepstral coefficients and delta-cepstral coefficients, and performing a polynomial expansion on cepstral coefficients. The identified spoken command is selected using the dot product of the command model data and the average command structure representing the unidentified spoken command.Type: GrantFiled: October 1, 1997Date of Patent: August 31, 1999Assignee: Motorola, Inc.Inventors: William Michael Campbell, John Eric Kleider, Charles Conway Broun, Carl Steven Gifford, Khaled Assaleh
-
Patent number: 5924066Abstract: A system and method for classifying a speech signal within a likely speech signal class of a plurality of speech signal classes are provided. Stochastic models include a plurality of states having state transitions and output probabilities to generate state sequences which model evolutionary characteristics and durational variability of a speech signal. The method includes extracting a frame sequence, and determining a state sequence for each stochastic model with each state sequence having full state segmentation. Representative frames are determined to provide speech signal time normalization. A likely speech signal class is determined from a neural network having a plurality of inputs receiving the representative frames and a plurality of outputs corresponding to the plurality of speech signal classes. An output signal is generated based on the likely stochastic model.Type: GrantFiled: September 26, 1997Date of Patent: July 13, 1999Assignees: U S WEST, Inc., MediaOne, Inc.Inventor: Amlan Kundu
-
Patent number: 5915235Abstract: The present invention teaches an equalizer preprocessor for a mobile telephone speech coder that adapts to the characteristics of its input transducer. The equalizer determines the frequency response of the input transducer by measuring the long term characteristics of the input signal and estimating the spectral envelope of that signal. The equalizer then adapts so that the output signal has a spectral response closer to a perceptually ideal response in accordance with the calculated spectral envelope. In a first embodiment of the present invention, the adaptive equalizer is implemented using digital filtering techniques. The equalizer determines a set of long term autocorrelation coefficient values and from these values generates a set of filter taps which serve to whiten or flatten the spectral response of the input signal. This whitened signal is then passed through a target filter which impresses upon the whitened signal the target spectral response.Type: GrantFiled: October 17, 1997Date of Patent: June 22, 1999Inventors: Andrew P. DeJaco, John A. Miller
-
Patent number: 5890113Abstract: An analyzing unit 1 converts an input speech into a feature vector time series. A reference pattern storing unit 3 stores the feature vector time series obtained by the same manner as in the analyzing unit. A matching unit 2 correlates for time axis the input speech feature vector time series and the reference patterns to one another. An environmental adapting unit 4 performs the environmental adaptation between the input speech feature vector time series and the reference patterns according to the result of matching in the matching unit 2. A speaker adapting unit 6 performs the adaptation concerning the speaker between the environmentally adapted reference patterns from the environmental adapting unit 4 and the input speech feature vector time series.Type: GrantFiled: December 13, 1996Date of Patent: March 30, 1999Assignee: NEC CorporationInventor: Keizaburo Takagi
-
Patent number: 5878392Abstract: A circuit arrangement for speech recognition carries out an analysis of a speech signal, extracting characteristic features. The extracted features are represented by spectral feature vectors which are compared with reference feature vectors stored for the speech signal to be recognized. The reference feature vectors are determined during a training phase in which a speech signal is recorded several times. A recognition result essentially depends on a quality of the spectral feature vectors and reference feature vectors. A recognition result essentially depends on a quality of the spectral feature vectors and reference feature vectors. A recursive high-pass filtering is performed in the time domain on the spectral feature vectors. Influences of noise signals on the recognition result are reduced by this and a high degree of speaker independence of the recognition is achieved.Type: GrantFiled: May 27, 1997Date of Patent: March 2, 1999Assignee: U.S. Philips CorporationInventors: Peter Meyer, Hans-Wilhelm Ruhl
-
Patent number: 5864806Abstract: For equalizing a speech signal constituted by an observed sequence of successive input sound frames, which speech signal is liable to be affected by disturbances, the speech signal is modelled by means of a hidden Markov model and, at each instant t: equalization filters are constituted in association with the paths in the Markov sense at instant t; at least a plurality of the equalization filters are applied to the frames to obtain, at instant t, a plurality of filtered sound frame sequences and an utterance probability for each of the paths respectively associated with the equalization filters applied; the equalization filter corresponding to the most probable path in the Markov sense is selected; and the filtered frame supplied by the selected equalization filter is selected as the equalized frame.Type: GrantFiled: May 5, 1997Date of Patent: January 26, 1999Assignee: France TelecomInventors: Chafic Mokbel, Denis Jouvet, Jean Monne
-
Patent number: 5842162Abstract: A sound recognizer uses a feature value normalization process to substantially increase the accuracy of recognizing acoustic signals in noise. The sound recognizer includes a feature vector device which determines a number of feature values for a number of analysis frames, a min/max device which determines a minimum and maximum feature value for each of a number of frequency bands, a normalizer which normalizes each of the feature values with the minimum and maximum feature values resulting in normalized feature vectors, and a comparator which compares the normalized feature vectors with template feature vectors to identify one of the template feature vectors that most resembles the normalized feature vectors.Type: GrantFiled: September 23, 1997Date of Patent: November 24, 1998Assignee: Motorola, Inc.Inventor: Adam B. Fineberg
-
Patent number: 5839103Abstract: The present invention relates to a pattern recognition system which uses data fusion to combine data from a plurality of extracted features and a plurality of classifiers. Speaker patterns can be accurately verified with the combination of discriminant based and distortion based classifiers. A novel approach using a training set of a "leave one out" data can be used for training the system with a reduced data set. Extracted features can be improved with a pole filtered method for reducing channel effects and an affine transformation for improving the correlation between training and testing data.Type: GrantFiled: June 7, 1995Date of Patent: November 17, 1998Assignee: Rutgers, The State University of New JerseyInventors: Richard J. Mammone, Kevin Farrell, Manish Sharma, Devang Naik, Xiaoyu Zhang, Khaled Assaleh, Han-Seng Liou
-
Patent number: 5812972Abstract: The present invention provides a speech recognizer that creates and updates the equalization vector as input speech is provided to the recognizer. The present invention includes a speech analyzer which transforms an input speech signal into a series of feature vectors or observation sequence. Each feature vector is then provided to a speech recognizer which modifies the feature vector by subtracting a previously determined equalization vector therefrom. The recognizer then performs segmentation and matches the modified feature vector to a stored model vector which is defined as the segmentation vector. The recognizer then, from time to time, determines a new equalization vector, the new equalization vector being defined based on the difference between one or more input feature vectors and their respective segmentation vectors.Type: GrantFiled: December 30, 1994Date of Patent: September 22, 1998Assignee: Lucent Technologies Inc.Inventors: Biing-Hwang Juang, David Mansour, Jay Gordon Wilpon
-
Patent number: 5790978Abstract: A system and method are provided for automatically computing local pitch contours from textual input to produce pitch contours that closely mimic those found in natural speech. The methodology of the invention incorporates parameterized equations whose parameters can be estimated directly from natural speech recordings. That methodology incorporates a model based on the premise that pitch contours instantiating a particular pitch contour class can be described as distortions in the temporal and frequency domains of a single, underlying contour. After the nature of the pitch contour for different pitch contour classes has been established, a pitch contour can be predicted that closely models a natural speech contour for a synthetic speech utterance by adding the individual contours of the different intonational classes and adjusting the boundaries of these to match the boundaries of the adjacent intonation curves.Type: GrantFiled: September 15, 1995Date of Patent: August 4, 1998Assignee: Lucent Technologies, Inc.Inventors: Joseph Philip Olive, Jan Pieter VanSanten