Abstract: A circuit arrangement for speech recognition carries out an analysis of a speech signal, extracting characteristic features. The extracted features are represented by spectral feature vectors which are compared with reference feature vectors stored for the speech signal to be recognized. The reference feature vectors are determined during a training phase in which a speech signal is recorded several times. A recognition result essentially depends on a quality of the spectral feature vectors and reference feature vectors. A recognition result essentially depends on a quality of the spectral feature vectors and reference feature vectors. A recursive high-pass filtering is performed in the time domain on the spectral feature vectors. Influences of noise signals on the recognition result are reduced by this and a high degree of speaker independence of the recognition is achieved.
Abstract: A method for decoding and synthesizing a synthetic digital speech signal from digital bits of the type produced by dividing a speech signal into frames and encoding the speech signal by an MBE based encoder. The method includes the steps of decoding the bits to provide spectral envelope and voicing information for each of the frames, processing the spectral envelope information to determine regenerated spectral phase information for each of the frames based on local envelope smoothness determining from the voicing information whether frequency bands for a particular frame are voiced or unvoiced. The method further includes synthesizing speech components for voiced frequency bands using the regenerated spectral phase information, synthesizing a speech component representing the speech signal in at least one unvoiced frequency band, and synthesizing the speech signal by combining the synthesized speech components for voiced and unvoiced frequency bands.
Abstract: A speech coding system employing an adaptive codebook model of periodicity is augmented with a pitch-predictive filter (PPF). This PPF has a delay equal to the integer component of the pitch-period and a gain which is adaptive based on a measure of periodicity of the speech signal. In accordance with an embodiment of the present invention, speech processing systems which include a first portion comprising an adaptive codebook and corresponding adaptive codebook amplifier and a second portion comprising a fixed codebook coupled to a pitch filter, are adapted to delay the adaptive codebook gain; determine the pitch filter gain based on the delayed adaptive codebook gain, and amplify samples of a signal in the pitch filter based on said determined pitch filter gain. The adaptive codebook gain is delayed for one subframe. The pitch filter gain equals the delayed. adaptive codebook gain, except when the adaptive codebook gain is either less than 0.2 or greater than 0.8.
Abstract: A speech recognition technique utilizes a set of N different principal discriminant matrices. Each principal discriminant matrix is associated with a distinct class. The class is an indication of the proximity of a speech segment to neighboring phones. A technique for speech encoding includes arranging speech signal into a series of frames. A feature vector is derived which represents the speech signal for a speech segment or series of speech segments for each frame. A set of N different projected vectors are generated for each frame, by multiplying the principal discriminant matrices by the vector. This speech encoding technique is capable of being used in speech recognition systems by utilizing models, in which each model transition is tagged with one of the N classes. The projected vector is utilized with the corresponding tag to compute the probability that at least one particular speech port is present in said frame.
Type:
Grant
Filed:
June 20, 1994
Date of Patent:
March 25, 1997
Assignee:
International Business Machines Corporation
Inventors:
Lahit R. Bahl, Peter V. de Souza, Ponani Gopalakrishnan, Michael A. Picheny