Vector Quantization Patents (Class 704/222)
  • Patent number: 6151414
    Abstract: A new encoder is provided which operates on a database of n-dimensional signal vectors where such signal vectors are restricted to being non-negative. In the establishment of a codebook for this encoder, p signal vectors (of dimension n) are placed in an n.times.p matrix X, and an approximate factorization: X.apprxeq.WV is constructed, where W and V are matrices whose entries are non-negative. Each column of the n.times.r matrix W is characterized as a "feature", such features having been "learned" from the database of n-dimensional signal vectors.
    Type: Grant
    Filed: January 30, 1998
    Date of Patent: November 21, 2000
    Assignee: Lucent Technologies Inc.
    Inventors: Daniel D. Lee, Hyunjune Sebastian Seung
  • Patent number: 6148288
    Abstract: A scalable audio coding/decoding method and apparatus are provided. The coding method includes the steps of (a) signal-processing input audio signals and quantizing the same for each predetermined coding band, (b) coding the quantized data corresponding to the base layer within a predetermined layer size, (c) coding the quantized data corresponding to the next enhancement layer of the coded base layer and the remaining quantized data uncoded and belonging to the enhancement layer, within a predetermined layer size, and (d) sequentially performing the layer coding steps for all layers, wherein the steps (b), (c) and (d) each include the steps of (e) representing the quantized data corresponding to a layer to be coded by digits of a predetermined same number, and (f) coding the most significant digit sequences composed of most significant digits of the magnitude data composing the represented digital data.
    Type: Grant
    Filed: April 2, 1998
    Date of Patent: November 14, 2000
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Sung-hee Park
  • Patent number: 6148283
    Abstract: A multi-path, split, multi-stage vector quantizer (MPSMS-VQ) having multiple paths between stages which result in a robust and flexible quanitizer. By varying parameters, the MPSMS-VQ meets design requirements, such as: (1) the number of bits used to represent the input vector (i.e., uses the same or less total bits than the given number of bits, N); (2) the dimension of the input vector, the performance (distortion as noted by WMSE or SD); (3) complexity (i.e., total complexity can be adjusted to be within a complexity constraint); and (4) memory usage (i.e., total number of words M in the codebook memory can be adjusted to be equal to, or less than, the memory constraint M.sub.d). Therefore, the disclosed method and apparatus works well in many conditions (i.e., offers a very robust performance across a wide range of inputs).
    Type: Grant
    Filed: September 23, 1998
    Date of Patent: November 14, 2000
    Assignee: Qualcomm Inc.
    Inventor: Amitav Das
  • Patent number: 6141640
    Abstract: A digital transmitter/receiver communications system transmits audio voice signals over a channel with increased quality for a specified bit rate. The method of encoding takes advantage of spherical symmetry of error vectors associated with encoding Line Spectral Frequency (LSF) coefficients, to reduce the information transmitted. Errors in encoding the LSF coefficient sets, vectors J, are modeled by a number of vectors J.sub.p having all positive components, and a sign vector s indicating the polarity of each component of the vector. Each LSF vector J intended to be transmitted is approximated by a positive vector J.sub.p and a sign vector s. An index I.sub.p of the positive vector J.sub.p and the sign vector corresponding to vector J are transmitted, along with other audio information to a receiver/decoder where the signal is decoded into an audio signal closely representing the original signal intended to be transmitted.
    Type: Grant
    Filed: February 20, 1998
    Date of Patent: October 31, 2000
    Assignee: General Electric Company
    Inventor: Peter Warren Moo
  • Patent number: 6141637
    Abstract: A speech encoding and decoding system comprises a speech coding apparatus and a speech decoding apparatus. The speech encoding apparatus orthogonally transforms an input speech signal represented in a time domain into a signal represented in a frequency domain in units of predetermined blocks, smoothes the resulting orthogonal transform coefficients by auxiliary information obtained by analyzing the speech signal, vector-quantizes the smoothed orthogonal transform coefficients to generate a quantization index, extracts a vector quantization error of low frequency components of the vector-quantized smoothed orthogonal transform coefficients, scalar-quantizes the vector quantization error to determine low frequency range correction information, and outputs the auxiliary information, quantization index, and low frequency range correction information.
    Type: Grant
    Filed: October 6, 1998
    Date of Patent: October 31, 2000
    Assignee: Yamaha Corporation
    Inventor: Kazunobu Kondo
  • Patent number: 6134520
    Abstract: A 1200 b/s vocoder providing a high degree of speech intelligibility and natural voice quality includes a tenth-order linear prediction analyzer, a split vector quantizer for line spectral frequencies, circuitry providing voicing classification and pitch estimation, a differential pitch and gain quantizer and a multiplexer for producing an encoded word transmitted to a receptive demultiplexer. The vocoder provides a characteristic encoded word including a first codeword, a second codeword, a pitch codeword and a gain codeword, wherein the first and second codewords are selected from respective first and second codebooks having a equal number of codewords and wherein the first and second codewords represent unequal numbers of elements of respective first and second sub-vectors. A codebook populating method for a split vector quantizer vocoder is also utilized.
    Type: Grant
    Filed: December 26, 1995
    Date of Patent: October 17, 2000
    Assignee: Comsat Corporation
    Inventor: Channasandra Ravishankar
  • Patent number: 6131083
    Abstract: On the basis of an autocorrelation coefficient calculated by an autocorrelation coefficient computation section from an input speech signal, an LSF computation section computes LSF parameters F(k) (k=1, 2, . . . , N). A modified logarithmic transformation section performs on the LSF parameters a logarithmic transformation with offset defined by f(k)=logC (1+A.times.F(k)) to obtain modified logarithmic LSF parameters f(k). The resulting modified logarithmic LSF parameters are quantized by a quantization section to provide quantized LSF parameters fq(k). Codes representing the quantized LSF parameters fq(k) are outputted. An inverse transformation defined by Fq(k)=(C.sup.fq(k) -1)/A is performed on the LSF parameters fq(k) to output LSF parameters Fq(k) on the general frequency scale.
    Type: Grant
    Filed: December 23, 1998
    Date of Patent: October 10, 2000
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kimio Miseki, Katsumi Tsuchiya
  • Patent number: 6131084
    Abstract: Speech is encoded into a 90 millisecond frame of bits for transmission across a satellite communication channel. A speech signal is digitized into digital speech samples that are then divided into subframes. Model parameters that include a set of spectral magnitude parameters that represent spectral information for the subframe are estimated for each subframe. Two consecutive subframes from the sequence of subframes are combined into a block and their spectral magnitude parameters are jointly quantized. The joint quantization includes forming predicted spectral magnitude parameters from the quantized spectral magnitude parameters from the previous block, computing the residual parameters as the difference between the spectral magnitude parameters and the predicted spectral magnitude parameters, combining the residual parameters from both of the subframes within the block, and using vector quantizers to quantize the combined residual parameters into a set of encoded spectral bits.
    Type: Grant
    Filed: March 14, 1997
    Date of Patent: October 10, 2000
    Assignee: Digital Voice Systems, Inc.
    Inventor: John C. Hardwick
  • Patent number: 6122608
    Abstract: A new method for quantization of the LPC coefficients in a speech coder includes an improved form of switched predictive multi-stage vector quantization. The switch predictive quantization includes at least a pair of codebook sets in a MSVQ quantizer and a first and second prediction matrix 24a and 24b with the first prediction matrix 1 used with codebook set 1 and prediction matrix 2 used with codebook set 2 and the encoder determines which prediction matrix/codebooks set produces the minimum quantization error at detector 35 and control 29 gates the indices with the minimum error out of the speech coder.
    Type: Grant
    Filed: August 15, 1998
    Date of Patent: September 19, 2000
    Assignee: Texas Instruments Incorporated
    Inventor: Alan V. McCree
  • Patent number: 6122618
    Abstract: A scalable audio coding/decoding method and apparatus are provided. The coding method includes the steps of (a) signal-processing input audio signals and quantizing the same for each predetermined coding band, (b) coding the quantized data corresponding to the base layer within a predetermined layer size, (c) coding the quantized data corresponding to the next enhancement layer of the coded base layer and the remaining quantized data uncoded and belonging to the enhancement layer, within a predetermined layer size, and (d) sequentially performing the layer coding steps for all layers, wherein the steps (b), (c) and (d) each include the steps of (e) representing the quantized data corresponding to a layer to be coded by digits of a predetermined same number, and (f) coding the most significant digit sequences composed of most significant digits of the magnitude data composing the represented digital data.
    Type: Grant
    Filed: November 26, 1997
    Date of Patent: September 19, 2000
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Sung-hee Park
  • Patent number: 6104991
    Abstract: The present invention relates to a speech encoder/decoder system employing digital transmission in which the encoding and decoding operations are complimentary, and these operations make use of sets of parameters which may be optimized for a speaker and for a particular digital radio link. A number of sets of parameters are determined experimentally, for example, by employing human sample groups in which perceived audio and transmission quality are tested. The encoder/decoder system then employs a group, or number, of sets of parameters serving all speakers rather than employing one fixed set of parameters. The particular set of parameters for a speaker in the encoder of a first transceiver is determined by a processor which receives values based on an analysis of the input audio signal, and then a parameter set identifier is sent within the digital signal for use by a decoder of a second transceiver.
    Type: Grant
    Filed: February 27, 1998
    Date of Patent: August 15, 2000
    Assignee: Lucent Technologies, Inc.
    Inventors: Paul B. Newland, Albert V. Franceschi, Howard Lenn
  • Patent number: 6098037
    Abstract: A method of quantizing harmonic amplitudes (FIG. 3), used in a speech encoder (10). The method compares variable dimension input vectors to fixed dimension codebook vectors, by first sampling each codebook vector so that it is converted to a vector having the same dimension as the input vector (FIG. 3, step 35). The resulting codebook vector is compared to the input vector (step 37). The difference (error) is weighted in favor of low frequency harmonics. Also, the weighting favors formant amplitudes so that they are quantized more accurately than formant nulls (FIG. 3, step 38; FIG. 5).
    Type: Grant
    Filed: May 19, 1998
    Date of Patent: August 1, 2000
    Assignee: Texas Instruments Incorporated
    Inventor: Suat Yeldener
  • Patent number: 6092039
    Abstract: The device and method of the invention receives a digital speech signal, which is processed by an Acoustic Processor to produce a Mel-Cepstrum Vector and Pitch. This is recalibrated and encoded. The encoded signal is transmitted over a narrow-band Channel, then decoded, split and recalibrated. From the split signals, one signal feeds a Statistical Processor which produces Recognized Text. Another signal feeds a Regenerator, which produces Regenerated Speech.
    Type: Grant
    Filed: October 31, 1997
    Date of Patent: July 18, 2000
    Assignee: International Business Machines Corporation
    Inventor: Arthur Richard Zingher
  • Patent number: 6091773
    Abstract: A method and apparatus for measuring the "perceptual distance" between an approximate, reconstructed representation of a sensory signal (such as an audio or video signal) and the original sensory signal is provided. The perceptual distance in this context is a direct quantitative measure of the likelihood that a human observer can distinguish the original audio or video signal from the reconstructed approximation to the original audio or video signal. The method described herein applies to noisy compression techniques; the method provides the ability to predict the likelihood that the reconstructed noisy representation of the original signal will be distinguishable by a human observer from the original input representation. The method can be used to allocate bits in audio and video compression algorithms such that the signal reconstructed from compressed representation is perceptually similar to the original input signal when judged by a human observer.
    Type: Grant
    Filed: November 12, 1997
    Date of Patent: July 18, 2000
    Inventor: Mark R. Sydorenko
  • Patent number: 6078881
    Abstract: Speech encoding using searching of a code book for a code that matches an input speech signal, and speech decoding using the code book are disclosed. A random series of code samples is stored in a buffer memory such as a ring buffer memory, and a basic vector generation unit generates basic vectors by applying an arbitrary shift to each of code series retrieved from the random series. Generation of the basic vectors may be performed according to, for example, an overlapping vector generation process. A code book generation unit extends the basic vectors contained in a basic vector unit according to a structuring process so as to produce a tree-structured delta code book. The basic vector generation unit may extend the basic vectors based on pitch parameters or a center clipping threshold.
    Type: Grant
    Filed: March 2, 1998
    Date of Patent: June 20, 2000
    Assignee: Fujitsu Limited
    Inventors: Yasuji Ota, Hitoshi Matsuzawa, Masanao Suzuki
  • Patent number: 6070136
    Abstract: A speech recognition system utilizes both matrix and vector quantizers as front ends to a second stage speech classifier. Matrix quantization exploits input signal information in both frequency and time domains, and the vector quantizer primarily operates on frequency domain information. However, in some circumstances, time domain information may be substantially limited which may introduce error into the matrix quantization. Information derived from vector quantization may be utilized by a hybrid decision generator to error compensate information derived from matrix quantization. Additionally, fuzz methods of quantization and robust distance measures may be introduced to also enhance speech recognition accuracy. Furthermore, other speech classification stages may be used, such as hidden Markov models which introduce probabilistic processes to further enhance speech recognition accuracy.
    Type: Grant
    Filed: October 27, 1997
    Date of Patent: May 30, 2000
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Lin Cong, Safdar M. Asghar
  • Patent number: 6067515
    Abstract: A speech recognition system utilizes both split matrix and split vector quantizers as front ends to a second stage speech classifier such as hidden Markov models (HMMs) to, for example, efficiently utilize processing resources and improve speech recognition performance. Fuzzy split matrix quantization (FSMQ) exploits the "evolution" of the speech short-term spectral envelopes as well as frequency domain information, and fuzzy split vector quantization (FSVQ) primarily operates on frequency domain information. Time domain information may be substantially limited which may introduce error into the matrix quantization, and the FSVQ may provide error compensation. Additionally, acoustic noise influence may affect particular frequency domain subbands. This system also, for example, exploits the localized noise by efficiently allocating enhanced processing technology to target noise-affected input signal parameters and minimize noise influence.
    Type: Grant
    Filed: October 27, 1997
    Date of Patent: May 23, 2000
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Lin Cong, Safdar M. Asghar
  • Patent number: 6061648
    Abstract: In a speech coding apparatus, an input device inputs a mixed speech signal of a plurality of speakers. A separating device analyzes period characteristics of the input mixed speech signal, and separates the same signal into a plurality of single speech signals each associated with a corresponding one of the speakers, based on a result of the analysis. A first extracting device extracts source speech characteristic parameters included in each of the single speech signals. A second extracting device extracts a generic vocal-tract characteristic parameter from the input mixed speech signal. In a speech decoding apparatus, a first input device inputs the source speech characteristic parameters for each of the speakers. A second input device inputs the vocal-tract characteristic parameter.
    Type: Grant
    Filed: February 26, 1998
    Date of Patent: May 9, 2000
    Assignee: Yamaha Corporation
    Inventor: Akitoshi Saito
  • Patent number: 6055496
    Abstract: A process for generation of codevectors in the production of synthetic speech in a communication system employing code-excited linear prediction (CELP) is implemented by dividing frames of sampled speech into sub-frames for which are generated codevectors suitable for excitation of synthesizer filters in the low-bit mode of signal transmission. Vector quantization (VQ) is employed with an algebraic representation of the CELP. A reduction of a sub-frame of 6.7 milliseconds to a vector representation of only 8 pulses results in an insufficiency of candidate codevectors, which insufficiency is overcome by a circular shifting of the codevectors at a cyclical rate equal to the pitch of the original voice signal.
    Type: Grant
    Filed: February 27, 1998
    Date of Patent: April 25, 2000
    Assignee: Nokia Mobile Phones, Ltd.
    Inventors: Alireza Ryan Heidari, Fenghua Liu
  • Patent number: 6052661
    Abstract: A speech encoding apparatus capable of averting the deterioration of synthesis speech quality in encoding the input speech and of generating a high-quality synthesis output speech through small quantities of computation. The apparatus includes a target speech generation part for generating from the input speech a target speech vector of a vector length corresponding to a delay parameter; an adaptive codebook for generating from previously generated excitation signals an adaptive vector of the vector length corresponding to the delay parameter; an adaptive code search part for evaluating the distortion of a synthesis vector obtained from the adaptive vector with respect to the target speech vector so as to search for the adaptive vector conducive to the least distortion; and a frame code generation part for generating an excitation signal of a frame length from the adaptive vector conducive to the least distortion.
    Type: Grant
    Filed: December 31, 1996
    Date of Patent: April 18, 2000
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventors: Tadashi Yamaura, Hirohisa Tasaki, Shinya Takahashi
  • Patent number: 6044343
    Abstract: One embodiment of a speech recognition system is organized with speech input signal preprocessing and feature extraction followed by a fuzzy matrix quantizer (FMQ) designed with respective codebook sets at multiple signal to noise ratios. The FMQ quantizes various training words from a set of vocabulary words and produces observation sequences O output data to train a hidden Markov model (HMM) processes .lambda.j and produces fuzzy distance measure output data for each vocabulary word codebook. A fuzzy Viterbi algorithm is used by a processor to compute maximum likelihood probabilities PR(O.vertline..lambda.j) for each vocabulary word. The fuzzy distance measures and maximum likelihood probabilities are mixed in a variety of ways to preferably optimize speech recognition accuracy and speech recognition speed performance.
    Type: Grant
    Filed: June 27, 1997
    Date of Patent: March 28, 2000
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Lin Cong, Safdar M. Asghar
  • Patent number: 6026122
    Abstract: A transfer process in which, an original vector signal is precoded to an intermediately-precoded vector signal, and the extended modulo operation is performed when the intermediately-precoded vector signal is located outside a predetermined extended-modulo limit area, and the precoded vector signal is transferred through a system having a predetermined filtering characteristic. From the transferred vector signal, the original vector signal is detected, based on a relationship between the vector components of the original vector signal and the transferred vector signal.
    Type: Grant
    Filed: September 9, 1997
    Date of Patent: February 15, 2000
    Assignee: Fujitsu Limited
    Inventors: Takashi Kaku, Kyoko Hirao, Hideo Miyazawa
  • Patent number: 6023672
    Abstract: An excitation quantizer 60 in a speech encoder includes a divider, which divides M pulses representing in combination a speech signal into groups each of L pulses, L being smaller than M. The amplitude of pulses, i.e., L pulses as each unit, is quantized, using spectral parameter. The quantization is executed on at least one quantization candidate, which is selected through distortion evaluation made through addition of the evaluation value based on an adjacent group quantization candidate output value and the evaluation value based on the pertinent group quantization value.
    Type: Grant
    Filed: April 16, 1997
    Date of Patent: February 8, 2000
    Assignee: NEC Corporation
    Inventor: Kazunori Ozawa
  • Patent number: 6018707
    Abstract: The code vector search for vector-quantizing a variable-dimension input vector is to be improved in precision. Via a terminal are entered a variable number of data, that is a variable-dimension vector v, representing, for example, the amplitudes of spectral components of the harmonics of speech. The variable-dimension vector v is converted by a variable/fixed dimension conversion circuit into the vector x of a fixed dimension, such as 44-dimension vector, which is sent to a selection circuit. From plural fixed-dimension vectors, such a code vector as minimizes a weighted error is selected from a codebook. The code vector of fixed dimension obtained by the codebook is converted by a fixed/variable dimension converting circuit into the same variable dimension as that of the original variable-dimension vector v. The converted variable dimension code vector is sent to a variable-dimension selection circuit for selecting from the codebook such code vector as minimizes the weighted error from the input vector v.
    Type: Grant
    Filed: September 5, 1997
    Date of Patent: January 25, 2000
    Assignee: Sony Corporation
    Inventors: Masayuki Nishiguchi, Kazuyuki Iijima, Jun Matsumoto
  • Patent number: 6016469
    Abstract: A process for the vector quantization of low bit rate vocoders, including determining a coding region by surrounding with an envelope a scatter of points of an autocorrelation matrix of reflection coefficients of a filter configured to model a vocal tract, wherein the envelope has a shape selected from the group consisting of a hyperellipsoid shape and a pyramidal shape, the envelope being centered at the barycenter of the scatter of points; determining principal axes of the volume of points inside the envelope; projecting area coefficients of the autocorrelation matrix onto the principal axes; partitioning the interior volume of the envelope into elementary volumes; and coding partition coefficients resulting from partitioning the interior volume on the basis of coordinates of said partition coefficients in a space defined by the principal axes of the volume of the points inside the envelope, while allocating as code values only values corresponding to locations of the elementary volumes in which said partit
    Type: Grant
    Filed: March 5, 1998
    Date of Patent: January 18, 2000
    Assignee: Thomson -CSF
    Inventor: Pierre Andre Laurent
  • Patent number: 6014623
    Abstract: A method of synthetic speech, wherein the method forms a speech data base, the speech data base includes plural syllables, each of the syllables having a total frame number of the syllable and plural frame parameters. Each of the frame parameter is formed using an energy amount, a speech pitch period, and 10 Line Spectrum Pair (LSP) speech parameters. Thereafter, each LSP speech parameter is encoded using 4 bit Differential Quantization.
    Type: Grant
    Filed: June 12, 1997
    Date of Patent: January 11, 2000
    Assignee: United Microelectronics Corp.
    Inventors: Xingjun Wu, Yihe Sun
  • Patent number: 6014618
    Abstract: A method and apparatus for reducing the complexity of linear prediction analysis-by-synthesis (LPAS) speech coders. The method and apparatus include product code vector quantization (PCVQ) of multi-tap pitch predictor coefficients, which reduces the search and quantization complexity of an adaptive codebook. Further included is a procedure for generating and selecting code vectors consisting of ternary (1,0,-1) values, for optimizing a fixed codebook. Serial optimization of the adaptive codebook first and then the fixed codebook, produces a low complexity LPAS speech coder of the present invention.
    Type: Grant
    Filed: August 6, 1998
    Date of Patent: January 11, 2000
    Assignee: DSP Software Engineering, Inc.
    Inventors: Jayesh S. Patel, Douglas E. Kolb
  • Patent number: 6009384
    Abstract: For coding human speech for subsequent audio reproduction thereof, a plurality of speech segments is derived from speech received, and systematically stored in a data base for later concatenated readout. After the deriving, respective speech segments are fragmented into temporally consecutive source frames, similar source frames as governed by a predetermined similarity measure thereamongst that is based on an underlying parameter set are joined, and joined source frames are collectively mapped onto a single storage frame. Respective segments are stored as containing sequenced referrals to storage frames for therefrom reconstituting the segment in question.
    Type: Grant
    Filed: May 20, 1997
    Date of Patent: December 28, 1999
    Assignee: U.S. Philips Corporation
    Inventors: Raymond N. J. Veldhuis, Paul A. P. Kaufholz
  • Patent number: 6009391
    Abstract: One embodiment of a speech recognition system is organized with speech input signal preprocessing and feature extraction followed by a fuzzy matrix quantizer (FMQ). Frames of the speech input signal are represented in a matrix by a vectorf of line spectral pair frequencies and energy coefficients and are fuzzy matrix quantized to respective vector f entries of a matrix codeword in a codebook of the FMQ. The energy coefficients include the original energy and the first and second derivatives of the original energy which increase recognition accuracy by, for example, being generally distinctive speech input signal parameters and providing noise signal suppression especially when the noise signal has a relatively constant energy over at least two time frame intervals. To reduce data while maintaining sufficient resolution, the energy coefficients may be normalized and logarithmically represented. A distance measure between f and f, d(f, f), is defined as ##EQU1## where the constants .alpha..sub.1, .alpha..sub.
    Type: Grant
    Filed: August 6, 1997
    Date of Patent: December 28, 1999
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Safdar M. Asghar, Lin Cong
  • Patent number: 6009123
    Abstract: A transfer process in which, an original vector signal is precoded to an intermediately-precoded vector signal, and the extended modulo operation is performed when the intermediately-precoded vector signal is located outside a predetermined extended-modulo limit area, and the precoded vector signal is transferred through a system having a predetermined filtering characteristic. From the transferred vector signal, the original vector signal is detected, based on a relationship between the vector components of the original vector signal and the transferred vector signal.
    Type: Grant
    Filed: September 8, 1997
    Date of Patent: December 28, 1999
    Assignee: Fujitsu Limited
    Inventors: Takashi Kaku, Kyoko Hirao, Hideo Miyazawa
  • Patent number: 6009387
    Abstract: Apparatus for processing acoustic features extracted from a sample of speech data forming a feature vector signal every frame period includes a first linear prediction analyzer, a vector quantizer, at least one partitioned vector quantizer and a scalar quantizer. The first linear prediction analyzer performs a linear prediction analysis on the feature vector signal to generate a first error vector signal. Next, the vector quantizer performs a vector quantization on the first error signal thereby generating a first index corresponding to a first prestored vector signal which is an approximation of the first error vector signal. The vector quantizer also generates a residual vector signal which is the difference between the first error vector signal and the first prestored approximation vector signal.
    Type: Grant
    Filed: March 20, 1997
    Date of Patent: December 28, 1999
    Assignee: International Business Machines Corporation
    Inventors: Ganesh Nachiappa Ramaswamy, Ponani Gopalakrishnan, Joseph Morris
  • Patent number: 6006178
    Abstract: In a speech encoder, a gain codebook switching circuit is supplied with short-term prediction gains from a short-term prediction gain calculator circuit and with mode information through an input terminal and compares the short-term prediction gains with a predetermined threshold value when the mode information indicates a predetermined mode. As a result of comparison, the gain codebook switching circuit produces gain codebook switching information which is delivered to a gain quantizer circuit. The gain codebook quantizer circuit is supplied with adaptive code vectors, excitation code vectors, impulse response information, and the gain codebook switching information, and gain code vectors from a particular gain codebook connected to one of a plurality of input terminals that is selected by the gain codebook switching information.
    Type: Grant
    Filed: July 26, 1996
    Date of Patent: December 21, 1999
    Assignee: NEC Corporation
    Inventors: Shin-Ichi Taumi, Kazunori Ozawa
  • Patent number: 6006174
    Abstract: The generation of multipulse excitation codes by digitizing an original speech, partitioning the digitized signal into a number of samples, pre-emphasizing the samples, producing linear predictive reflection coefficients from said samples, quantizing these reflection coefficients, converting the quantized reflection coefficients to spectral coefficients and subjecting the spectral coefficients to pitch analysis to obtain a spectral residual signal.
    Type: Grant
    Filed: October 15, 1997
    Date of Patent: December 21, 1999
    Assignee: InterDigital Technology Coporation
    Inventors: Daniel Lin, Brian M. McCarthy
  • Patent number: 6006177
    Abstract: The invention provides a speech coding apparatus wherein a perceptual weighting filter is realized with a comparatively small amount of calculation. The speech coding apparatus includes a weighting circuit which in turn includes a coefficient code book in which weighting coefficients are stored, a coefficient determination section which selects and outputs one of the weighting coefficients which corresponds to a short-term prediction code, and a weighting section for performing weighting calculation of a speech signal with the selected weighting coefficient.
    Type: Grant
    Filed: April 18, 1996
    Date of Patent: December 21, 1999
    Assignee: NEC Corporation
    Inventor: Keiichi Funaki
  • Patent number: 6006179
    Abstract: An audio coder/decoder ("codec") that is suitable for real-time applications due to reduced computational complexity, and a novel adaptive sparse vector quantization (ASVQ) scheme and algorithms for general purpose data quantization. The codec provides low bit-rate compression for music and speech, while being applicable to higher bit-rate audio compression. The codec includes an in-path implementation of psychoacoustic spectral masking, and frequency domain quantization using the novel ASVQ scheme and algorithms specific to audio compression. More particularly, the inventive audio codec employs frequency domain quantization with critically sampled subband filter banks to maintain time domain continuity across frame boundaries. The input audio signal is transformed into the frequency domain in which in-path spectral masking can be directly applied. This in-path spectral masking usually results in sparse vectors.
    Type: Grant
    Filed: October 28, 1997
    Date of Patent: December 21, 1999
    Assignee: America Online, Inc.
    Inventors: Shuwu Wu, John Mantegna
  • Patent number: 5999899
    Abstract: Audio source data is subjected to a pre-emphasis step (302) to perform gross decorrelation, followed by an adaptive linear prediction (306) to perform further decorrelation. A transform is performed on the residual of the linear prediction, to obtain transform coefficients representing the residual in the frequency domain. A number of tonal components are identified (310), subtracted from the transform coefficients and encoded by vector quantization. The transform coefficients are then grouped into sub-bands, and each sub-band encoded in the frequency domain by vector quantization. The sub-bands are of uniform width on an auditory scale, so that each vector may comprise a different number of transform coefficients.
    Type: Grant
    Filed: October 20, 1997
    Date of Patent: December 7, 1999
    Assignee: SoftSound Limited
    Inventor: Anthony John Robinson
  • Patent number: 5978758
    Abstract: A first vector quantizer generates output codevectors corresponding in number to a number determined by a predetermined number of bits through linear coupling of integer coefficients of a predetermined number of base vectors stored in a base vector memory. A second vector quantizer determines coefficients of the base vectors according to at least one of output indexes of the output codevectors.
    Type: Grant
    Filed: July 10, 1997
    Date of Patent: November 2, 1999
    Assignee: NEC Corporation
    Inventor: Shigeru Ono
  • Patent number: 5974378
    Abstract: A method of using a computer (11) to perform a multi-stage vector quantization process (13a). At each stage of the process (13a) subsequent to the first stage, input vectors from the previous stage are used to search a codebook (13b) for code-vectors that minimize distortion. (FIG. 2) The search is structured so that each stage is performed with an outer loop that calculates components of distortion that do not depend on the input vector value. An inner loop, which does depend on input vector values, is used to calculate distortion values and to maintain a list of the current best output code-vectors. (FIG. 3). The first stage is a special case, having only one input vector, but is otherwise performed like the subsequent stages.
    Type: Grant
    Filed: January 6, 1998
    Date of Patent: October 26, 1999
    Assignee: Texas Instruments Incorporated
    Inventors: Wilfrid P. LeBlanc, Alan V. McCree
  • Patent number: 5970443
    Abstract: An audio encoding-decoding system is constructed between a transmitting station and a receiving station which are connected together through communication lines. The transmitting station corresponds to an encoder which performs an encoding process on audio signals input thereto to produce compressive coded bit streams. Herein, the encoder uses a code book or conjugate structure code books to perform vector quantization on residual signals corresponding to residuals of an analysis of linear predictive coding which is performed on the audio signals. Indexes are produced in response to a result of the vector quantization. The encoder produces the compressive coded bit stream based on the indexes and a result of the analysis of the linear predictive coding. A bit rate mode is determined for the compressive coded bit stream in response to conditions of the communication lines.
    Type: Grant
    Filed: September 22, 1997
    Date of Patent: October 19, 1999
    Assignee: Yamaha Corporation
    Inventor: Shigeki Fujii
  • Patent number: 5970444
    Abstract: An ACELP speech coding method according to ITU-T Recommendation G.729. When coding a random component vector, each of random component vector forming together the random codebook is formed of three or less pulses having a unit amplitude for each 6f a pair of subframes which form together a frame. The positions of the pulses are determined from a plurality of predetermined positions which a pulse can assume in a subframe so that distortion is minimized. The method allows speech coding at a lower bit rate.
    Type: Grant
    Filed: March 11, 1998
    Date of Patent: October 19, 1999
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventors: Shinji Hayashi, Sachiko Kurihara, Akitoshi Kataoka
  • Patent number: 5966688
    Abstract: A speech mode based multi-stage vector quantizer is disclosed which quantizes and encodes line spectral frequency (LSF) vectors that were obtained by transforming the short-term predictor filter coefficients in a speech codec that utilizes linear predictive techniques. The quantizer includes a mode classifier that classifies each speech frame of a speech signal as being associated with one of a voiced, spectrally stationary (Mode A) speech frame, a voiced, spectrally non-stationary (Mode B) speech frame and an unvoiced (Mode C) speech frame. A converter converts each speech frame of the speech signal into an LSF vector and an LSF vector quantizer includes a 12-bit, two-stage, backward predictive vector encoder that encodes the Mode A speech frames and a 22 bit, four-stage backward predictive vector encoder that encodes the Mode 13 and the Mode C speech frames.
    Type: Grant
    Filed: October 28, 1997
    Date of Patent: October 12, 1999
    Assignee: Hughes Electronics Corporation
    Inventors: Srinivas Nandkumar, Kumar Swaminathan
  • Patent number: 5963896
    Abstract: In a speech coder, an excitation quantizer 360 retrieves the positions of M non-zero amplitude pulses, which together constitute an excitation, by using spectral parameters and with a different gain for each group of the pulses less in number than M.
    Type: Grant
    Filed: August 26, 1997
    Date of Patent: October 5, 1999
    Assignee: NEC Corporation
    Inventor: Kazunori Ozawa
  • Patent number: 5960390
    Abstract: There is provided a coding method which can effectively prevent a pre-echo and a post-echo from being generated and can perform effective coding to which an psycho-acoustic model is applied. A coding apparatus according to the coding method of the present invention detects the attack and release portions of a waveform signal, and performs gain control to a waveform signal before the attack portion and the waveform signal of the release portion by using a gain control amount adaptively calculated according to the characteristics of the waveform signal. An psycho acoustic model window circuit to an aural model application circuit calculate a masking level based on the psycho-acoustic model from a frequency component obtained by transforming the waveform signal, and a quantization precision determination circuit determines a quantization precision by using the masking level. An window circuit and a transform circuit transform the waveform signal into a plurality of frequency components.
    Type: Grant
    Filed: October 2, 1996
    Date of Patent: September 28, 1999
    Assignee: Sony Corporation
    Inventors: Masatoshi Ueno, Shinji Miyamori
  • Patent number: 5946651
    Abstract: A post-processor 317 and method substantially for enhancing synthesised speech is disclosed. The post-processor 317 operates on a signal ex(n) derived from an excitation generator 211 typically comprising a fixed code book 203 and an adaptive code book 204, the signal ex(n) being formed from the addition of scaled outputs from the fixed code book 203 and adaptive code book 204. The post-processor operates on ex(n) by adding to it a scaled signal pv(n) derived from the adaptive code book 204. A gain or scale factor p is determined by the speech coefficients input to the excitation generator 211. The combined signal ex(n)+pv(n) is normalised by unit 316 and input to an LPC or speech synthesis filter 208, prior to being input to an audio processing unit 209.
    Type: Grant
    Filed: August 18, 1998
    Date of Patent: August 31, 1999
    Assignee: Nokia Mobile Phones
    Inventors: Kari Jarvinen, Tero Honkanen
  • Patent number: 5943644
    Abstract: A digital speech waveform is divided into frames and sub-frames. Spectrum envelope information, pitch elements and stochastic elements are extracted and coded for the frames and sub-frames. A second error signal is calculated as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements. The second error signal is coded so as to obtain the stochastic elements as a result of transforming the second error signal into a signal of a frequency domain through discrete cosine transformation and coding coefficients of the transformed domain.
    Type: Grant
    Filed: June 18, 1997
    Date of Patent: August 24, 1999
    Assignee: Ricoh Company, Ltd.
    Inventors: Jun Yamane, Hiroki Uchiyama
  • Patent number: 5943647
    Abstract: A speech recognition method that combines HMMs and vector quantization to model the speech signal and adds spectral derivative information in the speech parameters. Each state of a HMM is modeled by two different VQ-codebooks. One is trained by using the spectral parameters and the second is trained by using the spectral derivative parameters.
    Type: Grant
    Filed: June 5, 1997
    Date of Patent: August 24, 1999
    Assignee: Tecnomen Oy
    Inventor: Jari Ranta
  • Patent number: 5930748
    Abstract: A speaker identification system (10) employs a supervised training process (100) that uses row action projection (RAP) to generate speaker model data for a set of speakers. The training process employing RAP uses less memory and processing resources by operating on a single row of a matrix at a time. Memory requirements are linearly proportional to number of speakers for storing each speakers information. A speaker is identified from the set of speakers by sampling the speaker's speech (202), deriving cepstral coefficients (208), and performing a polynomial expansion (212) on cepstral coefficients. The identified speaker (228) is selected using the product of the speaker model data (213) and the polynomial expanded coefficients from the speech sample.
    Type: Grant
    Filed: July 11, 1997
    Date of Patent: July 27, 1999
    Assignee: Motorola, Inc.
    Inventors: John Eric Kleider, Khaled Assaleh
  • Patent number: 5924065
    Abstract: In a computerized method for processing speech signals, first vectors representing clean speech signals are stored in a vector codebook. Second vectors are determined from dirty speech signals. Noise and distortion parameters are estimated from the second vectors. Third vectors are predicated, based on estimated noise and distortion parameters. The third vectors are used to correct the first vectors. The third vectors can then be applied to the second vectors to produce corrected vectors. The corrected vectors and the first vectors can be compared to identify first vectors which resemble the corrected vectors.
    Type: Grant
    Filed: June 16, 1997
    Date of Patent: July 13, 1999
    Assignee: Digital Equipment Corporation
    Inventors: Brian S. Eberman, Pedro J. Moreno
  • Patent number: 5924062
    Abstract: A codebook correlation matrix comprises a Toeplitz-type (diagonally symmetric) matrix which is calculated from a forty sample subframe of a speech signal, forming a 40.times.40 matrix. The resulting correlation coefficients which constitute the codes are stored within a DSP's local memory after calculation by dividing the matrix into five predefined x- and y- tracks, each track having a unique set of eight pulse positions. Using the eight pulse positions on each track, fifteen 8.times.8 sub-matrices are created which include all of the correlation coefficients in the original 40.times.40 matrix. The sub-matrices are distributed within a 5.times.5 mapping matrix which is correlated with a structure mapping matrix to determine the configuration of the resulting autocorrelation matrix for storage and searching. The sub-matrices within each column of correlated mapping matrices are searched by directing a multiplex pointer to that particular column.
    Type: Grant
    Filed: July 1, 1997
    Date of Patent: July 13, 1999
    Assignee: Nokia Mobile Phones
    Inventor: Tin Maung
  • Patent number: 5920833
    Abstract: An MPEG audio decoder includes a Vector FIFO buffer and a windowed polyphase filter. Groups of vector samples are zeroed out prior to storage in the Vector FIFO buffer when it is desired to soft-mute an audio output of the decoder.
    Type: Grant
    Filed: January 30, 1996
    Date of Patent: July 6, 1999
    Assignee: LSI Logic Corporation
    Inventor: Gregg Dierke