Vector Quantization Patents (Class 704/222)
-
Patent number: 6151414Abstract: A new encoder is provided which operates on a database of n-dimensional signal vectors where such signal vectors are restricted to being non-negative. In the establishment of a codebook for this encoder, p signal vectors (of dimension n) are placed in an n.times.p matrix X, and an approximate factorization: X.apprxeq.WV is constructed, where W and V are matrices whose entries are non-negative. Each column of the n.times.r matrix W is characterized as a "feature", such features having been "learned" from the database of n-dimensional signal vectors.Type: GrantFiled: January 30, 1998Date of Patent: November 21, 2000Assignee: Lucent Technologies Inc.Inventors: Daniel D. Lee, Hyunjune Sebastian Seung
-
Patent number: 6148288Abstract: A scalable audio coding/decoding method and apparatus are provided. The coding method includes the steps of (a) signal-processing input audio signals and quantizing the same for each predetermined coding band, (b) coding the quantized data corresponding to the base layer within a predetermined layer size, (c) coding the quantized data corresponding to the next enhancement layer of the coded base layer and the remaining quantized data uncoded and belonging to the enhancement layer, within a predetermined layer size, and (d) sequentially performing the layer coding steps for all layers, wherein the steps (b), (c) and (d) each include the steps of (e) representing the quantized data corresponding to a layer to be coded by digits of a predetermined same number, and (f) coding the most significant digit sequences composed of most significant digits of the magnitude data composing the represented digital data.Type: GrantFiled: April 2, 1998Date of Patent: November 14, 2000Assignee: Samsung Electronics Co., Ltd.Inventor: Sung-hee Park
-
Patent number: 6148283Abstract: A multi-path, split, multi-stage vector quantizer (MPSMS-VQ) having multiple paths between stages which result in a robust and flexible quanitizer. By varying parameters, the MPSMS-VQ meets design requirements, such as: (1) the number of bits used to represent the input vector (i.e., uses the same or less total bits than the given number of bits, N); (2) the dimension of the input vector, the performance (distortion as noted by WMSE or SD); (3) complexity (i.e., total complexity can be adjusted to be within a complexity constraint); and (4) memory usage (i.e., total number of words M in the codebook memory can be adjusted to be equal to, or less than, the memory constraint M.sub.d). Therefore, the disclosed method and apparatus works well in many conditions (i.e., offers a very robust performance across a wide range of inputs).Type: GrantFiled: September 23, 1998Date of Patent: November 14, 2000Assignee: Qualcomm Inc.Inventor: Amitav Das
-
Patent number: 6141640Abstract: A digital transmitter/receiver communications system transmits audio voice signals over a channel with increased quality for a specified bit rate. The method of encoding takes advantage of spherical symmetry of error vectors associated with encoding Line Spectral Frequency (LSF) coefficients, to reduce the information transmitted. Errors in encoding the LSF coefficient sets, vectors J, are modeled by a number of vectors J.sub.p having all positive components, and a sign vector s indicating the polarity of each component of the vector. Each LSF vector J intended to be transmitted is approximated by a positive vector J.sub.p and a sign vector s. An index I.sub.p of the positive vector J.sub.p and the sign vector corresponding to vector J are transmitted, along with other audio information to a receiver/decoder where the signal is decoded into an audio signal closely representing the original signal intended to be transmitted.Type: GrantFiled: February 20, 1998Date of Patent: October 31, 2000Assignee: General Electric CompanyInventor: Peter Warren Moo
-
Patent number: 6141637Abstract: A speech encoding and decoding system comprises a speech coding apparatus and a speech decoding apparatus. The speech encoding apparatus orthogonally transforms an input speech signal represented in a time domain into a signal represented in a frequency domain in units of predetermined blocks, smoothes the resulting orthogonal transform coefficients by auxiliary information obtained by analyzing the speech signal, vector-quantizes the smoothed orthogonal transform coefficients to generate a quantization index, extracts a vector quantization error of low frequency components of the vector-quantized smoothed orthogonal transform coefficients, scalar-quantizes the vector quantization error to determine low frequency range correction information, and outputs the auxiliary information, quantization index, and low frequency range correction information.Type: GrantFiled: October 6, 1998Date of Patent: October 31, 2000Assignee: Yamaha CorporationInventor: Kazunobu Kondo
-
Patent number: 6134520Abstract: A 1200 b/s vocoder providing a high degree of speech intelligibility and natural voice quality includes a tenth-order linear prediction analyzer, a split vector quantizer for line spectral frequencies, circuitry providing voicing classification and pitch estimation, a differential pitch and gain quantizer and a multiplexer for producing an encoded word transmitted to a receptive demultiplexer. The vocoder provides a characteristic encoded word including a first codeword, a second codeword, a pitch codeword and a gain codeword, wherein the first and second codewords are selected from respective first and second codebooks having a equal number of codewords and wherein the first and second codewords represent unequal numbers of elements of respective first and second sub-vectors. A codebook populating method for a split vector quantizer vocoder is also utilized.Type: GrantFiled: December 26, 1995Date of Patent: October 17, 2000Assignee: Comsat CorporationInventor: Channasandra Ravishankar
-
Patent number: 6131083Abstract: On the basis of an autocorrelation coefficient calculated by an autocorrelation coefficient computation section from an input speech signal, an LSF computation section computes LSF parameters F(k) (k=1, 2, . . . , N). A modified logarithmic transformation section performs on the LSF parameters a logarithmic transformation with offset defined by f(k)=logC (1+A.times.F(k)) to obtain modified logarithmic LSF parameters f(k). The resulting modified logarithmic LSF parameters are quantized by a quantization section to provide quantized LSF parameters fq(k). Codes representing the quantized LSF parameters fq(k) are outputted. An inverse transformation defined by Fq(k)=(C.sup.fq(k) -1)/A is performed on the LSF parameters fq(k) to output LSF parameters Fq(k) on the general frequency scale.Type: GrantFiled: December 23, 1998Date of Patent: October 10, 2000Assignee: Kabushiki Kaisha ToshibaInventors: Kimio Miseki, Katsumi Tsuchiya
-
Patent number: 6131084Abstract: Speech is encoded into a 90 millisecond frame of bits for transmission across a satellite communication channel. A speech signal is digitized into digital speech samples that are then divided into subframes. Model parameters that include a set of spectral magnitude parameters that represent spectral information for the subframe are estimated for each subframe. Two consecutive subframes from the sequence of subframes are combined into a block and their spectral magnitude parameters are jointly quantized. The joint quantization includes forming predicted spectral magnitude parameters from the quantized spectral magnitude parameters from the previous block, computing the residual parameters as the difference between the spectral magnitude parameters and the predicted spectral magnitude parameters, combining the residual parameters from both of the subframes within the block, and using vector quantizers to quantize the combined residual parameters into a set of encoded spectral bits.Type: GrantFiled: March 14, 1997Date of Patent: October 10, 2000Assignee: Digital Voice Systems, Inc.Inventor: John C. Hardwick
-
Patent number: 6122608Abstract: A new method for quantization of the LPC coefficients in a speech coder includes an improved form of switched predictive multi-stage vector quantization. The switch predictive quantization includes at least a pair of codebook sets in a MSVQ quantizer and a first and second prediction matrix 24a and 24b with the first prediction matrix 1 used with codebook set 1 and prediction matrix 2 used with codebook set 2 and the encoder determines which prediction matrix/codebooks set produces the minimum quantization error at detector 35 and control 29 gates the indices with the minimum error out of the speech coder.Type: GrantFiled: August 15, 1998Date of Patent: September 19, 2000Assignee: Texas Instruments IncorporatedInventor: Alan V. McCree
-
Patent number: 6122618Abstract: A scalable audio coding/decoding method and apparatus are provided. The coding method includes the steps of (a) signal-processing input audio signals and quantizing the same for each predetermined coding band, (b) coding the quantized data corresponding to the base layer within a predetermined layer size, (c) coding the quantized data corresponding to the next enhancement layer of the coded base layer and the remaining quantized data uncoded and belonging to the enhancement layer, within a predetermined layer size, and (d) sequentially performing the layer coding steps for all layers, wherein the steps (b), (c) and (d) each include the steps of (e) representing the quantized data corresponding to a layer to be coded by digits of a predetermined same number, and (f) coding the most significant digit sequences composed of most significant digits of the magnitude data composing the represented digital data.Type: GrantFiled: November 26, 1997Date of Patent: September 19, 2000Assignee: Samsung Electronics Co., Ltd.Inventor: Sung-hee Park
-
Patent number: 6104991Abstract: The present invention relates to a speech encoder/decoder system employing digital transmission in which the encoding and decoding operations are complimentary, and these operations make use of sets of parameters which may be optimized for a speaker and for a particular digital radio link. A number of sets of parameters are determined experimentally, for example, by employing human sample groups in which perceived audio and transmission quality are tested. The encoder/decoder system then employs a group, or number, of sets of parameters serving all speakers rather than employing one fixed set of parameters. The particular set of parameters for a speaker in the encoder of a first transceiver is determined by a processor which receives values based on an analysis of the input audio signal, and then a parameter set identifier is sent within the digital signal for use by a decoder of a second transceiver.Type: GrantFiled: February 27, 1998Date of Patent: August 15, 2000Assignee: Lucent Technologies, Inc.Inventors: Paul B. Newland, Albert V. Franceschi, Howard Lenn
-
Patent number: 6098037Abstract: A method of quantizing harmonic amplitudes (FIG. 3), used in a speech encoder (10). The method compares variable dimension input vectors to fixed dimension codebook vectors, by first sampling each codebook vector so that it is converted to a vector having the same dimension as the input vector (FIG. 3, step 35). The resulting codebook vector is compared to the input vector (step 37). The difference (error) is weighted in favor of low frequency harmonics. Also, the weighting favors formant amplitudes so that they are quantized more accurately than formant nulls (FIG. 3, step 38; FIG. 5).Type: GrantFiled: May 19, 1998Date of Patent: August 1, 2000Assignee: Texas Instruments IncorporatedInventor: Suat Yeldener
-
Patent number: 6092039Abstract: The device and method of the invention receives a digital speech signal, which is processed by an Acoustic Processor to produce a Mel-Cepstrum Vector and Pitch. This is recalibrated and encoded. The encoded signal is transmitted over a narrow-band Channel, then decoded, split and recalibrated. From the split signals, one signal feeds a Statistical Processor which produces Recognized Text. Another signal feeds a Regenerator, which produces Regenerated Speech.Type: GrantFiled: October 31, 1997Date of Patent: July 18, 2000Assignee: International Business Machines CorporationInventor: Arthur Richard Zingher
-
Patent number: 6091773Abstract: A method and apparatus for measuring the "perceptual distance" between an approximate, reconstructed representation of a sensory signal (such as an audio or video signal) and the original sensory signal is provided. The perceptual distance in this context is a direct quantitative measure of the likelihood that a human observer can distinguish the original audio or video signal from the reconstructed approximation to the original audio or video signal. The method described herein applies to noisy compression techniques; the method provides the ability to predict the likelihood that the reconstructed noisy representation of the original signal will be distinguishable by a human observer from the original input representation. The method can be used to allocate bits in audio and video compression algorithms such that the signal reconstructed from compressed representation is perceptually similar to the original input signal when judged by a human observer.Type: GrantFiled: November 12, 1997Date of Patent: July 18, 2000Inventor: Mark R. Sydorenko
-
Patent number: 6078881Abstract: Speech encoding using searching of a code book for a code that matches an input speech signal, and speech decoding using the code book are disclosed. A random series of code samples is stored in a buffer memory such as a ring buffer memory, and a basic vector generation unit generates basic vectors by applying an arbitrary shift to each of code series retrieved from the random series. Generation of the basic vectors may be performed according to, for example, an overlapping vector generation process. A code book generation unit extends the basic vectors contained in a basic vector unit according to a structuring process so as to produce a tree-structured delta code book. The basic vector generation unit may extend the basic vectors based on pitch parameters or a center clipping threshold.Type: GrantFiled: March 2, 1998Date of Patent: June 20, 2000Assignee: Fujitsu LimitedInventors: Yasuji Ota, Hitoshi Matsuzawa, Masanao Suzuki
-
Patent number: 6070136Abstract: A speech recognition system utilizes both matrix and vector quantizers as front ends to a second stage speech classifier. Matrix quantization exploits input signal information in both frequency and time domains, and the vector quantizer primarily operates on frequency domain information. However, in some circumstances, time domain information may be substantially limited which may introduce error into the matrix quantization. Information derived from vector quantization may be utilized by a hybrid decision generator to error compensate information derived from matrix quantization. Additionally, fuzz methods of quantization and robust distance measures may be introduced to also enhance speech recognition accuracy. Furthermore, other speech classification stages may be used, such as hidden Markov models which introduce probabilistic processes to further enhance speech recognition accuracy.Type: GrantFiled: October 27, 1997Date of Patent: May 30, 2000Assignee: Advanced Micro Devices, Inc.Inventors: Lin Cong, Safdar M. Asghar
-
Patent number: 6067515Abstract: A speech recognition system utilizes both split matrix and split vector quantizers as front ends to a second stage speech classifier such as hidden Markov models (HMMs) to, for example, efficiently utilize processing resources and improve speech recognition performance. Fuzzy split matrix quantization (FSMQ) exploits the "evolution" of the speech short-term spectral envelopes as well as frequency domain information, and fuzzy split vector quantization (FSVQ) primarily operates on frequency domain information. Time domain information may be substantially limited which may introduce error into the matrix quantization, and the FSVQ may provide error compensation. Additionally, acoustic noise influence may affect particular frequency domain subbands. This system also, for example, exploits the localized noise by efficiently allocating enhanced processing technology to target noise-affected input signal parameters and minimize noise influence.Type: GrantFiled: October 27, 1997Date of Patent: May 23, 2000Assignee: Advanced Micro Devices, Inc.Inventors: Lin Cong, Safdar M. Asghar
-
Patent number: 6061648Abstract: In a speech coding apparatus, an input device inputs a mixed speech signal of a plurality of speakers. A separating device analyzes period characteristics of the input mixed speech signal, and separates the same signal into a plurality of single speech signals each associated with a corresponding one of the speakers, based on a result of the analysis. A first extracting device extracts source speech characteristic parameters included in each of the single speech signals. A second extracting device extracts a generic vocal-tract characteristic parameter from the input mixed speech signal. In a speech decoding apparatus, a first input device inputs the source speech characteristic parameters for each of the speakers. A second input device inputs the vocal-tract characteristic parameter.Type: GrantFiled: February 26, 1998Date of Patent: May 9, 2000Assignee: Yamaha CorporationInventor: Akitoshi Saito
-
Patent number: 6055496Abstract: A process for generation of codevectors in the production of synthetic speech in a communication system employing code-excited linear prediction (CELP) is implemented by dividing frames of sampled speech into sub-frames for which are generated codevectors suitable for excitation of synthesizer filters in the low-bit mode of signal transmission. Vector quantization (VQ) is employed with an algebraic representation of the CELP. A reduction of a sub-frame of 6.7 milliseconds to a vector representation of only 8 pulses results in an insufficiency of candidate codevectors, which insufficiency is overcome by a circular shifting of the codevectors at a cyclical rate equal to the pitch of the original voice signal.Type: GrantFiled: February 27, 1998Date of Patent: April 25, 2000Assignee: Nokia Mobile Phones, Ltd.Inventors: Alireza Ryan Heidari, Fenghua Liu
-
Patent number: 6052661Abstract: A speech encoding apparatus capable of averting the deterioration of synthesis speech quality in encoding the input speech and of generating a high-quality synthesis output speech through small quantities of computation. The apparatus includes a target speech generation part for generating from the input speech a target speech vector of a vector length corresponding to a delay parameter; an adaptive codebook for generating from previously generated excitation signals an adaptive vector of the vector length corresponding to the delay parameter; an adaptive code search part for evaluating the distortion of a synthesis vector obtained from the adaptive vector with respect to the target speech vector so as to search for the adaptive vector conducive to the least distortion; and a frame code generation part for generating an excitation signal of a frame length from the adaptive vector conducive to the least distortion.Type: GrantFiled: December 31, 1996Date of Patent: April 18, 2000Assignee: Mitsubishi Denki Kabushiki KaishaInventors: Tadashi Yamaura, Hirohisa Tasaki, Shinya Takahashi
-
Patent number: 6044343Abstract: One embodiment of a speech recognition system is organized with speech input signal preprocessing and feature extraction followed by a fuzzy matrix quantizer (FMQ) designed with respective codebook sets at multiple signal to noise ratios. The FMQ quantizes various training words from a set of vocabulary words and produces observation sequences O output data to train a hidden Markov model (HMM) processes .lambda.j and produces fuzzy distance measure output data for each vocabulary word codebook. A fuzzy Viterbi algorithm is used by a processor to compute maximum likelihood probabilities PR(O.vertline..lambda.j) for each vocabulary word. The fuzzy distance measures and maximum likelihood probabilities are mixed in a variety of ways to preferably optimize speech recognition accuracy and speech recognition speed performance.Type: GrantFiled: June 27, 1997Date of Patent: March 28, 2000Assignee: Advanced Micro Devices, Inc.Inventors: Lin Cong, Safdar M. Asghar
-
Patent number: 6026122Abstract: A transfer process in which, an original vector signal is precoded to an intermediately-precoded vector signal, and the extended modulo operation is performed when the intermediately-precoded vector signal is located outside a predetermined extended-modulo limit area, and the precoded vector signal is transferred through a system having a predetermined filtering characteristic. From the transferred vector signal, the original vector signal is detected, based on a relationship between the vector components of the original vector signal and the transferred vector signal.Type: GrantFiled: September 9, 1997Date of Patent: February 15, 2000Assignee: Fujitsu LimitedInventors: Takashi Kaku, Kyoko Hirao, Hideo Miyazawa
-
Patent number: 6023672Abstract: An excitation quantizer 60 in a speech encoder includes a divider, which divides M pulses representing in combination a speech signal into groups each of L pulses, L being smaller than M. The amplitude of pulses, i.e., L pulses as each unit, is quantized, using spectral parameter. The quantization is executed on at least one quantization candidate, which is selected through distortion evaluation made through addition of the evaluation value based on an adjacent group quantization candidate output value and the evaluation value based on the pertinent group quantization value.Type: GrantFiled: April 16, 1997Date of Patent: February 8, 2000Assignee: NEC CorporationInventor: Kazunori Ozawa
-
Patent number: 6018707Abstract: The code vector search for vector-quantizing a variable-dimension input vector is to be improved in precision. Via a terminal are entered a variable number of data, that is a variable-dimension vector v, representing, for example, the amplitudes of spectral components of the harmonics of speech. The variable-dimension vector v is converted by a variable/fixed dimension conversion circuit into the vector x of a fixed dimension, such as 44-dimension vector, which is sent to a selection circuit. From plural fixed-dimension vectors, such a code vector as minimizes a weighted error is selected from a codebook. The code vector of fixed dimension obtained by the codebook is converted by a fixed/variable dimension converting circuit into the same variable dimension as that of the original variable-dimension vector v. The converted variable dimension code vector is sent to a variable-dimension selection circuit for selecting from the codebook such code vector as minimizes the weighted error from the input vector v.Type: GrantFiled: September 5, 1997Date of Patent: January 25, 2000Assignee: Sony CorporationInventors: Masayuki Nishiguchi, Kazuyuki Iijima, Jun Matsumoto
-
Patent number: 6016469Abstract: A process for the vector quantization of low bit rate vocoders, including determining a coding region by surrounding with an envelope a scatter of points of an autocorrelation matrix of reflection coefficients of a filter configured to model a vocal tract, wherein the envelope has a shape selected from the group consisting of a hyperellipsoid shape and a pyramidal shape, the envelope being centered at the barycenter of the scatter of points; determining principal axes of the volume of points inside the envelope; projecting area coefficients of the autocorrelation matrix onto the principal axes; partitioning the interior volume of the envelope into elementary volumes; and coding partition coefficients resulting from partitioning the interior volume on the basis of coordinates of said partition coefficients in a space defined by the principal axes of the volume of the points inside the envelope, while allocating as code values only values corresponding to locations of the elementary volumes in which said partitType: GrantFiled: March 5, 1998Date of Patent: January 18, 2000Assignee: Thomson -CSFInventor: Pierre Andre Laurent
-
Patent number: 6014623Abstract: A method of synthetic speech, wherein the method forms a speech data base, the speech data base includes plural syllables, each of the syllables having a total frame number of the syllable and plural frame parameters. Each of the frame parameter is formed using an energy amount, a speech pitch period, and 10 Line Spectrum Pair (LSP) speech parameters. Thereafter, each LSP speech parameter is encoded using 4 bit Differential Quantization.Type: GrantFiled: June 12, 1997Date of Patent: January 11, 2000Assignee: United Microelectronics Corp.Inventors: Xingjun Wu, Yihe Sun
-
Patent number: 6014618Abstract: A method and apparatus for reducing the complexity of linear prediction analysis-by-synthesis (LPAS) speech coders. The method and apparatus include product code vector quantization (PCVQ) of multi-tap pitch predictor coefficients, which reduces the search and quantization complexity of an adaptive codebook. Further included is a procedure for generating and selecting code vectors consisting of ternary (1,0,-1) values, for optimizing a fixed codebook. Serial optimization of the adaptive codebook first and then the fixed codebook, produces a low complexity LPAS speech coder of the present invention.Type: GrantFiled: August 6, 1998Date of Patent: January 11, 2000Assignee: DSP Software Engineering, Inc.Inventors: Jayesh S. Patel, Douglas E. Kolb
-
Patent number: 6009384Abstract: For coding human speech for subsequent audio reproduction thereof, a plurality of speech segments is derived from speech received, and systematically stored in a data base for later concatenated readout. After the deriving, respective speech segments are fragmented into temporally consecutive source frames, similar source frames as governed by a predetermined similarity measure thereamongst that is based on an underlying parameter set are joined, and joined source frames are collectively mapped onto a single storage frame. Respective segments are stored as containing sequenced referrals to storage frames for therefrom reconstituting the segment in question.Type: GrantFiled: May 20, 1997Date of Patent: December 28, 1999Assignee: U.S. Philips CorporationInventors: Raymond N. J. Veldhuis, Paul A. P. Kaufholz
-
Patent number: 6009391Abstract: One embodiment of a speech recognition system is organized with speech input signal preprocessing and feature extraction followed by a fuzzy matrix quantizer (FMQ). Frames of the speech input signal are represented in a matrix by a vectorf of line spectral pair frequencies and energy coefficients and are fuzzy matrix quantized to respective vector f entries of a matrix codeword in a codebook of the FMQ. The energy coefficients include the original energy and the first and second derivatives of the original energy which increase recognition accuracy by, for example, being generally distinctive speech input signal parameters and providing noise signal suppression especially when the noise signal has a relatively constant energy over at least two time frame intervals. To reduce data while maintaining sufficient resolution, the energy coefficients may be normalized and logarithmically represented. A distance measure between f and f, d(f, f), is defined as ##EQU1## where the constants .alpha..sub.1, .alpha..sub.Type: GrantFiled: August 6, 1997Date of Patent: December 28, 1999Assignee: Advanced Micro Devices, Inc.Inventors: Safdar M. Asghar, Lin Cong
-
Patent number: 6009123Abstract: A transfer process in which, an original vector signal is precoded to an intermediately-precoded vector signal, and the extended modulo operation is performed when the intermediately-precoded vector signal is located outside a predetermined extended-modulo limit area, and the precoded vector signal is transferred through a system having a predetermined filtering characteristic. From the transferred vector signal, the original vector signal is detected, based on a relationship between the vector components of the original vector signal and the transferred vector signal.Type: GrantFiled: September 8, 1997Date of Patent: December 28, 1999Assignee: Fujitsu LimitedInventors: Takashi Kaku, Kyoko Hirao, Hideo Miyazawa
-
Patent number: 6009387Abstract: Apparatus for processing acoustic features extracted from a sample of speech data forming a feature vector signal every frame period includes a first linear prediction analyzer, a vector quantizer, at least one partitioned vector quantizer and a scalar quantizer. The first linear prediction analyzer performs a linear prediction analysis on the feature vector signal to generate a first error vector signal. Next, the vector quantizer performs a vector quantization on the first error signal thereby generating a first index corresponding to a first prestored vector signal which is an approximation of the first error vector signal. The vector quantizer also generates a residual vector signal which is the difference between the first error vector signal and the first prestored approximation vector signal.Type: GrantFiled: March 20, 1997Date of Patent: December 28, 1999Assignee: International Business Machines CorporationInventors: Ganesh Nachiappa Ramaswamy, Ponani Gopalakrishnan, Joseph Morris
-
Patent number: 6006178Abstract: In a speech encoder, a gain codebook switching circuit is supplied with short-term prediction gains from a short-term prediction gain calculator circuit and with mode information through an input terminal and compares the short-term prediction gains with a predetermined threshold value when the mode information indicates a predetermined mode. As a result of comparison, the gain codebook switching circuit produces gain codebook switching information which is delivered to a gain quantizer circuit. The gain codebook quantizer circuit is supplied with adaptive code vectors, excitation code vectors, impulse response information, and the gain codebook switching information, and gain code vectors from a particular gain codebook connected to one of a plurality of input terminals that is selected by the gain codebook switching information.Type: GrantFiled: July 26, 1996Date of Patent: December 21, 1999Assignee: NEC CorporationInventors: Shin-Ichi Taumi, Kazunori Ozawa
-
Patent number: 6006174Abstract: The generation of multipulse excitation codes by digitizing an original speech, partitioning the digitized signal into a number of samples, pre-emphasizing the samples, producing linear predictive reflection coefficients from said samples, quantizing these reflection coefficients, converting the quantized reflection coefficients to spectral coefficients and subjecting the spectral coefficients to pitch analysis to obtain a spectral residual signal.Type: GrantFiled: October 15, 1997Date of Patent: December 21, 1999Assignee: InterDigital Technology CoporationInventors: Daniel Lin, Brian M. McCarthy
-
Patent number: 6006177Abstract: The invention provides a speech coding apparatus wherein a perceptual weighting filter is realized with a comparatively small amount of calculation. The speech coding apparatus includes a weighting circuit which in turn includes a coefficient code book in which weighting coefficients are stored, a coefficient determination section which selects and outputs one of the weighting coefficients which corresponds to a short-term prediction code, and a weighting section for performing weighting calculation of a speech signal with the selected weighting coefficient.Type: GrantFiled: April 18, 1996Date of Patent: December 21, 1999Assignee: NEC CorporationInventor: Keiichi Funaki
-
Patent number: 6006179Abstract: An audio coder/decoder ("codec") that is suitable for real-time applications due to reduced computational complexity, and a novel adaptive sparse vector quantization (ASVQ) scheme and algorithms for general purpose data quantization. The codec provides low bit-rate compression for music and speech, while being applicable to higher bit-rate audio compression. The codec includes an in-path implementation of psychoacoustic spectral masking, and frequency domain quantization using the novel ASVQ scheme and algorithms specific to audio compression. More particularly, the inventive audio codec employs frequency domain quantization with critically sampled subband filter banks to maintain time domain continuity across frame boundaries. The input audio signal is transformed into the frequency domain in which in-path spectral masking can be directly applied. This in-path spectral masking usually results in sparse vectors.Type: GrantFiled: October 28, 1997Date of Patent: December 21, 1999Assignee: America Online, Inc.Inventors: Shuwu Wu, John Mantegna
-
Patent number: 5999899Abstract: Audio source data is subjected to a pre-emphasis step (302) to perform gross decorrelation, followed by an adaptive linear prediction (306) to perform further decorrelation. A transform is performed on the residual of the linear prediction, to obtain transform coefficients representing the residual in the frequency domain. A number of tonal components are identified (310), subtracted from the transform coefficients and encoded by vector quantization. The transform coefficients are then grouped into sub-bands, and each sub-band encoded in the frequency domain by vector quantization. The sub-bands are of uniform width on an auditory scale, so that each vector may comprise a different number of transform coefficients.Type: GrantFiled: October 20, 1997Date of Patent: December 7, 1999Assignee: SoftSound LimitedInventor: Anthony John Robinson
-
Patent number: 5978758Abstract: A first vector quantizer generates output codevectors corresponding in number to a number determined by a predetermined number of bits through linear coupling of integer coefficients of a predetermined number of base vectors stored in a base vector memory. A second vector quantizer determines coefficients of the base vectors according to at least one of output indexes of the output codevectors.Type: GrantFiled: July 10, 1997Date of Patent: November 2, 1999Assignee: NEC CorporationInventor: Shigeru Ono
-
Patent number: 5974378Abstract: A method of using a computer (11) to perform a multi-stage vector quantization process (13a). At each stage of the process (13a) subsequent to the first stage, input vectors from the previous stage are used to search a codebook (13b) for code-vectors that minimize distortion. (FIG. 2) The search is structured so that each stage is performed with an outer loop that calculates components of distortion that do not depend on the input vector value. An inner loop, which does depend on input vector values, is used to calculate distortion values and to maintain a list of the current best output code-vectors. (FIG. 3). The first stage is a special case, having only one input vector, but is otherwise performed like the subsequent stages.Type: GrantFiled: January 6, 1998Date of Patent: October 26, 1999Assignee: Texas Instruments IncorporatedInventors: Wilfrid P. LeBlanc, Alan V. McCree
-
Patent number: 5970443Abstract: An audio encoding-decoding system is constructed between a transmitting station and a receiving station which are connected together through communication lines. The transmitting station corresponds to an encoder which performs an encoding process on audio signals input thereto to produce compressive coded bit streams. Herein, the encoder uses a code book or conjugate structure code books to perform vector quantization on residual signals corresponding to residuals of an analysis of linear predictive coding which is performed on the audio signals. Indexes are produced in response to a result of the vector quantization. The encoder produces the compressive coded bit stream based on the indexes and a result of the analysis of the linear predictive coding. A bit rate mode is determined for the compressive coded bit stream in response to conditions of the communication lines.Type: GrantFiled: September 22, 1997Date of Patent: October 19, 1999Assignee: Yamaha CorporationInventor: Shigeki Fujii
-
Patent number: 5970444Abstract: An ACELP speech coding method according to ITU-T Recommendation G.729. When coding a random component vector, each of random component vector forming together the random codebook is formed of three or less pulses having a unit amplitude for each 6f a pair of subframes which form together a frame. The positions of the pulses are determined from a plurality of predetermined positions which a pulse can assume in a subframe so that distortion is minimized. The method allows speech coding at a lower bit rate.Type: GrantFiled: March 11, 1998Date of Patent: October 19, 1999Assignee: Nippon Telegraph and Telephone CorporationInventors: Shinji Hayashi, Sachiko Kurihara, Akitoshi Kataoka
-
Patent number: 5966688Abstract: A speech mode based multi-stage vector quantizer is disclosed which quantizes and encodes line spectral frequency (LSF) vectors that were obtained by transforming the short-term predictor filter coefficients in a speech codec that utilizes linear predictive techniques. The quantizer includes a mode classifier that classifies each speech frame of a speech signal as being associated with one of a voiced, spectrally stationary (Mode A) speech frame, a voiced, spectrally non-stationary (Mode B) speech frame and an unvoiced (Mode C) speech frame. A converter converts each speech frame of the speech signal into an LSF vector and an LSF vector quantizer includes a 12-bit, two-stage, backward predictive vector encoder that encodes the Mode A speech frames and a 22 bit, four-stage backward predictive vector encoder that encodes the Mode 13 and the Mode C speech frames.Type: GrantFiled: October 28, 1997Date of Patent: October 12, 1999Assignee: Hughes Electronics CorporationInventors: Srinivas Nandkumar, Kumar Swaminathan
-
Patent number: 5963896Abstract: In a speech coder, an excitation quantizer 360 retrieves the positions of M non-zero amplitude pulses, which together constitute an excitation, by using spectral parameters and with a different gain for each group of the pulses less in number than M.Type: GrantFiled: August 26, 1997Date of Patent: October 5, 1999Assignee: NEC CorporationInventor: Kazunori Ozawa
-
Patent number: 5960390Abstract: There is provided a coding method which can effectively prevent a pre-echo and a post-echo from being generated and can perform effective coding to which an psycho-acoustic model is applied. A coding apparatus according to the coding method of the present invention detects the attack and release portions of a waveform signal, and performs gain control to a waveform signal before the attack portion and the waveform signal of the release portion by using a gain control amount adaptively calculated according to the characteristics of the waveform signal. An psycho acoustic model window circuit to an aural model application circuit calculate a masking level based on the psycho-acoustic model from a frequency component obtained by transforming the waveform signal, and a quantization precision determination circuit determines a quantization precision by using the masking level. An window circuit and a transform circuit transform the waveform signal into a plurality of frequency components.Type: GrantFiled: October 2, 1996Date of Patent: September 28, 1999Assignee: Sony CorporationInventors: Masatoshi Ueno, Shinji Miyamori
-
Patent number: 5946651Abstract: A post-processor 317 and method substantially for enhancing synthesised speech is disclosed. The post-processor 317 operates on a signal ex(n) derived from an excitation generator 211 typically comprising a fixed code book 203 and an adaptive code book 204, the signal ex(n) being formed from the addition of scaled outputs from the fixed code book 203 and adaptive code book 204. The post-processor operates on ex(n) by adding to it a scaled signal pv(n) derived from the adaptive code book 204. A gain or scale factor p is determined by the speech coefficients input to the excitation generator 211. The combined signal ex(n)+pv(n) is normalised by unit 316 and input to an LPC or speech synthesis filter 208, prior to being input to an audio processing unit 209.Type: GrantFiled: August 18, 1998Date of Patent: August 31, 1999Assignee: Nokia Mobile PhonesInventors: Kari Jarvinen, Tero Honkanen
-
Patent number: 5943644Abstract: A digital speech waveform is divided into frames and sub-frames. Spectrum envelope information, pitch elements and stochastic elements are extracted and coded for the frames and sub-frames. A second error signal is calculated as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements. The second error signal is coded so as to obtain the stochastic elements as a result of transforming the second error signal into a signal of a frequency domain through discrete cosine transformation and coding coefficients of the transformed domain.Type: GrantFiled: June 18, 1997Date of Patent: August 24, 1999Assignee: Ricoh Company, Ltd.Inventors: Jun Yamane, Hiroki Uchiyama
-
Patent number: 5943647Abstract: A speech recognition method that combines HMMs and vector quantization to model the speech signal and adds spectral derivative information in the speech parameters. Each state of a HMM is modeled by two different VQ-codebooks. One is trained by using the spectral parameters and the second is trained by using the spectral derivative parameters.Type: GrantFiled: June 5, 1997Date of Patent: August 24, 1999Assignee: Tecnomen OyInventor: Jari Ranta
-
Patent number: 5930748Abstract: A speaker identification system (10) employs a supervised training process (100) that uses row action projection (RAP) to generate speaker model data for a set of speakers. The training process employing RAP uses less memory and processing resources by operating on a single row of a matrix at a time. Memory requirements are linearly proportional to number of speakers for storing each speakers information. A speaker is identified from the set of speakers by sampling the speaker's speech (202), deriving cepstral coefficients (208), and performing a polynomial expansion (212) on cepstral coefficients. The identified speaker (228) is selected using the product of the speaker model data (213) and the polynomial expanded coefficients from the speech sample.Type: GrantFiled: July 11, 1997Date of Patent: July 27, 1999Assignee: Motorola, Inc.Inventors: John Eric Kleider, Khaled Assaleh
-
Patent number: 5924065Abstract: In a computerized method for processing speech signals, first vectors representing clean speech signals are stored in a vector codebook. Second vectors are determined from dirty speech signals. Noise and distortion parameters are estimated from the second vectors. Third vectors are predicated, based on estimated noise and distortion parameters. The third vectors are used to correct the first vectors. The third vectors can then be applied to the second vectors to produce corrected vectors. The corrected vectors and the first vectors can be compared to identify first vectors which resemble the corrected vectors.Type: GrantFiled: June 16, 1997Date of Patent: July 13, 1999Assignee: Digital Equipment CorporationInventors: Brian S. Eberman, Pedro J. Moreno
-
Patent number: 5924062Abstract: A codebook correlation matrix comprises a Toeplitz-type (diagonally symmetric) matrix which is calculated from a forty sample subframe of a speech signal, forming a 40.times.40 matrix. The resulting correlation coefficients which constitute the codes are stored within a DSP's local memory after calculation by dividing the matrix into five predefined x- and y- tracks, each track having a unique set of eight pulse positions. Using the eight pulse positions on each track, fifteen 8.times.8 sub-matrices are created which include all of the correlation coefficients in the original 40.times.40 matrix. The sub-matrices are distributed within a 5.times.5 mapping matrix which is correlated with a structure mapping matrix to determine the configuration of the resulting autocorrelation matrix for storage and searching. The sub-matrices within each column of correlated mapping matrices are searched by directing a multiplex pointer to that particular column.Type: GrantFiled: July 1, 1997Date of Patent: July 13, 1999Assignee: Nokia Mobile PhonesInventor: Tin Maung
-
Patent number: 5920833Abstract: An MPEG audio decoder includes a Vector FIFO buffer and a windowed polyphase filter. Groups of vector samples are zeroed out prior to storage in the Vector FIFO buffer when it is desired to soft-mute an audio output of the decoder.Type: GrantFiled: January 30, 1996Date of Patent: July 6, 1999Assignee: LSI Logic CorporationInventor: Gregg Dierke