Correlation Function Patents (Class 704/216)
  • Patent number: 7283954
    Abstract: A method for determining if one audio signal is derived from another audio signal or if two audio signals are derived from the same audio signal compares reduced-information characterizations of said audio signals, wherein said characterizations are based on auditory scene analysis. The comparison removes from the characterisations or minimizes in the characterisations the effect of temporal shift or delay on the audio signals (5-1), calculates a measure of similarity (5-2), and compares the measure of similarity against a threshold. In one alternative, the effect of temporal shift or delay is removed or minimized by cross-correlating the two characterizations. In another alternative, the effect of temporal shift or delay is removed or minimized by transforming the characterizations into a domain that is independent of temporal delay effects, such as the frequency domain. In both cases, a measure of similarity is calculated by calculating a coefficient of correlation.
    Type: Grant
    Filed: February 22, 2002
    Date of Patent: October 16, 2007
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Brett G. Crockett, Michael J. Smithers
  • Patent number: 7251597
    Abstract: A method for tracking pitch signal, including receiving a detected pitch signal that consists of a succession of pitch values, and for each current pitch value in the detected signal perform the following steps: constructing sub-sequences of consistent pitch values from neighboring pitch values. Next, calculating significance of the sub-sequences, and selecting a sub-sequence or a collection of consistent subsequences with highest significance. If the current pitch value is not consistent with the sub-sequence with highest significance, smoothing the current pitch value by diving it or multiplying it by an integer value>1, so as to render it consistent with the sub-sequence with highest significance.
    Type: Grant
    Filed: December 27, 2002
    Date of Patent: July 31, 2007
    Assignee: International Business Machines Corporation
    Inventor: Dan Chazan
  • Patent number: 7243065
    Abstract: A comfort noise generator (104) suitable for use in a communication system includes a finite impulse response (FIR) filter (136), a random number generator (140), and a coefficient updater (138). The coefficient updater (138) determines an updated set of filter coefficients (142) based on the signal frame of the input signal (102). The updated set of filter coefficients (142) is output to the FIR filter (136). The FIR filter (136) shapes a white noise signal (146) supplied by the random number generator (140) to provide a simulated background noise signal, or comfort noise signal (122). The comfort noise signal (122) is selectively output from an echo suppression system or corresponding method to overwrite or suppress reflected residual echoes.
    Type: Grant
    Filed: April 8, 2003
    Date of Patent: July 10, 2007
    Assignee: FreeScale Semiconductor, Inc
    Inventors: James Allen Stephens, David L. Barron, Sean S. You
  • Patent number: 7236927
    Abstract: A method of searching for an interpolated peak of a Normalized Correlation Square (NCS) signal derived from an audio signal, comprises: producing quadratically interpolated correlation (QIC) signal values at interpolated time lags; squaring each of the QIC signal values to produce square QIC signal values; producing an individual interpolated energy signal value corresponding to each of the square QIC signal values, wherein ratios of the square QIC signal values to their corresponding interpolated energy values represent interpolated NCS signal values; and selecting, as the interpolated peak, a largest interpolated NCS signal value among the interpolated NCS signal values without evaluating the ratios.
    Type: Grant
    Filed: October 31, 2002
    Date of Patent: June 26, 2007
    Assignee: Broadcom Corporation
    Inventor: Juin-Hwey Chen
  • Patent number: 7187730
    Abstract: An apparatus and a method for symbol decoding of baseband data in a wireless communications network is disclosed, and specifically CCK subsymbol prediction and symbol demodulation that occurs at 5.5 Mbps or 11 Mbps. The apparatus is configured to demodulate or predict the data differently, depending on the modulation rate. If the data was modulated at 11 Mbps, the ?3 rotator is rotated through each of its possible phase values and symbol correlation takes four clock cycles to complete. If the data was modulated at 5.5 Mbps, ?3 is not rotated with a set value of 0 within the correlator architecture, thereby saving power and reducing symbol correlation and subsymbol prediction to a single cycle while in such transmission mode.
    Type: Grant
    Filed: September 19, 2002
    Date of Patent: March 6, 2007
    Assignee: Marvell International Ltd.
    Inventors: Guorong Hu, Yungping Hsu
  • Patent number: 7155386
    Abstract: An approach for adaptively adjusting the correlation window for open-loop pitch determination is presented. Correlation between a windowed reference signal (or target signal) and a candidate signal is maximized under most conditions by sliding the reference window by a delta increment in either direction to capture peak energy. The traditional fixed size of the correlation window is maintained. However, the window slides forward and/or backwards to capture peak energy within the window. The position of the adjusting or sliding window is allowed to shift in a small range or increment in either direction to maximize the energy of the windowed signal thus making sure that at least one peak energy is captured within the window.
    Type: Grant
    Filed: March 11, 2004
    Date of Patent: December 26, 2006
    Assignee: Mindspeed Technologies, Inc.
    Inventor: Yang Gao
  • Patent number: 7130795
    Abstract: A method is provided for detecting music in a speech signal having a plurality of frames. The method comprises obtaining one or more first pitch correlation candidates from a first frame of the plurality of frames; obtaining one or more second pitch correlation candidates from a second frame of the plurality of frames; selecting a pitch correlation (Rp) from the one or more first pitch correlation candidates and the one or more second pitch correlation candidates; and distinguishing music from background noise based on analyzing the pitch correlation (Rp).
    Type: Grant
    Filed: June 17, 2005
    Date of Patent: October 31, 2006
    Assignee: Mindspeed Technologies, Inc.
    Inventor: Yang Gao
  • Patent number: 7130292
    Abstract: A method and apparatus for enhancing the receiving and information identification functions of multiple access communications systems by employing one or more optical processors configured as a bank of 1-D correlators. The present invention is particularly useful in a DS/SS CDMA communications system, resulting in a multiuser CDMA system that approaches carrier to noise performance (C/N) as opposed to being limited by multiple access interference (MAI). The correlators are arranged in parallel to detect and/or demodulate the received signal, in conjunction with one or more complex algorithms to perform near-optimum multiuser detection, perform multipath combining and/or perform carrier Doppler compensation.
    Type: Grant
    Filed: January 19, 2001
    Date of Patent: October 31, 2006
    Assignee: Essex Corporation
    Inventors: Terry M. Turpin, James L. Lafuse
  • Patent number: 7058569
    Abstract: A synthesis method for concatenative speech synthesis is provided for efficiently concatenating waveform segments in the time-domain. A digital waveform provider produces an input sequence of digital waveform segments. A waveform concatenator concatenates the input segments by using waveform blending within a concatenation zone to synchronize, weight, and overlap-add selected portions of the input segments to produce a single digital waveform. The synchronizing includes determining a minimum weighted energy anchor in the selected portion of each input segment and aligning synchronization peaks in a local vicinity of each anchor.
    Type: Grant
    Filed: September 14, 2001
    Date of Patent: June 6, 2006
    Assignee: Nuance Communications, Inc.
    Inventors: Geert Coorman, Bert Van Coile
  • Patent number: 7039582
    Abstract: A computationally efficient and robust pitch detection and tracking system and related methods are presented. According to certain exemplary implementations a method is presented comprising identifying an initial set of pitch period candidates using a first estimation algorithm, filtering the initial set of candidates and passing the filtered candidates through a second, more accurate pitch estimation algorithm to generate a final set of pitch period candidates from which the most likely pitch value is selected.
    Type: Grant
    Filed: February 22, 2005
    Date of Patent: May 2, 2006
    Assignee: Microsoft Corporation
    Inventors: Eric I-Chao Chang, Jian-Lai Zhou
  • Patent number: 7035790
    Abstract: A speech processing system is provided which is operable to receive sets of signal values representative of a speech signal generated by a speech source as distorted by a transmission channel between the speech source and the speech processing system. The system stores data defining a predetermined function derived from a signal model which models both the speech source and the channel and defining a probability density function which gives, for a given set of model parameters, the probability that the signal model has those model parameters given that the signal model is assumed to have generated the received set of signal values. The system applies a current set of received signal values to the stored probability density function and then draws samples from it using a Gibbs sampler. The system then analyses the samples to determine a set of parameter values representative of the speech signal before it was distorted by the channel.
    Type: Grant
    Filed: May 30, 2001
    Date of Patent: April 25, 2006
    Assignee: Canon Kabushiki Kaisha
    Inventor: Jebu Jacob Rajan
  • Patent number: 7027980
    Abstract: A system or method for modeling a signal, such as a speech signal, in which harmonic frequencies and amplitudes are identified and the harmonic magnitudes are interpolated to obtain spectral magnitudes at a set of fixed frequencies. An inverse transform is applied to the spectral magnitudes to obtain a pseudo auto-correlation sequence, from which linear prediction coefficients are calculated. From the linear prediction coefficients, model harmonic magnitudes are generated by sampling the spectral envelope defined by the linear prediction coefficients. A set of scale factors are then calculated as the ratio of the harmonic magnitudes to the model harmonic magnitudes and interpolated to obtain a second set of scale factors at the set of fixed frequencies. The spectral envelope magnitudes at the set of fixed frequencies are multiplied by the second set of scale factors to obtain new spectral magnitudes and the process is iterated to obtain final linear prediction coefficients.
    Type: Grant
    Filed: March 28, 2002
    Date of Patent: April 11, 2006
    Assignee: Motorola, Inc.
    Inventors: Tenkasi V. Ramabadran, Aaron M. Smith, Mark A. Jasiuk
  • Patent number: 7016507
    Abstract: This invention describes a practical application of noise reduction in hearing aids. Although listening in noisy conditions is difficult for persons with normal hearing, hearing impaired individuals are at a considerable further disadvantage. Under light noise conditions, conventional hearing aids amplifying the input signal sufficiently to overcome the hearing loss. For a typical sloping hearing loss where there is a loss in high frequency hearing sensitivity, the amount of boost (or gain) rises with frequency. Most frequently, the loss in sensitivity is only for low-level signals; high level signals are affective minimally or not at all. A compression hearing aid is able to compensate by automatically lowering the gain as the input signal level rises. This compression action is usually compromised under noisy conditions.
    Type: Grant
    Filed: April 16, 1998
    Date of Patent: March 21, 2006
    Assignee: AMI Semiconductor Inc.
    Inventor: Robert Brennan
  • Patent number: 6999922
    Abstract: The present invention (110) permits a user to speed up and slow down speech without changing the speakers pitch (102, 110, 112, 128, 402–416). It is a user adjustable feature to change the spoken rate to the listeners' preferred listening rate or comfort. It can be included on the phone as a customer convenience feature without changing any characteristics of the speakers voice besides the speaking rate with soft key button (202) combinations (in interconnect or normal). From the users perspective, it would seem only that the talker changed his speaking rate, and not that the speech was digitally altered in any way. The pitch and general prosody of the speaker are preserved. The following uses of the time expansion/compression feature are listed to compliment already existing technologies or applications in progress including messaging services, messaging applications and games, real-time feature to slow down the listening rate.
    Type: Grant
    Filed: June 27, 2003
    Date of Patent: February 14, 2006
    Assignee: Motorola, Inc.
    Inventors: Marc Andre Boillot, John Gregory Harris, Thomas Lawrence Reinke
  • Patent number: 6952670
    Abstract: An extraction section extracts a speech signal having ambient noise superimposed thereon as a data segment having a predetermined duration. An autocorrelation function normalizing section determines normalized autocorrelation function vectors. A normalized autocorrelation function count section counts a given number of normalized autocorrelation function vectors. A noise vector region/speech vector region/undefined vector computation section classifies the normalized autocorrelation function vectors into any of a noise vector region, a speech vector region, or undefined vectors. When the latest normalized autocorrelation function vector acquired by a normalized autocorrelation function vector determination section pertains to the noise vector region, the speech signal is determined to be a noise segment. In contrast, when the latest vector does not pertain to the noise vector region, the input signal is determined to be a speech segment.
    Type: Grant
    Filed: July 17, 2001
    Date of Patent: October 4, 2005
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Shogo Iizuka, Shigeru Hosoi, Kazuki Hoshino
  • Patent number: 6871175
    Abstract: A voice encoding method includes the steps of encoding a first frame that contains a plurality of voice data into encoded parameters, locally decoding the encoded parameters of the first frame into a second frame, performing a plurality of interpolation recovery processes that generate respective frames approximating to the first frame by using a frame or frames other than the first frame, comparing the second frame with the frames approximating to the first frame generated by the plurality of interpolation recovery processes, calculating a signal to noise ratio of each of the frames approximating to the first frame by treating the second frame as the signal, determining an index number that indicates an interpolation recovery process which provides a highest signal to noise ratio, and multiplexing and transmitting the index number with the encoded parameters.
    Type: Grant
    Filed: March 22, 2001
    Date of Patent: March 22, 2005
    Assignee: Fujitsu Limited Kawasaki
    Inventor: Fumio Amano
  • Patent number: 6842731
    Abstract: A prediction parameter analysis apparatus comprises a windowing part which generates a short time input signal by subjecting an input signal or a signal derived from the input signal to windowing, a component removal part which removes an unnecessary component from the short time input signal to generate a modified short time input signal, an autocorrelation coefficient computation part which computes autocorrelation coefficients based on the modified short time input signal, and a prediction parameter computation part which computes prediction parameters based on the autocorrelation coefficients.
    Type: Grant
    Filed: May 16, 2002
    Date of Patent: January 11, 2005
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Kimio Miseki
  • Publication number: 20040254786
    Abstract: The invention relates to a method for transcoding audio signals in a communications system. In order to improve the inter-operability between units (2,40) capable of handling wideband audio signals and units (3,46) or network components (50) capable of handling narrowband audio signals, it is proposed that first, an audio signal is received in a network element (42) of a communications network via which said audio signal is transmitted. Next, it is determined in said network element (42) whether a transcoding of the received audio signal is required. In case a narrowband-to-wideband transcoding of the received signal is required, the received narrowband audio signal is transcoded into a wideband audio signal in the network element (1,42). The generated wideband audio signal is then forwarded to the receiving terminal (2,40). The invention equally relates to a corresponding communications system and its components.
    Type: Application
    Filed: January 14, 2004
    Publication date: December 16, 2004
    Inventors: Olli Kirla, Henrik Lepanaho, Teemu Himanen
  • Patent number: 6829578
    Abstract: Robust acoustic tone features are achieved first by the introduction of on-line, look-ahead trace back of the fundamental frequency (F0) contour with adaptive pruning, this fundamental frequency serves as the signal preprocessing front-end. The F0 contour is subsequently decomposed into lexical tone effect, phrase intonation effect, and random effect by means of time-variant, weighted moving average (MA) filter in conjunction with weighted (placing more emphasis on vowels) least squares of the F0 contour. The intonation effect is removed by subtraction of the F0 contour under superposition assumption. The acoustic tone features are defined as two parts. First, is the coefficients of the second order weighted regression of the de-intonation of the F0 contour over neighbouring frames. The second part deals with the degree of the periodicity of the signal, which are the coefficients of the second order regression of the auto-correlation.
    Type: Grant
    Filed: July 9, 2001
    Date of Patent: December 7, 2004
    Assignee: Koninklijke Philips Electronics, N.V.
    Inventors: Chang-Han Huang, Frank Torsten Bernd Seide
  • Patent number: 6823302
    Abstract: A method for providing real-time perceptual quality measurements of an audio signal (12) in which a quality test signal, including an audio test signal, is received by equipment under test. Playback of a pre-stored representation of the audio signal is coarsely synchronized (20) to the received audio test signal, for example, utilizing a synchronizing pulse in a header of the quality test signal. The playback is then finely synchronized (24) to the received audio signal, for example, by comparing data in a windowed portion of the received audio test signal and a windowed portion of the pre-stored representation of the audio test signal and by adjusting a windowed portion of the pre-stored representation of the audio test signal in accordance with results of the comparison. A window of the received audio test singal is then compared (14) to a portion of the finely synchronized play back of the pre-stored representation of the audio test signal to output a quality measurement of the received audio test signal.
    Type: Grant
    Filed: January 24, 2001
    Date of Patent: November 23, 2004
    Assignee: National Semiconductor Corporation
    Inventors: Ian Atkinson, Martin Lee, Wei Ma, Kambiz Homayounfar
  • Publication number: 20040230423
    Abstract: To select the encoding mode of an audio signal in a multi-channel system, a level of energy of the audio signal associated with each channel is determined, which in turn is used to compute a first value. Next, a second value based on a degree of correlation of the signals of each channel is determined. If the first value is smaller than the second value, the audio signal is encoded using a first encoding mode. Next, a third value defined by the energy levels and a fourth value defined by the correlation are computed. If the first value is greater than the second value, and the third value is smaller than the fourth value, the audio signal is encoded using a second encoding mode. Otherwise the audio signal is encoded using a third encoding mode.
    Type: Application
    Filed: May 16, 2003
    Publication date: November 18, 2004
    Applicant: Divio, Inc.
    Inventors: Christos Chrysafis, Siu-Leong Yu
  • Patent number: 6819275
    Abstract: Estimating a compression gain obtainable in compressing a given audio signal, comprising extracting a signal power in a selected frequency band of the given audio signal, and obtaining an estimation of the compression gain by correlation with the extracted signal power.
    Type: Grant
    Filed: May 8, 2002
    Date of Patent: November 16, 2004
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Derk Reefman, Petrus Antonius Cornelis Maria Nuijten
  • Patent number: 6789059
    Abstract: Methods and apparatus for quickly selecting an optimal excitation waveform from a codebook are presented herein. To reduce the number of computations required to choose the optimal codebook vector, a subset of codevectors are selected based upon optimal pulse locations, wherein the subset of codevectors form a subcodebook. Rather than searching the entire codebook, only the entries of the subcodebook are searched.
    Type: Grant
    Filed: June 6, 2001
    Date of Patent: September 7, 2004
    Assignee: Qualcomm Incorporated
    Inventors: Ananthapadmanabhan Kandhadai, Andrew P. DeJaco, Sharath Manjunath
  • Patent number: 6785645
    Abstract: An efficient and accurate classification method for classifying speech and music signals, or other diverse signal types, is provided. The method and system are especially, although not exclusively, suited for use in real-time applications. Long-term and short-term features are extracted relative to each frame, whereby short-term features are used to detect a potential switching point at which to switch a coder operating mode, and long-term features are used to classify each frame and validate the potential switch at the potential switch point according to the classification and a predefined criterion.
    Type: Grant
    Filed: November 29, 2001
    Date of Patent: August 31, 2004
    Assignee: Microsoft Corporation
    Inventors: Hosam Adel Khalil, Vladimir Cuperman, Tian Wang
  • Patent number: 6766289
    Abstract: Methods and apparatus for quickly selecting an optimal excitation waveform from a codebook are presented herein. In encoding schemes that use forward and backward pitch enhancement, storage and processor load is reduced by approximating a two-dimensional autocorrelation matrix with a one-dimensional autocorrelation vector. The approximation is possible when a cross-correlation element is configured to determine the autocorrelation matrix of an impulse response and a pulse energy determination element is configured to determine the energy of a pulse code vector that incorporates secondary pulse positions.
    Type: Grant
    Filed: June 4, 2001
    Date of Patent: July 20, 2004
    Assignee: Qualcomm Incorporated
    Inventors: Ananthapadmanabhan Kandhadai, Andrew P. DeJaco, Sharath Manjunath
  • Publication number: 20040098254
    Abstract: There are provided a search method of a fixed codebook, and more particularly, a focused search method and apparatus thereof, for being applied to a speech codec for Voice over Internet Protocol (VoIP). The focused search method of the fixed codebook includes: calculating absolute values of correlation vectors of respective pulse locations of tracks 0, 1, 2, and 3 and arranging the pulse locations in a descending order of the absolute values; and selecting a predetermined number of pulse locations for each track among candidate pulse locations arranged and conducting focused search of the selected result. Therefore, it is possible to significantly reduce a calculation amount required for fixed codebook search while maintaining tone quality in a similar level.
    Type: Application
    Filed: November 12, 2003
    Publication date: May 20, 2004
    Inventors: Eung Don Lee, Do Young Kim, Bong Tae Kim
  • Publication number: 20040093202
    Abstract: Disclosed are a computerized method and system for the identification of identical or similar audio recordings or segments of audio recordings. Identity or similarity between a first audio segment of a first audio stream and at least a second audio segment of an at least second audio stream is determined by digitizing at least the first audio segment and the at least second audio segment of said audio streams, calculating characteristic signatures from at least one local feature of the first audio segment and the at least second audio segment, aligning the at least two characteristic signatures, comparing the at least two aligned characteristic signatures and calculating a distance between the aligned characteristic signatures and determining identity or similarity between the at least two audio segments based on the determined distance.
    Type: Application
    Filed: September 12, 2003
    Publication date: May 13, 2004
    Inventors: Uwe Fischer, Stefan Hoffmann, Werner Kriechbaum, Gerhard Stenzel
  • Patent number: 6718305
    Abstract: Disclosed is a method for use by a speech recognizer. The method includes determining a regression class tree structure for the speech recognizer, wherein the tree structure includes, representing word subunits or regression classes, as tree leaves, combining the word subunits to form tree nodes using a distance measure for the word subunits in the acoustic space, and combining regression classes to a regression class that lies closer to a tree root of the tree structure using a correlation measure, and wherein at least two of regression classes having the largest correlation parameter are combined to a new regression class that is used in the formation of the regression tree structure, instead of the two combined regression classes, to determine a regression class representing the tree root.
    Type: Grant
    Filed: March 17, 2000
    Date of Patent: April 6, 2004
    Assignee: Koninklijke Philips Electronics N.V.
    Inventor: Reinhold Häb-Umbach
  • Patent number: 6694010
    Abstract: A fast-converging, computationally simple, method for recognizing a single frequency tone or a sinusoid in a signal without prior knowledge of the tone frequency. The method employs a second order or higher auto-regressive model and includes: (a) sampling the signal at a constant sampling rate, and, for each sample, recursively determining a finite number of correlation coefficients using a time-reversed, exponentially weighted, future sliding equivalent of the signal, wherein the correlation coefficients are determined using pre-existing values of the correlation coefficients determined in a previous iteration, a current sample of the signal and at least two consecutively previous samples of the signal; (b) periodically determining at least the second auto-regressive coefficient modeling the signal using the correlation coefficients; and (c) recognizing the presence of the tone based on the value of the second auto-regressive coefficient.
    Type: Grant
    Filed: April 27, 1999
    Date of Patent: February 17, 2004
    Assignee: Alcatel Canada Inc.
    Inventor: Eric Verreault
  • Patent number: 6687672
    Abstract: Methods and apparatus for blind channel estimation of a speech signal corrupted by a communication channel are provided. One method includes converting a noisy speech signal into either a cepstral representation or a log-spectral representation; estimating a correlation of the representation of the noisy speech signal; determining an average of the noisy speech signal; constructing and solving, subject to a minimization constraint, a system of linear equations utilizing a correlation structure of a clean speech training signal, the correlation of the representation of the noisy speech signal, and the average of the noisy speech signal; and selecting a sign of the solution of the system of linear equations to estimate an average clean speech signal in a processing window.
    Type: Grant
    Filed: March 15, 2002
    Date of Patent: February 3, 2004
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Younes Souilmi, Luca Rigazio, Patrick Nguyen, Jean-Claude Junqua
  • Patent number: 6675114
    Abstract: A system and method (and storage media) for identifying the category of the source noise and sound by using the physical factors derived from the autocorrelation function (ACF) and the interaural crosscorrelation function (IACF) which are ever changing in the time domain based on the model of human auditory brain function system. A method for evaluating a sound comprises the steps of: using the sound recorder to capture and record the acoustic signal; calculating the ACF from the acoustic signal using a CPU; calculating ACF factors extracted from the calculated ACF using the CPU; and identifying the kinds of the noise based on the ACF factors. Thereby the unknown noise source can be identified what it is (such as automobile noise, and factory noise), and can be identified its type such as type of cars or type of machines.
    Type: Grant
    Filed: July 10, 2002
    Date of Patent: January 6, 2004
    Assignee: Kobe University
    Inventors: Yoichi Ando, Hiroyuki Sakai
  • Patent number: 6662153
    Abstract: A time-separated speech coder that codes a transitional signal of voiced/unvoiced sound through harmonic speech coding, the coder including a transitional excitation signal analyzer/synthesizer for coding the transitional signal by extracting the harmonic model parameters of both transitional analyzers after detecting a transitional point and generating sinusoidal waveforms according to a variable transitional point separating both transitional analyzers. By the transitional point at which energy varies abruptly and the time-separated coding based on the transitional point, more improved speech quality than in the general harmonic speech coder can be obtained using the time-separated speech coder by increasing the representation capability of the transitional signal with large energy variation, after adapting it to the variable transitional point.
    Type: Grant
    Filed: January 24, 2001
    Date of Patent: December 9, 2003
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Hyoung Jung Kim, In Sung Lee, Jong Hark Kim, Man Ho Park, Byung Sik Yoon, Song In Choi, Dae Sik Kim
  • Patent number: 6658381
    Abstract: Techniques and systems for identifying coding rates of transmitted frames are described. Unused bits in rate adapted frames are used to carry frame type indicator patterns. Maximal rate frames (i.e., with a highest coding rate) need not include a frame type indicator.
    Type: Grant
    Filed: September 20, 2000
    Date of Patent: December 2, 2003
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Karl Hellwig, Robert Bäuml, Jesus Andonegui
  • Publication number: 20030177003
    Abstract: Methods and apparatus for blind channel estimation of a speech signal corrupted by a communication channel are provided. One method includes converting a noisy speech signal into either a cepstral representation or a log-spectral representation; estimating a correlation of the representation of the noisy speech signal; determining an average of the noisy speech signal; constructing and solving, subject to a minimization constraint, a system of linear equations utilizing a correlation structure of a clean speech training signal, the correlation of the representation of the noisy speech signal, and the average of the noisy speech signal; and selecting a sign of the solution of the system of linear equations to estimate an average clean speech signal in a processing window.
    Type: Application
    Filed: March 15, 2002
    Publication date: September 18, 2003
    Inventors: Younes Souilmi, Luca Rigazio, Patrick Nguyen, Jean-Claude Junqua
  • Patent number: 6622117
    Abstract: In connection with blind source separation, proposed herein, inter alia, are: expectation-maximization equations to iteratively estimate unmixing filters and source density parameters in the context of convolutive independent component analysis where the sources are modeled with mixtures of Gaussians; a scheme to estimate the length of unmixing filters; and two alternative initialization schemes.
    Type: Grant
    Filed: May 14, 2001
    Date of Patent: September 16, 2003
    Assignee: International Business Machines Corporation
    Inventors: Sabine Deligne, Ramesh A. Gopinath
  • Publication number: 20030171918
    Abstract: A method of filtering digital audio data with short delay according to the present invention comprises the steps of: (a) buffering input source digital audio data; (b) calculating digital data that are being filtered from the buffered input source digital audio data; and (c) outputting a portion of the filtered digital data. The source digital audio data are sequence of digital samples. Buffer contents are shifted to arrange some of input samples in a buffer in the step (a). The step (b) comprises the steps of: (b-1) calculating a correlation matrix; (b-2) decomposing the correlation matrix; (b-3) calculating a filter matrix; and (b-4) calculating an approximation of the filtered digital data.
    Type: Application
    Filed: February 21, 2003
    Publication date: September 11, 2003
    Inventors: Mikhael A. Sall, Sergei N. Gramnitskiy, Alexandr L. Maiboroda, Victor V. Redkov, Anatoli I. Tikhotsky, Andrei B. Viktorov
  • Publication number: 20030125938
    Abstract: The present invention discloses a complete speech recognition system having a training button and a recognition button, and the whole system uses the application specific integrated circuit (ASIC) architecture for the design, and also uses the modular design to divide the speech processing into 4 modules: system control module, autocorrelation and linear predictive coefficient module, cepstrum module, and DTW recognition module. Each module forms an intellectual product (IP) component by itself. Each IP component can work with various products and application requirements for the design reuse to greatly shorten the time to market.
    Type: Application
    Filed: December 24, 2002
    Publication date: July 3, 2003
    Inventors: Jhing-Fa Wang, Jia-Ching Wang, Tai-Lung Chen, Chin-Chan Chang
  • Publication number: 20030101050
    Abstract: An efficient and accurate classification method for classifying speech and music signals, or other diverse signal types, is provided. The method and system are especially, although not exclusively, suited for use in real-time applications. Long-term and short-term features are extracted relative to each frame, whereby short-term features are used to detect a potential switching point at which to switch a coder operating mode, and long-term features are used to classify each frame and validate the potential switch at the potential switch point according to the classification and a predefined criterion.
    Type: Application
    Filed: November 29, 2001
    Publication date: May 29, 2003
    Applicant: Microsoft Corporation
    Inventors: Hosam Adel Khalil, Vladimir Cuperman, Tian Wang
  • Patent number: 6564183
    Abstract: A speech encoding/decoding apparatus. A speech encoding apparatus has a coding portion for receiving input information related to an uncoded signal representative of an original speech signal, the coding portion including a fixed coding portion for receiving the input information and producing a first coded signal estimate, and an adaptive coding portion for receiving the input information and producing a second coded signal estimate. A controller is connected to the fixed coding portion and the adaptive coding portion for receiving information indicative of speech characteristics of the uncoded signal and generates a control signal; and a code modifier receives the first coded signal estimate from the fixed coding portion and the control signal from the controller and produces a modified signal estimate.
    Type: Grant
    Filed: December 22, 1999
    Date of Patent: May 13, 2003
    Assignee: Telefonaktiebolaget LM Erricsson (Publ)
    Inventors: Roar Hagen, Erik Ekudden
  • Publication number: 20030088405
    Abstract: A method of processing a decoded speech (DS) signal including successive DS frames, each DS frame including DS samples. The method comprises: adaptively filtering the DS signal to produce a filtered signal; gain-scaling the filtered signal with an adaptive gain updated once a DS frame, thereby producing a gain-scaled signal; and performing a smoothing operation to smooth possible waveform discontinuities in the gain-scaled signal.
    Type: Application
    Filed: August 9, 2002
    Publication date: May 8, 2003
    Applicant: Broadcom Corporation
    Inventors: Juin-Hwey Chen, Jes Thyssen, Chris C. Lee
  • Publication number: 20030046066
    Abstract: Methods and apparatus for quickly selecting an optimal excitation waveform from a codebook are presented herein. To reduce the number of computations required to choose the optimal codebook vector, a subset of codevectors are selected based upon optimal pulse locations, wherein the subset of codevectors form a subcodebook. Rather than searching the entire codebook, only the entries of the subcodebook are searched.
    Type: Application
    Filed: June 6, 2001
    Publication date: March 6, 2003
    Inventors: Ananthapadmanabhan Kandhadai, Andrew P. DeJaco, Sharath Manjunath
  • Patent number: 6502067
    Abstract: A method for processing a sound signal y in which redundancy, consisting mainly of almost repetitions of signal profiles, is detected and correlations between the signal profiles are determined within segments of the sound signal. Correlated signal components are allocated to a power component and uncorrelated signal components to a noise component of the sound signal. The correlations between the signal profiles are determined by methods of nonlinear noise reduction in deterministic systems in reconstructed vector spaces based on the time domain.
    Type: Grant
    Filed: December 17, 1999
    Date of Patent: December 31, 2002
    Assignee: Max-Planck-Gesellschaft zur Forderung der Wissenschaften e.V.
    Inventors: Rainer Hegger, Holger Kantz, Lorenzo Matassini
  • Patent number: 6502068
    Abstract: A speech coding apparatus includes a pulse position candidate table, inter-pulse distortion table, first and second reference address tables, first and second reference address table creation units, and search table creation unit. The pulse position candidate table stores the pulse position candidate of each pulse. The inter-pulse distortion table stores a distortion calculated every pulse interval. The first reference address table creation unit regards the pulse position of the inter-pulse distortion table as a relative distance from the start of the inter-pulse distortion table, calculates a distortion every pulse interval to obtain the absolute address of the inter-pulse distortion table, and stores it in the first reference address table. Table creation unit creates a second reference address table accordingly. A multipulse search table is created using these absolute addresses.
    Type: Grant
    Filed: September 18, 2000
    Date of Patent: December 31, 2002
    Assignee: NEC Corporation
    Inventor: Katsuya Misu
  • Patent number: 6477490
    Abstract: An audio signal compression apparatus for compressively coding an input audio signal comprises a time-to-frequency transformation unit for transforming the input audio signal to a frequency domain signal; a spectrum envelope calculation unit for calculating a spectrum envelope having different resolutions for different frequencies, from the input audio signal, using a weighting function on frequency based on human auditory characteristics; a normalization unit for normalizing the frequency domain signal using the spectrum envelope to obtain a residual signal; a power normalization unit for normalizing the residual signal by the power; an auditory weighting calculation unit for calculating weighting coefficients on frequency, based on the spectrum of the input audio signal and human auditory characteristics; and a multi-stage quantization device having plural stages of vector quantizers connected in series, to which the normalized residual signal is input, and at least one of the vector quantizers quantizing t
    Type: Grant
    Filed: June 28, 2001
    Date of Patent: November 5, 2002
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Yoshihisa Nakatoh, Takeshi Norimatsu, Mineo Tsushima, Tomokazu Ishikawa, Mitsuhiko Serikawa, Taro Katayama, Junichi Nakahashi, Yoriko Yagi
  • Patent number: 6463406
    Abstract: An analyzer and synthesizer (500) for human speech using LPC filtering (530) of an excitation of mixed (508-518-520) voiced pulse train (502) and unvoiced noise (512) with fractional sampling period pitch period determination.
    Type: Grant
    Filed: May 20, 1996
    Date of Patent: October 8, 2002
    Assignee: Texas Instruments Incorporated
    Inventor: Alan V. McCree
  • Patent number: 6424942
    Abstract: A method and arrangement for telecommunication comprises that it is detected (120) whether an incoming signal is speech or background noise, and encoding (100, 110) and transmitting parameters characterising the incoming signal. In or before (103) in the encoding of the background noise, parameters are produced, which represent background noise having increased low frequency components. Thus, the incoming signal can be subjected (103) to a frequency tilting operation. The degree of increasing the low frequency components is determined by the maximum long term correlation of the incoming signal. This method and arrangement provides a better generation of comfort noise, when the input signal comprises low frequency sinusoids, such as engine noise from cars and trams.
    Type: Grant
    Filed: October 25, 1999
    Date of Patent: July 23, 2002
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Peter Mustel, Ingemar Johansson
  • Patent number: 6424938
    Abstract: Perceptually relevant non-speech information can be preserved during encoding of an audio signal by determining whether the audio signal includes such information. If so, a speech/noise classification of the audio signal is overriden to prevent misclassification of the audio signal as noise.
    Type: Grant
    Filed: November 5, 1999
    Date of Patent: July 23, 2002
    Assignee: Telefonaktiebolaget L M Ericsson
    Inventors: Ingemar Johansson, Erik Ekudden, Jonas Svedberg, Anders Uvliden
  • Publication number: 20020065648
    Abstract: A voice encoding method includes the steps of encoding a first frame that contains a plurality of voice data into encoded parameters, locally decoding the encoded parameters of the first frame into a second frame, performing a plurality of interpolation recovery processes that generate respective frames approximating to the first frame by using a frame or frames other than the first frame, comparing the second frame with the frames approximating to the first frame generated by the plurality of interpolation recovery processes, calculating a signal to noise ratio of each of the frames approximating to the first frame by treating the second frame as the signal, determining an index number that indicates an interpolation recovery process which provides a highest signal to noise ratio, and multiplexing and transmitting the index number with the encoded parameters.
    Type: Application
    Filed: March 22, 2001
    Publication date: May 30, 2002
    Inventor: Fumio Amano
  • Patent number: 6385548
    Abstract: An apparatus and method to characterize an input communication signal as being a voice, tone or noise signal is provided. The apparatus and method involve measuring variations of pitch over time from a sampled input signal. A minimum value of Average Magnitude Difference Function (AMDF) over a pitch range and an average variation value of the AMDF over sampled intervals are used to determine whether the signal is a voice signal, a tone or noise. Historical data of these values is maintained in a dual buffer arrangement and is used in the determination of signal type by detecting transitions.
    Type: Grant
    Filed: December 12, 1997
    Date of Patent: May 7, 2002
    Assignee: Motorola, Inc.
    Inventors: Satish Ananthaiyer, Eric David Elias
  • Patent number: RE38889
    Abstract: A pitch period extracting apparatus includes a microcomputer which determines a sampling frequency for an A/D converter, and a range of delay times for calculating autocorrelative values on the basis of the sampling frequency. For example, the delay times are set within a range of 20 samples?k?100 samples in a case of 8 kHz, and a range of 15 samples?k?75 samples in a case of 6 kHz. The microcomputer calculates the autocorrelative values of speech signal data stored in a buffer memory, and outputs a delay time at which a maximum autocorrelative value is obtainable as a pitch period of an inputted speech signal.
    Type: Grant
    Filed: October 6, 2000
    Date of Patent: November 22, 2005
    Assignee: Sanyo Electric Co., Ltd.
    Inventor: Takeo Inoue