Correlation Function Patents (Class 704/216)
-
Patent number: 7283954Abstract: A method for determining if one audio signal is derived from another audio signal or if two audio signals are derived from the same audio signal compares reduced-information characterizations of said audio signals, wherein said characterizations are based on auditory scene analysis. The comparison removes from the characterisations or minimizes in the characterisations the effect of temporal shift or delay on the audio signals (5-1), calculates a measure of similarity (5-2), and compares the measure of similarity against a threshold. In one alternative, the effect of temporal shift or delay is removed or minimized by cross-correlating the two characterizations. In another alternative, the effect of temporal shift or delay is removed or minimized by transforming the characterizations into a domain that is independent of temporal delay effects, such as the frequency domain. In both cases, a measure of similarity is calculated by calculating a coefficient of correlation.Type: GrantFiled: February 22, 2002Date of Patent: October 16, 2007Assignee: Dolby Laboratories Licensing CorporationInventors: Brett G. Crockett, Michael J. Smithers
-
Patent number: 7251597Abstract: A method for tracking pitch signal, including receiving a detected pitch signal that consists of a succession of pitch values, and for each current pitch value in the detected signal perform the following steps: constructing sub-sequences of consistent pitch values from neighboring pitch values. Next, calculating significance of the sub-sequences, and selecting a sub-sequence or a collection of consistent subsequences with highest significance. If the current pitch value is not consistent with the sub-sequence with highest significance, smoothing the current pitch value by diving it or multiplying it by an integer value>1, so as to render it consistent with the sub-sequence with highest significance.Type: GrantFiled: December 27, 2002Date of Patent: July 31, 2007Assignee: International Business Machines CorporationInventor: Dan Chazan
-
Patent number: 7243065Abstract: A comfort noise generator (104) suitable for use in a communication system includes a finite impulse response (FIR) filter (136), a random number generator (140), and a coefficient updater (138). The coefficient updater (138) determines an updated set of filter coefficients (142) based on the signal frame of the input signal (102). The updated set of filter coefficients (142) is output to the FIR filter (136). The FIR filter (136) shapes a white noise signal (146) supplied by the random number generator (140) to provide a simulated background noise signal, or comfort noise signal (122). The comfort noise signal (122) is selectively output from an echo suppression system or corresponding method to overwrite or suppress reflected residual echoes.Type: GrantFiled: April 8, 2003Date of Patent: July 10, 2007Assignee: FreeScale Semiconductor, IncInventors: James Allen Stephens, David L. Barron, Sean S. You
-
Patent number: 7236927Abstract: A method of searching for an interpolated peak of a Normalized Correlation Square (NCS) signal derived from an audio signal, comprises: producing quadratically interpolated correlation (QIC) signal values at interpolated time lags; squaring each of the QIC signal values to produce square QIC signal values; producing an individual interpolated energy signal value corresponding to each of the square QIC signal values, wherein ratios of the square QIC signal values to their corresponding interpolated energy values represent interpolated NCS signal values; and selecting, as the interpolated peak, a largest interpolated NCS signal value among the interpolated NCS signal values without evaluating the ratios.Type: GrantFiled: October 31, 2002Date of Patent: June 26, 2007Assignee: Broadcom CorporationInventor: Juin-Hwey Chen
-
Patent number: 7187730Abstract: An apparatus and a method for symbol decoding of baseband data in a wireless communications network is disclosed, and specifically CCK subsymbol prediction and symbol demodulation that occurs at 5.5 Mbps or 11 Mbps. The apparatus is configured to demodulate or predict the data differently, depending on the modulation rate. If the data was modulated at 11 Mbps, the ?3 rotator is rotated through each of its possible phase values and symbol correlation takes four clock cycles to complete. If the data was modulated at 5.5 Mbps, ?3 is not rotated with a set value of 0 within the correlator architecture, thereby saving power and reducing symbol correlation and subsymbol prediction to a single cycle while in such transmission mode.Type: GrantFiled: September 19, 2002Date of Patent: March 6, 2007Assignee: Marvell International Ltd.Inventors: Guorong Hu, Yungping Hsu
-
Patent number: 7155386Abstract: An approach for adaptively adjusting the correlation window for open-loop pitch determination is presented. Correlation between a windowed reference signal (or target signal) and a candidate signal is maximized under most conditions by sliding the reference window by a delta increment in either direction to capture peak energy. The traditional fixed size of the correlation window is maintained. However, the window slides forward and/or backwards to capture peak energy within the window. The position of the adjusting or sliding window is allowed to shift in a small range or increment in either direction to maximize the energy of the windowed signal thus making sure that at least one peak energy is captured within the window.Type: GrantFiled: March 11, 2004Date of Patent: December 26, 2006Assignee: Mindspeed Technologies, Inc.Inventor: Yang Gao
-
Patent number: 7130795Abstract: A method is provided for detecting music in a speech signal having a plurality of frames. The method comprises obtaining one or more first pitch correlation candidates from a first frame of the plurality of frames; obtaining one or more second pitch correlation candidates from a second frame of the plurality of frames; selecting a pitch correlation (Rp) from the one or more first pitch correlation candidates and the one or more second pitch correlation candidates; and distinguishing music from background noise based on analyzing the pitch correlation (Rp).Type: GrantFiled: June 17, 2005Date of Patent: October 31, 2006Assignee: Mindspeed Technologies, Inc.Inventor: Yang Gao
-
Patent number: 7130292Abstract: A method and apparatus for enhancing the receiving and information identification functions of multiple access communications systems by employing one or more optical processors configured as a bank of 1-D correlators. The present invention is particularly useful in a DS/SS CDMA communications system, resulting in a multiuser CDMA system that approaches carrier to noise performance (C/N) as opposed to being limited by multiple access interference (MAI). The correlators are arranged in parallel to detect and/or demodulate the received signal, in conjunction with one or more complex algorithms to perform near-optimum multiuser detection, perform multipath combining and/or perform carrier Doppler compensation.Type: GrantFiled: January 19, 2001Date of Patent: October 31, 2006Assignee: Essex CorporationInventors: Terry M. Turpin, James L. Lafuse
-
Patent number: 7058569Abstract: A synthesis method for concatenative speech synthesis is provided for efficiently concatenating waveform segments in the time-domain. A digital waveform provider produces an input sequence of digital waveform segments. A waveform concatenator concatenates the input segments by using waveform blending within a concatenation zone to synchronize, weight, and overlap-add selected portions of the input segments to produce a single digital waveform. The synchronizing includes determining a minimum weighted energy anchor in the selected portion of each input segment and aligning synchronization peaks in a local vicinity of each anchor.Type: GrantFiled: September 14, 2001Date of Patent: June 6, 2006Assignee: Nuance Communications, Inc.Inventors: Geert Coorman, Bert Van Coile
-
Patent number: 7039582Abstract: A computationally efficient and robust pitch detection and tracking system and related methods are presented. According to certain exemplary implementations a method is presented comprising identifying an initial set of pitch period candidates using a first estimation algorithm, filtering the initial set of candidates and passing the filtered candidates through a second, more accurate pitch estimation algorithm to generate a final set of pitch period candidates from which the most likely pitch value is selected.Type: GrantFiled: February 22, 2005Date of Patent: May 2, 2006Assignee: Microsoft CorporationInventors: Eric I-Chao Chang, Jian-Lai Zhou
-
Patent number: 7035790Abstract: A speech processing system is provided which is operable to receive sets of signal values representative of a speech signal generated by a speech source as distorted by a transmission channel between the speech source and the speech processing system. The system stores data defining a predetermined function derived from a signal model which models both the speech source and the channel and defining a probability density function which gives, for a given set of model parameters, the probability that the signal model has those model parameters given that the signal model is assumed to have generated the received set of signal values. The system applies a current set of received signal values to the stored probability density function and then draws samples from it using a Gibbs sampler. The system then analyses the samples to determine a set of parameter values representative of the speech signal before it was distorted by the channel.Type: GrantFiled: May 30, 2001Date of Patent: April 25, 2006Assignee: Canon Kabushiki KaishaInventor: Jebu Jacob Rajan
-
Patent number: 7027980Abstract: A system or method for modeling a signal, such as a speech signal, in which harmonic frequencies and amplitudes are identified and the harmonic magnitudes are interpolated to obtain spectral magnitudes at a set of fixed frequencies. An inverse transform is applied to the spectral magnitudes to obtain a pseudo auto-correlation sequence, from which linear prediction coefficients are calculated. From the linear prediction coefficients, model harmonic magnitudes are generated by sampling the spectral envelope defined by the linear prediction coefficients. A set of scale factors are then calculated as the ratio of the harmonic magnitudes to the model harmonic magnitudes and interpolated to obtain a second set of scale factors at the set of fixed frequencies. The spectral envelope magnitudes at the set of fixed frequencies are multiplied by the second set of scale factors to obtain new spectral magnitudes and the process is iterated to obtain final linear prediction coefficients.Type: GrantFiled: March 28, 2002Date of Patent: April 11, 2006Assignee: Motorola, Inc.Inventors: Tenkasi V. Ramabadran, Aaron M. Smith, Mark A. Jasiuk
-
Patent number: 7016507Abstract: This invention describes a practical application of noise reduction in hearing aids. Although listening in noisy conditions is difficult for persons with normal hearing, hearing impaired individuals are at a considerable further disadvantage. Under light noise conditions, conventional hearing aids amplifying the input signal sufficiently to overcome the hearing loss. For a typical sloping hearing loss where there is a loss in high frequency hearing sensitivity, the amount of boost (or gain) rises with frequency. Most frequently, the loss in sensitivity is only for low-level signals; high level signals are affective minimally or not at all. A compression hearing aid is able to compensate by automatically lowering the gain as the input signal level rises. This compression action is usually compromised under noisy conditions.Type: GrantFiled: April 16, 1998Date of Patent: March 21, 2006Assignee: AMI Semiconductor Inc.Inventor: Robert Brennan
-
Patent number: 6999922Abstract: The present invention (110) permits a user to speed up and slow down speech without changing the speakers pitch (102, 110, 112, 128, 402–416). It is a user adjustable feature to change the spoken rate to the listeners' preferred listening rate or comfort. It can be included on the phone as a customer convenience feature without changing any characteristics of the speakers voice besides the speaking rate with soft key button (202) combinations (in interconnect or normal). From the users perspective, it would seem only that the talker changed his speaking rate, and not that the speech was digitally altered in any way. The pitch and general prosody of the speaker are preserved. The following uses of the time expansion/compression feature are listed to compliment already existing technologies or applications in progress including messaging services, messaging applications and games, real-time feature to slow down the listening rate.Type: GrantFiled: June 27, 2003Date of Patent: February 14, 2006Assignee: Motorola, Inc.Inventors: Marc Andre Boillot, John Gregory Harris, Thomas Lawrence Reinke
-
Patent number: 6952670Abstract: An extraction section extracts a speech signal having ambient noise superimposed thereon as a data segment having a predetermined duration. An autocorrelation function normalizing section determines normalized autocorrelation function vectors. A normalized autocorrelation function count section counts a given number of normalized autocorrelation function vectors. A noise vector region/speech vector region/undefined vector computation section classifies the normalized autocorrelation function vectors into any of a noise vector region, a speech vector region, or undefined vectors. When the latest normalized autocorrelation function vector acquired by a normalized autocorrelation function vector determination section pertains to the noise vector region, the speech signal is determined to be a noise segment. In contrast, when the latest vector does not pertain to the noise vector region, the input signal is determined to be a speech segment.Type: GrantFiled: July 17, 2001Date of Patent: October 4, 2005Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Shogo Iizuka, Shigeru Hosoi, Kazuki Hoshino
-
Patent number: 6871175Abstract: A voice encoding method includes the steps of encoding a first frame that contains a plurality of voice data into encoded parameters, locally decoding the encoded parameters of the first frame into a second frame, performing a plurality of interpolation recovery processes that generate respective frames approximating to the first frame by using a frame or frames other than the first frame, comparing the second frame with the frames approximating to the first frame generated by the plurality of interpolation recovery processes, calculating a signal to noise ratio of each of the frames approximating to the first frame by treating the second frame as the signal, determining an index number that indicates an interpolation recovery process which provides a highest signal to noise ratio, and multiplexing and transmitting the index number with the encoded parameters.Type: GrantFiled: March 22, 2001Date of Patent: March 22, 2005Assignee: Fujitsu Limited KawasakiInventor: Fumio Amano
-
Patent number: 6842731Abstract: A prediction parameter analysis apparatus comprises a windowing part which generates a short time input signal by subjecting an input signal or a signal derived from the input signal to windowing, a component removal part which removes an unnecessary component from the short time input signal to generate a modified short time input signal, an autocorrelation coefficient computation part which computes autocorrelation coefficients based on the modified short time input signal, and a prediction parameter computation part which computes prediction parameters based on the autocorrelation coefficients.Type: GrantFiled: May 16, 2002Date of Patent: January 11, 2005Assignee: Kabushiki Kaisha ToshibaInventor: Kimio Miseki
-
Publication number: 20040254786Abstract: The invention relates to a method for transcoding audio signals in a communications system. In order to improve the inter-operability between units (2,40) capable of handling wideband audio signals and units (3,46) or network components (50) capable of handling narrowband audio signals, it is proposed that first, an audio signal is received in a network element (42) of a communications network via which said audio signal is transmitted. Next, it is determined in said network element (42) whether a transcoding of the received audio signal is required. In case a narrowband-to-wideband transcoding of the received signal is required, the received narrowband audio signal is transcoded into a wideband audio signal in the network element (1,42). The generated wideband audio signal is then forwarded to the receiving terminal (2,40). The invention equally relates to a corresponding communications system and its components.Type: ApplicationFiled: January 14, 2004Publication date: December 16, 2004Inventors: Olli Kirla, Henrik Lepanaho, Teemu Himanen
-
Patent number: 6829578Abstract: Robust acoustic tone features are achieved first by the introduction of on-line, look-ahead trace back of the fundamental frequency (F0) contour with adaptive pruning, this fundamental frequency serves as the signal preprocessing front-end. The F0 contour is subsequently decomposed into lexical tone effect, phrase intonation effect, and random effect by means of time-variant, weighted moving average (MA) filter in conjunction with weighted (placing more emphasis on vowels) least squares of the F0 contour. The intonation effect is removed by subtraction of the F0 contour under superposition assumption. The acoustic tone features are defined as two parts. First, is the coefficients of the second order weighted regression of the de-intonation of the F0 contour over neighbouring frames. The second part deals with the degree of the periodicity of the signal, which are the coefficients of the second order regression of the auto-correlation.Type: GrantFiled: July 9, 2001Date of Patent: December 7, 2004Assignee: Koninklijke Philips Electronics, N.V.Inventors: Chang-Han Huang, Frank Torsten Bernd Seide
-
Patent number: 6823302Abstract: A method for providing real-time perceptual quality measurements of an audio signal (12) in which a quality test signal, including an audio test signal, is received by equipment under test. Playback of a pre-stored representation of the audio signal is coarsely synchronized (20) to the received audio test signal, for example, utilizing a synchronizing pulse in a header of the quality test signal. The playback is then finely synchronized (24) to the received audio signal, for example, by comparing data in a windowed portion of the received audio test signal and a windowed portion of the pre-stored representation of the audio test signal and by adjusting a windowed portion of the pre-stored representation of the audio test signal in accordance with results of the comparison. A window of the received audio test singal is then compared (14) to a portion of the finely synchronized play back of the pre-stored representation of the audio test signal to output a quality measurement of the received audio test signal.Type: GrantFiled: January 24, 2001Date of Patent: November 23, 2004Assignee: National Semiconductor CorporationInventors: Ian Atkinson, Martin Lee, Wei Ma, Kambiz Homayounfar
-
Publication number: 20040230423Abstract: To select the encoding mode of an audio signal in a multi-channel system, a level of energy of the audio signal associated with each channel is determined, which in turn is used to compute a first value. Next, a second value based on a degree of correlation of the signals of each channel is determined. If the first value is smaller than the second value, the audio signal is encoded using a first encoding mode. Next, a third value defined by the energy levels and a fourth value defined by the correlation are computed. If the first value is greater than the second value, and the third value is smaller than the fourth value, the audio signal is encoded using a second encoding mode. Otherwise the audio signal is encoded using a third encoding mode.Type: ApplicationFiled: May 16, 2003Publication date: November 18, 2004Applicant: Divio, Inc.Inventors: Christos Chrysafis, Siu-Leong Yu
-
Patent number: 6819275Abstract: Estimating a compression gain obtainable in compressing a given audio signal, comprising extracting a signal power in a selected frequency band of the given audio signal, and obtaining an estimation of the compression gain by correlation with the extracted signal power.Type: GrantFiled: May 8, 2002Date of Patent: November 16, 2004Assignee: Koninklijke Philips Electronics N.V.Inventors: Derk Reefman, Petrus Antonius Cornelis Maria Nuijten
-
Patent number: 6789059Abstract: Methods and apparatus for quickly selecting an optimal excitation waveform from a codebook are presented herein. To reduce the number of computations required to choose the optimal codebook vector, a subset of codevectors are selected based upon optimal pulse locations, wherein the subset of codevectors form a subcodebook. Rather than searching the entire codebook, only the entries of the subcodebook are searched.Type: GrantFiled: June 6, 2001Date of Patent: September 7, 2004Assignee: Qualcomm IncorporatedInventors: Ananthapadmanabhan Kandhadai, Andrew P. DeJaco, Sharath Manjunath
-
Patent number: 6785645Abstract: An efficient and accurate classification method for classifying speech and music signals, or other diverse signal types, is provided. The method and system are especially, although not exclusively, suited for use in real-time applications. Long-term and short-term features are extracted relative to each frame, whereby short-term features are used to detect a potential switching point at which to switch a coder operating mode, and long-term features are used to classify each frame and validate the potential switch at the potential switch point according to the classification and a predefined criterion.Type: GrantFiled: November 29, 2001Date of Patent: August 31, 2004Assignee: Microsoft CorporationInventors: Hosam Adel Khalil, Vladimir Cuperman, Tian Wang
-
Patent number: 6766289Abstract: Methods and apparatus for quickly selecting an optimal excitation waveform from a codebook are presented herein. In encoding schemes that use forward and backward pitch enhancement, storage and processor load is reduced by approximating a two-dimensional autocorrelation matrix with a one-dimensional autocorrelation vector. The approximation is possible when a cross-correlation element is configured to determine the autocorrelation matrix of an impulse response and a pulse energy determination element is configured to determine the energy of a pulse code vector that incorporates secondary pulse positions.Type: GrantFiled: June 4, 2001Date of Patent: July 20, 2004Assignee: Qualcomm IncorporatedInventors: Ananthapadmanabhan Kandhadai, Andrew P. DeJaco, Sharath Manjunath
-
Publication number: 20040098254Abstract: There are provided a search method of a fixed codebook, and more particularly, a focused search method and apparatus thereof, for being applied to a speech codec for Voice over Internet Protocol (VoIP). The focused search method of the fixed codebook includes: calculating absolute values of correlation vectors of respective pulse locations of tracks 0, 1, 2, and 3 and arranging the pulse locations in a descending order of the absolute values; and selecting a predetermined number of pulse locations for each track among candidate pulse locations arranged and conducting focused search of the selected result. Therefore, it is possible to significantly reduce a calculation amount required for fixed codebook search while maintaining tone quality in a similar level.Type: ApplicationFiled: November 12, 2003Publication date: May 20, 2004Inventors: Eung Don Lee, Do Young Kim, Bong Tae Kim
-
Publication number: 20040093202Abstract: Disclosed are a computerized method and system for the identification of identical or similar audio recordings or segments of audio recordings. Identity or similarity between a first audio segment of a first audio stream and at least a second audio segment of an at least second audio stream is determined by digitizing at least the first audio segment and the at least second audio segment of said audio streams, calculating characteristic signatures from at least one local feature of the first audio segment and the at least second audio segment, aligning the at least two characteristic signatures, comparing the at least two aligned characteristic signatures and calculating a distance between the aligned characteristic signatures and determining identity or similarity between the at least two audio segments based on the determined distance.Type: ApplicationFiled: September 12, 2003Publication date: May 13, 2004Inventors: Uwe Fischer, Stefan Hoffmann, Werner Kriechbaum, Gerhard Stenzel
-
Patent number: 6718305Abstract: Disclosed is a method for use by a speech recognizer. The method includes determining a regression class tree structure for the speech recognizer, wherein the tree structure includes, representing word subunits or regression classes, as tree leaves, combining the word subunits to form tree nodes using a distance measure for the word subunits in the acoustic space, and combining regression classes to a regression class that lies closer to a tree root of the tree structure using a correlation measure, and wherein at least two of regression classes having the largest correlation parameter are combined to a new regression class that is used in the formation of the regression tree structure, instead of the two combined regression classes, to determine a regression class representing the tree root.Type: GrantFiled: March 17, 2000Date of Patent: April 6, 2004Assignee: Koninklijke Philips Electronics N.V.Inventor: Reinhold Häb-Umbach
-
Patent number: 6694010Abstract: A fast-converging, computationally simple, method for recognizing a single frequency tone or a sinusoid in a signal without prior knowledge of the tone frequency. The method employs a second order or higher auto-regressive model and includes: (a) sampling the signal at a constant sampling rate, and, for each sample, recursively determining a finite number of correlation coefficients using a time-reversed, exponentially weighted, future sliding equivalent of the signal, wherein the correlation coefficients are determined using pre-existing values of the correlation coefficients determined in a previous iteration, a current sample of the signal and at least two consecutively previous samples of the signal; (b) periodically determining at least the second auto-regressive coefficient modeling the signal using the correlation coefficients; and (c) recognizing the presence of the tone based on the value of the second auto-regressive coefficient.Type: GrantFiled: April 27, 1999Date of Patent: February 17, 2004Assignee: Alcatel Canada Inc.Inventor: Eric Verreault
-
Patent number: 6687672Abstract: Methods and apparatus for blind channel estimation of a speech signal corrupted by a communication channel are provided. One method includes converting a noisy speech signal into either a cepstral representation or a log-spectral representation; estimating a correlation of the representation of the noisy speech signal; determining an average of the noisy speech signal; constructing and solving, subject to a minimization constraint, a system of linear equations utilizing a correlation structure of a clean speech training signal, the correlation of the representation of the noisy speech signal, and the average of the noisy speech signal; and selecting a sign of the solution of the system of linear equations to estimate an average clean speech signal in a processing window.Type: GrantFiled: March 15, 2002Date of Patent: February 3, 2004Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Younes Souilmi, Luca Rigazio, Patrick Nguyen, Jean-Claude Junqua
-
Patent number: 6675114Abstract: A system and method (and storage media) for identifying the category of the source noise and sound by using the physical factors derived from the autocorrelation function (ACF) and the interaural crosscorrelation function (IACF) which are ever changing in the time domain based on the model of human auditory brain function system. A method for evaluating a sound comprises the steps of: using the sound recorder to capture and record the acoustic signal; calculating the ACF from the acoustic signal using a CPU; calculating ACF factors extracted from the calculated ACF using the CPU; and identifying the kinds of the noise based on the ACF factors. Thereby the unknown noise source can be identified what it is (such as automobile noise, and factory noise), and can be identified its type such as type of cars or type of machines.Type: GrantFiled: July 10, 2002Date of Patent: January 6, 2004Assignee: Kobe UniversityInventors: Yoichi Ando, Hiroyuki Sakai
-
Patent number: 6662153Abstract: A time-separated speech coder that codes a transitional signal of voiced/unvoiced sound through harmonic speech coding, the coder including a transitional excitation signal analyzer/synthesizer for coding the transitional signal by extracting the harmonic model parameters of both transitional analyzers after detecting a transitional point and generating sinusoidal waveforms according to a variable transitional point separating both transitional analyzers. By the transitional point at which energy varies abruptly and the time-separated coding based on the transitional point, more improved speech quality than in the general harmonic speech coder can be obtained using the time-separated speech coder by increasing the representation capability of the transitional signal with large energy variation, after adapting it to the variable transitional point.Type: GrantFiled: January 24, 2001Date of Patent: December 9, 2003Assignee: Electronics and Telecommunications Research InstituteInventors: Hyoung Jung Kim, In Sung Lee, Jong Hark Kim, Man Ho Park, Byung Sik Yoon, Song In Choi, Dae Sik Kim
-
Patent number: 6658381Abstract: Techniques and systems for identifying coding rates of transmitted frames are described. Unused bits in rate adapted frames are used to carry frame type indicator patterns. Maximal rate frames (i.e., with a highest coding rate) need not include a frame type indicator.Type: GrantFiled: September 20, 2000Date of Patent: December 2, 2003Assignee: Telefonaktiebolaget LM Ericsson (publ)Inventors: Karl Hellwig, Robert Bäuml, Jesus Andonegui
-
Publication number: 20030177003Abstract: Methods and apparatus for blind channel estimation of a speech signal corrupted by a communication channel are provided. One method includes converting a noisy speech signal into either a cepstral representation or a log-spectral representation; estimating a correlation of the representation of the noisy speech signal; determining an average of the noisy speech signal; constructing and solving, subject to a minimization constraint, a system of linear equations utilizing a correlation structure of a clean speech training signal, the correlation of the representation of the noisy speech signal, and the average of the noisy speech signal; and selecting a sign of the solution of the system of linear equations to estimate an average clean speech signal in a processing window.Type: ApplicationFiled: March 15, 2002Publication date: September 18, 2003Inventors: Younes Souilmi, Luca Rigazio, Patrick Nguyen, Jean-Claude Junqua
-
Patent number: 6622117Abstract: In connection with blind source separation, proposed herein, inter alia, are: expectation-maximization equations to iteratively estimate unmixing filters and source density parameters in the context of convolutive independent component analysis where the sources are modeled with mixtures of Gaussians; a scheme to estimate the length of unmixing filters; and two alternative initialization schemes.Type: GrantFiled: May 14, 2001Date of Patent: September 16, 2003Assignee: International Business Machines CorporationInventors: Sabine Deligne, Ramesh A. Gopinath
-
Publication number: 20030171918Abstract: A method of filtering digital audio data with short delay according to the present invention comprises the steps of: (a) buffering input source digital audio data; (b) calculating digital data that are being filtered from the buffered input source digital audio data; and (c) outputting a portion of the filtered digital data. The source digital audio data are sequence of digital samples. Buffer contents are shifted to arrange some of input samples in a buffer in the step (a). The step (b) comprises the steps of: (b-1) calculating a correlation matrix; (b-2) decomposing the correlation matrix; (b-3) calculating a filter matrix; and (b-4) calculating an approximation of the filtered digital data.Type: ApplicationFiled: February 21, 2003Publication date: September 11, 2003Inventors: Mikhael A. Sall, Sergei N. Gramnitskiy, Alexandr L. Maiboroda, Victor V. Redkov, Anatoli I. Tikhotsky, Andrei B. Viktorov
-
Publication number: 20030125938Abstract: The present invention discloses a complete speech recognition system having a training button and a recognition button, and the whole system uses the application specific integrated circuit (ASIC) architecture for the design, and also uses the modular design to divide the speech processing into 4 modules: system control module, autocorrelation and linear predictive coefficient module, cepstrum module, and DTW recognition module. Each module forms an intellectual product (IP) component by itself. Each IP component can work with various products and application requirements for the design reuse to greatly shorten the time to market.Type: ApplicationFiled: December 24, 2002Publication date: July 3, 2003Inventors: Jhing-Fa Wang, Jia-Ching Wang, Tai-Lung Chen, Chin-Chan Chang
-
Publication number: 20030101050Abstract: An efficient and accurate classification method for classifying speech and music signals, or other diverse signal types, is provided. The method and system are especially, although not exclusively, suited for use in real-time applications. Long-term and short-term features are extracted relative to each frame, whereby short-term features are used to detect a potential switching point at which to switch a coder operating mode, and long-term features are used to classify each frame and validate the potential switch at the potential switch point according to the classification and a predefined criterion.Type: ApplicationFiled: November 29, 2001Publication date: May 29, 2003Applicant: Microsoft CorporationInventors: Hosam Adel Khalil, Vladimir Cuperman, Tian Wang
-
Patent number: 6564183Abstract: A speech encoding/decoding apparatus. A speech encoding apparatus has a coding portion for receiving input information related to an uncoded signal representative of an original speech signal, the coding portion including a fixed coding portion for receiving the input information and producing a first coded signal estimate, and an adaptive coding portion for receiving the input information and producing a second coded signal estimate. A controller is connected to the fixed coding portion and the adaptive coding portion for receiving information indicative of speech characteristics of the uncoded signal and generates a control signal; and a code modifier receives the first coded signal estimate from the fixed coding portion and the control signal from the controller and produces a modified signal estimate.Type: GrantFiled: December 22, 1999Date of Patent: May 13, 2003Assignee: Telefonaktiebolaget LM Erricsson (Publ)Inventors: Roar Hagen, Erik Ekudden
-
Publication number: 20030088405Abstract: A method of processing a decoded speech (DS) signal including successive DS frames, each DS frame including DS samples. The method comprises: adaptively filtering the DS signal to produce a filtered signal; gain-scaling the filtered signal with an adaptive gain updated once a DS frame, thereby producing a gain-scaled signal; and performing a smoothing operation to smooth possible waveform discontinuities in the gain-scaled signal.Type: ApplicationFiled: August 9, 2002Publication date: May 8, 2003Applicant: Broadcom CorporationInventors: Juin-Hwey Chen, Jes Thyssen, Chris C. Lee
-
Publication number: 20030046066Abstract: Methods and apparatus for quickly selecting an optimal excitation waveform from a codebook are presented herein. To reduce the number of computations required to choose the optimal codebook vector, a subset of codevectors are selected based upon optimal pulse locations, wherein the subset of codevectors form a subcodebook. Rather than searching the entire codebook, only the entries of the subcodebook are searched.Type: ApplicationFiled: June 6, 2001Publication date: March 6, 2003Inventors: Ananthapadmanabhan Kandhadai, Andrew P. DeJaco, Sharath Manjunath
-
Patent number: 6502067Abstract: A method for processing a sound signal y in which redundancy, consisting mainly of almost repetitions of signal profiles, is detected and correlations between the signal profiles are determined within segments of the sound signal. Correlated signal components are allocated to a power component and uncorrelated signal components to a noise component of the sound signal. The correlations between the signal profiles are determined by methods of nonlinear noise reduction in deterministic systems in reconstructed vector spaces based on the time domain.Type: GrantFiled: December 17, 1999Date of Patent: December 31, 2002Assignee: Max-Planck-Gesellschaft zur Forderung der Wissenschaften e.V.Inventors: Rainer Hegger, Holger Kantz, Lorenzo Matassini
-
Patent number: 6502068Abstract: A speech coding apparatus includes a pulse position candidate table, inter-pulse distortion table, first and second reference address tables, first and second reference address table creation units, and search table creation unit. The pulse position candidate table stores the pulse position candidate of each pulse. The inter-pulse distortion table stores a distortion calculated every pulse interval. The first reference address table creation unit regards the pulse position of the inter-pulse distortion table as a relative distance from the start of the inter-pulse distortion table, calculates a distortion every pulse interval to obtain the absolute address of the inter-pulse distortion table, and stores it in the first reference address table. Table creation unit creates a second reference address table accordingly. A multipulse search table is created using these absolute addresses.Type: GrantFiled: September 18, 2000Date of Patent: December 31, 2002Assignee: NEC CorporationInventor: Katsuya Misu
-
Patent number: 6477490Abstract: An audio signal compression apparatus for compressively coding an input audio signal comprises a time-to-frequency transformation unit for transforming the input audio signal to a frequency domain signal; a spectrum envelope calculation unit for calculating a spectrum envelope having different resolutions for different frequencies, from the input audio signal, using a weighting function on frequency based on human auditory characteristics; a normalization unit for normalizing the frequency domain signal using the spectrum envelope to obtain a residual signal; a power normalization unit for normalizing the residual signal by the power; an auditory weighting calculation unit for calculating weighting coefficients on frequency, based on the spectrum of the input audio signal and human auditory characteristics; and a multi-stage quantization device having plural stages of vector quantizers connected in series, to which the normalized residual signal is input, and at least one of the vector quantizers quantizing tType: GrantFiled: June 28, 2001Date of Patent: November 5, 2002Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Yoshihisa Nakatoh, Takeshi Norimatsu, Mineo Tsushima, Tomokazu Ishikawa, Mitsuhiko Serikawa, Taro Katayama, Junichi Nakahashi, Yoriko Yagi
-
Patent number: 6463406Abstract: An analyzer and synthesizer (500) for human speech using LPC filtering (530) of an excitation of mixed (508-518-520) voiced pulse train (502) and unvoiced noise (512) with fractional sampling period pitch period determination.Type: GrantFiled: May 20, 1996Date of Patent: October 8, 2002Assignee: Texas Instruments IncorporatedInventor: Alan V. McCree
-
Patent number: 6424942Abstract: A method and arrangement for telecommunication comprises that it is detected (120) whether an incoming signal is speech or background noise, and encoding (100, 110) and transmitting parameters characterising the incoming signal. In or before (103) in the encoding of the background noise, parameters are produced, which represent background noise having increased low frequency components. Thus, the incoming signal can be subjected (103) to a frequency tilting operation. The degree of increasing the low frequency components is determined by the maximum long term correlation of the incoming signal. This method and arrangement provides a better generation of comfort noise, when the input signal comprises low frequency sinusoids, such as engine noise from cars and trams.Type: GrantFiled: October 25, 1999Date of Patent: July 23, 2002Assignee: Telefonaktiebolaget LM Ericsson (publ)Inventors: Peter Mustel, Ingemar Johansson
-
Patent number: 6424938Abstract: Perceptually relevant non-speech information can be preserved during encoding of an audio signal by determining whether the audio signal includes such information. If so, a speech/noise classification of the audio signal is overriden to prevent misclassification of the audio signal as noise.Type: GrantFiled: November 5, 1999Date of Patent: July 23, 2002Assignee: Telefonaktiebolaget L M EricssonInventors: Ingemar Johansson, Erik Ekudden, Jonas Svedberg, Anders Uvliden
-
Publication number: 20020065648Abstract: A voice encoding method includes the steps of encoding a first frame that contains a plurality of voice data into encoded parameters, locally decoding the encoded parameters of the first frame into a second frame, performing a plurality of interpolation recovery processes that generate respective frames approximating to the first frame by using a frame or frames other than the first frame, comparing the second frame with the frames approximating to the first frame generated by the plurality of interpolation recovery processes, calculating a signal to noise ratio of each of the frames approximating to the first frame by treating the second frame as the signal, determining an index number that indicates an interpolation recovery process which provides a highest signal to noise ratio, and multiplexing and transmitting the index number with the encoded parameters.Type: ApplicationFiled: March 22, 2001Publication date: May 30, 2002Inventor: Fumio Amano
-
Patent number: 6385548Abstract: An apparatus and method to characterize an input communication signal as being a voice, tone or noise signal is provided. The apparatus and method involve measuring variations of pitch over time from a sampled input signal. A minimum value of Average Magnitude Difference Function (AMDF) over a pitch range and an average variation value of the AMDF over sampled intervals are used to determine whether the signal is a voice signal, a tone or noise. Historical data of these values is maintained in a dual buffer arrangement and is used in the determination of signal type by detecting transitions.Type: GrantFiled: December 12, 1997Date of Patent: May 7, 2002Assignee: Motorola, Inc.Inventors: Satish Ananthaiyer, Eric David Elias
-
Patent number: RE38889Abstract: A pitch period extracting apparatus includes a microcomputer which determines a sampling frequency for an A/D converter, and a range of delay times for calculating autocorrelative values on the basis of the sampling frequency. For example, the delay times are set within a range of 20 samples?k?100 samples in a case of 8 kHz, and a range of 15 samples?k?75 samples in a case of 6 kHz. The microcomputer calculates the autocorrelative values of speech signal data stored in a buffer memory, and outputs a delay time at which a maximum autocorrelative value is obtainable as a pitch period of an inputted speech signal.Type: GrantFiled: October 6, 2000Date of Patent: November 22, 2005Assignee: Sanyo Electric Co., Ltd.Inventor: Takeo Inoue