Correlation Function Patents (Class 704/216)
  • Patent number: 8032361
    Abstract: An audio processing apparatus for processing two sampled audio signals to detect a temporal position of one of the audio signals with respect to the other. The apparatus detects audio power characteristics of each signal in respect of successive continuous temporal portions of each of the two signals, the portions having identical lengths and each portion including at least two audio samples, and correlates the detected audio power characteristics in respect of the two audio signals to establish a most likely temporal offset between the two audio signals.
    Type: Grant
    Filed: October 27, 2006
    Date of Patent: October 4, 2011
    Assignee: Sony United Kingdom Limited
    Inventors: William Edmund Cranstoun Kentish, Nicolas John Haynes
  • Patent number: 8015000
    Abstract: An audio decoding system performs frame loss concealment (FLC) when portions of a bit stream representing an audio signal are lost within the context of a digital communication system. The audio decoding system employs two different FLC methods: one designed to perform well for music, and the other designed to perform well for speech. When a frame is deemed lost, the audio decoding system analyzes a previously-decoded audio signal corresponding to previously-decoded frames of an audio bit-stream. Based on the results of the analysis, the lost frame is classified as either speech or music. Using this classification, other signal analysis, and knowledge of the employed FLC methods, the audio decoding system selects the appropriate FLC method which then performs FLC on the lost frame.
    Type: Grant
    Filed: April 13, 2007
    Date of Patent: September 6, 2011
    Assignee: Broadcom Corporation
    Inventors: Robert W. Zopf, Juin-Hwey Chen, Jes Thyssen
  • Patent number: 8010350
    Abstract: A method and system for refining an estimated pitch period estimate based on a coarse pitch useful for performing frame loss concealment in an audio decoder as well as for other applications. A normalized correlation at the coarse pitch lag is computed and used as the current best candidate. The normalized correlation is then evaluated at the midpoint of the refinement pitch range on either side of the current best candidate. If the normalized correlation at either midpoint is greater than the current best lag, the midpoint with the maximum correlation is selected as the current best lag. After each iteration, the refinement range is decreased by a factor of two and centered on the current best lag. This bisectional search continues until the pitch has been refined to an acceptable tolerance or until the refinement range has been exhausted. During each step of the bisectional pitch refinement, the signal is decimated to reduce the complexity of computing the normalized correlation.
    Type: Grant
    Filed: April 13, 2007
    Date of Patent: August 30, 2011
    Assignee: Broadcom Corporation
    Inventor: Robert W. Zopf
  • Patent number: 8009966
    Abstract: Digital audio and video files are created corresponding to selected scenes from a creative production and are provided with a processing system that enables dialog to be selected from a scene and replaced by a user's dialog which is automatically synchronized with the original dialog so as to be in synchronism with lip movements displayed by the accompanying video display. The processing further includes a graphical user interface that presents the user with the video, the text of the dialog, and cues for rehearsal and recording of replacement dialog by the user. Replay of the user's dialog is accompanied by the video and part of the original audio except that the original dialog corresponding to the user's dialog is muted so that the user's dialog is heard as a replacement. Singing or other sounds associated with visible action may also be replaced by the same processes.
    Type: Grant
    Filed: October 28, 2003
    Date of Patent: August 30, 2011
    Assignee: Synchro Arts Limited
    Inventors: Phillip Jeffrey Bloom, William John Ellwood
  • Patent number: 7996216
    Abstract: In one embodiment, at least first and second channels in a frame of the audio signal are independently subdivided into blocks if the first and second channels are not correlated with each other. At least two of the blocks have different block lengths. Furthermore, the first and second channels are correspondingly subdivided into blocks such that the lengths of the blocks into which the second channel is subdivided correspond to the lengths of the blocks into which the first channel is subdivided if the first and second channels are correlated with each other. At least two of the blocks have different block lengths.
    Type: Grant
    Filed: July 7, 2006
    Date of Patent: August 9, 2011
    Assignee: LG Electronics Inc.
    Inventor: Tilman Liebchen
  • Patent number: 7991045
    Abstract: A device and a method for testing signal-receiving sensitivity of an electronic subassembly are provided. The device includes a control board and a computer. The control board is connected to the electronic subassembly. The computer is connected to the control board and also connected to the electronic subassembly. Wherein signals sent by the computer are compared with signals received by the computer for adjusting predetermined parameters associated with the signal-receiving sensitivity.
    Type: Grant
    Filed: May 29, 2006
    Date of Patent: August 2, 2011
    Assignee: Hon Hai Precision Industry Co., Ltd.
    Inventor: Shou-Kuo Hsu
  • Publication number: 20110150227
    Abstract: Provided is a signal processing method which calculates a correlation coefficient indicating the degree of relation in a stereo signal and extracts a speech signal from the stereo signal by using the correlation coefficient and the stereo signal.
    Type: Application
    Filed: October 28, 2010
    Publication date: June 23, 2011
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Sun-min KIM
  • Patent number: 7962332
    Abstract: In one embodiment, the method includes receiving an audio data frame having at least first and second channels. The first and second channels are independently subdivided into blocks if the first and second channels are not correlated with each other. The first and second channels are decoded, and the subdivided blocks of the first and second channels are not interleaved if the first and second channels are independently subdivided.
    Type: Grant
    Filed: September 18, 2008
    Date of Patent: June 14, 2011
    Assignee: LG Electronics Inc.
    Inventor: Tilman Liebchen
  • Patent number: 7949521
    Abstract: A fixed codebook searching apparatus which slightly suppresses an increase in the operation amount, even if the filter applied to the excitation pulse has the characteristic that it cannot be represented by a lower triangular matrix and realizes a quasi-optimal fixed codebook search. This fixed codebook searching apparatus is provided with an algebraic codebook that generates a pulse excitation vector; a convolution operation section that convolutes an impulse response of auditory weighted synthesis filter into an impulse response vector that has a value at negative times, to generate a second impulse response vector that has a value at second negative times; a matrix generating section that generates a Toeplitz-type convolution matrix by means of the second impulse response vector; and a convolution operation section that convolutes the matrix generated by matrix generating section into the pulse excitation vector generated by algebraic codebook.
    Type: Grant
    Filed: February 25, 2009
    Date of Patent: May 24, 2011
    Assignee: Panasonic Corporation
    Inventors: Hiroyuki Ehara, Koji Yoshida
  • Patent number: 7933366
    Abstract: A channel estimation method and system using linear correlation based interference cancellation combined with decision-feedback-equalization (LCIC-DFE) are provided. The channel estimation method includes generating a first correlation sequence by calculating a linear correlation between a baseband sampled complex signal and a locally stored pseudo-noise signal and obtaining a second correlation sequence by iteratively removing inter-path interference from the first correlation sequence and generating a first channel impulse response (CIR) sequence based on the second correlation sequence. And, obtaining a third correlation sequence by removing random-data interference from the second correlation sequence based on the first CIR sequence and a feedback signal and generating a second CIR sequence based on the third correlation sequence.
    Type: Grant
    Filed: May 4, 2007
    Date of Patent: April 26, 2011
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Guanghui Liu
  • Patent number: 7930184
    Abstract: A lossless audio codec encodes/decodes a lossless variable bit rate (VBR) bitstream with random access point (RAP) capability to initiate lossless decoding at a specified segment within a frame and/or multiple prediction parameter set (MPPS) capability partitioned to mitigate transient effects. This is accomplished with an adaptive segmentation technique that fixes segment start points based on constraints imposed by the existence of a desired RAP and/or detected transient in the frame and selects a optimum segment duration in each frame to reduce encoded frame payload subject to an encoded segment payload constraint. In general, the boundary constraints specify that a desired RAP or detected transient must lie within a certain number of analysis blocks of a segment start point.
    Type: Grant
    Filed: January 30, 2008
    Date of Patent: April 19, 2011
    Assignee: DTS, Inc.
    Inventor: Zoran Fejzo
  • Patent number: 7930172
    Abstract: Portions from time-domain speech segments are extracted. Feature vectors that represent the portions in a vector space are created. The feature vectors incorporate phase information of the portions. A distance between the feature vectors in the vector space is determined. In one aspect, the feature vectors are created by constructing a matrix W from the portions and decomposing the matrix W. In one aspect, decomposing the matrix W comprises extracting global boundary-centric features from the portions. In one aspect, the portions include at least one pitch period. In another aspect, the portions include centered pitch periods.
    Type: Grant
    Filed: December 8, 2009
    Date of Patent: April 19, 2011
    Assignee: Apple Inc.
    Inventor: Jerome R. Bellegarda
  • Patent number: 7930175
    Abstract: A noise reduction system includes a microphone configured to detect an acoustic signal. A first digitizer converts an output of the microphone into a discrete output signal. An acoustic sensor detects structure-borne noise, and a second digitizer converts an output of the acoustic sensor into a discrete acoustic noise reference signal. A noise compensation circuit processes the discrete output signal based on the discrete acoustic noise reference signal.
    Type: Grant
    Filed: June 25, 2007
    Date of Patent: April 19, 2011
    Assignee: Nuance Communications, Inc.
    Inventors: Tim Haulick, Martin Roessler, Klaus Alois Haindl
  • Patent number: 7894654
    Abstract: A voice data processing apparatus which converts voice data to voice playback data by an OLA method to correspond to a set magnification of playback velocity, including a voice data block setting device which partitions the voice data to set a plurality of voice data blocks, a segment setting device which sets voice data segments to the voice data to correspond to respective voice data blocks, a segment adjuster which adjusts positions and lengths on a time base, of the voice data segments set by the segment setting device, and a voice playback data generator which combines the respective voice data segments adjusted by the segment adjuster so as to overlap each other along the time base thereby generating the voice playback data.
    Type: Grant
    Filed: July 7, 2009
    Date of Patent: February 22, 2011
    Assignee: GE Medical Systems Global Technology Company, LLC
    Inventors: Shin Hirota, Yoshihiro Oda, Tetsu Miyagawa
  • Patent number: 7848921
    Abstract: An audio encoding apparatus capable of improving a frame cancellation error tolerance without increasing a number of bits of a fixed codebook in a CELP type audio encoding. A linear prediction analyzer analyzes an input digital speech signal and outputs linear predictive coefficients. A linear predictive coefficients quantizer quantizes the linear predictive coefficients. A low-frequency-band component encoder encodes a down-sampled linear-predictive residual signal by a pulse-code-modulation encoder and generates low-frequency-band component encoded information, while a high-frequency-band component encoder encodes an error signal between a linear-predictive residual signal and an up-sampled signal of a decoded down-sampled linear-predictive residual signal by a code-excited-linear-prediction encoder and generates high-frequency-band component encoded information.
    Type: Grant
    Filed: August 29, 2005
    Date of Patent: December 7, 2010
    Assignee: Panasonic Corporation
    Inventor: Hiroyuki Ehara
  • Patent number: 7809559
    Abstract: A method for removing periodic noise pulses from a continuous audio signal generated in a pressurized air delivery system includes the steps of: detecting, in a time-windowed segment of the continuous audio signal generated in the pressurized air delivery system, a plurality of the periodic noise pulses having a pulse period and being representable in the form of a plurality of signal components combined by convolution; deconvolving the plurality of signal components to generate a plurality of deconvolved signal components; and removing at least a portion of the periodic noise pulses from the time-windowed segment of the continuous audio signal using the deconvolved signal components.
    Type: Grant
    Filed: July 24, 2006
    Date of Patent: October 5, 2010
    Assignee: Motorola, Inc.
    Inventors: William M. Kushner, Sara M. Harton
  • Patent number: 7756715
    Abstract: Apparatus, method, and medium for processing an audio signal using a correlation between bands are provided. The apparatus includes an encoding unit encoding an input audio signal and a decoding unit decoding the encoded input audio signal.
    Type: Grant
    Filed: November 17, 2005
    Date of Patent: July 13, 2010
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Junghoe Kim, Dohyung Kim, Sihwa Lee
  • Publication number: 20100174534
    Abstract: A method of encoding speech, the method comprising: receiving a signal representative of speech to be encoded; at each of a plurality of intervals during the encoding, determining a pitch lag between portions of the signal having a degree of repetition; selecting for a set of said intervals a pitch lag vector from a pitch lag codebook of such vectors, each pitch lag vector comprising a set of offsets corresponding to the offset between the pitch lag determined for each said interval and an average pitch lag for said set of intervals, and transmitting an indication of the selected vector and said average over a transmission medium as part of the encoded signal representative of said speech.
    Type: Application
    Filed: June 5, 2009
    Publication date: July 8, 2010
    Inventor: Koen Bernard Vos
  • Patent number: 7752037
    Abstract: A method of determining a pitch period of an audio signal using a correlation-based signal derived from the audio signal. The correlation-based signal includes known peaks each corresponding to a respective one of known time lags. The known peaks includes a global maximum peak. The method comprises: (a) determining if a candidate peak among the local peaks exceeds a peak threshold; (b) determining if a candidate time lag corresponding to the candidate peak is within a predetermined range of at least one integer sub-multiple of the time lag corresponding to the global maximum peak; and (c) setting the pitch period equal to the candidate time lag when the determinations of both steps (a) and (b) are true.
    Type: Grant
    Filed: October 31, 2002
    Date of Patent: July 6, 2010
    Assignee: Broadcom Corporation
    Inventor: Juin-Hwey Chen
  • Patent number: 7752038
    Abstract: Autocorrelation values are determined as a basis for an estimation of a pitch lag in a segment of an audio signal. A first considered delay range for the autocorrelation computations is divided into a first set of sections, and first autocorrelation values are determined for delays in a plurality of sections of this first set of sections. A second considered delay range for the autocorrelation computations is divided into a second set of sections such that sections of the first set and sections of the second set are overlapping. Second autocorrelation values are determined for delays in a plurality of sections of this second set of sections.
    Type: Grant
    Filed: October 13, 2006
    Date of Patent: July 6, 2010
    Assignee: Nokia Corporation
    Inventors: Lasse Laaksonen, Anssi Ramo, Adriana Vasilache
  • Patent number: 7734464
    Abstract: An autocorrelation trigger comprising a correlator detector for producing a correlation coefficient by correlating a signal with a time-delayed version of the signal and generating a trigger in real-time when the correlation coefficient corresponds to a predetermined condition is provided. A method of producing trigger based upon an autocorrelation measurement is also provided. The autocorrelation trigger may be used to produce a trigger based upon the degree to which the autocorrelation relates to an autocorrelation model, such as, the degree of randomness in a signal.
    Type: Grant
    Filed: May 20, 2005
    Date of Patent: June 8, 2010
    Assignee: Tektronix, Inc.
    Inventor: Kyle L. Bernard
  • Patent number: 7720230
    Abstract: At an audio encoder, cue codes are generated for one or more audio channels, wherein an envelope cue code is generated by characterizing a temporal envelope in an audio channel. At an audio decoder, E transmitted audio channel(s) are decoded to generate C playback audio channels, where C>E?1. Received cue codes include an envelope cue code corresponding to a characterized temporal envelope of an audio channel corresponding to the transmitted channel(s). One or more transmitted channel(s) are upmixed to generate one or more upmixed channels. One or more playback channels are synthesized by applying the cue codes to the one or more upmixed channels, wherein the envelope cue code is applied to an upmixed channel or a synthesized signal to adjust a temporal envelope of the synthesized signal based on the characterized temporal envelope such that the adjusted temporal envelope substantially matches the characterized temporal envelope.
    Type: Grant
    Filed: December 7, 2004
    Date of Patent: May 18, 2010
    Assignees: Agere Systems, Inc., Fraunhofer-Gesellschaft zur Forderung der angewandten Forschung e.V.
    Inventors: Eric Allamanche, Sascha Disch, Christof Faller, Juergen Herre
  • Patent number: 7711553
    Abstract: A method and apparatus performing blind source separation using frequency-domain normalized multichannel blind deconvolution. Multichannel mixed signals are frames of N samples including r consecutive blocks of M samples. The frames are separated using separating filters in frequency domain in an overlap-save manner by discrete Fourier transform (DFT). The separated signals are then converted back into time domain using inverse DFT applied to a nonlinear function. Cross-power spectra between separated signals and nonlinear-transformed signals are computed and normalized by power spectra of both separated signals and nonlinear-transformed signals to have flat spectra. Time domain constraint is then applied to preserve first L cross-correlations. These alias-free normalized cross-power spectra are further constrained by nonholonomic constraints. Then, natural gradient is computed by convolving alias-free normalized cross-power spectra with separating filters.
    Type: Grant
    Filed: February 26, 2005
    Date of Patent: May 4, 2010
    Inventor: Seung Hyon Nam
  • Patent number: 7653537
    Abstract: A system and method is provided for determining whether a data frame of a coded speech signal corresponds to voice or to noise. In one embodiment, a voice activity detector determines a cross-correlation of data. If the cross-correlation is lower than a predetermined cross-correlation value, then the data frame corresponds to noise. If not, then the voice activity detector determines a periodicity of the cross-correlation and a variance of the periodicity. If the variance is less than a predetermined variance value, then the data frame corresponds to voice. In another embodiment, a method determines energy of the data frame and an average energy of the coded speech signal. If the data frame is one of a predetermined number of initial data frames, then a comparison between the average energy to the energy of the data frame is used to determine whether the data frame is noise or voice.
    Type: Grant
    Filed: September 28, 2004
    Date of Patent: January 26, 2010
    Assignee: STMicroelectronics Asia Pacific Pte. Ltd.
    Inventors: Kabi Prakash Padhi, Sapna George
  • Publication number: 20100008556
    Abstract: A voice data processing apparatus which converts voice data to voice playback data by an OLA method to correspond to a set magnification of playback velocity, including a voice data block setting device which partitions the voice data to set a plurality of voice data blocks, a segment setting device which sets voice data segments to the voice data to correspond to respective voice data blocks, a segment adjuster which adjusts positions and lengths on a time base, of the voice data segments set by the segment setting device, and a voice playback data generator which combines the respective voice data segments adjusted by the segment adjuster so as to overlap each other along the time base thereby generating the voice playback data.
    Type: Application
    Filed: July 7, 2009
    Publication date: January 14, 2010
    Inventors: Shin Hirota, Yoshihiro Oda, Tetsu Miyagawa
  • Patent number: 7643990
    Abstract: Portions from time-domain speech segments are extracted. Feature vectors that represent the portions in a vector space are created. The feature vectors incorporate phase information of the portions. A distance between the feature vectors in the vector space is determined. In one aspect, the feature vectors are created by constructing a matrix W from the portions and decomposing the matrix W. In one aspect, decomposing the matrix W comprises extracting global boundary-centric features from the portions. In one aspect, the portions include at least one pitch period. In another aspect, the portions include centered pitch periods.
    Type: Grant
    Filed: October 23, 2003
    Date of Patent: January 5, 2010
    Assignee: Apple Inc.
    Inventor: Jerome R. Bellegarda
  • Patent number: 7636659
    Abstract: In accordance with the present invention, computer implemented methods and systems are provided for representing and modeling the temporal structure of audio signals. In response to receiving a signal, a time-to-frequency domain transformation on at least a portion of the received signal to generate a frequency domain representation is performed. The time-to-frequency domain transformation converts the signal from a time domain representation to the frequency domain representation. A frequency domain linear prediction (FDLP) is performed on the frequency domain representation to estimate a temporal envelope of the frequency domain representation. Based on the temporal envelope, one or more speech features are generated.
    Type: Grant
    Filed: March 25, 2005
    Date of Patent: December 22, 2009
    Assignee: The Trustees of Columbia University in the City of New York
    Inventors: Marios Athineos, Daniel P. W. Ellis
  • Patent number: 7630889
    Abstract: A code conversion method for converting first code string data conforming to a first speech coding scheme into second code string data conforming to a second speech coding scheme has the steps of decoding the first code string data to generate a first decoded speech, correcting the signal characteristics of the first decoded speech to generate a second decoded speech, and encoding the second decoded speech in accordance with the second speech coding scheme to generate the second code string data.
    Type: Grant
    Filed: March 31, 2004
    Date of Patent: December 8, 2009
    Assignee: NEC Corporation
    Inventor: Atsushi Murashima
  • Patent number: 7627482
    Abstract: A sound signal encoder for high efficiency encoding of sound signals from a plurality of channels is provided which includes a to-be-correlated object setter (52), to-be-correlated object selector (56) and a variable-length encoder (58). The to-be-correlated object setter (52) sets, on the basis of left-channel frequency information held in a left-channel frequency information holder (50) and right-channel frequency information held in a right-channel frequency information holder (51), index [i] indicating which ones of sine waves on the left channel are to be correlated with, namely, are to be subtracted from, sine waves on the right channel. The to-be-correlated object selector (56) selects a default value read from a storage unit (55) or index [i]-th amplitude information read from a left-channel amplitude information holder (53) as an object to be subtracted from the i-th amplitude information on the right channel according to the index [i].
    Type: Grant
    Filed: December 5, 2007
    Date of Patent: December 1, 2009
    Assignee: Sony Corporation
    Inventors: Minoru Tsuji, Shiro Suzuki, Keisuke Toyama
  • Patent number: 7590525
    Abstract: A method and system are provided for synthesizing a number of corrupted frames output from a decoder including one or more predictive filters. The corrupted frames are representative of one segment of a decoded signal (sq(n)) output from the decoder. The method comprises determining a first preliminary time lag (ppfe1) based upon examining a predetermined number (K) of samples of another segment of the decoded signal and determining a scaling factor (ptfe) associated with the examined number (K) of samples when the first preliminary time lag (ppfe1) is determined. The method also comprises extrapolating one or more replacement frames based upon the first preliminary time lag (ppfe1) and the scaling factor (ptfe).
    Type: Grant
    Filed: August 19, 2002
    Date of Patent: September 15, 2009
    Assignee: Broadcom Corporation
    Inventor: Juin-Hwey Chen
  • Publication number: 20090132243
    Abstract: A plurality of pairs of segments to be weighted/added are selected non-linearly with respect to a time axis of audio data. A speed conversion is achieved by performing the weighting/addition on the selected pairs of segments. The non-linear selection is performed by (a) obtaining all possible pairs of segments constituting the audio data, (b) calculating a degree of similarity pertaining to each possible pair, (c) ranking the all possible pairs of segments according to the degrees of similarity, and (d) overlapping at least one of the all possible pairs of segments that holds the highest degree of similarity.
    Type: Application
    Filed: January 23, 2007
    Publication date: May 21, 2009
    Inventor: Ryoji Suzuki
  • Publication number: 20090132242
    Abstract: A portable audio recording and playback system is provided. The portable audio recording and playback system can synchronously display the corresponding lyrics or character contents, and it also can selectively record user's voice or the mixed signal of the user's voice and a background music including music and a human voice of singer during playback. Or the volume of the human voice in the background music can be increased or decreased corresponding to that of the music and be outputted or repeatedly played with changing speech speed without the tone changed for achieving the effects of on-the-go vocal and/or music accompaniment, voice recording, and language learning with changing speech speed without the tone changed.
    Type: Application
    Filed: November 19, 2007
    Publication date: May 21, 2009
    Applicant: COOL-IDEA TECHNOLOGY CORP.
    Inventors: Wei-Sheng WANG, Cheng-Wen TSAI, Szu-Hsuan WANG, Chun-Chang CHANG
  • Patent number: 7536299
    Abstract: Transmitters and receivers in multiple description coding systems use correlating and decorrelating transforms to generate and process multiple descriptions of elements of an input signal. The multiple descriptions include groups of correlating transform coefficients that permit recovery of an inexact facsimile of the signal if some of the correlating transform coefficients are lost or corrupted during transmission. Noiseless implementations of the correlating and decorrelating transforms are described that allow the signal elements to be quantized with different quantizing resolutions. Implementations using the Fast Hadamard Transform are described that reduce the resources needed to perform the transforms.
    Type: Grant
    Filed: December 19, 2005
    Date of Patent: May 19, 2009
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Corey I. Cheng, Claus Bauer
  • Patent number: 7529661
    Abstract: A method of attempting to determine a pitch period of an audio signal using a correlation-based signal derived from the audio signal. The correlation-based signal has known peaks, having been quadratically interpolated and filtered with coefficients that are a function of the interpolation ratio, each corresponding to a respective one of known time lags. The method comprises: identifying a time lag among the time lags; determining if there exists another time lag (i) within a time lag range of a respective one of one or more integer multiples of the identified time lag, and (ii) corresponding to a peak exceeding a peak threshold; and if the determination of step (a) passes, then returning the identified time lag as a time lag indicative of the pitch period.
    Type: Grant
    Filed: October 31, 2002
    Date of Patent: May 5, 2009
    Assignee: Broadcom Corporation
    Inventor: Juin-Hwey Chen
  • Publication number: 20090089053
    Abstract: Voice activity detection using multiple microphones can be based on a relationship between an energy at each of a speech reference microphone and a noise reference microphone. The energy output from each of the speech reference microphone and the noise reference microphone can be determined. A speech to noise energy ratio can be determined and compared to a predetermined voice activity threshold. In another embodiment, the absolute value of the autocorrelation of the speech and noise reference signals are determined and a ratio based on autocorrelation values is determined. Ratios that exceed the predetermined threshold can indicate the presence of a voice signal. The speech and noise energies or autocorrelations can be determined using a weighted average or over a discrete frame size.
    Type: Application
    Filed: September 28, 2007
    Publication date: April 2, 2009
    Applicant: QUALCOMM INCORPORATED
    Inventors: Song Wang, Samir Kumar Gupta, Eddie L. T. Choy
  • Patent number: 7509255
    Abstract: An apparatus for processing a speech signal includes a receiver, a speech signal decoder, a speech rate conversion information detector, and a speech rate converting processor. The receiver receives multiplexed signal of information concerning controls and programs, including speech packets through a transmission line. The decoder decodes the speech signal of packets out of the received signals. The detector detects speech rate conversion execution information in the received signals. The processor subjects the decoded speech signal to a speech rate conversion process if the speech rate conversion execution information indicates that the speech signal has not been subjected to the speech rate conversion process on the transmitting end, and which does not subject the decoded speech signal to the speech rate conversion process if the speech rate conversion execution information indicates that the speech signal has been subjected to the speech rate conversion process on the transmitting end.
    Type: Grant
    Filed: September 28, 2004
    Date of Patent: March 24, 2009
    Assignee: Victor Company of Japan, Limited
    Inventors: Hiroyuki Takeishi, Yutaka Ichinoi
  • Publication number: 20090037167
    Abstract: In one embodiment, the method includes receiving an audio data frame having at least first and second channels. The first and second channels have been independently subdivided into blocks if the first and second channels are not correlated with each other, and the first and second channels have been synchronously subdivided into blocks if the first and second channels are correlated with each other and difference coding is used. The embodiment further includes obtaining subdivision information from the audio data frame. The subdivision information includes first information and second information. The first information indicates whether the first and second channels are independently subdivided or synchronously subdivided, and the second information indicates how the subdividing is performed. The first and second channels are decoded based on the obtained subdivision information.
    Type: Application
    Filed: September 19, 2008
    Publication date: February 5, 2009
    Inventor: Tilman Liebchen
  • Patent number: 7483830
    Abstract: A speech decoder comprises a decoder (103) for converting a linear prediction encoded speech signal into a first sample stream having a first sampling rate and representing a first frequency band. Additionally it comprises a vocoder (105) for converting an input signal into a second sample stream having a second sampling rate and representing a second frequency band, and combination means (107) for combining the first and second sample streams in processed form. It comprises also means (301) for generating a second linear prediction filter, to be used by the vocoder (105) on the second frequency band, on the basis of a first linear prediction filter used by the decoder (103) on the first frequency band. Extrapolation through an infinite impulse response filter is the preferable method of generating the second linear prediction filter.
    Type: Grant
    Filed: March 1, 2001
    Date of Patent: January 27, 2009
    Assignee: Nokia Corporation
    Inventors: Jani Rotola-Pukkila, Janne Vainio, Hannu Mikkola
  • Patent number: 7461002
    Abstract: A method for time aligning audio signal, wherein one signal has been derived from the other or both have been derived from another signal, comprises deriving reduced-information characterizations of the audio signals, auditory scene analysis. The time offset of one characterization with respect to the other characterization is calculated and the temporal relationship of the audio signals with respect to each other is modified in response to the time offset such that the audio signals are coicident with each other. These principles may also be applied to a method for time aligning a video signal and an audio signal that will be subjected to differential time offsets.
    Type: Grant
    Filed: February 25, 2002
    Date of Patent: December 2, 2008
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Brett G. Crockett, Michael J. Smithers
  • Patent number: 7457745
    Abstract: A fast on-line automatic speaker/environment adaptation suitable for speech/speaker recognition system, method and computer program product are presented. The system comprises a computer system including a processor, a memory coupled with the processor, an input coupled with the processor for receiving acoustic signals, and an output coupled with the processor for outputting recognized words or sounds. The system includes a model-adaptation system and a recognition system, configured to accurately and efficiently recognize on-line distorted sounds or words spoken with different accents, in the presence of randomly changing environmental conditions. The model-adaptation system quickly adapts standard acoustic training models, available on audio recognition systems, by incorporating distortion parameters representative of the changing environmental conditions or the speaker's accent. By adapting models already available to the new environment, the system does not need separate adaptation training data.
    Type: Grant
    Filed: December 3, 2003
    Date of Patent: November 25, 2008
    Assignee: HRL Laboratories, LLC
    Inventors: Shubha Kadambe, Ron Burns, Markus Iseli
  • Publication number: 20080275697
    Abstract: An audio processing apparatus for processing two sampled audio signals to detect a temporal position of one of the audio signals with respect to the other. The apparatus detects audio power characteristics of each signal in respect of successive continuous temporal portions of each of the two signals, the portions having identical lengths and each portion including at least two audio samples, and correlates the detected audio power characteristics in respect of the two audio signals to establish a most likely temporal offset between the two audio signals.
    Type: Application
    Filed: October 27, 2006
    Publication date: November 6, 2008
    Applicant: SONY UNITED KINGDOM LIMITED
    Inventors: William Edmund Cranstoun Kentish, Nicolas John Haynes
  • Patent number: 7433358
    Abstract: An embodiment may include an apparatus comprising a dejitter buffer to receive packets containing audio data, a codec coupled with the dejitter buffer, the codec to receive coded audio frames from the dejitter buffer and decode them, and a concealed seconds meter coupled with the dejitter buffer, the concealed seconds meter to record concealment events by the decoder to provide an objective measure of media impairment. Another exemplary embodiment may be a method comprising receiving packets containing audio information at a dejitter buffer, decomposing the packets to coded audio frames, sending the coded audio frames to a decoder and decoding the frames, generating a concealment output stream if the decoder does not receive a valid frame from the dejitter buffer, and recording concealment events to provide an objective measure of media impairment.
    Type: Grant
    Filed: July 8, 2005
    Date of Patent: October 7, 2008
    Assignee: Cisco Technology, Inc.
    Inventors: Paul Volkaerts, Kevin Joseph Connor, James C. Frauenthal, Rajesh Kumar
  • Publication number: 20080235008
    Abstract: In a masking sound generation apparatus, a CPU analyzes a speech utterance speed of a received sound signal. Then, the CPU copies the received sound signal into a plurality of sound signals and performs the following processing on each of the sound signals. Namely, the CPU divides each of the sound signals into frames on the basis of a frame length determined on the basis of the speech utterance speed. Reverse process is performed on each of the frames to replace a waveform of the frame with a reverse waveform, and a windowing process is performed to achieve a smooth connection between the frames. Then, the CPU randomly rearranges the order of the frames and mixes the plurality of sound signals to generate a masking sound signal.
    Type: Application
    Filed: March 19, 2008
    Publication date: September 25, 2008
    Applicant: Yamaha Corporation
    Inventors: Atsuko ITO, Yasushi Shimizu, Akira Miki, Masato Hata
  • Patent number: 7412380
    Abstract: Modifying an audio signal comprising a plurality of channel signals is disclosed. At least selected ones of the channel signals are transformed into a time-frequency domain. The at least selected ones of the channel signals are compared in the time-frequency domain to identify corresponding portions of the channel signals that are not correlated or are only weakly correlated across channels. The identified corresponding portions of said channel signals are modified.
    Type: Grant
    Filed: December 17, 2003
    Date of Patent: August 12, 2008
    Assignee: Creative Technology Ltd.
    Inventors: Carlos Avendano, Michael Goodwin, Ramkumar Sridharan, Martin Wolters, Jean-Marc Jot
  • Patent number: 7412384
    Abstract: A digital signal processing method and learning method and devices therefor, and a program storage medium which are capable of further improving the waveform reproducibility of a digital signal. Self correlation coefficients are calculated by cutting parts out of the digital signal by multiple windows having different sizes, and the parts are classified based on the calculation results of the self correlation coefficients. Then, the digital signal is converted by the prediction method corresponding to the classified class, so that the conversion further suitable for the features of the digital signal can be conducted.
    Type: Grant
    Filed: July 31, 2001
    Date of Patent: August 12, 2008
    Assignee: Sony Corporation
    Inventors: Tetsujiro Kondo, Tsutomu Watanabe
  • Publication number: 20080154585
    Abstract: In a sound signal processing apparatus, a frame information generation section generates frame information of each frame of a sound signal. A storage stores the frame information generated by the frame information generation section. A first interval determination section determines a first utterance interval in the sound signal. A second interval determination section determines a second utterance interval based on the frame information of the first utterance interval stored in the storage such that the second utterance interval is made shorter than the first utterance interval and confined within the first utterance interval by trimming frames from either of a start point or an end point of the first utterance interval.
    Type: Application
    Filed: December 21, 2007
    Publication date: June 26, 2008
    Applicant: Yamaha Corporation
    Inventor: Yasuo Yoshioka
  • Patent number: 7349849
    Abstract: A speech recognition device with a frequency range with an upper frequency limit fmax is provided. The speech recognition device has more than two microphones with distances between the microphones, wherein the greatest common factor of the distances between the microphones is less than the speed of sound divided by fmax. More particularly, where the microphones are spaced a total distance, the number of the more than two microphones is less than the one half the total distance times the upper frequency limit divided by the speed of sound.
    Type: Grant
    Filed: July 25, 2002
    Date of Patent: March 25, 2008
    Assignee: Apple, Inc.
    Inventors: Kim E. Silverman, Devang K. Naik
  • Patent number: 7305337
    Abstract: The present invention includes a method for speech encoding and decoding and a design of speech coder and decoder. The characteristic of speech encoding method relies on the type of data with high compression rate after the whole speech data is compressed. The present invention is able to lower the bit rate of the original speech from 64 Kbps to 1.6 Kbps and provide a bit rate lower than the traditional compression method. It can provide good speech quality, and attain the function of storing the maximum speech data with minimum memory. As to the speech decoding method, some random noises are appropriated added into the exciting source, so that more speech characteristics can be simulated to produce various speech sounds. In addition, the present invention also discloses a coder and a decoder designed by application specific integrated circuit, and the structural design is optimized according to the software.
    Type: Grant
    Filed: December 24, 2002
    Date of Patent: December 4, 2007
    Assignee: National Cheng Kung University
    Inventors: Jhing-Fa Wang, Jia-Ching Wang, Yun-Fei Chao, Han-Chiang Chen, Ming-Chi Shih
  • Patent number: 7289952
    Abstract: A random code vector reading section and a random codebook of a conventional CELP type speech coder/decoder are respectively replaced with an oscillator for outputting different vector streams in accordance with values of input seeds, and a seed storage section for storing a plurality of seeds. This makes it unnecessary to store fixed vectors as they are in a fixed codebook (ROM), thereby considerably reducing the memory capacity.
    Type: Grant
    Filed: May 7, 2001
    Date of Patent: October 30, 2007
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Kazutoshi Yasunaga, Toshiyuki Morii, Hiroyuki Ehara
  • Patent number: 7284255
    Abstract: A system and method are disclosed for performing audience surveys of broadcast audio from radio and television. A small body-worn portable collection unit samples the audio environment of the survey member and stores highly compressed features of the audio programming. A central computer simultaneously collects the audio outputs from a number of radio and television receivers representing the possible selections that a survey member may choose. On a regular schedule the central computer interrogates the portable units used in the survey and transfers the captured audio feature samples. The central computer then applies a feature pattern recognition technique to identify which radio or television station the survey member was listening to at various times of day. This information is then used to estimate the popularity of the various broadcast stations.
    Type: Grant
    Filed: November 16, 1999
    Date of Patent: October 16, 2007
    Inventors: Steven G. Apel, Stephen C. Kenyon