Psychoacoustic Patents (Class 704/200.1)
  • Patent number: 7725313
    Abstract: A non-iterative and computationally efficient bit allocation technique for perceptual audio coders employing uniform quantization schemes. This is achieved by computing a target MNR for all critical bands in a frame using a target bit rate and associated SMRs. Associated SNRs are then computed for the critical bands using the computed target MNR and the associated SMRs. Bits are then allocated to the critical bands based on the computed associated SNRs.
    Type: Grant
    Filed: September 13, 2004
    Date of Patent: May 25, 2010
    Assignee: Ittiam Systems (P) Ltd.
    Inventors: Preethi Konda, Vinod Prakash
  • Publication number: 20100121632
    Abstract: Provided is a stereo audio encoding device which can improve the ICP (Inter-channel Prediction) performance of a stereo audio signal while suppressing the bit rate. The device (100) includes: a QMF analysis unit (101) which divides two channel signals constituting a stereo audio signal into a plurality of frequency band signals; a monaural signal generation unit (104) which generates a monaural signal by averaging the two channel signals of the divided frequency bands; parameter band constituting units (102, 105) each of which collects one or more of the continuous frequency bands to constitute a parameter band in such a manner that less bands are contained in a lower frequency for the two channel signals and monaural signals of the divided frequency bands; and an ICP analysis unit (106) which performs inter-channel prediction by using the channel signal and the monaural signal of the divided frequency bands.
    Type: Application
    Filed: April 24, 2008
    Publication date: May 13, 2010
    Applicant: PANASONIC CORPORATION
    Inventor: Kok Seng Chong
  • Patent number: 7715567
    Abstract: A method of denoise a stereo signal comprising a stereo sum signal and a stereo difference signal, performs a frequency selective stereo to mono blending based on the masking effect of the human auditory system. Therefore, a stereo signal noise reducer, comprising a first filter bank (1) to split the stereo difference signal (l?r) into a plurality of subbands, respective first multipliers (CO, . . . , CN) to weight each of the subbands of the stereo difference signal with a respective corresponding control signal (CO, . . . , CN), and a first adder (3) to sum all weighted subbands of the stereo difference signal (l?r) to build a frequency selective weighted stereo difference signal (diff), within which a number and width of the subbands obtained via the first filter bank (1) are choosen according to the properties of the human auditory system, further comprises a weighting factor determination unit which determines a respective control signal (CO, . . .
    Type: Grant
    Filed: August 18, 2006
    Date of Patent: May 11, 2010
    Assignee: Sony Deutschland GmbH
    Inventor: Jens Wildhagen
  • Patent number: 7716042
    Abstract: Coding an audio signal of a sequence of audio values into a coded signal includes determining first and second listening thresholds for first and second blocks of audio values of the sequence of audio values; calculating versions of first second parameterizations of the parameterizable filter such that the transfer function thereof roughly corresponds to the inverse of the magnitude of the first and second listening thresholds, respectively; filtering a predetermined block of audio values of the sequence of audio values with the parameterizable filter using a predetermined parameterization which depends on the version of the second parameterization to obtain a block of filtered audio values corresponding to the predetermined block which is quantized; forming a difference between the versions of the first and second parameterizations; integrating information on, inter alias, the difference into the coded signal.
    Type: Grant
    Filed: July 27, 2006
    Date of Patent: May 11, 2010
    Inventors: Gerald Schuller, Stefan Wabnik, Jens Hirschfeld, Manfred Lutzky
  • Patent number: 7711555
    Abstract: Digital audio data are divided into a plurality of frames, each of which includes a desired number of sub-band samples, which are gradually increased in a range between “16” and “1024”, and are then compressed by way of psychoacoustics analysis and quantization, whereby compressed data are realized with a high compression ratio and small tone-generation latency. The compressed data are decoded by way of inverse quantization and sub-band synthesis, so that decoded data are sequentially written into a memory (e.g., a FIFO memory). Decoding is appropriately turned on or off in response to a presently vacant capacity of the memory.
    Type: Grant
    Filed: May 29, 2006
    Date of Patent: May 4, 2010
    Assignee: Yamaha Corporation
    Inventor: Toshihiko Suzuki
  • Patent number: 7707030
    Abstract: A filter bank device for generating a complex spectral representation of a discrete-time signal includes a generator for generating a block-wise real spectral representation, which, for example, implements an MDCT, to obtain temporally successive blocks of real spectral coefficients. The output values of this spectral conversion device are fed to a post-processor for post-processing the block-wise real spectral representation to obtain an approximated complex spectral representation having successive blocks, each block having a set of complex approximated spectral coefficients, wherein a complex approximated spectral coefficient can be represented by a first partial spectral coefficient and by a second partial spectral coefficient, wherein at least one of the first and second partial spectral coefficients is determined by combining at least two real spectral coefficients.
    Type: Grant
    Filed: January 26, 2005
    Date of Patent: April 27, 2010
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.
    Inventors: Bernd Edler, Stefan Geyersberger
  • Patent number: 7702514
    Abstract: A method for audio encoding includes: analyzing an audio frame using a psychoacoustic model to obtain a corresponding masking curve and window information; transforming the audio frame according to the window information to obtain a spectrum, and dividing the spectrum into a plurality of frequency sub-bands; estimating a scale factor for each frequency sub-band; quantizing the frequency sub-bands; encoding the quantized frequency sub-bands; and packing the encoded frequency sub-bands and side information into an audio stream. Each scale factor is estimated based on a quantizable audio intensity of each frequency sub-band, which is adjusted according to a cumulative total amount of buffer space used for storing the encoded frequency sub-bands and an amount of buffer space used for storing a previously encoded audio frame, and a mean of intensities of all signals in the corresponding frequency sub-band and spectrum position of the corresponding frequency sub-band.
    Type: Grant
    Filed: March 28, 2006
    Date of Patent: April 20, 2010
    Assignee: Pixart Imaging Incorporation
    Inventors: Chih-Hsin Lin, Hsin-Chia Chen, Chang-Che Tsai, Tzu-Yi Chao
  • Patent number: 7698130
    Abstract: Provided are an audio encoding method and apparatus capable of fast bit rate control. The audio encoding method includes: converting audio sampling data into frequency domain data; adjusting a scalefactor value in each predetermined frequency band based on an available bits and allowed distortion of a psychoacoustic model to allocate a number of necessary bits to the frequency domain data and quantize the frequency domain data; and generating a bit stream based on the quantized data.
    Type: Grant
    Filed: September 8, 2005
    Date of Patent: April 13, 2010
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Miyoung Kim, Shihwa Lee, Dohyung Kim
  • Patent number: 7693707
    Abstract: A voice and musical tone coding apparatus is provided that can perform high-quality coding by executing vector quantization taking the characteristics of human hearing into consideration. In this voice and musical tone coding apparatus, a quadrature transformation processing section (201) converts a voice and musical tone signal from time components to frequency components. An auditory masking characteristic value calculation section (203) finds an auditory masking characteristic value from a voice and musical tone signal. A vector quantization section (202) performs vector quantization changing a calculation method of a distance between a code vector found from a preset codebook and a frequency component based on an auditory masking characteristic value.
    Type: Grant
    Filed: December 20, 2004
    Date of Patent: April 6, 2010
    Assignee: Pansonic Corporation
    Inventors: Tomofumi Yamanashi, Kaoru Sato, Toshiyuki Morii
  • Patent number: 7689406
    Abstract: Method and system for measuring transmission quality of an audio transmission system under test. Specifically, an input signal (X), such as an original input speech signal, is applied to the audio transmission system which results in an output signal (Y) produced by the transmission system. Both signals X and Y are mutually processed to yield a perceived quality signal. In accordance with the invention, output signal Y and/or input signal X are scaled such that, depending on a ratio of power of these two signals, relatively small deviations of power between these signals are compensated, while relatively larger deviations are only partially compensated. Further, an artificial reference speech signal may be created for which noise levels present in the input speech signal are reduced by a scale factor which reflects a local level of the noise in that input signal.
    Type: Grant
    Filed: February 26, 2003
    Date of Patent: March 30, 2010
    Assignee: Koninklijke KPN. N.V.
    Inventor: John Gerard Beerends
  • Patent number: 7680671
    Abstract: AC-3 is a high quality audio compression format widely used in feature films and, more recently, on Digital Versatile Disks (DVD). For consumer applications the algorithm is usually coded into the firmware of a DSP Processor, which due to cost considerations may be capable of only fixed point arithmetic. It is generally assumed that 16-bit processing is incapable of delivering the high fidelity audio, expected from the AC-3 technology. Double precision computation can be utilized on such processors to provide the high quality; but the computational burden of such implementation will be beyond the capacity of the processor to enable real-time operation. Through extensive simulation study of a high quality AC-3 encoder implementation, a multi-precision technique for each processing block is presented whereby the quality of the encoder on a 16-bit processor matches the single precision 24-bit implementation very closely without excessive additional computational complexity.
    Type: Grant
    Filed: September 8, 2006
    Date of Patent: March 16, 2010
    Assignee: STMicroelectronics Asia Pacific Pte. Ltd.
    Inventors: Mohammed Javed Absar, Sapna George, Antonio Mario Alvarez-Tinoco
  • Patent number: 7676374
    Abstract: A filtering method and system for a subband-domain is provided. A first analysis filter bank is configured to divide an input signal into a plurality of subbands. A second analysis filter bank divides one or more of the subbands into a second set of subbands. A modification unit accepts the plurality of subbands, the second set of subbands and modification data and outputs a plurality of modified frequency subbands. A first synthesis filter bank synthesizes the plurality of modified subbands. A filter then filters the plurality of modified subbands and the one or more synthesized modified subbands to obtain a plurality of filtered subbands. A second synthesis filter bank synthesizes the plurality of filtered subbands to obtain an output signal.
    Type: Grant
    Filed: March 28, 2006
    Date of Patent: March 9, 2010
    Assignee: Nokia Corporation
    Inventor: Mikko Tammi
  • Publication number: 20100042406
    Abstract: A perceptual model based on psychoacoustic auditory experiments is based on the (time domain) roughness of an input signal envelope in particular cochlea filter bands rather than the noise-like vs. tonal nature of the input signal. In illustrative embodiments, frequency domain techniques are used to develop envelope and envelope roughness measures, and such roughness measures are then used to derive Noise Masking Ratio (NMR) values for achieving a high level of noise masking in coder embodiments. Coder embodiments based on present inventive teachings are compatible with well-known AAC coding standards.
    Type: Application
    Filed: March 4, 2002
    Publication date: February 18, 2010
    Inventors: James David Johnston, Shyh-Shiaw Kuo
  • Publication number: 20100042407
    Abstract: In one alternative, an audio signal is analyzed using multiple psychoacoustic criteria to identify a region of the signal in which time scaling and/or pitch shifting processing would be inaudible or minimally audible, and the signal is time scaled and/or pitch shifted within that region. In another alternative, the signal is divided into auditory events, and the signal is time scaled and/or pitch shifted within an auditory event. In a further alternative, the signal is divided into auditory events, and the auditory events are analyzed using a psychoacoustic criterion to identify those auditory events in which the time scaling and/or pitch shifting processing of the signal would be inaudible or minimally audible. Further alternatives provide for multiple channels of audio.
    Type: Application
    Filed: October 26, 2009
    Publication date: February 18, 2010
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventor: Brett Graham Crockett
  • Patent number: 7664633
    Abstract: Coding of an audio signal represented by a respective set of sampled signal values for each of a plurality of sequential segments is disclosed. The sampled signal values are analyzed (40) to determine one or more sinusoidal components for each of the plurality of sequential segments. The sinusoidal components are linked (42) across a plurality of sequential segments to provide sinusoidal tracks. For each sinusoidal track, a phase comprising a generally monotonically changing value is determined and an encoded audio stream including sinusoidal codes (r) representing said phase is generated (46).
    Type: Grant
    Filed: November 6, 2003
    Date of Patent: February 16, 2010
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Albertus Cornelis Den Brinker, Andreas Johannes Gerrits, Robert Johannes Sluijter
  • Patent number: 7660720
    Abstract: A lossless audio coding and/or decoding method and apparatus are provided. The coding method includes: mapping the audio signal in the frequency domain having an integer value into a bit-plane signal with respect to the frequency; obtaining a most significant bit and a Golomb parameter for each bit-plane; selecting a binary sample on a bit-plane to be coded in the order from the most significant bit to the least significant bit and from a lower frequency component to a higher frequency component; calculating the context of the selected binary sample by using significances of already coded bit-planes for each of a plurality of frequency lines existing in the vicinity of a frequency line to which the selected binary sample belongs; selecting a probability model by using the obtained Golomb parameter and the calculated contexts; and lossless-coding the binary sample by using the selected probability model.
    Type: Grant
    Filed: March 10, 2005
    Date of Patent: February 9, 2010
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ennmi Oh, Junghoe Kim, Miao Lei, Shihwa Lee, Sangwook Kim
  • Publication number: 20100017195
    Abstract: A filter compressor for generating compressed subband filter impulse responses from input subband filter impulse responses corresponding to subbands, which include filter impulse response values at filter taps, includes a processor for examining the filter impulse response values from at least two input subband filter input responses to find filter impulse response values having higher values and at least one filter impulse response value having a value being lower than the higher values, and a filter impulse response constructor for constructing the compressed subband filter impulse responses using the filter impulse response values having the higher values, wherein the compressed subband filter impulse responses do not include filter impulse response values corresponding to filter taps of the at least one filter impulse response value having the lower value or include zero-valued values corresponding to filter taps of the at least one filter impulse response value having the lower value.
    Type: Application
    Filed: July 3, 2007
    Publication date: January 21, 2010
    Inventor: Lars Villemoes
  • Patent number: 7650277
    Abstract: A technique to encode an audio signal based on a perceptual model. In one example embodiment, this is accomplished by shaping quantization noise in the spectral lines on a band-by-band basis using their local gains. The noise shaped spectral lines are then fitted within a predetermined bit rate to form an encoded bit stream.
    Type: Grant
    Filed: September 25, 2003
    Date of Patent: January 19, 2010
    Assignee: Ittiam Systems (P) Ltd.
    Inventors: Vinod Prakash, Ashok I Magadum
  • Patent number: 7650278
    Abstract: A digital signal encoding method and apparatus using a plurality of lookup tables. The method includes: preparing a plurality of lookup tables storing numbers of allocated bits for encoding frequency bands of an input signal according to a characteristic of the input signal in a predetermined number of addresses; dividing an input signal in the time domain into signals in predetermined frequency bands; calculating address values of the frequency bands; selecting one of the plurality of lookup tables according to the characteristic of the input signal; extracting numbers of allocated bits of addresses having the calculated address values from the selected lookup table with respect to the frequency bands and allocating the numbers of bits to the frequency bands; and generating a bitstream by quantizing the input signal according to the numbers of allocated bits.
    Type: Grant
    Filed: March 16, 2005
    Date of Patent: January 19, 2010
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Dohyung Kim, Junghoe Kim, Yangseock Seo, Shihwa Lee, Sangwook Kim
  • Publication number: 20100010807
    Abstract: A method and apparatus to encode and decode an audio/speech signal is provided. An inputted audio signal or speech signal may be transformed into at least one of a high frequency resolution signal and a high temporal resolution signal. The signal may be encoded by determining an appropriate resolution, the encoded signal may be decoded, and thus the audio signal, the speech signal, and a mixed signal of the audio signal and the speech signal may be processed.
    Type: Application
    Filed: July 14, 2009
    Publication date: January 14, 2010
    Inventors: Eun Mi Oh, Jung Hoe Kim, Ki Hyun Choo, Ho Sang Sung, Mi Young Kim
  • Patent number: 7647221
    Abstract: Audio level control is provided for compressed audio. Scale factors for the compressed audio are extracted from an MPEG audio data stream, the extracted scale factors are altered without decompressing the compressed audio, and the MPEG audio data stream is updated with the altered scale factors. All of the scale factors in the MPEG audio data stream are altered based on a parameter identifying how the gain levels in the MPEG data stream are to be altered.
    Type: Grant
    Filed: April 30, 2003
    Date of Patent: January 12, 2010
    Assignee: The DIRECTV Group, Inc.
    Inventor: James A. Michener
  • Publication number: 20090326928
    Abstract: Various embodiments provide techniques for allowing an application to opt out of system default audio stream behavior, as well as techniques for notifying applications on a computing device that a communication audio stream has been initiated. The techniques may differentiate between communication-related audio streams and audio streams that are not communication-related. In some embodiments, an application may register to receive notification that a communication stream has been initiated. The application may be configured to comply with system default audio stream handling policies, or it can perform custom behavior in response to the audio stream notification. In some embodiments, an application may register for filtered or unfiltered notification. In a filtered notification scenario, an application is notified that a communication stream has been initiated when an audio stream associated with the application has not already been modified in response to the initiation of a different communication stream.
    Type: Application
    Filed: June 26, 2008
    Publication date: December 31, 2009
    Applicant: Microsoft Corporation
    Inventors: Elliot H. Omiya, Noel R. Cross, Adeel A. Aslam, Lawrence W. Osterman
  • Publication number: 20090319259
    Abstract: Methods and an apparatus for enhancement of source coding systems utilizing high frequency reconstruction (HFR) are introduced. The problem of insufficient noise contents is addressed in a reconstructed highband, by using Adaptive Noise-floor Addition. New methods are also introduced for enhanced performance by means of limiting unwanted noise, interpolation and smoothing of envelope adjustment amplification factors. The methods and apparatus used are applicable to both speech coding and natural audio coding systems.
    Type: Application
    Filed: June 24, 2009
    Publication date: December 24, 2009
    Inventors: Lars G. Liljeryd, Kristofer Kjoerling, Per Ekstrand, Fredrik Henn
  • Patent number: 7636659
    Abstract: In accordance with the present invention, computer implemented methods and systems are provided for representing and modeling the temporal structure of audio signals. In response to receiving a signal, a time-to-frequency domain transformation on at least a portion of the received signal to generate a frequency domain representation is performed. The time-to-frequency domain transformation converts the signal from a time domain representation to the frequency domain representation. A frequency domain linear prediction (FDLP) is performed on the frequency domain representation to estimate a temporal envelope of the frequency domain representation. Based on the temporal envelope, one or more speech features are generated.
    Type: Grant
    Filed: March 25, 2005
    Date of Patent: December 22, 2009
    Assignee: The Trustees of Columbia University in the City of New York
    Inventors: Marios Athineos, Daniel P. W. Ellis
  • Patent number: 7634400
    Abstract: A mask generation process for use in encoding audio data, including generating linear masking components from the audio data, generating logarithmic masking components from the linear masking components, and generating a global masking threshold from the logarithmic masking components. The process is a psychoacoustic masking process for use in an MPEG-1-L2 encoder, and includes generating energy values from a Fourier transform of the audio data, determining sound pressure level values from the energy values, selecting tonal and non-tonal masking components on the basis of the energy values, generating power values from the energy values, generating masking thresholds on the basis of the masking components and the power values, and generating signal to mask ratios for a quantizier on the basis of the sound pressure level values and the masking thresholds.
    Type: Grant
    Filed: March 8, 2004
    Date of Patent: December 15, 2009
    Assignee: STMicroelectronics Asia Pacific Pte. Ltd.
    Inventors: Charles Averty, Xue Yao, Ranjot Singh
  • Patent number: 7630889
    Abstract: A code conversion method for converting first code string data conforming to a first speech coding scheme into second code string data conforming to a second speech coding scheme has the steps of decoding the first code string data to generate a first decoded speech, correcting the signal characteristics of the first decoded speech to generate a second decoded speech, and encoding the second decoded speech in accordance with the second speech coding scheme to generate the second code string data.
    Type: Grant
    Filed: March 31, 2004
    Date of Patent: December 8, 2009
    Assignee: NEC Corporation
    Inventor: Atsushi Murashima
  • Patent number: 7620543
    Abstract: A method, medium, and apparatus for converting compressed audio data, including decoding compressed audio input data, in accordance with a corresponding compression format, coding a result of the decoding, in accordance with a predetermined compression format, and combining a result of the coding with the side information to generate audio output data to be compressed according to the predetermined compression format.
    Type: Grant
    Filed: January 13, 2005
    Date of Patent: November 17, 2009
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Dohyung Kim, Sangwook Kim, Ennmi Oh, Junghoe Kim, Yangseock Seo, Shihwa Lee
  • Patent number: 7617100
    Abstract: An improved audio compression scheme is provided. The scheme uses an excitation pattern to more efficiently provide audio signal compression. Under the scheme, an input signal is transformed to the frequency domain. Next, the excitation pattern corresponding to the transformed input signal is calculated. Bit allocation processing is then performed based on the excitation pattern. Frequencies are then coded based on the results of the bit allocation processing. Finally, bitstream packing is performed to generate the output coded audio bit stream. In one exemplary implementation, the audio compression scheme is implemented in an encoder.
    Type: Grant
    Filed: January 10, 2003
    Date of Patent: November 10, 2009
    Assignee: NVIDIA Corporation
    Inventor: Fa-Long Luo
  • Patent number: 7617110
    Abstract: A lossless audio encoding/decoding method, medium, and apparatus. The lossless audio encoding method includes converting an audio signal in a time domain into an audio spectral signal with an integer in a frequency domain, mapping the audio spectral signal in the frequency domain to a bit plane signal according to its frequency, and losslessly encoding binary samples of bit planes using a probability model determined according to a predetermined context. The lossless audio decoding method includes extracting a predetermined lossy bitstream and an error bitstream from error data by demultiplexing an audio bitstream, the error data corresponding to a difference between lossy encoded audio data and an audio spectral signal with an integer in a frequency domain, lossy decoding the extracted encoded lossy bitstream, losslessly decoding the extracted error bitstream, and restoring the original audio frequency spectral signal using the decoded lossy bitstream and error bitstream.
    Type: Grant
    Filed: February 28, 2005
    Date of Patent: November 10, 2009
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Junghoo Kim, Miao Lei, Shihwa Lee, Sangwook Kim, Ennmi Oh, Dohyung Kim
  • Patent number: 7613609
    Abstract: To encode multi-channel digital data by adjusting the number of bits allocated to each channel to perform entropy coding of the multi-channel data, there is provided a multi-channel encoder including n encoders for audio data from n channels and an inter-channel bit allocator that allocates the number of bits usable for each channel on the basis of the provisional number of in-use bits from each of the encoders. Each of the encoders performs entropy coding on the basis of the provisional number of quantizing steps, outputs the provisional number of in-use bits resulting from summing of a code length of each unit of coding, and adjusts the number of in-use bits by updating the quantizing steps correspondingly to the number of bits supplied based on the provisional number of in-use bits.
    Type: Grant
    Filed: April 2, 2004
    Date of Patent: November 3, 2009
    Assignee: Sony Corporation
    Inventor: Kenichi Makino
  • Patent number: 7613603
    Abstract: An efficient audio coding device that quantizes and encodes digital audio signals with a reduced amount of computation. A spatial transform unit subjects samples of a given audio signal to a spatial transform, thus obtaining transform coefficients of the signal. With a representative value selected out of the transform coefficients of each subband, a quantization step size calculator estimates quantization noise and calculates, in an approximative way, a quantization step size of each subband from the estimated quantization noise, as well as from a masking power threshold determined from a psycho-acoustic model of the human auditory system. A quantizer then quantizes the transform coefficients, based on the calculated quantization step sizes, thereby producing quantized values of those coefficients. The quantization step sizes are also used by a scalefactor calculator to calculate common and individual scalefactors.
    Type: Grant
    Filed: November 10, 2005
    Date of Patent: November 3, 2009
    Assignee: Fujitsu Limited
    Inventor: Hiroaki Yamashita
  • Patent number: 7610205
    Abstract: In one alternative, an audio signal is analyzed using multiple psychoacoustic criteria to identify a region of the signal in which time scaling and/or pitch shifting processing would be inaudible or minimally audible, and the signal is time scaled and/or pitch shifted within that region. In another alternative, the signal is divided into auditory events, and the signal is time scaled and/or pitch shifted within an auditory event. In a further alternative, the signal is divided into auditory events, and the auditory events are analyzed using a psychoacoustic criterion to identify those auditory events in which the time scaling and/or pitch shifting procession of the signal would be inaudible or minimally audible. Further alternatives provide for multiple channels of audio.
    Type: Grant
    Filed: February 12, 2002
    Date of Patent: October 27, 2009
    Assignee: Dolby Laboratories Licensing Corporation
    Inventor: Brett Graham Crockett
  • Patent number: 7610195
    Abstract: A decoder (e.g., an AAC-LTP decoder) receives a stream containing coded audio data and prediction data. The coded data is upsampled or downsampled during decoding. Portions of the decoded data are stored in a buffer for use in decoding subsequent coded data. The buffer into which the decoded data is placed has different dimensions than a buffer used in a coder when generating the coded data. A portion of the data in the decoder buffer is identified and modified with interleaved zero values so as to correspond to the dimensions of the prediction coding buffer in the coder.
    Type: Grant
    Filed: June 1, 2006
    Date of Patent: October 27, 2009
    Assignee: Nokia Corporation
    Inventor: Juha Ojanperä
  • Patent number: 7590523
    Abstract: There is provided a speech post-processor for enhancing a speech signal divided into a plurality of sub-bands in frequency domain. The speech post-processor comprises an envelope modification factor generator configured to use frequency domain coefficients representative of an envelope derived from the plurality of sub-bands to generate an envelope modification factor for the envelope derived from the plurality of sub-bands, where the envelope modification factor is generated using FAC=?ENV/Max+(1??), where FAC is the envelope modification factor, ENV is the envelope, Max is the maximum envelope, and ? is a value between 0 and 1, where ? is a different constant value for each speech coding rate. The speech post-processor further comprises an envelope modifier configured to modify the envelope derived from the plurality of sub-bands by the envelope modification factor corresponding to each of the plurality of sub-bands.
    Type: Grant
    Filed: March 20, 2006
    Date of Patent: September 15, 2009
    Assignee: Mindspeed Technologies, Inc.
    Inventor: Yang Gao
  • Patent number: 7590543
    Abstract: The present invention proposes a new method for improving the performance of a real-valued filterbank based spectral envelope adjuster. By adaptively locking the gain values for adjacent channels dependent on the sign of the channels, as defined in the application, reduced aliasing is achieved. Furthermore, the grouping of the channels during gain-calculation, gives an improved energy estimate of the real valued subband signals in the filterbank.
    Type: Grant
    Filed: September 21, 2007
    Date of Patent: September 15, 2009
    Assignee: Coding Technologies Sweden AB
    Inventors: Kristofer Kjorling, Lars Villemoes
  • Publication number: 20090226010
    Abstract: An apparatus for mixing a plurality of input data streams is described, which has a processing unit adapted to compare the frames of the plurality of input data streams, and determine, based on the comparison, for a spectral component of an output frame of an output data stream, exactly one input data stream of the plurality of input data streams. The output data stream is generated by copying at least a part of an information of a corresponding spectral component of the frame of the determined data stream. Further or alternatively, the control values of the frames of the first and second input data streams are compared, and, if so, the control value is adopted.
    Type: Application
    Filed: March 4, 2009
    Publication date: September 10, 2009
    Inventors: Markus Schnell, Manfred Lutzky, Markus Multrus
  • Patent number: 7587311
    Abstract: For embedding binary payload in a carrier signal, which, for example, is an audio signal, a sequence of time-discrete values of the carrier signal is converted to the frequency domain by means of an integer transform algorithm to obtain binary spectral representation values. Bits of the binary spectral representation values with a valency less than signal limit valency are determined and set according to the payload. The signal limit valency for a spectral representation value is less than the valency of the leading bit of this spectral representation value, so that, with adequate distance, a psychoacoustic transparent insertion of information is achieved. Thus a modified spectral representation with inserted information is generated which is finally converted back to the time domain using an integer back transform algorithm. For extracting the payload, the time-discrete signal with the inserted information is again converted to a spectral representation with the integer forward transform algorithm.
    Type: Grant
    Filed: November 15, 2005
    Date of Patent: September 8, 2009
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.
    Inventors: Gerald Schuller, Ralf Geiger, Juergen Koller
  • Patent number: 7583805
    Abstract: A scheme for stereo and multi-channel synthesis of inter-channel correlation (ICC) (normalized cross-correlation) cues for parametric stereo and multi-channel coding. The scheme synthesizes ICC cues such that they approximate those of the original. For that purpose, diffuse audio channels are generated and mixed with the transmitted combined (e.g., sum) signal(s). The diffuse audio channels are preferably generated using relatively long filters with exponentially decaying Gaussian impulse responses. Such impulse responses generate diffuse sound similar to late reverberation. An alternative implementation for reduced computational complexity is proposed, where inter-channel level difference (ICLD), inter-channel time difference (ICTD), and ICC synthesis are all carried out in the domain of a single short-time Fourier transform (STFT), including the filtering for diffuse sound generation.
    Type: Grant
    Filed: April 1, 2004
    Date of Patent: September 1, 2009
    Assignee: Agere Systems Inc.
    Inventors: Frank Baumgarte, Christof Faller
  • Patent number: 7577570
    Abstract: The present invention proposes a new method for improving the performance of a real-valued filterbank based spectral envelope adjuster. By adaptively locking the gain values for adjacent channels dependent on the sign of the channels, as defined in the application, reduced aliasing is achieved. Furthermore, the grouping of the channels during gain-calculation, gives an improved energy estimate of the real valued subband signals in the filterbank.
    Type: Grant
    Filed: August 29, 2003
    Date of Patent: August 18, 2009
    Assignee: Coding Technologies Sweden AB
    Inventors: Kristofer Kjoerling, Lars Villemoes
  • Patent number: 7574355
    Abstract: For determining a quantizer step size for quantizing a signal including audio or video information, a first quantizer step size as well as an interference threshold are provided. Then, the actual interference introduced by the first quantizer step size is determined and compared with the interference threshold. Despite the fact that the comparison reveals that the actually introduced interference exceeds the threshold, a second, coarser quantizer step size is nevertheless used, which will then be used for quantization if it turns out that the interference introduced by the coarser, second quantizer step size falls below the threshold or falls below the interference introduced by the first quantizer step size. Thus, the quantization interference is reduced while the quantization is coarsened and, thus, the compression gain is increased.
    Type: Grant
    Filed: August 30, 2006
    Date of Patent: August 11, 2009
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Bernhard Grill, Michael Schug, Bodo Teichmann, Nikolaus Rettelbach
  • Patent number: 7570748
    Abstract: Using a network in which one master station modem 105 and multiple slave station modem 106, 107, 108 are physically connected by rudder connection or bus connection with a telecommunication line 121, the measurement result of the S/N ratio between the master station and slave stations is once collected in the master station at the initialization stage, pairs of the modulation method, by which all modems can demodulate at high probability, and transmission voltage are calculated based on this data, and the calculation is transmitted to all slave stations; and at a normal transmission stage, information regarding the coordination and control of the network is transmitted in accordance with this setting. Thus, one-to-multi communication is realized, guaranteeing an access right.
    Type: Grant
    Filed: December 22, 2004
    Date of Patent: August 4, 2009
    Assignee: Hitachi, Ltd.
    Inventors: Yoshikazu Ishii, Setsuo Arita, Yuji Ichinose, Nao Saito, Daisuke Sinma, Yasuhiro Nakatsuka
  • Patent number: 7562012
    Abstract: A method and apparatus for creating a signature of a sampled work in real-time is disclosed herein. Unique signatures of an unknown audio work are created by segmenting a file into segments having predetermined segment and hop sizes. The signature then may be compared against reference signatures. One aspect may be characterized in that the hop size of the sampled work signature is less than the hop size of reference signatures. A method for identifying an unknown audio work is also disclosed.
    Type: Grant
    Filed: November 3, 2000
    Date of Patent: July 14, 2009
    Assignee: Audible Magic Corporation
    Inventors: Erling H. Wold, Thomas L. Blum, Douglas F. Keislar, James A. Wheaton
  • Patent number: 7558389
    Abstract: A method and apparatus utilizing prosody modification of a speech signal output by a text-to-speech (TTS) system to substantially prevent an interactive voice response (IVR) system from understanding the speech signal without significantly degrading the speech signal with respect to human understanding. The present invention involves modifying the prosody of the speech output signal by using the prosody of the user's response to a prompt. In addition, a randomly generated overlay frequency is used to modify the speech signal to further prevent an IVR system from recognizing the TTS output. The randomly generated frequency may be periodically changed using an overlay timer that changes the random frequency signal at a predetermined intervals.
    Type: Grant
    Filed: October 1, 2004
    Date of Patent: July 7, 2009
    Assignee: AT&T Intellectual Property II, L.P.
    Inventor: Joseph DeSimone
  • Patent number: 7555432
    Abstract: Audio steganography methods and apparatus using cepstral domain techniques to make embedded data in audio signals less perceivable. One approach defines a set of frames for a host audio signal, and, for each frame, determines a plurality of masked frequencies as spectral points with power level below a masking threshold for the frame. The two most commonly occurring masked frequencies f1 and f2 in the set of frames are selected, and a cepstrum of each frame is modified to produce complementary changes of the spectrum at f1 and f2 to correspond to a desired bit value. Another aspect of the invention involves determining a masking threshold for a frame, determining masked frequencies within the frame having a power level below threshold, obtaining a cepstrum of a sinusoid at a selected masked frequency, and modifying the frame by an offset to correspond to an embedded data value, the offset derived from the cepstrum.
    Type: Grant
    Filed: February 10, 2006
    Date of Patent: June 30, 2009
    Assignee: Purdue Research Foundation
    Inventor: Kaliappan Gopalan
  • Publication number: 20090161882
    Abstract: A method of calculating an objective score (NOS) of the perceived quality of an audio signal degraded by the presence of noise and processed by a noise reducing function, said method comprising a preliminary step of obtaining a predefined test audio signal (x[m]) containing a wanted signal free of noise, a signal (xb[m]) affected by noise obtained by adding a predefined noise signal to the test signal (x[m]), and a processed signal (y[m]) obtained by applying the noise reducing function to the signal (xb[m]) affected by noise. This method includes a step (a5) of measuring distances (dYX(m,b)) between perceived loudness densities calculated for the processed signal (y[m]) and perceived loudness densities calculated for the test signal (x[m]); and a step (a6) of comparing said distances (dYX(m,b)) with masking thresholds (Smasking(m,b)) calculated for the test signal (x[m]) and/or the processed signal (y[m]).
    Type: Application
    Filed: December 8, 2006
    Publication date: June 25, 2009
    Inventors: Nicolas Le Faucher, Valerie Gautier-Turbin
  • Publication number: 20090157391
    Abstract: An audio fingerprint is extracted from an audio sample, where the fingerprint contains information that is characteristic of the content in the sample. The fingerprint may be generated by computing an energy spectrum for the audio sample, resampling the energy spectrum logarithmically in the time dimension, transforming the resampled energy spectrum to produce a series of feature vectors, and computing the fingerprint using differential coding of the feature vectors. The generated fingerprint can be compared to a set of reference fingerprints in a database to identify the original audio content.
    Type: Application
    Filed: February 24, 2009
    Publication date: June 18, 2009
    Inventor: Sergiy Bilobrov
  • Patent number: 7548855
    Abstract: An audio processing tool measures the quality of reconstructed audio data. For example, an audio encoder measures the quality of a block of reconstructed frequency coefficient data in a quantization loop. The invention includes several techniques and tools, which can be used in combination or separately. First, before measuring quality, the tool normalizes the block to account for variation in block sizes. Second, for the quality measurement, the tool processes the reconstructed data by critical bands, which can differ from the quantization bands used to compress the data. Third, the tool accounts for the masking effect of the reconstructed data, not just the masking effect of the original data. Fourth, the tool band weights the quality measurement, which can be used to account for noise substitution or band truncation. Finally, the tool changes quality measurement techniques depending on the channel coding mode.
    Type: Grant
    Filed: June 26, 2006
    Date of Patent: June 16, 2009
    Assignee: Microsoft Corporation
    Inventors: Wei-Ge Chen, Naveen Thumpudi, Ming-Chieh Lee
  • Patent number: 7548850
    Abstract: An audio processing tool measures the quality of reconstructed audio data. For example, an audio encoder measures the quality of a block of reconstructed frequency coefficient data in a quantization loop. The invention includes several techniques and tools, which can be used in combination or separately. First, before measuring quality, the tool normalizes the block to account for variation in block sizes. Second, for the quality measurement, the tool processes the reconstructed data by critical bands, which can differ from the quantization bands used to compress the data. Third, the tool accounts for the masking effect of the reconstructed data, not just the masking effect of the original data. Fourth, the tool band weights the quality measurement, which can be used to account for noise substitution or band truncation. Finally, the tool changes quality measurement techniques depending on the channel coding mode.
    Type: Grant
    Filed: June 26, 2006
    Date of Patent: June 16, 2009
    Assignee: Microsoft Corporation
    Inventors: Wei-Ge Chen, Naveen Thumpudi, Ming-Chieh Lee
  • Publication number: 20090150143
    Abstract: A post-filtering apparatus and method for speech enhancement in a modified discrete cosine transform (MDCT) domain are disclosed. In the apparatus and method, previous and current MDCT coefficients are used for obtaining a speech spectrum coefficient similar to a real speech spectrum, and a convex function is used for transforming the speech spectrum coefficient and obtaining a post-filter coefficient so that difference can increase in the case where the speech spectrum coefficient is small but decrease in the case where the coefficient is large. Then, the post-filter coefficient is applied to the MDCT coefficient. With this configuration, both the current and previous MDCT values are used, so that it is possible to obtain a spectrum coefficient similar to the real speech spectrum and to obtain a more accurate filter coefficient. Further, the coefficient is adaptively transformed through the convex function, thereby enhancing speech quality.
    Type: Application
    Filed: June 5, 2008
    Publication date: June 11, 2009
    Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Hyun-woo Kim, Jong-mo Sung, Mi-suk Lee, Do-young Kim, Byung-sun Lee
  • Publication number: 20090138259
    Abstract: An inventive method for introducing information into a data stream including data about spectral values representing a short-term spectrum of an audio signal first performs a processing of the data stream to obtain the spectral values of the short-term spectrum of the audio signal. Apart from that, the information to be introduced are combined with a spread sequence to obtain a spread information signal, whereupon a spectral representation of the spread information is generated which will then be weighted with an established psychoacoustic maskable noise energy to generate a weighted information signal, wherein the energy of the introduced information is substantially equal to or below the psychoacoustic masking threshold. The weighted information signal and the spectral values of the short-term spectrum of the audio signal will then be summed and afterwards processed again to obtain a processed data stream including both audio information and information to be introduced.
    Type: Application
    Filed: February 5, 2009
    Publication date: May 28, 2009
    Applicant: Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V.
    Inventors: Christian NEUBAUER, Juergen Herre, Karlheinz Brandenburg, Eric Allamanche