Psychoacoustic Patents (Class 704/200.1)
  • Patent number: 7539612
    Abstract: Techniques and tools for representing, coding, and decoding scale factor information are described herein. For example, during encoding of scale factors, an encoder uses one or more of flexible scale factor resolution selection, spatial prediction of scale factors, flexible prediction of scale factors, smoothing of noisy scale factor amplitudes, reordering of scale factor prediction residuals, and prediction of scale factor prediction residuals. Or, during decoding, a decoder uses one or more of flexible scale factor resolution selection, spatial prediction of scale factors, flexible prediction of scale factors, reordering of scale factor prediction residuals, and prediction of scale factor prediction residuals.
    Type: Grant
    Filed: July 15, 2005
    Date of Patent: May 26, 2009
    Assignee: Microsoft Corporation
    Inventors: Naveen Thumpudi, Wei-Ge Chen, Chao He
  • Publication number: 20090132239
    Abstract: Techniques are disclosed for automatically de-identifying spoken audio signals. In particular, techniques are disclosed for automatically removing personally identifying information from spoken audio signals and replacing such information with non-personally identifying information. De-identification of a spoken audio signal may be performed by automatically generating a report based on the spoken audio signal. The report may include concept content (e.g., text) corresponding to one or more concepts represented by the spoken audio signal. The report may also include timestamps indicating temporal positions of speech in the spoken audio signal that corresponds to the concept content. Concept content that represents personally identifying information is identified. Audio corresponding to the personally identifying concept content is removed from the spoken audio signal. The removed audio may be replaced with non-personally identifying audio.
    Type: Application
    Filed: January 26, 2009
    Publication date: May 21, 2009
    Inventors: Michael Finke, Detlef Koll
  • Publication number: 20090132238
    Abstract: An audio encoding system that accepts an audio signal as an input to the system. The system includes a filter bank that splits the audio signal into a plurality of frames, and a bit allocation unit that assigns a number of bits for a current frame of the plurality of frames. The system further includes a scale factor unit that calculates a scale factor, identifies a block type of a first block of a current frame, identifies a block type of a second block consecutive to the first block, and reuses a scale factor of the first block for the second block, when the block type of the first block and the block type of the second block match. The system additionally includes a quantization and coding unit that quantizes and codes the signal, and a bit rate checker that verifies whether a bit rate requirement is satisfied.
    Type: Application
    Filed: October 31, 2008
    Publication date: May 21, 2009
    Inventor: B. SUDHAKAR
  • Publication number: 20090125298
    Abstract: The technology disclosed relates to audio signal processing. It includes a series of modules that individually are useful to solve audio signal processing problems. Among the problems addressed are buzz removal, selecting a pitch candidate among pitch candidates based on local continuity of pitch and regional octave consistency, making small adjustments in pitch, ensuring that a selected pitch is consistent with harmonic peaks, determining whether a given frame or region of frames includes harmonic, voiced signal, extracting harmonics from voice signals and detecting vibrato. One environment in which these modules are useful is transcribing singing or humming into a symbolic melody. Another environment that would usefully employ some of these modules is speech processing. Some of the modules, such as buzz removal, are useful in many other environments as well.
    Type: Application
    Filed: November 3, 2008
    Publication date: May 14, 2009
    Applicant: Melodis Inc.
    Inventors: Aaron Master, Seyed Majid Emami
  • Patent number: 7523039
    Abstract: A digital audio encoding method using an advanced psychoacoustic model is provided. The audio encoding method including determining the type of a window according to the characteristic of an input audio signal; generating a complex modified discrete cosine transform (CMDCT) spectrum from the input audio signal according to the determined window type; generating a fast Fourier transform (FFT) spectrum from the input audio signal, by using the determined window type; and performing a psychoacoustic model analysis by using the generated CMDCT spectrum and FFT spectrum.
    Type: Grant
    Filed: September 2, 2003
    Date of Patent: April 21, 2009
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Mathew Manu
  • Publication number: 20090099843
    Abstract: A method for determining a speech quality measure of an output speech signal with respect to an input speech signal, wherein the input signal passes through a signal path of a data transmission system resulting in the output signal, includes the steps of pre-processing the output signal; determining at least one of an interruption rate of the pre-processed output signal and a measure for an intensity of musical tones present in the pre-processed output signal; and determining the speech quality measure from at least one of the interruption rate and the measure for the intensity of the musical tones.
    Type: Application
    Filed: September 11, 2008
    Publication date: April 16, 2009
    Applicants: DEUTSCHE TELEKOM AG, FRANCE TELECOM, TECHNISCHE UNIVERSITAET BERLIN
    Inventors: Vincent Barriac, Nicolas Cote, Valerie Gautier-Turbin, Sebastian Moeller, Alexander Raake, Marcel Waeltermann, Ulrich Heute, Kirstin Scholz
  • Patent number: 7516064
    Abstract: Analysis and synthesis filter banks such as those used in audio and video coding systems are each implemented by a hybrid transform that comprises a primary transform in cascade with one or more secondary transforms. The primary transforms for the filter banks implement an analysis/synthesis system in which time-domain aliasing artifacts are cancelled. The secondary transforms, which are in cascade with the primary transforms, are applied to blocks of transform coefficients. The length of the blocks is varied to adapt the time resolution of the analysis and synthesis filter banks.
    Type: Grant
    Filed: February 19, 2004
    Date of Patent: April 7, 2009
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Mark Stuart Vinton, Grant Allen Davidson
  • Publication number: 20090089049
    Abstract: Provided are a method of adaptively determining a quantization step according to a masking effect in a psychoacoustics model and a method of encoding/decoding an audio signal by using the determined quantization step. The method of adaptively determining a quantization step includes calculating a first ratio value indicating an intensity of an input audio signal with respect to a masking threshold; and determining the maximum value of the quantization step in a range in which noise generated when the audio signal is quantized is masked, according to the first ratio value. According to the present invention, quantization noise may be removed and the number of bits required to encode an audio signal may be reduced, by using auditory characteristics of humans.
    Type: Application
    Filed: September 25, 2008
    Publication date: April 2, 2009
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Han-gil MOON, Geon-hyoung LEE
  • Patent number: 7512536
    Abstract: Low-complexity synthesis filter bank for MPEG audio decoding uses a factoring of the 64×32 matrixing for the inverse-quantized subband coefficients. Factoring into non-standard 4-point discrete cosine and sine transforms, point-wise multiplications and combinations, and non-standard 8-point discrete cosine and sine transforms limits memory requirements and computational complexity.
    Type: Grant
    Filed: May 2, 2005
    Date of Patent: March 31, 2009
    Assignee: Texas Instruments Incorporated
    Inventor: Mohamed F. Mansour
  • Publication number: 20090076801
    Abstract: An inventive method for introducing information into a data stream including data about spectral values representing a short-term spectrum of an audio signal first performs a processing of the data stream to obtain the spectral values of the short-term spectrum of the audio signal. Apart from that, the information to be introduced are combined with a spread sequence to obtain a spread information signal, whereupon a spectral representation of the spread information is generated which will then be weighted with an established psychoacoustic maskable noise energy to generate a weighted information signal, wherein the energy of the introduced information is substantially equal to or below the psychoacoustic masking threshold. The weighted information signal and the spectral values of the short-term spectrum of the audio signal will then be summed and afterwards processed again to obtain a processed data stream including both audio information and information to be introduced.
    Type: Application
    Filed: September 25, 2008
    Publication date: March 19, 2009
    Inventors: Christian Neubauer, Juergen Herre, Karlheinz Brandenburg, Eric Allamanche
  • Patent number: 7505897
    Abstract: The subject matter includes systems, engines, and methods for generalizing a class of Lempel-Ziv algorithms for lossy compression of multimedia. One implementation of the subject matter compresses audio signals. Because music, especially electronically generated music, has a substantial level of repetitiveness within a single audio clip, the basic Lempel-Ziv compression technique can be generalized to support representing a single window of an audio signal using a linear combination of filtered past windows. Exemplary similarity searches and filtering strategies for finding the past windows are described.
    Type: Grant
    Filed: January 27, 2005
    Date of Patent: March 17, 2009
    Assignee: Microsoft Corporation
    Inventors: Darko Kirovski, Zeph Landau
  • Publication number: 20090067644
    Abstract: Measuring the loudness of audio encoded in a bitstream that includes data from which an approximation of the power spectrum of the audio can be derived without fully decoding the audio is performed by deriving the approximation of the power spectrum of the audio from said bitstream without fully decoding the audio, and determining an approximate loudness of the audio in response to the approximation of the power spectrum of the audio. The data may include coarse representations of the audio and associated finer representations of the audio, the approximation of the power spectrum of the audio being derived from the coarse representations of the audio. In the case of subband encoded audio, the coarse representations of the audio may comprise scale factors and the associated finer representations of the audio may comprise sample data associated with each scale factor.
    Type: Application
    Filed: March 23, 2006
    Publication date: March 12, 2009
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventors: Brett Graham Crockett, Michael John Smithers, Alan Jeffrey Seefeldt
  • Publication number: 20090063137
    Abstract: A method and an apparatus of a low-complexity psychoacoustic model applicable for advanced audio coding encoders use a modified discrete cosine transform based (MDCT-based) psychoacoustic model and a simplified look-up table to compute the MDCT-based psychoacoustic model by a logarithm based logarithmic method to simplify the computational complexity, and then computing a quantization loop (Q loop) by the logarithm based logarithmic method to further reduce the computational quantity of the MDCT-based psychoacoustic model, so as to achieve the real-time playback effect by a very low operating frequency.
    Type: Application
    Filed: October 9, 2007
    Publication date: March 5, 2009
    Inventors: Tsung-Han TSAI, Shih-Way Huang, Jia-Her Luo
  • Patent number: 7499852
    Abstract: Coding (1) of an audio signal is provided including estimating (110) a position of a transient signal component in the audio signal, matching (111,112) a shape function on the transient signal component in case the transient signal component is gradually declining after an initial increase, which shape function has a substantially exponential initial behavior and a substantially logarithmic declining behavior; and including (15) the position and shape parameters describing the shape function in an audio stream (AS).
    Type: Grant
    Filed: April 27, 2005
    Date of Patent: March 3, 2009
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Arnoldus Werner Johannes Oomen, Albertus Cornelis Den Brinker
  • Publication number: 20090048826
    Abstract: Provided are an encoding method and apparatus for efficiently encoding a sinusoidal signal whose magnitude is less than a masking value according to a psychoacoustic model, a decoding method and apparatus for decoding an encoded sinusoidal signal, and a computer-readable recording medium having recorded thereon a program for executing the encoding method/the decoding method. By using a particular code indicating that the magnitude of a first sinusoidal signal is less than a masking value according to a psychoacoustic model to encode the first sinusoidal signal, difference coding for a third sinusoidal signal of a next frame, which is connected to the first sinusoidal signal, is performed using a sinusoidal signal or sinusoidal signals selected according to a method to use the particular code, and a decoding apparatus obtains a sum with a transmitted difference using the selected sinusoidal signal(s).
    Type: Application
    Filed: June 2, 2008
    Publication date: February 19, 2009
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Nam-suk LEE, Geon-hyoung LEE, Chul-woo LEE, Han-gil MOON
  • Publication number: 20090037166
    Abstract: An audio encoding method previously estimates better initial iterative values of global-gain and scalefactor for avoiding heavy calculation. The estimating process of the encoding method includes calculating the bit allocation of one frequency sample based on a sampling rate, a bit rate, and the number of audio channels according to an input frame, and the psychoacoustic model, searching one frequency sample having the greatest sample energy in each of a plurality of scalefactor bands, quantizing the frequency sample to comply with the bit allocation and to generate a corresponding scalefactor, searching a maximum scalefactor of all scalefactor bands corresponding to the input frame, and setting initial values of scalefactors and an initial value of global-gain for the quantization iterative loop process according to the corresponding scalefactor and the maximum scalefactor.
    Type: Application
    Filed: July 30, 2008
    Publication date: February 5, 2009
    Inventor: Wen-Haw Wang
  • Patent number: 7483836
    Abstract: The invention concerns audio coding methods and particularly relates to an efficient means by which selected frequency bands of information from an original audio signal which are audible but which are perceptually less relevant need not be encoded, but may be replaced by a noise filling parameter. Those signal bands having content which is perceptually more relevant are, in contrast fully encoded. Encoding bits may be saved in this manner, without leaving voids in the frequency spectrum of the received signal. In this way, this method avoids the annoying bandwidth switching artefacts that can occur when full bandwidth audio is encoded with a bit budget which is too low to represent the signal within each frequency band. Thus, this method allows an increase in the encoded audio bandwidth without introducing annoying bandwidth switching artefacts.
    Type: Grant
    Filed: May 6, 2002
    Date of Patent: January 27, 2009
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Rakesh Taori, Steven Leonardus Josephus Dimphina Elisabeth Van De Par
  • Patent number: 7480692
    Abstract: A method of communicating with an electronic device. The method includes providing an electronic device having an audible sound receiving and generating sub-system including a microphone, transmitting from a source at least one acoustic signal encoded with information, receiving said at least one acoustic signal by said microphone and determining a spatial position, distance or movement of the microphone relative to the source, responsive to the received at least one signal.
    Type: Grant
    Filed: January 25, 2006
    Date of Patent: January 20, 2009
    Assignee: BeepCard Inc.
    Inventors: Alon Atsmon, Amit Antebi, Nathan Altman, Zvi Lev, Moshe Cohen
  • Publication number: 20090006081
    Abstract: Provided are a method and apparatus for encoding or decoding an audio signal or a speech signal. In the encoding method, encoding is performed by performing domain transformation on a received signal in units of frequency bands by applying a psychoacoustic model, encoding the transformation result with respect to predetermined one or more frequency bands by using a high temporal resolution coding tool, and then quantizing the encoding result. In the decoding method, decoding is performed by inversely quantizing signals obtained by encoding in units of frequency bands, decoding one or more signals from among the inversely quantized signals, which are allocated to one or more frequency bands which have a predetermined domain resolution, determined by applying the psychoacoustic model, that is greater than a predetermined value, according to a predetermined method, and then inversely transforming either the inversely quantized or the one or more decoded signals.
    Type: Application
    Filed: February 19, 2008
    Publication date: January 1, 2009
    Applicant: Samsung Electronics Co., Ltd.
    Inventors: Eun-mi OH, Ho-sang Sung, Ki-hyun Choo, Jung-hoe Kim, Mi-young Kim
  • Patent number: 7472056
    Abstract: A transcoder for use between speech codecs using different Code-Excited Linear Prediction (CELP) type and a method therefor are disclosed. The transcoder includes a decoding unit of an input CELP codec, a transcoding filter, a transcoding filter design unit, and an encoding unit of an output CELP codec. By substituting a post-filter and a perceptual weighting filter of a prior art with one transcoding filter, the calculation amount of the transcoder is reduced, and speech quality decoded at a receiving end is improved.
    Type: Grant
    Filed: December 30, 2003
    Date of Patent: December 30, 2008
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Jongmo Sung, Hyun Woo Kim, Do Young Kim, Jin Kyu Choi, Sung Wan Yoon, Hong Goo Kang, Ki Seung Lee, Dae Hee Youn
  • Publication number: 20080319739
    Abstract: A multi-channel audio decoder provides a reduced complexity processing to reconstruct multi-channel audio from an encoded bitstream in which the multi-channel audio is represented as a coded subset of the channels along with a complex channel correlation matrix parameterization. The decoder translates the complex channel correlation matrix parameterization to a real transform that satisfies the magnitude of the complex channel correlation matrix. The multi-channel audio is derived from the coded subset of channels via channel extension processing using a real value effect signal and real number scaling.
    Type: Application
    Filed: June 22, 2007
    Publication date: December 25, 2008
    Applicant: Microsoft Corporation
    Inventors: Sanjeev Mehrotra, Wei-Ge Chen
  • Patent number: 7469208
    Abstract: Automatically normalizing a perceived loudness for a digitally encoded audio track formed of a number of channels during playback on a multimedia asset player is described. A number of auditory selectivity frequency bands are selected and for each channel in the track, a power value for each of the number of selectivity frequency bands is computed. Each of the power values is weighted by a sensitivity weighting factor and a sum value of all the weighted power values is then calculated. For the track, a perceived acoustic power value is calculated based upon the sum value for each of the channels and a normalization gain factor based upon the perceived acoustic power is calculated and associated with the track. During playback, the normalization gain factor is applied to the track.
    Type: Grant
    Filed: May 12, 2006
    Date of Patent: December 23, 2008
    Assignee: Apple Inc.
    Inventor: William S. Kincaid
  • Patent number: 7464027
    Abstract: Quantizing an information signal of a sequence of information values includes frequency-selective filtering the sequence of information values to obtain a sequence of filtered information values and quantizing the filtered information values to obtain a sequence of quantized information values by means of a quantizing step function which maps the filtered information values to the quantized information values and the course of which is steeper below a threshold information value than above the threshold information value.
    Type: Grant
    Filed: July 27, 2006
    Date of Patent: December 9, 2008
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.
    Inventors: Gerald Schuller, Stefan Wabnik, Jens Hirschfeld, Wolfgang Fiesel
  • Patent number: 7461002
    Abstract: A method for time aligning audio signal, wherein one signal has been derived from the other or both have been derived from another signal, comprises deriving reduced-information characterizations of the audio signals, auditory scene analysis. The time offset of one characterization with respect to the other characterization is calculated and the temporal relationship of the audio signals with respect to each other is modified in response to the time offset such that the audio signals are coicident with each other. These principles may also be applied to a method for time aligning a video signal and an audio signal that will be subjected to differential time offsets.
    Type: Grant
    Filed: February 25, 2002
    Date of Patent: December 2, 2008
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Brett G. Crockett, Michael J. Smithers
  • Patent number: 7460990
    Abstract: Traditional audio encoders may conserve coding bit-rate by encoding fewer than all spectral coefficients, which can produce a blurry low-pass sound in the reconstruction. An audio encoder using wide-sense perceptual similarity improves the quality by encoding a perceptually similar version of the omitted spectral coefficients, represented as a scaled version of already coded spectrum. The omitted spectral coefficients are divided into a number of sub-bands. The sub-bands are encoded as two parameters: a scale factor, which may represent the energy in the band; and a shape parameter, which may represent a shape of the band. The shape parameter may be in the form of a motion vector pointing to a portion of the already coded spectrum, an index to a spectral shape in a fixed code-book, or a random noise vector. The encoding thus efficiently represents a scaled version of a similarly shaped portion of spectrum to be copied at decoding.
    Type: Grant
    Filed: June 29, 2004
    Date of Patent: December 2, 2008
    Assignee: Microsoft Corporation
    Inventors: Sanjeev Mehrotra, Wei-Ge Chen
  • Patent number: 7460993
    Abstract: A transform coder adaptively configures window sizes for transform coding in a two-pass process to maximize coding efficiency, while achieving necessary time resolution to avoid pre-echo. In a first pass, the coder places small size windows over detected transient regions of an input signal in an open-loop window configuration process. In a second pass, the coder adjusts the window size configuration according to measurements of the achieved quality in a closed-loop window configuration process. Where quality measurement shows unacceptable quantization noise, the coder increases window size. Where pre-echo is detected, the coder reduces window size within coding bit rate constraints.
    Type: Grant
    Filed: December 14, 2001
    Date of Patent: December 2, 2008
    Assignee: Microsoft Corporation
    Inventors: Wei-Ge Chen, Naveen Thumpudi, Ming-Chieh Lee
  • Patent number: 7457742
    Abstract: A maximum of Nmax bits for encoding is defined for a set of parameters which may be calculated from a signal frame. The parameters for a first sub-set are calculated and encoded with N0 bits, where N0<Nmax. The allocation of Nmax?N0 encoding bits for the parameters of a second sub-set are determined and the encoding bits allocated to the parameters for the second sub-set are classified. The allocation and/or order of classification of the encoding bits are determined as a function of the encoding parameters for the first sub-set. For a total of N available bits for the encoding of the total parameters (N0<N=Nmax), the parameters for the second sub-set allocated the N?N0 encoding bits classified the first in said order are selected. Said selected parameters are calculated and encoded to give the N?N0 bits. The N0 encoding bits for the first sub-set and the N?N0 encoding bits for the selected parameters for the second sub-set are finally introduced into the output sequence of the encoder.
    Type: Grant
    Filed: December 22, 2003
    Date of Patent: November 25, 2008
    Assignee: France Telecom
    Inventors: Balazs Kovesi, Dominique Massaloux
  • Patent number: 7454327
    Abstract: An inventive method for introducing information into a data stream including data about spectral values representing a short-term spectrum of an audio signal first performs a processing of the data stream to obtain the spectral values of the short-term spectrum of the audio signal. Apart from that, the information to be introduced are combined with a spread sequence to obtain a spread information signal, whereupon a spectral representation of the spread information is generated which will then be weighted with an established psychoacoustic maskable noise energy to generate a weighted information signal, wherein the energy of the introduced information is substantially equal to or below the psychoacoustic masking threshold. The weighted information signal and the spectral values of the short-term spectrum of the audio signal will then be summed and afterwards processed again to obtain a processed data stream including both audio information and information to be introduced.
    Type: Grant
    Filed: October 5, 2000
    Date of Patent: November 18, 2008
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandtren Forschung e.V.
    Inventors: Christian Neubauer, Juergen Herre, Karlheinz Brandenburg, Eric Allamanche
  • Patent number: 7447317
    Abstract: In processing a multi-channel audio signal having at least three original channels, a first downmix channel and a second downmix channel are provided, which are derived from the original channels. For a selected original channel of the original channels, channel side information are calculated such that a downmix channel or a combined downmix channel including the first and the second downmix channels, when weighted using the channel side information, results in an approximation of the selected original channel. The channel side information and the first and second downmix channels form output data to be transmitted to a decoder, which, in case of a low level decoder only decodes the first and second downmix channels or, in case of a high level decoder provides a full multi-channel audio signal based on the downmix channels and the channel side information.
    Type: Grant
    Filed: October 2, 2003
    Date of Patent: November 4, 2008
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V
    Inventors: Jürgen Herre, Johannes Hilpert, Stefan Geyersberger, Andreas Hölzer, Claus Spenger
  • Publication number: 20080270123
    Abstract: The present invention discloses means and method for indicating emotional attitudes of a speaker, either human or animal, according to voice intonation. The invention also discloses a method for advertising, marketing, educating, or lie detecting by indicating emotional attitudes of a speaker and a method of providing remote service by a group comprising at least one observer to at least one speaker. The invention also discloses a system for indicating emotional attitudes of a speaker comprising a glossary of intonations relating intonations to emotions attitudes.
    Type: Application
    Filed: December 20, 2006
    Publication date: October 30, 2008
    Inventors: Yoram Levanon, Lan Lossos
  • Patent number: 7444280
    Abstract: A sound processor including a microphone (1), a pre-amplifier (2), a bank of N parallel filters (3), means for detecting short-duration transitions in the envelope signal of each filter channel, and means for applying gain to the outputs of these filter channels in which the gain is related to a function of the second-order derivative of the slow-varying envelope signal in each filter channel, to assist in perception of low-intensity sort-duration speech features in said signal.
    Type: Grant
    Filed: January 18, 2007
    Date of Patent: October 28, 2008
    Assignee: Cochlear Limited
    Inventors: Andrew Vandali, Graeme M. Clark
  • Patent number: 7437299
    Abstract: A method of encoding a multichannel signal, such as a stereophonic audio signal, including at least first and second signal components includes transforming at least the first and second signal components by a predetermined transformation into a principal signal including most of the signal energy and at least one residual signal including less energy than the principal signal. The predetermined transformation is parameterized by at least one transformation parameter. The method further includes representing the multichannel signal at least by the principal signal and the transformation parameter.
    Type: Grant
    Filed: March 20, 2003
    Date of Patent: October 14, 2008
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Ronaldus Maria Aarts, Roy Irwan
  • Patent number: 7433817
    Abstract: The present invention proposes a new method and a new apparatus for enhancement of audio source coding systems utilising high frequency reconstruction (HFR). It utilises adaptive filtering to reduce artifacts due to different tonal characteristics in different frequency ranges of an audio signal upon which HFR is performed. The present invention is applicable to both speech coding and natural audio coding systems.
    Type: Grant
    Filed: October 12, 2005
    Date of Patent: October 7, 2008
    Assignee: Coding Technologies AB
    Inventors: Kristofer Kjörling, Per Ekstrand, Fredrik Henn, Lars Villemoes
  • Patent number: 7428488
    Abstract: A received voice processing apparatus is provided, in which the received voice processing apparatus includes: a target spectrum calculation part for calculating, for each frequency band, a target spectrum on the basis of a compression ratio for a voice spectrum; a gain calculation part for calculating a gain value for amplifying the voice spectrum to the target spectrum; a filter coefficient calculation part for calculating a filter coefficient from the gain value; and a filer part for processing a received voice signal by using the filter coefficient.
    Type: Grant
    Filed: January 16, 2003
    Date of Patent: September 23, 2008
    Assignee: Fujitsu Limited
    Inventor: Mutsumi Saito
  • Patent number: 7426462
    Abstract: The largest absolute value (LAV) is determined within a group of data. Based on the LAV, a difference table is identified that is to be used for selecting an optimal Huffman codebook for the group of data. The difference table is associated with two Huffman codebooks. Further, one or more indexes are calculated for the group of data using an expression associated with the two Huffman codebooks, and a size difference value is determined for the group of data using the calculated indexes and the difference table. Based on the determined size difference value, the optimal Huffman codebook is selected from the above two codebooks.
    Type: Grant
    Filed: September 29, 2003
    Date of Patent: September 16, 2008
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Jeongnam Young, Sundararam Dwarakanath, Cheung Auyeung, Takao Yamazaki
  • Publication number: 20080221875
    Abstract: The present invention relates to a method for encoding an audio signal. In a first embodiment a model relating to temporal masking of sound provided to a human ear is provided. A temporal masking index is determined in dependence upon a received audio signal and the model using a forward and a backward masking function. Using a psychoacoustic model a masking threshold is determined in dependence upon the temporal masking index. Finally, the audio signal is encoded in dependence upon the masking threshold. The method has been implemented using the MPEG-1 psychoacoustic model 2. Semiformal listening test showed that using the method for encoding an audio signal according to the present invention the subjective high quality of the decoded compressed sounds has been maintained while the bit rate was reduced by approximately 10%. In a second embodiment, the inharmonic structure of audio signals is modeled and incorporated into the MPEG-1 psychoacoustic model 2.
    Type: Application
    Filed: May 19, 2008
    Publication date: September 11, 2008
    Applicant: Her Majesty in Right of Canada as Represented by the Minister of Industry
    Inventors: Hossein Najaf-Zadeh, Hassan Lahdili, Louis Thibault, William Treurniet
  • Patent number: 7412375
    Abstract: A method and apparatus for assessing the perceptual quality of stereo speech signals transmitted via a telecommunications network and recorded acoustically from an acoustic terminal device in which a mono reference signal comprising a single channel is aligned with a degraded stereo signal comprising a left and a right channel; a delay between each channel of said degraded signal and said reference signal is estimated; a noise masking indicator in dependence upon said estimated delays is generated; the level of the stereo signals is adjusted in dependence upon said noise masking indicator; a set of perceptually relevant parameters for each of said reference and degraded signals are generated; the perceptually relevant parameters of the reference signal with the perceptually relevant parameters of the degraded signal to generate a disturbance profile are compared; and a speech quality prediction is generated in dependence upon said disturbance profile.
    Type: Grant
    Filed: June 22, 2004
    Date of Patent: August 12, 2008
    Assignee: Psytechnics Limited
    Inventors: Tom Goldstein, Paul Alexander Barrett, Antony William Rix
  • Publication number: 20080177532
    Abstract: A method for improving the intelligibility of an incoming telephone signal, including boosting loudness of at least one band of poorly heard frequencies of the signal within at least one band of intensities of the signal, the band lying below a predetermined intensity level at which telephone standard conformance testing is performed, thereby to generate a differentially boosted telephone signal.
    Type: Application
    Filed: January 22, 2007
    Publication date: July 24, 2008
    Applicant: D.S.P. Group Ltd.
    Inventors: Israel Greiss, Arie Gur
  • Patent number: 7398204
    Abstract: The present invention relates to a method for encoding an audio signal. In a first embodiment a model relating to temporal masking of sound provided to a human ear is provided. A temporal masking index is determined in dependence upon a received audio signal and the model using a forward and a backward masking function. Using a psychoacoustic model a masking threshold is determined in dependence upon the temporal masking index. Finally, the audio signal is encoded in dependence upon the masking threshold. The method has been implemented using the MPEG-1 psychoacoustic model 2. Semiformal listening test showed that using the method for encoding an audio signal according to the present invention the subjective high quality of the decoded compressed sounds has been maintained while the bit rate was reduced by approximately 10%. In a second embodiment, the inharmonic structure of audio signals is modeled and incorporated into the MPEG-1 psychoacoustic model 2.
    Type: Grant
    Filed: August 26, 2003
    Date of Patent: July 8, 2008
    Assignee: Her Majesty in Right of Canada as Represented by the Minister of Industry
    Inventors: Hossein Najaf-Zadeh, Hassan Lahdili, Louis Thibault, William Treurniet
  • Publication number: 20080162119
    Abstract: An acoustic signal is subjected to filtration whereby low frequency sounds such as respiration are removed. Intense acoustic sounds such as coughing are also removed, and ultrasonic carrier modulation and demodulation is also performed to increase the saliency of speech sounds. By removing non-speech sounds from an acoustic signal comprising speech, a method is disclosed for improving the functioning of devices such as speech recognition machinery. Devices for implementing such techniques are also disclosed.
    Type: Application
    Filed: January 3, 2008
    Publication date: July 3, 2008
    Inventor: Martin L. Lenhardt
  • Patent number: 7395210
    Abstract: A system and method for lossless and/or progressive to lossless data coding (e.g., audio and/or image) is provided. The system and method employ a multiple factorization reversible transform component that provides quantized coefficients based, at least in part, upon a multiple factorization reversible transform. The multiple factorization reversible transform component can employ an N-point modulated lapped transform in accordance with one aspect of the present invention. The multiple factorization reversible transform component can comprise a modulation stage, a pre-FFT rotation stage, a complex FFT stage and a post-FFT rotation stage.
    Type: Grant
    Filed: November 21, 2002
    Date of Patent: July 1, 2008
    Assignee: Microsoft Corporation
    Inventor: Jin Li
  • Patent number: 7395211
    Abstract: A method of modifying the operation of the encoder function and/or the decoder function of a perceptual coding system in accordance with supplemental information, such as a watermark, so that the supplemental information may be detectable in the output of the decoder function. One or more parameters are modulated in the encoder function and/or the decoder function in response to the supplemental information.
    Type: Grant
    Filed: August 15, 2001
    Date of Patent: July 1, 2008
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Matthew Aubrey Watson, Michael Mead Truman, Stephen Decker Vernon, Brett Graham Crockett
  • Patent number: 7391877
    Abstract: Optimal head related transfer function spatial configurations designed to maximize speech intelligibility in multi-talker speech displays by spatially separating competing speech channels combined with a method of normalizing the relative levels of the different talkers in a multi-talker speech display that improves overall performance even in conventional multi-talker spatial configurations.
    Type: Grant
    Filed: March 30, 2007
    Date of Patent: June 24, 2008
    Assignee: United States of America as represented by the Secretary of the Air Force
    Inventor: Douglas S. Brungart
  • Publication number: 20080133223
    Abstract: A method and apparatus to extract an important frequency component of an audio signal and a method and apparatus to encode and/or decode an audio signal by using the same. The method of extracting an important frequency component of an audio signal includes converting an audio signal of a time domain into an audio signal of a frequency domain, selecting a frequency band having a harmonic feature from the converted audio signal of the frequency domain, and extracting an important frequency component from the selected frequency band having the harmonic feature.
    Type: Application
    Filed: October 31, 2007
    Publication date: June 5, 2008
    Applicant: Samsung Electronics Co., Ltd.
    Inventors: Chang-yong SON, Bun-mi Oh, Ho-sang Sung, Ki-hyun Choo, Jung-hue Kim, Kang-eun Lee
  • Patent number: 7383297
    Abstract: A method of communicating with an electronic device, comprising: providing a computer having an audible sound receiving and generating sub-system including a microphone; transmitting from a source at least one ultrasonic acoustic signal, encoded with information to the computer; and receiving said at least one signal by said microphone, to be detected by said computer. Preferably, the ultrasonic frequency used is below 50 kHz.
    Type: Grant
    Filed: October 1, 1999
    Date of Patent: June 3, 2008
    Assignee: BeepCard Ltd.
    Inventors: Alon Atsmon, Amit Antebi, Nathan Altman, Zvi Lev, Moshe Cohen
  • Patent number: 7376553
    Abstract: An apparatus for signal processing based on an algorithm for representing harmonics in a fractal lattice. The apparatus includes a plurality of tuned segments, each tuned segment including a transceiver having an intrinsic resonant frequency the amplitude of the resonant frequency capable of being modified by either receiving an external input signal, or by internally generating a response to an applied feedback signal. A plurality of signal processing elements are arranged in an array pattern, the signal processing elements including at least one function selected from the group including buffers for storing information, a feedback device for generating a feedback signal, a controller for controlling an output signal, a connection circuit for connecting the plurality of tuned segments to signal processing elements, and a feedback connection circuit for conveying signals from the plurality of signal processing elements in the array to the tuned segments.
    Type: Grant
    Filed: July 8, 2004
    Date of Patent: May 20, 2008
    Inventor: Robert Patel Quinn
  • Patent number: 7373293
    Abstract: A method and apparatus for shaping quantization noise generated when compressing audio data at a low bit rate is disclosed. A predetermined quantization noise threshold allowed during quantization of sampled audio data and quantization noise energy information of a quantized MDCT coefficient are received in all frequency bands of an audio frequency. The quantization noise energy of the quantized MDCT coefficient is attenuated in a predetermined number of frequency bands in which a difference between the predetermined quantization noise threshold and the quantization noise energy of the quantized MDCT coefficient is large.
    Type: Grant
    Filed: November 25, 2003
    Date of Patent: May 13, 2008
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Tae-gyu Chang, Heung-yeop Jang
  • Publication number: 20080109215
    Abstract: High frequency components of audio signals are reconstructed from the aspects of envelope and fine detail. The envelopes of the high frequency components are found through linear extrapolation of signals with frequencies lower than a cutoff frequency point. One method of reconstructing high frequency components is based on the linear extrapolation on the logarithm scale magnitudes of the transform coefficients of the audio signal in a frequency domain. The linear extrapolation is a linear approximation based on minimizing least squares of the logarithm scale magnitudes of the transform coefficients of the low frequency components. Another method is based on the linear extrapolation on the logarithm scale magnitudes of the envelope elements of the filterbank signals of the audio signal over a time segment. The linear extrapolation is a linear approximation based on minimizing least squares of the logarithm scale magnitudes of the envelope elements of the low frequency filterbank signals.
    Type: Application
    Filed: June 26, 2006
    Publication date: May 8, 2008
    Inventors: Chi-min Liu, Wen-chieh Lee, Han-Wen Hsu
  • Publication number: 20080103763
    Abstract: Samples of a component having a frequency less than a predetermined frequency in an input audio signal that is a digital signal having a predetermined sampling frequency are written in a memory. A harmonic-overtone signal having a frequency N times a frequency of the input audio signal is generated by repeating an operation N times, where N is an integer more than one, the operation including reading one sample and thinning out (N-1) samples for every N samples from the memory within each cycle period from a first one-direction zero-crossing point to a second one-direction zero-crossing point subsequent to the first one-direction zero-crossing point, each one-direction zero-crossing point being a point at which a level of the input audio signal changes from negative to positive or a point at which the level of the input audio signal changes from positive to negative.
    Type: Application
    Filed: October 24, 2007
    Publication date: May 1, 2008
    Applicant: Sony Corporation
    Inventors: Masaru Shimura, Kazunobu Ohkuri, Taro Nakagami
  • Patent number: RE40691
    Abstract: An audio type signal is encoded. The signal is first divided into bands. For each band, a yardstick signal element is selected. The yardstick may be the signal element having the largest magnitude in the band, the second largest, closest to the median magnitude, or having some other selected magnitude. This magnitude is used for various purposes, including assigning bits to the different bands, and for establishing reconstruction levels within a band. The magnitude of non yardstick signal elements is also quantized. The encoded signal is also decoded. Apparatus for both encoding and decoding are also disclosed. The location of the yardstick element within its band may also be recorded and encoded, and used for efficiently allocating bits to non-yardstick signal elements. Split bands may be established, such that each split band includes a yardstick signal element and each full band includes a major and a minor yardstick signal element.
    Type: Grant
    Filed: June 17, 1999
    Date of Patent: March 31, 2009
    Assignee: Massachusetts Institute of Technology
    Inventor: Jae S. Lim