Psychoacoustic Patents (Class 704/200.1)
  • Patent number: 7136492
    Abstract: A method for the visualization of the hearing capacity of a person. As a function of audiogram data, text is modified in any of several characteristics, including brightness, contrast, sharpness, and/or omission of letters.
    Type: Grant
    Filed: July 11, 2002
    Date of Patent: November 14, 2006
    Assignee: Phonak AG
    Inventor: Jonathan Möller
  • Patent number: 7117145
    Abstract: A cabin communication system for improving clarity of a voice spoken within an interior cabin having ambient noise includes a microphone for receiving the spoken voice and the ambient noise and for converting the spoken voice and the ambient noise into an audio signal, the audio signal having a first component corresponding to the spoken voice and a second component corresponding to the ambient noise, a speech enhancement filter for removing the second component from the audio signal to provide a filtered audio signal, the speech enhancement filter removing the second component by processing the audio signal by a method taking into account elements of psycho-acoustics of a human ear, and a loudspeaker for outputting a clarified voice in response to the filtered audio signal.
    Type: Grant
    Filed: October 19, 2000
    Date of Patent: October 3, 2006
    Assignee: Lear Corporation
    Inventors: Saligrama R. Venkatesh, Alan M. Finn
  • Patent number: 7117146
    Abstract: A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codec are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech. The overall quality of the system is strongly related to the excitation. In order to enhance the excitation, the system contains a fixed codebook comprising several subcodebooks.
    Type: Grant
    Filed: August 27, 2001
    Date of Patent: October 3, 2006
    Assignee: Mindspeed Technologies, Inc.
    Inventor: Yang Gao
  • Patent number: 7110941
    Abstract: The embedded audio coder (EAC) is a fully scalable psychoacoustic audio coder which uses a novel perceptual audio coding approach termed “implicit auditory masking” which is intermixed with a scalable entropy coding process. When encoding and decoding an audio file using the EAC, auditory masking thresholds are not sent to a decoder. Instead, the masking thresholds are automatically derived from already coded coefficients. Furthermore, in one embodiment, rather than quantizing the audio coefficients according to the auditory masking thresholds, the masking thresholds are used to control the order that the coefficients are encoded. In particular, in this embodiment, during the scalable coding, larger audio coefficients are encoded first, as the larger components are the coefficients that contribute most to the audio energy level and lead to a higher auditory masking threshold.
    Type: Grant
    Filed: March 28, 2002
    Date of Patent: September 19, 2006
    Assignee: Microsoft Corporation
    Inventor: Jin Li
  • Patent number: 7107212
    Abstract: A data processing apparatus for data processing an audio signal includes an input terminal (1) for receiving the audio signal, a 1-bit A/D converter (4) for A/D converting the audio signal to for a bitstream signal, a prediction unit (10) for carrying out a prediction step on the bitstream signal to form a predicted bitstream signal, a signal combination unit (42) for combining the bitstream signal and the predicted bitstream signal to form a residue bitstream signal, and an output terminal (14) for supplying the residual bitstream signal.
    Type: Grant
    Filed: November 25, 2002
    Date of Patent: September 12, 2006
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Renatus J. Van Der Vleuten, Alphons A. M. L. Bruekers, Arnoldus W. J. Oomen
  • Patent number: 7103554
    Abstract: In a method for producing a data stream of code words of variable lengths which are divided up into a plurality of sets of code words, wherein a raster having segments is specified for the data stream, wherein two adjacent raster points define a segment, code words of the first sets are written into the data stream starting at raster points. Subsequently, code words of the second set are written into the data stream pursuant to a predetermined assignment rule, wherein each code word of the second set is assigned to a different segment. Entire code words or parts of code words which cannot be written according to their assignment, are stored and entered into the data stream in further attempts, wherein the assignment by a predetermined regulation is changed from attempt to attempt. This procedure is analogously repeated for any further set which may be present.
    Type: Grant
    Filed: January 17, 2000
    Date of Patent: September 5, 2006
    Assignee: Fraunhofer-Gesellschaft zue Foerderung der angewandten Forschung e.V.
    Inventors: Ralph Sperschneider, Martin Dietz, Pierre Lauber, Michael Schug
  • Patent number: 7092873
    Abstract: For upgrading a data stream of multimedia data, which comprises features with textual description, a set of phonetic translation hints is included in the data stream, which specifies the phonetic transcription of parts or words of the textual description. The phonetic transcriptions need not be repeated for each occurrence of a word. This reduces the amount of data necessary for storing or transmitting the description text.
    Type: Grant
    Filed: January 7, 2002
    Date of Patent: August 15, 2006
    Assignee: Robert Bosch GmbH
    Inventors: Andreas Engelsberg, Holger Kussmann, Michael Wollborn, Sven Mecke, Andre Mengel
  • Patent number: 7089176
    Abstract: A method, system and computer readable medium for increasing the audio perceptual loudness includes shifting at least one frequency of a first audio signal to create a second audio signal so as to increase the audio perceptual loudness. The power level of the second audio signal is not more than a power level of the first audio signal. The method also includes generating high-audio perceptual loudness tone alert sequences based on psychoacoustic and audiometric data. It further includes acquiring a listener's threshold audio profile; adding the listener's audio profile to the loudness sensitivity curve for producing the listener's tonal sensitivity curve; determining a required dB scaling for critical band tones from the listener's tonal sensitivity curve; normalizing the tonal sensitivity curve for creating a decibel curve; selecting a frequency range of the tones by using the tonal sensitivity curve; and spacing the sequence of tones along a critical band scale.
    Type: Grant
    Filed: March 27, 2003
    Date of Patent: August 8, 2006
    Assignee: Motorola, Inc.
    Inventors: Marc Andre Boillot, Dennis Anson, Audley F. Patterson
  • Patent number: 7080006
    Abstract: A method is provided for the decoding of digital audio data, which is used to perform an error recognition dependent on transmitted reference values, e.g., scale factors. The method includes the comparison of reference values of a subband with preceding reference values of the same subband, in order to produce a characteristic that is compared with a threshold value, and that, if the characteristic is located above the predetermined threshold value, this is indicated by a signaling. In an example embodiment of the present invention, it is provided that in subbands in which no audio data are transmitted a default value is entered, which leads to the result that no characteristic is produced for this subband.
    Type: Grant
    Filed: November 7, 2000
    Date of Patent: July 18, 2006
    Assignee: Robert Bosch GmbH
    Inventors: Claus Kupferschmidt, Torsten Mlasko, Marc Klein Middelink
  • Patent number: 7065485
    Abstract: The method and preprocessor enhances the intelligibility of narrowband speech without essentially lengthening the overall time duration of the signal. Both spectral enhancements and variable-rate time-scaling procedures are implemented to improve the salience of initial consonants, particularly the perceptually important formant transitions. Emphasis is transferred from the dominating vowel to the preceding consonant through adaptation of the phoneme timing structure. In a further embodiment, the technique is applied as a preprocessor to a speech coder.
    Type: Grant
    Filed: January 9, 2002
    Date of Patent: June 20, 2006
    Assignee: AT&T Corp
    Inventors: Nicola R. Chong-White, Richard Vandervoort Cox
  • Patent number: 7062430
    Abstract: A signal processing device includes a biorthogonal filter bank that processes a finite length signal including a left boundary and a right boundary. The biorthogonal filter bank includes an analysis filter bank. The analysis filter bank includes one or more left boundary filters, one or more right boundary filters, and one or more steady-state analysis filters. Each left boundary filter and each right boundary filter includes a row vector.
    Type: Grant
    Filed: August 23, 2002
    Date of Patent: June 13, 2006
    Assignee: Texas Instruments Incorporated
    Inventor: Daniel L. Zelazo
  • Patent number: 7062429
    Abstract: A method and apparatus are disclosed for controlling a buffer in a communication system, such as a digital audio broadcasting (DAB) communication system. A more consistent perceptual quality over time provides for a more pleasing auditory experience to a listener. Thus, the disclosed bit allocation process determines, for each frame, a distortion d[k] at which the frame is to be encoded. Generally, the distortion d[k] is determined to minimize (i) the probability for a buffer overflow, and (ii) the variation of perceived distortion over time. A buffer level is controlled by partitioning a signal into a sequence of successive frames; estimating a distortion rate for a number of frames; and selecting a distortion such that the variance of the buffer level is bounded by a specified value.
    Type: Grant
    Filed: September 7, 2001
    Date of Patent: June 13, 2006
    Assignee: Agere Systems Inc.
    Inventor: Christof Faller
  • Patent number: 7050966
    Abstract: A system and method of improving signal intelligibility over an interference signal is provided. The system includes a psychoacoustic professor having a psychoacoustic model wherein the level of a signal-of-interest is improved so as to be audible above noise and so as not to exceed a predetermined maximum output level. The system can be combined with active noise cancellation.
    Type: Grant
    Filed: August 7, 2002
    Date of Patent: May 23, 2006
    Assignee: AMI Semiconductor, Inc.
    Inventors: Todd Schneider, David Coode, Robert L. Brennan, Peter Olijnyk
  • Patent number: 7050980
    Abstract: A system and method for detecting beats in a compressed audio domain is disclosed where a beat detector functions as part of an error concealment system in an audio decoding section used in audio information transfer and audio download-streaming system terminal devices such as mobile phones. The beat detector includes a MDCT coefficient extractor, a band feature value analyzer, a confidence score calculator; and a converging and storage unit. The method provides beat detection by means of beat information obtained using both MDCT coefficients as well as window-switching information. A baseline beat position is determined using MDCT coefficients obtained from the audio bitstream which also provides a window-switching pattern. A window-switching beat position is compared with the baseline beat position and, if a predetermined condition is satisfied, the window-switching beat position is validated as a detected beat.
    Type: Grant
    Filed: September 28, 2001
    Date of Patent: May 23, 2006
    Assignee: Nokia Corp.
    Inventors: Ye Wang, Miikka Vilermo
  • Patent number: 7050965
    Abstract: A method of normalizing received digital audio data includes decomposing the digital audio data into a plurality of sub-bands and applying a psycho-acoustic model to the digital audio data to generate a plurality of masking thresholds. The method further includes generating a plurality of transformation adjustment parameters based on the masking thresholds and desired transformation parameters and applying the transformation adjustment parameters to the sub-bands to generate transformed sub-bands.
    Type: Grant
    Filed: June 3, 2002
    Date of Patent: May 23, 2006
    Assignee: Intel Corporation
    Inventor: Alex A. Lopez-Estrada
  • Patent number: 7047184
    Abstract: A speech coding apparatus comprises a repetition period pre-selecting unit for generating a plurality of candidates for the repetition period of a driving excitation source by multiplying the repetition period of an adaptive excitation source by a plurality of constant numbers, respectively, and for pre-selecting a predetermined number of candidates from all the candidates generated. A driving excitation source coding unit provides both excitation source location information and excitation source polarity information that minimize a coding distortion, for each of the predetermined number of candidates, and provides an evaluation value associated with the minimum coding distortion for each of the predetermined number of candidates.
    Type: Grant
    Filed: November 7, 2000
    Date of Patent: May 16, 2006
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventors: Hirohisa Tasaki, Tadashi Yamaura
  • Patent number: 7047187
    Abstract: A method for concealing errors in an audio signal includes digitally encoding the audio signal into a plurality of audio data packets representative of the audio signal; determining a perceptually tolerable distortion limit for the audio packets; and altering a value of at least one audio packet by an amount within the perceptually tolerable distortion limit utilizing information representative of a different audio data packet.
    Type: Grant
    Filed: February 27, 2002
    Date of Patent: May 16, 2006
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Szeming Cheng, Hong Heather Yu, Zixiang Xiong
  • Patent number: 7035700
    Abstract: Method and apparatus for embedding binary data in audio signals. The magnitude of the power spectrum at the perceptual holes of each frame of a host speech utterance, may be altered so as to embed digital data. The phase spectrum at perceptually masked spectral points of each frame of a host speech utterance, may also be altered so as to embed digital data.
    Type: Grant
    Filed: March 13, 2002
    Date of Patent: April 25, 2006
    Assignee: The United States of America as represented by the Secretary of the Air Force
    Inventors: Kaliappan Gopalan, Daniel S. Benincasa
  • Patent number: 7031905
    Abstract: An audio signal encoding apparatus includes a device for compressing multiple-channel digital audio signals into compression-resultant multiple-channel signals respectively. The multiple-channel digital audio signals relate to a sampling frequency and a quantization bit number. The compression-resultant multiple-channel signals, a signal representative of the sampling frequency, and a signal representative of the quantization bit number are formatted into a formatting-resultant signal. The formatting-resultant signal contains a sub packet and a sync information portion. The sub packet contains at least portions of the compression-resultant multiple-channel signals. The sync information portion contains the signal representative of the sampling frequency and the signal representative of the quantization bit number.
    Type: Grant
    Filed: May 27, 2004
    Date of Patent: April 18, 2006
    Assignee: Victor Company of Japan, Ltd.
    Inventors: Yoshiaki Tanaka, Shoji Ueno, Norihiko Fuchigami
  • Patent number: 7024352
    Abstract: A method of and a device for output based objective speech quality assessment, wherein a degraded output speech signal comprising a speech information portion, is compared (5) with a reference signal retrieved from the output speech signal. The reference signal is provided by perceptual approximation of the speech information portion of the output speech signal using a speech recoder (2) producing a reference speech signal of finite bitrate. In a preferred embodiment, the speech recorder (2) is a speech codec.
    Type: Grant
    Filed: September 3, 2001
    Date of Patent: April 4, 2006
    Assignee: Koninklijke KPN N.V.
    Inventors: John Gerard Beerends, Andries Pieter Hekstra
  • Patent number: 7020615
    Abstract: An improved representation of transients in audio signals comprises modifying transient locations in such a way that a transient can occur only at a beginning of a sinusoidal segment. The modification procedure comprises the steps: detecting a beginning and an end of a transient using an energy-based approach with two sliding rectangular windows; moving samples between the beginning and the end of the transient to the locations specified by the segmentation used; and time-warping the signal parts in between the transients in order to fill the intervals between the modified transients.
    Type: Grant
    Filed: November 2, 2001
    Date of Patent: March 28, 2006
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Renat Vafin, Richard Heusdens, Steven Leonardus Josephus Dimphina Elisabeth Van De Par, Willem Bastiaan Kleijn
  • Patent number: 7020603
    Abstract: A system and method of encoding an audio stream includes generation of a distortion threshold templates database that is accessible by a perceptual audio encoder. The audio encoder utilizes the threshold templates to operate a compression algorithm, obviating the need to implement a psycho-acoustic model to generate a distortion threshold for each compression operation. A similar templates database may be used in a transcoding operation, again bypassing a psycho-acoustic modeling operation and promoting system efficiency.
    Type: Grant
    Filed: February 7, 2002
    Date of Patent: March 28, 2006
    Assignee: Intel Corporation
    Inventor: Alex A Lopez-Estrada
  • Patent number: 7016814
    Abstract: According to an objective measurement technique for determining the quality of an output signal (X(t)) of an audio or voice signals-processing and/or -transporting system, the output signal and a reference signal (Y(t)) are mapped on representation signals (R(X), R(Y)) of a suitably chosen perception model. A differential signal (D(t)) of the representation signals is averaged, in a time-averaging device, in two steps by two averaging members connected in series, over time, whereby a quality signal Q is generated. The time averaging is carried out by the first and second averaging members at the local and the global levels, respectively, using various averaging functions, e.g., preferably Lebesgue paveraging functions having various p-values p1>p2. The invention advantageously provides a high correlation between a generated quality signal Q and the human quality perception, even in the event of brief disturbances.
    Type: Grant
    Filed: December 11, 2000
    Date of Patent: March 21, 2006
    Assignee: Koninklijke KPN N.V.
    Inventors: John Gerard Beerends, Andries Pieter Hekstra
  • Patent number: 7010480
    Abstract: A method for preparing a speech signal for encoding comprises determining whether the spectral content of an input speech signal is representative of a defined spectral characteristic (e.g., a defined characteristic slope). A frequency specific filter component of a weighting filter is controlled based on the determination of the spectral content of the speech signal or/and its location in the encoder. A core weighting filter component of the weighting filter may be maintained regardless of the spectral content of the speech signal.
    Type: Grant
    Filed: September 13, 2001
    Date of Patent: March 7, 2006
    Assignee: Mindspeed Technologies, Inc.
    Inventors: Yang Gao, Huan-Yu Su
  • Patent number: 7003449
    Abstract: A method for encoding an audio signal, including providing a masking function, representative of psychoacoustic masking; setting a quality value for data of the encoded signal, adjusting the masking function dependent upon the quality value; and allocating bits for quantization of the encoded signal based on the incremental masking function.
    Type: Grant
    Filed: October 30, 1999
    Date of Patent: February 21, 2006
    Assignee: STMicroelectronics Asia Pacific PTE Ltd.
    Inventors: Mohammed Javed Absar, Sapna George
  • Patent number: 7003447
    Abstract: In a system in which an auxiliary data signal is conveyed with a decoded audio signal to assist in re-encoding, signature information is provided to assist in detecting changes in the decoded audio signal which would render the auxiliary data signal of little or no use in re-encoding. The signature information is most preferably included in the auxiliary data signal.
    Type: Grant
    Filed: September 23, 1999
    Date of Patent: February 21, 2006
    Assignee: British Broadcasting Corporation
    Inventor: Andrew James Mason
  • Patent number: 7003448
    Abstract: In a method for concealing an error in an encoded audio signal a set of spectral coefficients is subdivided into at least two sub-bands (14), whereupon the sub-bands are subjected to a re-verse transform (16). A specific prediction is performed (18) for each quasi time signal of a sub-band to obtain an estimated temporal representation for a sub-band of a set of spectral coefficients following the current set. A forward transform (20) of the time signal of each sub-band provides estimated spectral coefficients which can be used (28) instead of erroneous spectral coefficients of a following set of spectral coefficients, e.g. in order to conceal transmission errors. Transforming at the sub-band level provides independence from transform characteristics such as block length, window type and MDCT algorithm while at the same time preserving spectral processing for error concealment. Thus the spectral characteristics of audio signals can also be taken into account during error concealment.
    Type: Grant
    Filed: April 12, 2000
    Date of Patent: February 21, 2006
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung e.V.
    Inventors: Pierre Lauber, Martin Dietz, Juergen Herre, Reinhold Boehm, Ralph Sperschneider, Daniel Homm
  • Patent number: 6999919
    Abstract: A method for an improved QSS (bit allocator) algorithm is disclosed. The disclosed method is capable of greatly improving determination time; thereby, improving the efficiency of converting a signal from an audio format to an MP3 format. The starting point of the QSS determination for a present frame (N) is the QSS of a previous frame (N?1). This starting point provides for improved efficiency for determining actual QSS of frame N as QSS[N?1] will be closer to QSS[N] than an arbitrary starting point. Thus, fewer iterations are required to determine QSS[N] as compared to conventional encoders. The algorithm of the present is more efficient than conventional methods in that it makes use of the fact that audio signal statistics usually do not change abruptly during the period of one audio frame to another.
    Type: Grant
    Filed: February 20, 2001
    Date of Patent: February 14, 2006
    Assignee: Intervideo, Inc.
    Inventors: Shahab Layeghi, Fahri Surucu
  • Patent number: 6996521
    Abstract: A method is provided for embedding data into an audio signal and determining data embedded into an audio signal. In the method for embedding data into an audio signal, the audio signal is based on a first set of data and includes a phase component. The method modifies at least a portion of the phase component of the audio signal to embed a second set of data into the audio signal. The modified audio signal can be made to differ with respect to the audio signal in a manner at least one of (i) substantially imperceptible and (ii) imperceptible to a listener of the first set of data depending on the extent that the phase component of the audio signal is modified. In the method for determining data embedded into an audio signal, the audio signal is based on a first set of data of an original audio signal and includes a phase component. The method determines a second set of data embedded into the audio signal based on the phase component of the audio signal.
    Type: Grant
    Filed: October 4, 2001
    Date of Patent: February 7, 2006
    Assignee: The University of Miami
    Inventors: Alexander I. Iliev, Michael S. Scordilis
  • Patent number: 6965859
    Abstract: A method and apparatus for audio compression receives an audio signal. Transform coding is applied to the audio signal to generate a sequence of transform frequency coefficients. The sequence of transform frequency coefficients is partitioned into a plurality of non-uniform width frequency ranges and then zero value frequency coefficients are inserted at the boundaries of the non-uniform width frequency ranges. As a result, certain of the transform frequency coefficients that represent high frequencies are dropped.
    Type: Grant
    Filed: March 3, 2003
    Date of Patent: November 15, 2005
    Assignee: XVD Corporation
    Inventors: Victor D. Kolesnik, Boris D. Kudryashov, Sergey Petrov, Evgeny Ovsyannikov, Boris Trojanovsky, Andrey Trofimov
  • Patent number: 6963842
    Abstract: A memory-efficient system converting a signal from a first transform domain to a second transform domain. The system includes a first mechanism that obtains an input signal expressed via a first transform-domain signal representation. A second mechanism expresses the input signal via a second transform-domain signal representation without intermediate time-domain conversion. In the specific embodiment, the input signal is a Modified Discrete Cosine Transform (MDCT) signal. The second transform-domain signal representation is a Discrete Fourier Transform (DFT) signal. The second mechanism further includes a third mechanism that combines effects of an inverse MDCT, a synthesis window function, and an analysis window function, and provides a first signal in response thereto. A fourth mechanism converts the MDCT signal to the DFT signal based on the first signal.
    Type: Grant
    Filed: September 5, 2001
    Date of Patent: November 8, 2005
    Assignee: Creative Technology Ltd.
    Inventor: Michael M. Goodwin
  • Patent number: 6963649
    Abstract: An improved noise canceling microphone is provided including robust design features and advanced noise control and speech discrimination convergence characteristics. Two adaptive controllers are used to ensure robust performance in quickly changing acoustic environments ensuring an acceptable minimum performance characteristic. Additionally, a new real-time spectral estimation procedure is applied to a noise canceling communications microphone platform that permits continued and optimal adaptation of non-voice bandwidth frequencies during speech transients.
    Type: Grant
    Filed: October 3, 2001
    Date of Patent: November 8, 2005
    Assignee: Adaptive Technologies, Inc.
    Inventors: Michael A. Vaudrey, William R. Saunders
  • Patent number: 6950794
    Abstract: A method of encoding a digital signal, particularly an audio signal, which predicts favorable scalefactors for different frequency subbands of the signal. Distortion thresholds which are associated with each of the frequency subbands of the signal are used, along with transform coefficients, to calculate total scaling values, one for each of the frequency subbands, such that the product of a transform coefficient for a given subband with its respective total scaling value is less than a corresponding one of the distortion thresholds. In an audio encoding application, the distortion thresholds are based on psychoacoustic masking. The invention may use a novel approximation for calculating the total scaling values, which obtains a first term based on a corresponding distortion threshold, and obtains a second term based on a sum of the transform coefficients. Both of these terms may be obtained using lookup tables.
    Type: Grant
    Filed: November 20, 2001
    Date of Patent: September 27, 2005
    Assignee: Cirrus Logic, Inc.
    Inventors: Girish P. Subramaniam, Raghunath K. Rao
  • Patent number: 6947886
    Abstract: Disclosed are scalable quantizers for audio and other signals characterized by a non-uniform, perception-based distortion metric, that operate in a common companded domain which includes both the base-layer and one or more enhancement-layers. The common companded domain is designed to permit use of the same unweighted MSE metric for optimal quantization parameter selection in multiple layers, exploiting the statistical dependence of the enhancement-layer signal on the quantization parameters used in the preceding layer. One embodiment features an asymptotically optimal entropy coded uniform scalar quantizer. Another embodiment is an improved bit rate scalable multi-layer Advanced Audio Coder (AAC) which extends the scalability of the asymptotically optimal entropy coded uniform scalar quantizer to systems with non-uniform base-layer quantization, selecting the enhancement-layer quantization methodology to be used in a particular band based on the preceding layer quantization coefficients.
    Type: Grant
    Filed: February 21, 2003
    Date of Patent: September 20, 2005
    Assignee: The Regents of the University of California
    Inventors: Kenneth Rose, Ashish Aggarwal, Shankar L. Regunathan
  • Patent number: 6934676
    Abstract: A method and system for coding audio signals in a multi-channel sound system, wherein a plurality of MDCT units are used to reduce the audio signals for providing a plurality of MDCT coefficients. The MDCT coefficients are quantized according to the masking threshold calculated from a psychoacoustic model and a plurality of INT (integer-to-integer) DCT modules are used to remove the cross-channel redundancy in the quantized MDCT coefficients. The output from the INT-DCT modules is Huffman coded and written to a bitstream for transmission or storage.
    Type: Grant
    Filed: May 11, 2001
    Date of Patent: August 23, 2005
    Assignee: Nokia Mobile Phones Ltd.
    Inventors: Ye Wang, Miikka Vilermo
  • Patent number: 6934677
    Abstract: Quantization matrices facilitate digital audio encoding and decoding. An audio encoder generates and compresses quantization matrices; an audio decoder decompresses and applies the quantization matrices. The invention includes several techniques and tools, which can be used in combination or separately. For example, the audio encoder can generate quantization matrices from critical band patterns for blocks of audio data. The encoder can compute the quantization matrices directly from the critical band patterns, which can be computed from the same audio data that is being compressed. The audio encoder/decoder can use different modes for generating/applying quantization matrices depending on the coding channel mode of multi-channel audio data. The audio encoder/decoder can use different compression/decompression modes for the quantization matrices, including a parametric compression/decompression mode.
    Type: Grant
    Filed: December 14, 2001
    Date of Patent: August 23, 2005
    Assignee: Microsoft Corporation
    Inventors: Wei-Ge Chen, Naveen Thumpudi, Ming-Chieh Lee
  • Patent number: 6931370
    Abstract: DTS Interactive provides low cost fully interactive immersive digital surround sound environment suitable for 3D gaming and other high fidelity audio applications, which can be configured to maintain compatibility with the existing infrastructure of Digital Surround Sound decoders. The component audio is stored and mixed in a compressed and simplified format that reduces memory requirements and processor utilization and increases the number of components that can be mixed without degrading audio quality. Techniques are also provided for “looping” compressed audio, which is an important and standard feature in gaming applications that manipulate PCM audio. In addition, decoder sync is ensured by transmitting frames of “silence” whenever mixed audio is not present either due to processing latency or the gaming application.
    Type: Grant
    Filed: November 2, 1999
    Date of Patent: August 16, 2005
    Assignee: Digital Theater Systems, Inc.
    Inventor: Samuel Keith McDowell
  • Patent number: 6925434
    Abstract: Coding (1) of an audio signal is provided including estimating (110) a position of a transient signal component in the audio signal, matching (111,112) a shape function on the transient signal component in case the transient signal component is gradually declining after an initial increase, which shape function has a substantially exponential initial behavior and a substantially logarithmic declining behavior; and including (15) the position and shape parameters describing the shape function in an audio stream (AS).
    Type: Grant
    Filed: March 12, 2001
    Date of Patent: August 2, 2005
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Arnoldus Werner Johannes Oomen, Albertus Cornelis Den Brinker
  • Patent number: 6915255
    Abstract: Herein disclosed is an audio signal encoding apparatus comprises initial maximum scale factor band calculation means for calculating an initial maximum scale factor band for an audio signal inputted therein on the basis of the result made by the frame length determining means and the coded mode information inputted from the coded mode information means with reference to the initial maximum scale factor band information and Signal-to-Mask ratio threshold value information stored in the maximum scale factor band table storage means, and maximum scale factor band calculation means for calculating a maximum scale factor band for the audio signal on the basis of the initial maximum scale factor band calculated by the initial maximum scale factor band calculation means in accordance with the Signal-to-Mask ratio information calculated by the psychoacoustic model analyzing means, thereby making it possible to adaptively calculate the maximum scale factor band for the audio signal in accordance with the coded mode in
    Type: Grant
    Filed: December 21, 2001
    Date of Patent: July 5, 2005
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventor: Yasuhito Watanabe
  • Patent number: 6915264
    Abstract: A method and apparatus for determining masked thresholds for a perceptual auditory model used, for example, in a perceptual audio coder, which makes use of a filter bank structure comprising a plurality of filter bank stages which are connected in series, wherein each filter bank stage comprises a plurality of low-pass filters connected in series and a corresponding plurality of high-pass filters applied to the outputs of each of the low-pass filters, and wherein downsampling is advantageously applied between each successive pair of filter bank stages. In accordance with one illustrative embodiment, the filter bank comprises low order IIR filters. The cascade structure advantageously supports sampling rate reduction due to the continuously decreasing cutoff frequency in the cascade. The filter bank coefficients may advantageously be optimized for modeling of masked threshold patterns of narrow-band maskers, and the generated thresholds may be advantageously applied in a perceptual audio coder.
    Type: Grant
    Filed: February 22, 2001
    Date of Patent: July 5, 2005
    Assignee: Lucent Technologies Inc.
    Inventor: Frank Baumgarte
  • Patent number: 6910013
    Abstract: The invention relates first of all to a method for identifying a transient acoustic scene, said method including the extraction, during an extraction phase, of characteristic features from an acoustic signal captured by at least one microphone (2a, 2b), and the identification, during an identification phase, of the transient acoustic scene on the basis of the extracted characteristics. According to the invention, at least auditory-based characteristics are identified in the extraction phase. Also specified are an application of the method per this invention and a hearing device.
    Type: Grant
    Filed: January 5, 2001
    Date of Patent: June 21, 2005
    Assignee: Phonak AG
    Inventors: Sylvia Allegro, Michael Büchler
  • Patent number: 6895374
    Abstract: A method incorporating the use of a filter that accepts simultaneous masking signals and generates a close replica of temporal masking signals derived from the input simultaneous masking signals. The filter output is then added to the filter input to provide a composite masking signal. This composite masking signal may then be used to establish overall masking threshold levels which can be mapped in the appropriate subband to significantly reduce the amount of coding quantization required without significantly affecting the perceived sound of the reconstructed broadband signal. The filter's transfer function and impulse response define a filter the output of which exhibits two principal characteristics of temporal masking. One such characteristic is decay with the logarithm of time. The other is a rate of decay that is inversely proportional to the duration of the corresponding simultaneous masking.
    Type: Grant
    Filed: September 29, 2000
    Date of Patent: May 17, 2005
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventor: Wan-Chieh Pai
  • Patent number: 6873950
    Abstract: MPEG 1 Audio data compression is based on subband coding. A quantization is performed using a psychoacoustic model which is adapted to the masking behavior of the human hearing. Each subband signal is quantized in such a way that the quantization noise introduced by the coding will not exceed the masking curve for that subband. In ISO/IEC 11172-3 two independent psychoacoustic models are defined. The output from these psychoacoustic models is a set of Signal-to-Masking Ratios, SMRn, for every subband n. In order to calculate the SMRn for the psychoacoustic model 2 according to the invention a Fast Fourier Transformation is performed with a length of L=1152 samples by calculating k subtransformations over 2N samples with k*2N=L and fitting together the results of the k subtransformations.
    Type: Grant
    Filed: July 31, 2000
    Date of Patent: March 29, 2005
    Assignee: Thomson Licensing S.A.
    Inventors: Walter Voessing, Fei Gao, Andreas Aust
  • Publication number: 20040254785
    Abstract: In accordance with a specific implementation of the disclosure, a stream of audio frames is received and compressed using psycho-acoustical processing. The signal-to-mask ratio table generated by the psycho-acoustical algorithm is updated using only a portion of the received audio frames.
    Type: Application
    Filed: June 13, 2003
    Publication date: December 16, 2004
    Applicant: VIXS Systems, Inc.
    Inventor: Hong Zeng
  • Patent number: 6832198
    Abstract: The present invention provides a system, computer program product and a method for overlapping boundary data of an audio information track which includes ending data from a previous track and starting data from a succeeding track. By doing this, the ending data from a previous track and the beginning data from a succeeding track are available for the compression process. The compression program then manipulates the additional appended data in order to generate the compressed audio. Later, by severing the overlapping appended additional information before recombining the tracks, the resulting No-Pause CD audio stream is free from distortion and mismatch.
    Type: Grant
    Filed: June 29, 2000
    Date of Patent: December 14, 2004
    Assignee: International Business Machines Corporation
    Inventors: Kha D. Nguyen, Craig Kindell, Kenneth L. Milsted, Marco M. Hurtado
  • Patent number: 6829576
    Abstract: A nonlinear operation method suitable for audio encoding/decoding and an applied hardware thereof. The nonlinear operation method suitable for audio encoding exists in a quantization process for the audio encoding. The nonlinear operation equation is ƒ(X)=X3/4, where X represents the frequency-field data. The method comprises following steps. Building a query table that comprises the frequency-field data X and the corresponding value f(X) that corresponds to the frequency-field data X, wherein the query table is represented as a function T(X), and T(X)=X3/4, 1≦X ≦S, where S represents a data range included in the query table. Analyzing and providing a modified error quantity function fa(z) represented by an equation of power of 2, where z = ⌊ X 16 n ⌋ , n=1, 2 or 3, so that z falls in the data range S.
    Type: Grant
    Filed: October 18, 2002
    Date of Patent: December 7, 2004
    Assignee: National Central University
    Inventors: Tsung-Han Tsai, Chuh-Chu Yen
  • Publication number: 20040243397
    Abstract: A mask generation process for use in encoding audio data, including generating linear masking components from the audio data, generating logarithmic masking components from the linear masking components, and generating a global masking threshold from the logarithmic masking components. The process is a psychoacoustic masking process for use in an MPEG-1-L2 encoder, and includes generating energy values from a Fourier transform of the audio data, determining sound pressure level values from the energy values, selecting tonal and non-tonal masking components on the basis of the energy values, generating power values from the energy values, generating masking thresholds on the basis of the masking components and the power values, and generating signal to mask ratios for a quantizier on the basis of the sound pressure level values and the masking thresholds.
    Type: Application
    Filed: March 8, 2004
    Publication date: December 2, 2004
    Applicant: STMicroelectronics Asia Pacific Pte Ltd
    Inventors: Charles Averty, Xue Yao, Ranjot Singh
  • Patent number: 6826525
    Abstract: A method for detecting a transient in a discrete-time audio signal is performed completely in the time domain and includes the step of segmenting the discrete-time audio signal so as to generate consecutive segments of the same length with unfiltered discrete-time audio signals xs(T−1). The discrete-time audio signal in a current segment is subsequently filtered. Then either the energy of the filtered discrete-time audio signal in the current segment can be compared with the energy of the filtered discrete-time audio signal in a preceding segment or a current relationship between the energy of the filtered discrete-time audio signal in the current segment and the energy of the unfiltered discrete-time audio signal in the current segment can be formed and this current relationship compared with a preceding corresponding relationship. On the basis of the one and/or the other of these comparisons it is detected whether a transient is present in the discrete-time audio signal.
    Type: Grant
    Filed: June 25, 2002
    Date of Patent: November 30, 2004
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.
    Inventors: Johannes Hilpert, Jürgen Herre, Bernhard Grill, Rainer Buchta, Karlheinz Brandenburg, Heinz Gerhäuser
  • Patent number: 6823302
    Abstract: A method for providing real-time perceptual quality measurements of an audio signal (12) in which a quality test signal, including an audio test signal, is received by equipment under test. Playback of a pre-stored representation of the audio signal is coarsely synchronized (20) to the received audio test signal, for example, utilizing a synchronizing pulse in a header of the quality test signal. The playback is then finely synchronized (24) to the received audio signal, for example, by comparing data in a windowed portion of the received audio test signal and a windowed portion of the pre-stored representation of the audio test signal and by adjusting a windowed portion of the pre-stored representation of the audio test signal in accordance with results of the comparison. A window of the received audio test singal is then compared (14) to a portion of the finely synchronized play back of the pre-stored representation of the audio test signal to output a quality measurement of the received audio test signal.
    Type: Grant
    Filed: January 24, 2001
    Date of Patent: November 23, 2004
    Assignee: National Semiconductor Corporation
    Inventors: Ian Atkinson, Martin Lee, Wei Ma, Kambiz Homayounfar
  • Patent number: RE39080
    Abstract: A method and apparatus for quantizing audio signals is disclosed which advantageously produces a quantized audio signal which can be encoded within an acceptable range. Advantageously, the quantizer uses a scale factor which is interpolated between a threshold based on the calculated threshold of hearing at a given frequency and the absolute threshold of hearing at the same frequency.
    Type: Grant
    Filed: August 13, 2002
    Date of Patent: April 25, 2006
    Assignee: Lucent Technologies Inc.
    Inventor: James David Johnston