Transformation Patents (Class 704/203)
  • Patent number: 8428958
    Abstract: A method of encoding an audio signal, where signals including two or more channel signals are downmixed to a mono signal, the mono signal is divided into a low-frequency signal and a high-frequency signal, the low-frequency signal is encoded through algebraic code excited linear prediction (ACELP) or transform coded excitation (TCX), and the high-frequency signal is encoded using the low-frequency signal. A method of decoding of an audio signal, a low-frequency signal encoded through ACELP or TCX is decoded, a high-frequency signal is decoded using the low-frequency signal, the low-frequency signal and the high-frequency signal are combined to generate a mono signal, and the mono signal is upmixed by decoding spatial parameters regarding signals including two or more channel signals.
    Type: Grant
    Filed: October 7, 2008
    Date of Patent: April 23, 2013
    Assignee: SAMSUNG Electronics Co., Ltd.
    Inventors: Ho-sang Sung, Eun-mi Oh, Jung-hoe Kim, Ki-hyun Choo, Mi-young Kim
  • Patent number: 8423355
    Abstract: A method for encoding audio frames by producing a first frame of coded audio samples by coding a first audio frame in a sequence of frames, producing at least a portion of a second frame of coded audio samples by coding at least a portion of a second audio frame in the sequence of frames, and producing parameters for generating audio gap filler samples, wherein the parameters are representative of either a weighted segment of the first frame of coded audio samples or a weighted segment of the portion of the second frame of coded audio samples.
    Type: Grant
    Filed: July 27, 2010
    Date of Patent: April 16, 2013
    Assignee: Motorola Mobility LLC
    Inventors: Udar Mittal, Jonathan A. Gibbs, James P. Ashley
  • Patent number: 8400336
    Abstract: A method for parallel context modeling through reordering the bits of an input sequence to form groups of bits in accordance with a context model-specific reordering schedule. The reordering schedule is developed such that the groups of bits are formed to satisfy two conditions: first, that the context for each of the bits in a group of bits is different from the context of each of the other bits in that group, and the context of each of the bits in that group is determined independently from each of the other bits in that group. The parallel context modeling may be used in encoding or decoding operations.
    Type: Grant
    Filed: April 19, 2011
    Date of Patent: March 19, 2013
    Assignee: Research In Motion Limited
    Inventors: Dake He, Gaëlle Christine Martin-Cocher, Gergely Ferenc Korodi
  • Patent number: 8392200
    Abstract: A complex analysis filterbank is implemented by obtaining an input audio signal as a plurality of N time-domain input samples. Pair-wise additions and subtractions of the time-domain input samples is performed to obtain a first and second groups of intermediate samples, each group having N/2 intermediate samples. The signs of odd-indexed intermediate samples in the second group are then inverted. A first transform is applied to the first group of intermediate samples to obtain a first group of output coefficients in the frequency domain. A second transform is applied to the second group of intermediate samples to obtain an intermediate second group of output coefficients in the frequency domain. The order of coefficients in the intermediate second group of output coefficients is then reversed to obtain a second group of output coefficients. The first and second groups of output coefficients may be stored and/or transmitted as a frequency domain representation of the audio signal.
    Type: Grant
    Filed: April 13, 2010
    Date of Patent: March 5, 2013
    Assignee: QUALCOMM Incorporated
    Inventors: Ravi Kiran Chivukula, Yuriy Reznik
  • Patent number: 8380495
    Abstract: The embodiments of a transcoding method, a transcoding device, and a communication apparatus are provided. The embodiment of a method includes: receiving a bit stream input from a sending end; determining an attribute of discontinuous transmission (DTX) used by a receiving end and a frame type of the input bit stream; and transcoding the input bit stream in a corresponding processing manner according to a determination result. Thereby, a corresponding transcoding operation is performed on the input bit stream according to the attribute of DTX used by the receiving end and the frame type of the input bit stream. In such a manner, input bit streams of various types can be processed, and the input bit streams can be correspondingly transcoded according to the requirements of the receiving end. Therefore, the average computational complexity and peak computational complexity can be effectively decreased without decreasing the quality of the synthesized speech.
    Type: Grant
    Filed: January 21, 2010
    Date of Patent: February 19, 2013
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Changchun Bao, Hao Xu, Fanrong Tang, Xiangyu Hu
  • Publication number: 20130030795
    Abstract: An encoding method of an encoder is provided. The encoder generates first MDCT coefficients by transforming an input signal, and generates MDCT indices by quantizing the first MDCT coefficients. The encoder generates second MDCT coefficients by dequantizing the MDCT indices, and calculates MDCT residual coefficients using differences between the first MDCT coefficients and the second MDCT coefficients. The encoder generates a residual index by encoding the MDCT residual coefficients, and generates gain indices corresponding to gains from the first MDCT coefficients and the second MDCT coefficients.
    Type: Application
    Filed: March 31, 2011
    Publication date: January 31, 2013
    Inventors: Jongmo Sung, Hyun Woo Kim, Hyun Joo Bae
  • Patent number: 8364477
    Abstract: A method (400, 500) and apparatus (220) seeks to improve the intelligibility of speech emitted into a noisy environment. Formants are identified (426) and perceptual frequency scale band is selected (502) that includes at least one of the identified formants. The SNR in each band is compared (504) to a threshold and, if the SNR for that band is less than the threshold, the method increases a formant enhancement gain for that band. A set of high pass filter gains (338) is combined (516) with the formant enhancement gains yielding combined gains that are then clipped (518), scaled (520) according to a total SNR, normalized (526), smoothed across time (530) and frequency (532), and used to reconstruct (532, 534) an audio signal.
    Type: Grant
    Filed: August 30, 2012
    Date of Patent: January 29, 2013
    Assignee: Motorola Mobility LLC
    Inventors: Jianming J Song, John C Johnson
  • Patent number: 8363809
    Abstract: A teleconference terminal apparatus including: an input unit which receives a speech signal; an analyzing unit which calculates a target size on a predetermined segment basis of a speech signal; a coding unit which codes the speech signal to generate a data stream, so that the coded data size on a predetermined segment basis becomes the target size corresponding to each predetermined segment; a stream transmitting unit which transmits to a network the data stream; a receiving unit which receives the data stream transmitted from another terminal apparatus; a filtering unit which determines whether segment data is to be decoded based on data size for each predetermined segment in the received data stream, the segment data being included in the data stream; a decoding unit which decodes segment data determined to be decoded to generate a speech signal; and an output unit which outputs the generated speech signal.
    Type: Grant
    Filed: October 24, 2008
    Date of Patent: January 29, 2013
    Assignee: Panasonic Corporation
    Inventor: Kojiro Ono
  • Patent number: 8352260
    Abstract: A system for a multimodal unification of articulation includes a voice signal modality to receive a voice signal, and a control signal modality which receives an input from a user and generates a control signal from the input which is selected from predetermined inputs directly corresponding to the phonetic information. The interactive voice based phonetic input system also includes a multimodal integration system to receive and integrates the voice signal and the control signal. The multimodal integration system delimits a context of a spoken utterance of the voice signal by using the control signal to preprocess and discretize into phonetic frames. A voice recognizer analyzing the voice signal integrated with the control signal to output a voice recognition result. This new paradigm helps overcome constraints found in interfacing mobile devices. Context information facilitates the handling of the commands in the application environment.
    Type: Grant
    Filed: September 10, 2009
    Date of Patent: January 8, 2013
    Inventor: Jun Hyung Sung
  • Publication number: 20130006618
    Abstract: The present invention relates to a speech processing apparatus, a speech processing method and a program which, when multichannel audio signals are downmixed and coded, prevent delay and an increase in the computation amount upon decoding of the audio signals. An inverse multiplexing unit (101) acquires coded data on which a BC parameter is multiplexed. An uncorrelated frequency-time transform unit (102) performs IMDCT transform and IMDST transform of frequency spectrum coefficients of a monaural signal (XM) obtained from this coded data to generate the monaural signal XM) which is a time domain signal and a signal (XD?) which is substantially uncorrelated with this monaural signal (XM). The stereo synthesis unit (103) generates a stereo signal by synthesizing the monaural signal (XM) and the signal (XD?) using the BC parameter. The present invention is applicable to, for example, a speech processing apparatus which decodes a downmixed and coded stereo signal.
    Type: Application
    Filed: March 8, 2011
    Publication date: January 3, 2013
    Inventors: Yasuhiro Toguri, Shiro Suzuki, Jun Matsumoto, Yuuji Maeda, Yuuki Matsumura
  • Patent number: 8326606
    Abstract: A sound encoding device enabling the amount of delay to be kept small and the distortion between frames to be mitigated. In the sound encoding device, a window multiplication part (211) of a long analysis section (21) multiplies a long analysis frame signal of analysis length M1 by an analysis window, the resultant signal multiplied by the analysis window is outputted to an MDCT section (212), and the MDCT section (212) performs MDCT of the input signal to obtain the transform coefficients of the long analysis frame and outputs it to a transform coefficient encoding section (30). The window multiplication part (221) of a short analysis section (22) multiplies a short analysis frame signal of analysis length M2 (M2<M1) by an analysis window and the resultant signal multiplied by the analysis window is outputted to the MDCT section (222).
    Type: Grant
    Filed: October 25, 2005
    Date of Patent: December 4, 2012
    Assignee: Panasonic Corporation
    Inventor: Masahiro Oshikiri
  • Patent number: 8321207
    Abstract: For postprocessing spectral values which are based on a first transformation algorithm for converting the audio signal into a spectral representation, first a sequence of blocks of the spectral values representing a sequence of blocks of samples of the audio signal are provided.
    Type: Grant
    Filed: September 28, 2007
    Date of Patent: November 27, 2012
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung e.V.
    Inventors: Bernd Edler, Ralf Geiger, Christian Ertel, Johannes Hilpert, Harald Popp
  • Patent number: 8315859
    Abstract: A filter apparatus for filtering a time domain input signal to obtain a time domain output signal, which is a representation of the time domain input signal filtered using a filter characteristic having an non-uniform amplitude/frequency characteristic, comprises a complex analysis filter bank for generating a plurality of complex subband signals from the time domain input signals, a plurality of intermediate filters, wherein at least one of the intermediate filters of the plurality of the intermediate filters has a non-uniform amplitude/frequency characteristic, wherein the plurality of intermediate filters have a shorter impulse response compared to an impulse response of a filter having the filter characteristic, and wherein the non-uniform amplitude/frequency characteristics of the plurality of intermediate filters together represent the non-uniform filter characteristic, and a complex synthesis filter bank for synthesizing the output of the intermediate filters to obtain the time domain output signal.
    Type: Grant
    Filed: March 17, 2010
    Date of Patent: November 20, 2012
    Assignee: Dolby International AB
    Inventor: Lars Villemoes
  • Patent number: 8315862
    Abstract: An audio signal quality enhancement apparatus and method. The apparatus includes a pitch calculating unit to extract a pitch period of an audio signal, a frequency domain transforming unit to transform the audio signal to a frequency domain, a frequency band dividing unit to classify the transformed audio signal into audio signals for each of the plurality of frequency bands based on the extracted pitch period, and a pitch enhancement unit to determine a gain based on a volume of the transformed audio signal, and to generate an output signal by multiplying each of the classified audio signals with respect to each of the plurality of frequency bands by the gain, thereby enhancing quality of the audio signal.
    Type: Grant
    Filed: June 5, 2009
    Date of Patent: November 20, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jung Hoe Kim, Ho Chong Park, Eun Mi Oh
  • Patent number: 8315863
    Abstract: A post filter and a decoder enabling improvement of the sound quality of a decoded signal even when the sound quality of the decoded signal is different from the bands are disclosed. A frequency converting section determines a decoded spectrum. A power spectrum computing section computes the power spectrum from the decoded spectrum. A correction band determining section determines the band in which the power spectrum is corrected according to layer information. A power spectrum correcting section corrects the power spectrum in the corrected band in such a way that the variation along the frequency axis is suppressed. An inverse converting section subjects the corrected power spectrum to inverse conversion to determine an autocorrelation function. An LPC analyzing section determines an LPC coefficient of the determined autocorrelation function.
    Type: Grant
    Filed: June 15, 2006
    Date of Patent: November 20, 2012
    Assignee: Panasonic Corporation
    Inventor: Masahiro Oshikiri
  • Patent number: 8311810
    Abstract: The delay in a multi-channel audio coding apparatus and a multi-channel audio decoding apparatus is reduced. The audio coding apparatus includes: a downmix signal generating unit that generates, in a time domain, a first downmix signal that is one of a 1-channel audio signal and a 2-channel audio signal from an input multi-channel audio signal; a downmix signal coding unit that codes the first downmix signal; a first t-f converting unit that converts the input multi-channel audio signal into a multi-channel audio signal in a frequency domain; and a spatial information calculating unit that generates spatial information for generating a multi-channel audio signal from a downmix signal.
    Type: Grant
    Filed: July 28, 2009
    Date of Patent: November 13, 2012
    Assignee: Panasonic Corporation
    Inventors: Tomokazu Ishikawa, Takeshi Norimatsu, Kok Seng Chong, Huan Zhou
  • Patent number: 8311843
    Abstract: A method of encoding a time-domain audio signal is presented. In the method, an electronic device receives the time-domain audio signal. The time-domain audio signal is transformed into a frequency-domain signal including a coefficient for each of a plurality of frequencies, which are grouped into frequency bands. For each frequency band, the energy of the band is determined, a scale factor for the band is determined based on the energy of the band, and the coefficients of the band are quantized based on the associated scale factor. The encoded audio signal is generated based on the quantized coefficients and the scale factors.
    Type: Grant
    Filed: August 24, 2009
    Date of Patent: November 13, 2012
    Assignee: Sling Media Pvt. Ltd.
    Inventor: Laxminarayana M. Dalimba
  • Patent number: 8311809
    Abstract: Synthesizing an output audio signal is provided on the basis of an input audio signal, the input audio signal comprising a plurality of input sub-band signals, wherein at least one input sub-band signal is transformed (T) from the sub-band domain to the frequency domain to obtain at least one respective transformed signal, wherein the at least one input sub-band signal is delayed and transformed (D, T) to obtain at least one respective transformed delayed signal, wherein at least two processed signals are derived (P)from the at least one transformed signal and the at least one transformed delayed signal, wherein the processed signals are inverse transformed (T?1) from the frequency domain to the sub-band domain to obtain respective processed sub-band signals, and wherein the output audio signal is synthesized from the processed sub-band signals.
    Type: Grant
    Filed: April 14, 2004
    Date of Patent: November 13, 2012
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Erik Gosuinus Petrus Schuijers, Marc Willem Theodorus Klein Middelink, Arnoldus Werner Johannes Oomen, Leon Maria Van De Kerkhof
  • Patent number: 8306811
    Abstract: A method of embedding data into an audio signal provides a data sequence for embedding in the audio signal and computes masking thresholds for the audio signal from a frequency domain transform of the audio signal. The masking thresholds correspond to subbands of the audio signal, which are obtained from a masking model used to compress the audio signal. The method applies the masking threshold to the data sequence to produce masked data sequence and inserts the masked data sequence in the audio signal to produce an embedded audio signal. A method of detecting data embedded in an audio signal analyzes the audio signal to estimate the masking threshold used in embedding the data and applies the estimated masking threshold to the audio signal to extract the embedded data.
    Type: Grant
    Filed: October 24, 2007
    Date of Patent: November 6, 2012
    Assignee: Digimarc Corporation
    Inventors: Ahmed Tewfik, Bin Zhu, Mitch Swanson
  • Patent number: 8290770
    Abstract: Provided are a method and apparatus for sinusoidal audio coding, which employs a tracking method for further effective coding of sinusoids extracted in the process of a sinusoidal analysis of parametric coding. The sinusoidal audio coding method includes: extracting sinusoids of a current frame by performing a sinusoidal analysis on an input audio signal; with respect to each of the extracted sinusoids, setting a mode selected from a birth mode in which a sinusoid is newly generated irrespective of sinusoids of a previous frame, a continuation mode in which the sinusoid is only one sinusoid continued from one of the sinusoids of the previous frame, and a branch mode in which the sinusoid is one of a plurality of sinusoids continued from one of the sinusoids of the previous frame; and coding the extracted sinusoids according to the selected mode. Accordingly, a plurality of sinusoids that can be continued from one previous track component are set to the continuation mode or the branch mode.
    Type: Grant
    Filed: February 5, 2008
    Date of Patent: October 16, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Nam-suk Lee, Geon-hyoung Lee, Jae-one Oh, Chul-woo Lee, Jong-hoon Jeong
  • Patent number: 8290782
    Abstract: Digital audio samples are represented as a product of scale factors codes and corresponding quantity codes, sometimes referred to as exponent/mantissa format. To compress audio data, scale factors are organized by sample time and frequency either by filtering or frequency transformation, into a two-dimensional frame. The frame may be decomposed into “tiles” by partition. One or more such scale factor tiles are compressed by transformation by a two-dimensional, orthogonal transformation such as a two dimensional discrete cosine transform. Optional further encoding is applied to reduce redundancy. A decoding method and an encoded machine readable medium complement the method of encoding.
    Type: Grant
    Filed: July 24, 2008
    Date of Patent: October 16, 2012
    Assignee: DTS, Inc.
    Inventor: Dmitry V. Shmunk
  • Patent number: 8280730
    Abstract: A method (400, 600, 700) and apparatus (220) for enhancing the intelligibility of speech emitted into a noisy environment. After filtering (408) ambient noise with a filter (304) that simulates the physical blocking of noise by a at least a part of a voice communication device (102) a frequency dependent SNR of received voice audio relative to ambient noise is computed (424) on a perceptual (e.g. Bark) frequency scale. Formants are identified (426, 600, 700) and the SNR in bands including certain formants are modified (508, 510) with formant enhancement gain factors in order to improve intelligibility. A set of high pass filter gains (338) is combined (516) with the formant enhancement gains factors yielding combined gains which are clipped (518), scaled (520) according to a total SNR, normalized (526), smoothed across time (530) and frequency (532) and used to reconstruct (532, 534) an audio signal.
    Type: Grant
    Filed: May 25, 2005
    Date of Patent: October 2, 2012
    Assignee: Motorola Mobility LLC
    Inventors: Jianming J. Song, John C. Johnson
  • Publication number: 20120245927
    Abstract: A method, system and machine readable medium for noise reduction is provided. The method includes: (1) receiving a noise corrupted signal; (2) transforming the noise corrupted signal to a time-frequency domain representation; (3) determining probabilistic bases for operation, the probabilistic bases being priors in a multitude of frequency bands calculated online; (4) adapting longer term internal states of the method; (5) calculating present distributions that fit data; (6) generating non-linear filters that minimize entropy of speech and maximize entropy of noise, thereby reducing the impact of noise while enhancing speech; (7) applying the filters to create a primary output in a frequency domain; and (8) transforming the primary output to the time domain and outputting a noise suppressed signal.
    Type: Application
    Filed: March 20, 2012
    Publication date: September 27, 2012
    Applicant: ON SEMICONDUCTOR TRADING LTD.
    Inventor: Jeffrey Paul BONDY
  • Publication number: 20120239387
    Abstract: Method, system, and computer program product for voice transformation are provided. The method includes transforming a source speech using transformation parameters, and encoding information on the transformation parameters in an output speech using steganography, wherein the source speech can be reconstructed using the output speech and the information on the transformation parameters. A method for reconstructing voice transformation is also provided including: receiving an output speech of a voice transformation system wherein the output speech is transformed speech which has encoded information on the transformation parameters using steganography; extracting the information on the transformation parameters; and carrying out an inverse transformation of the output speech to obtain an approximation of an original source speech.
    Type: Application
    Filed: March 17, 2011
    Publication date: September 20, 2012
    Applicant: International Business Corporation
    Inventors: Shay Ben-David, Ron Hoory, Zvi Kons, David Nahamoo
  • Patent number: 8271272
    Abstract: There is disclosed a scalable encoding device capable of increasing the conversion performance from a narrow-band LSP to a wide-band LSP (prediction accuracy when predicting the wide-band LSP from the narrow-band LSP) and realizing a high-performance band scalable LSP encoding. The device includes a conversion coefficient calculation unit (109) for calculating a conversion coefficient by using a narrow-band quantization LSP which has been outputted from a narrow-band LSP encoding unit (103) and a wide-band quantization LSP which has been outputted from a wide-band LSP encoding unit (107). The wide-band LSP encoding unit (107) multiplies the narrow-band quantization LSP with the conversion coefficient inputted from the conversion coefficient calculation unit (109) so as to convert it into a wide-band LSP. The wide-band LSP is multiplied by a weight coefficient to calculate a prediction wide-band LSP.
    Type: Grant
    Filed: April 19, 2005
    Date of Patent: September 18, 2012
    Assignee: Panasonic Corporation
    Inventors: Hiroyuki Ehara, Koji Yoshida
  • Patent number: 8271266
    Abstract: Computer implemented methods and computing systems wherein relationships of words or phrases within a textual corpus are assessed via frequencies of occurrence of particular words or phrases and via frequencies of co-occurrence of particular pairs of words or phrases within defined tracts of text from within the textual corpus.
    Type: Grant
    Filed: August 29, 2007
    Date of Patent: September 18, 2012
    Assignee: Waggner Edstrom Worldwide, Inc.
    Inventors: Daniel Gerard Gallagher, Jia Lin, Marc Stoffregen
  • Patent number: 8270618
    Abstract: In processing a multi-channel audio signal having at least three original channels, a first downmix channel and a second downmix channel are provided, which are derived from the original channels. For a selected original channel of the original channels, channel side information are calculated such that a downmix channel or a combined downmix channel including the first and the second downmix channels, when weighted using the channel side information, results in an approximation of the selected original channel. The channel side information and the first and second downmix channels form output data to be transmitted to a decoder, which, in case of a low level decoder only decodes the first and second downmix channels or, in case of a high level decoder provides a full multi-channel audio signal based on the downmix channels and the channel side information.
    Type: Grant
    Filed: September 9, 2008
    Date of Patent: September 18, 2012
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.
    Inventors: Juergen Herre, Johannes Hilpert, Stefan Geyersberger, Andreas Hölzer, Claus Spenger
  • Patent number: 8271270
    Abstract: A method, apparatus, and system for encoding or decoding a broadband voice signal are provided. The method includes extracting a linear prediction coefficient (LPC) from the broadband voice signal; outputting a linear prediction (LP) residual signal; pitch-searching a spectrum of the LP residual signal; extracting spectral magnitudes and phases of the LP residual signal, which correspond to a damping factor; obtaining, from among the extracted spectral magnitudes and phases, a first spectral magnitude and a first phase at which a power value of the LP residual signal is minimized; quantizing the first spectral magnitude and the first phase; and decoding the broadband voice signal. The apparatus includes a linear prediction coefficient (LPC) analyzer; an LPC inverse filter; a pitch searching unit; a sinusoidal analyzer; and a phase and spectral magnitude quantizer. The system includes a broadband voice encoding apparatus and a broadband voice decoding apparatus.
    Type: Grant
    Filed: August 14, 2007
    Date of Patent: September 18, 2012
    Assignees: Samsung Electronics Co., Ltd., Chungbuk National University Industry-Academic Cooperation Foundation
    Inventors: In-sung Lee, Jong-hark Kim, Gyu-hyeok Jeong, Sang-won Seo
  • Patent number: 8260613
    Abstract: A double talk detector for controlling the echo path estimation in a telecommunication system by indicating when a received coded speech signal is dominated by a non-echo signal; i.e., that so-called double talk exists. This is determined by extracting LSPs from a coded speech frame of the received coded speech signal when the signal power exceeds a first threshold value, converting each of said extracted LSPs into LSFs, and calculating the distance between each two adjacent LSFs. For each distance that is smaller than a second threshold, a spectral peak is located between the two LSFs, and it is determined whether said spectral peak is an echo or not. When a predetermined number of non-echo spectral peaks are located in the received speech signal, double talk will be indicated, and the echo path estimation may be disabled.
    Type: Grant
    Filed: February 21, 2007
    Date of Patent: September 4, 2012
    Assignee: Telefonaktiebolaget L M Ericsson (Publ)
    Inventor: Tonu Trump
  • Publication number: 20120215524
    Abstract: A tone determination device, which determines the tonality of an input signal, is capable of reducing calculation complexity. Therein a frequency conversion unit (101) converts the frequency of an input signal; a downsampling unit (102) carries out shortening processing which shortens the vector series length of the frequency-converted signal; a constancy determination unit (107) determines the constancy of the input signal; depending on the constancy of the input signal, a vector selection unit (104) selects either the vector series of the post-frequency conversion signal or the vector series after the shortening of the vector series length; a correlation analysis unit (105) uses the vector series selected by the vector selection unit (104) to obtain correlations; and a tone determination unit (106) uses the correlations to determine the tonality of the input signal.
    Type: Application
    Filed: October 26, 2010
    Publication date: August 23, 2012
    Applicant: PANASONIC CORPORATION
    Inventor: Kaoru Satoh
  • Patent number: 8244547
    Abstract: A signal bandwidth extension apparatus includes a determination unit which determines whether or not a peak component of the input signal is lacked in the band to be extended, and a control unit which controls to extend the bandwidth when the determination unit determines that the peak component of the input signal is lacked in the band to be extended, and not to extend the bandwidth when the determination unit determines that the peak component is not lacked.
    Type: Grant
    Filed: August 28, 2009
    Date of Patent: August 14, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Takashi Sudo, Kimio Miseki
  • Patent number: 8244527
    Abstract: A signature is extracted from the audio of a program received by a tunable receiver such that the signature characterizes the program. In order to extract the signature, blocks of the audio are converted to corresponding spectral moments. At least one of the spectral moments is then converted to the signature. Also, a test audio signal from a receiver is correlated to a reference audio signal by converting the test audio signal and the reference audio signal to corresponding test and reference spectra, determining test slopes corresponding to coefficients of the test spectrum and reference slopes corresponding to coefficients of the reference spectrum, and comparing the test slopes to the reference slopes in order to determine a match between the test audio signal and the reference audio signal.
    Type: Grant
    Filed: January 4, 2010
    Date of Patent: August 14, 2012
    Assignee: The Nielsen Company (US), LLC
    Inventors: Venugopal Srinivasan, Keqiang Deng, Daozheng Lu
  • Patent number: 8239190
    Abstract: A method of communicating speech comprising time-warping a residual low band speech signal to an expanded or compressed version of the residual low band speech signal, time-warping a high band speech signal to an expanded or compressed version of the high band speech signal, and merging the time-warped low band and high band speech signals to give an entire time-warped speech signal. In the low band, the residual low band speech signal is synthesized after time-warping of the residual low band signal while in the high band, an unwarped high band signal is synthesized before time-warping of the high band speech signal. The method may further comprise classifying speech segments and encoding the speech segments. The encoding of the speech segments may be one of code-excited linear prediction, noise-excited linear prediction or ? frame (silence) coding.
    Type: Grant
    Filed: August 22, 2006
    Date of Patent: August 7, 2012
    Assignee: QUALCOMM Incorporated
    Inventors: Rohit Kapoor, Serafin Diaz Spindola
  • Patent number: 8234109
    Abstract: A method and system for hiding lost packets are disclosed.
    Type: Grant
    Filed: May 17, 2010
    Date of Patent: July 31, 2012
    Assignee: Huawei Technologies Co., Ltd.
    Inventor: Wuzhou Zhan
  • Publication number: 20120185242
    Abstract: A noise estimating apparatus estimates two types of noise spectra for removing a noise component using the two types of noise spectra. The noise estimating apparatus includes an A/D converter that converts an input speech signal to a digital signal, and a Fourier transformer that performs a discrete Fourier transform on the digital signal having a predetermined time length to obtain an input spectrum and a complex spectrum. The noise estimating apparatus also includes a noise spectrum storage device that stores the two types of noise spectra, including a mean noise spectrum and a compensation noise spectrum, and a noise estimator that estimates a new compensation noise spectrum and a new mean noise spectrum as new two types of noise spectra.
    Type: Application
    Filed: November 22, 2011
    Publication date: July 19, 2012
    Applicant: PANASONIC CORPORATION
    Inventor: Toshiyuki MORII
  • Publication number: 20120179458
    Abstract: Provided are an apparatus and method for estimating noise that changes with time. The apparatus may calculate a speech absence probability that indicates the possibility of the absence of speech in each frequency component of an input acoustic signal, may discriminate between a speech-dominant region and a noise region from the acoustic signals based on the speech absence probability, and may estimate noise according to the discrimination result.
    Type: Application
    Filed: November 1, 2011
    Publication date: July 12, 2012
    Inventors: Kwang-Cheol Oh, Jeong-Su Kim, Jae-Hoon Jeong, So-Young Jeong
  • Publication number: 20120179459
    Abstract: A method of pre-processing an audio signal transmitted to a user terminal via a communication network and an apparatus using the method are provided. The method of pre-processing the audio signal may prevent deterioration of a sound quality of the audio signal transmitted to the user terminal by pre-processing the audio signal, and by enabling a codec module, encoding the audio signal, to determine the audio signal as a speech signal. Also, the method of pre-processing the audio signal may improve a probability that the codec module may determine a corresponding audio signal as a speech when the audio signal is transmitted via the communication network by pre-processing the audio signal using a speech codec.
    Type: Application
    Filed: March 21, 2012
    Publication date: July 12, 2012
    Applicant: REALNETWORKS, INC.
    Inventors: Jae Woong Jeong, Seop Hyeong Park, Jong Kyu Ryu
  • Patent number: 8214200
    Abstract: Methods and apparatus are disclosed for approximating an MDCT coefficient of a block of windowed sinusoid having a defined frequency, the block being multiplied by a window sequence and having a block length and a block index. A finite trigonometric series is employed to approximate the window sequence. A window summation table is pre-computed using the finite trigonometric series and the defined frequency of the sinusoid. A block phase is computed for each block with the defined frequency, the block length and the block index. An MDCT coefficient is approximated by the dot product of a phase vector computed using the block phase with a corresponding row of the window summation table.
    Type: Grant
    Filed: March 14, 2007
    Date of Patent: July 3, 2012
    Assignee: XFRM, Inc.
    Inventors: Richard C. Cabot, Matthew S. Ashman
  • Patent number: 8209188
    Abstract: A down-sampler 101 down-samples the sampling rate of an input signal from sampling rate FH to sampling rate FL. A base layer coder 102 encodes the sampling rate FL acoustic signal. A local decoder 103 decodes coding information output from base layer coder 102. An up-sampler 104 raises the sampling rate of the decoded signal to FH. A subtracter 106 subtracts the decoded signal from the sampling rate FH acoustic signal. An enhancement layer coder 107 encodes the signal output from subtracter 106 using a decoding result parameter output from local decoder 103.
    Type: Grant
    Filed: May 6, 2010
    Date of Patent: June 26, 2012
    Assignee: Panasonic Corporation
    Inventor: Masahiro Oshikiri
  • Patent number: 8204121
    Abstract: A memory optimization method for a MP3 decoder. In a pipeline structure for speeding matrix calculation in Mp3 decoding, an output sequence of IMDCT calculation is altered so that matrix calculation is activated before completing the IMDCT calculation. A decoding control method allows pipeline processing in MP3 decoding, with decoding procedures for subsequent granules activated while the current granule is still being processing in the matrix calculation.
    Type: Grant
    Filed: December 23, 2004
    Date of Patent: June 19, 2012
    Assignee: VIA Technologies, Inc.
    Inventors: Zhou Jin Feng, David Gao
  • Publication number: 20120136653
    Abstract: A transform coding apparatus includes an input scale factor calculating section that calculates an input scale factor having a predetermined number of scale factors associated with an input spectrum as an element, and a codebook that stores a plurality of scale factor candidates having a predetermined number of elements and outputs one scale factor candidate. The transform coding apparatus also includes an error calculating section that calculates an error on a per element basis, a weighted error calculating section that determines a weight on a per element basis and calculates a sum of products of the error and the weight to calculate a weighted error, and a searching section that searches for a scale factor candidate that minimizes the weighted error in the codebook.
    Type: Application
    Filed: February 7, 2012
    Publication date: May 31, 2012
    Applicant: PANASONIC CORPORATION
    Inventors: Masahiro OSHIKIRI, Tomofumi YAMANASHI
  • Patent number: 8190426
    Abstract: An audio enhancement refines a short-time spectrum. The refinement may reduce overlap between audio sub-bands. The sub-bands are transformed into sub-band short-time spectra. A portion of the spectra are time-delayed. The sub-band short-time spectrum and the time-delayed portion are filtered to obtain a refined sub-band short-time spectrum. The refined spectrum improves audio processing.
    Type: Grant
    Filed: November 30, 2007
    Date of Patent: May 29, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Mohamed Krini, Gerhard Uwe Schmidt
  • Patent number: 8190425
    Abstract: An audio encoder encodes a combined channel (e.g., a sum channel) for a group of plural physical audio channels. The encoder determines plural parameters for representing individual physical channels of the group as modified versions of the encoded combined channel. The plural parameters comprise ratios of power in each individual channel to power in the combined channel (e.g., a ratio of the power of a right channel to the power of the combined channel, and a ratio of the power of the left channel to the power of the combined channel). The plural parameters can include a complex parameter. The combined channel and the plural parameters facilitate reconstruction at the audio decoder of source channels. An audio decoder performs a forward complex transform on the multi-channel audio data and reconstructs plural channels from the multi-channel audio data. The decoder can maintain second-order statistics for the source channels.
    Type: Grant
    Filed: January 20, 2006
    Date of Patent: May 29, 2012
    Assignee: Microsoft Corporation
    Inventors: Sanjeev Mehrotra, Wei-Ge Chen
  • Patent number: 8185381
    Abstract: A unified filter bank for performing signal conversions may include an interface that receives signal conversion commands in relation to multiple types of compressed audio bitstreams. The unified filter bank may also include a reconfigurable transform component that performs a transform as part of signal conversion for the multiple types of compressed audio bitstreams. The unified filter bank may also include complementary modules that perform complementary processing as part of the signal conversion for the multiple types of compressed audio bitstreams. The unified filter bank may also include an interface command controller that controls the configuration of the reconfigurable transform component and the complementary modules.
    Type: Grant
    Filed: July 16, 2008
    Date of Patent: May 22, 2012
    Assignee: QUALCOMM Incorporated
    Inventors: Sang-Uk Ryu, Eddie L. T. Choy, Nidish Ramachandra Kamath, Samir Kumar Gupta, Suresh Devalapalli
  • Publication number: 20120116753
    Abstract: In order to reduce interference in an audio signal during a call on a mobile communication device, a plurality of transforms of the audio signal is performed, each transform containing phase information and amplitude information of corresponding samples of the audio signal. The results of the transforms are then averaged in order to generating a compensation signal that can be subtracted from the audio signal.
    Type: Application
    Filed: November 3, 2011
    Publication date: May 10, 2012
    Applicant: SONY ERICSSON MOBILE COMMUNICATIONS AB
    Inventors: Jonny STRANDH, Kaj ULLÉN
  • Publication number: 20120095754
    Abstract: Provided are a method and an apparatus for encoding and decoding an audio signal. A method for encoding an audio signal includes receiving a transformed audio signal, dividing the transformed audio signal into a plurality of subbands, performing a first sinusoidal pulse coding operation on the subbands, determining a performance region of a second sinusoidal pulse coding operation among the subbands on the basis of coding information of the first sinusoidal pulse coding operation, and performing the second sinusoidal pulse coding operation on the determined performance region, wherein the first sinusoidal pulse coding operation is performed variably according to the coding information. Accordingly, it is possible to further improve the quality of a synthesized signal by considering the sinusoidal pulse coding of a lower layer when encoding or decoding an audio signal in an upper layer by a layered sinusoidal pulse coding scheme.
    Type: Application
    Filed: May 19, 2010
    Publication date: April 19, 2012
    Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Mi-Suk Lee, Heesik Yang, Hyun-Woo Kim, Jongmo Sung, Hyun-Joo Bae, Byung-Sun Lee
  • Publication number: 20120089389
    Abstract: In a CELP coder, a combined innovation codebook coding device comprises a pre-quantizer of a first, adaptive-codebook excitation residual, and a CELP innovation-codebook search module responsive to a second excitation residual produced from the first, adaptive-codebook excitation residual. In a CELP decoder, a combined innovation codebook comprises a de-quantizer of pre-quantized coding parameters into a first excitation contribution, and a CELP innovation-codebook structure responsive to CELP innovation-codebook parameters to produce a second excitation contribution.
    Type: Application
    Filed: April 11, 2011
    Publication date: April 12, 2012
    Inventor: Bruno Bessette
  • Patent number: 8155954
    Abstract: A filter bank device for generating a complex spectral representation of a discrete-time signal includes a generator for generating a block-wise real spectral representation, which, for example, implements an MDCT, to obtain temporally successive blocks of real spectral coefficients. The output values of this spectral conversion device are fed to a post-processor for post-processing the block-wise real spectral representation to obtain an approximated complex spectral representation having successive blocks, each block having a set of complex approximated spectral coefficients, wherein a complex approximated spectral coefficient can be represented by a first partial spectral coefficient and by a second partial spectral coefficient, wherein at least one of the first and second partial spectral coefficients is determined by combining at least two real spectral coefficients.
    Type: Grant
    Filed: March 4, 2010
    Date of Patent: April 10, 2012
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Bernd Edler, Stefan Geyersberger
  • Patent number: 8145492
    Abstract: A behavior control system of a robot for learning a phoneme sequence includes a sound inputting device inputting a phoneme sequence, a sound signal learning unit operable to convert the phoneme sequence into a sound synthesis parameter and to learn or evaluate a relationship between a sound synthesis parameter of a phoneme sequence that is generated by the robot and a sound synthesis parameter used for sound imitation, and a sound synthesizer operable to generate a phoneme sequence based on the sound synthesis parameter obtained by the sound signal learning unit.
    Type: Grant
    Filed: April 6, 2005
    Date of Patent: March 27, 2012
    Assignee: Sony Corporation
    Inventor: Masahiro Fujita
  • Patent number: RE44126
    Abstract: Estimates of spectral magnitude and phase are obtained by an estimation process using spectral information from analysis filter banks such as the Modified Discrete Cosine Transform. The estimation process may be implemented by convolution-like operations with impulse responses. Portions of the impulse responses may be selected for use in the convolution-like operations to trade off between computational complexity and estimation accuracy. Mathematical derivations of analytical expressions for filter structures and impulse responses are disclosed.
    Type: Grant
    Filed: November 15, 2011
    Date of Patent: April 2, 2013
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Corey I. Cheng, Michael J. Smithers, David N. Lathrop