Transformation Patents (Class 704/203)
  • Patent number: 7856353
    Abstract: Method for processing speech signal data. A speech signal is divided into frames. Each frame is characterized by a frame number T representing a unique interval of time. Each speech signal is characterized by a power spectrum with respect to frame T and frequency band ?. A speech segment and a reverberation segment of the speech signal is determined. L filter coefficients W(k) (k=1, 2, . . . , L) respectively corresponding to L frames immediately preceding frame T are computed such that the L filter coefficients minimize a function ? that is a linear combination of sum of squares of a residual speech power in the reverberation segment and a sum of squares of a subtracted speech power in the speech segment. The computed L filter coefficients are stored within storage media of the computing apparatus.
    Type: Grant
    Filed: August 7, 2007
    Date of Patent: December 21, 2010
    Assignee: Nuance Communications, Inc.
    Inventors: Takashi Fukuda, Osamu Ichikawa, Masafumi Nishimura
  • Publication number: 20100312551
    Abstract: Disclosed is a method of processing a signal, which includes receiving at least one of a first signal and a second signal, receiving mode information, and coding the at least one of the first signal and the second signal using at least one of a first coding scheme and a second coding scheme according to the mode information, wherein the mode information is information for indicating that a prescribed mode corresponds to which one of at least three modes.
    Type: Application
    Filed: October 15, 2008
    Publication date: December 9, 2010
    Applicants: LG Electronics Inc., Industry-Academic Cooperation Foundation, Yonsei University
    Inventors: Hyen-O Oh, Hong Goo Kang, Chang Heon Lee, Sang Wook Shin, Yang Won Jung
  • Patent number: 7831420
    Abstract: A speech converter in a speech processing system modifies various aspects of input speech. The speech converter receives a formants signal representing an input speech signal. The speech converter may also receive a formant scaling command or a user selection of one of multiple control signals, each specifying a manner of modifying one or more of the received signals (i.e., formants, voicing, pitch, gain). The speech converter modifies at least one of the formants, voicing, pitch, and/or gain signals as specified by the selected voice font.
    Type: Grant
    Filed: April 4, 2006
    Date of Patent: November 9, 2010
    Assignee: QUALCOMM Incorporated
    Inventors: Daniel J. Sinder, Ananthapadmanabhan Aasanipalai Kandhadai
  • Patent number: 7826870
    Abstract: Separating mixed signals includes receiving the mixed signals from signal sources transmitting from a number of cells. A signal source is operable to transmit a source signal, and a mixed signal comprises at least a subset of the source signals. A complex mixing matrix is established from the mixed signals. The complex mixing matrix describes mixing the source signals to yield the mixed signals. The number of cells is estimated from the mixed signals. The mixed signals are separated using the complex mixing matrix and the estimated number of cells.
    Type: Grant
    Filed: May 3, 2006
    Date of Patent: November 2, 2010
    Assignee: Raytheon Company
    Inventors: David B. Shu, Yuri Owechko
  • Patent number: 7818167
    Abstract: An audio information retrieval method, medium, and system that can rapidly retrieve audio information, even in noisy environments, by extracting a modulation spectrum that is robust against noise, converting features of the extracted modulation spectrum into hash bits, and using a hash table. The audio information retrieval method may include extracting a modulation spectrum from audio data of a compressed domain, converting the extracted modulation spectrum into fingerprint bits, arranging the fingerprint bits in a form of a hash table, converting a received query into an address by a hash function corresponding to the query, and retrieving the audio information by referring to the hash table.
    Type: Grant
    Filed: August 29, 2006
    Date of Patent: October 19, 2010
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hyoung Gook Kim, Ki Wan Eom, Ji Yeun Kim, Yuan Yuan She, Xuan Zhu
  • Publication number: 20100262421
    Abstract: Provided is an encoding device which improves the sound quality of a stereo signal while maintaining a low bit rate. The encoding device includes: an LP inverse filter (121) which LP-inverse-filterS a left signal L(n) by using an inverse quantization linear prediction coefficient AdM(z) of a monaural signal; a T/F conversion unit (122) which converts the left sound source signal Le(n) from a temporal region to a frequency region; an inverse quantizer (123) which inverse-quantizes encoded information Mqe; spectrum division units (124, 125) which divide a high-frequency component of the sound source signal Mde(f) and the left signal Le(f) into a plurality of bands; and scale factor calculation units (126, 127) which calculate scale factors ai and ssi by using a monaural sound source signal Mdeh,i(f), a left sound source signal Leh,i(f), Mdeh,i(f), and right sound source signal Reh,i(f) of each divided band.
    Type: Application
    Filed: November 4, 2008
    Publication date: October 14, 2010
    Applicant: PANASONIC CORPORATION
    Inventors: Kok Seng Chong, Koji Yoshida, Masahiro Oshikiri
  • Patent number: 7809561
    Abstract: The present invention provides a method and apparatus for verification of speaker authentication. A method for verification of speaker authentication, comprising: inputting an utterance containing a password that is spoken by a speaker; extracting an acoustic feature vector sequence from said inputted utterance; DTW-matching said extracted acoustic feature vector sequence and a speaker template enrolled by an enrolled speaker; calculating each of a plurality of local distances between said DTW-matched acoustic feature vector sequence and said speaker template; nonlinear-transforming said each local distance calculated to give more weights on small local distances; calculating a DTW-matching score based on said plurality of local distances nonlinear-transformed; and comparing said matching score with a predefined discriminating threshold to determine whether said inputted utterance is an utterance containing a password spoken by the enrolled speaker.
    Type: Grant
    Filed: March 28, 2007
    Date of Patent: October 5, 2010
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Jian Luan, Jie Hao
  • Patent number: 7809146
    Abstract: Problems of permutation can be solved with high accuracy without utilizing knowledge about original signals or information concerning positions of microphones and the like when each one of plural signals mixed in an audio signal is separated using independent component analysis. A short-time Fourier transformation section generates spectrograms of observation signals from observation signals in time domain. A signal separation section separates the spectrograms of the observation signals into spectrograms of respective signals, to generate spectrograms of separate signals. A permutation problem solution section calculates a scale corresponding to the degree of permutation, e.g., a Kullback-Leiblar information amount calculated by use of a multidimensional probability density function or multidimensional kurtosis, from substantial whole of the spectrograms of the separate signals.
    Type: Grant
    Filed: June 1, 2006
    Date of Patent: October 5, 2010
    Assignee: Sony Corporation
    Inventors: Atsuo Hiroe, Keiichi Yamada
  • Publication number: 20100250244
    Abstract: There is provided an encoder capable of improving inter-channel prediction (ICP) performance in scalable stereo sound encoding using an ICP. In the encoder, ICP analysis units (113, 114, 115) use, as reference signal candidates, a frequency coefficient (sL?(f)) in the low-band portion of a side residual signal, a frequency coefficient (mM,i(f)) in each sub-band portion of a monaural residual signal, and a frequency coefficient (mL(f)) in the low-band portion of the monaural residual signal, respectively, and perform an ICP analysis between the respective these candidates and a frequency coefficient (sM,i(f)) in each sub-band portion of the side residual signal to generate first, second, and third ICP coefficients.
    Type: Application
    Filed: October 31, 2008
    Publication date: September 30, 2010
    Applicant: PANASONIC CORPORATION
    Inventors: Haishan Zhong, Zongxian Liu, Kok Seng Chong, Koji Yoshida
  • Patent number: 7805293
    Abstract: A voice band correcting apparatus in which the signal level of limit bands is amplified by a correction filter, the signal level of a correction signal supplied is compared by a level detector to a preset level, and the result of decision is sent as level information to a coefficient controller, where the signal level is adjusted in a controlled manner. The high-quality broadband signal may be obtained on correction without degrading the quality of a communication signal ascribable to excess amplification.
    Type: Grant
    Filed: February 26, 2004
    Date of Patent: September 28, 2010
    Assignee: Oki Electric Industry Co., Ltd.
    Inventors: Masashi Takada, Yoshinari Murakami
  • Patent number: 7802101
    Abstract: A system and method of retrieving a watermark in a watermarked signal are disclosed. The watermarked signal comprises odd and even overlapped blocks where the watermark is contained in the even blocks. The method comprises, for each k-th even block, subtracting the two adjacent odd numbered blocks from the k-th even block of the watermarked signal to retrieve s *k(n), transforming s *k(n) into the frequency domain to generate S k(f), calculating a phase of S k(f) as ? (f) and a phase of Sk(f) as ?(f), calculating the difference ? (f) between ? (f) and ?(f), unwrapping ? (f) to obtain the phase modulation {tilde over (?)} k(f), and using a Viterbi search to retrieve the watermark embedded in {tilde over (?)} k(f).
    Type: Grant
    Filed: March 30, 2009
    Date of Patent: September 21, 2010
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: James David Johnston, Shyh-Shiaw Kuo, Schuyler Reynier Quackenbush, William Turin
  • Publication number: 20100228542
    Abstract: A method and system for hiding lost packets are disclosed.
    Type: Application
    Filed: May 17, 2010
    Publication date: September 9, 2010
    Applicant: HUAWEI TECHNOLOGIES CO., LTD.
    Inventor: Wuzhou ZHAN
  • Publication number: 20100228541
    Abstract: A subband coding apparatus carries out subband coding which prevents deterioration in coding performance and improves audio quality of decoded signals. The subband coding apparatus includes a low-band coding section (103) to code a low-band spectrum (S13). A low-band decoding section (106) decodes a low-band coded data (S14) and outputs a decoded low-band spectrum (S18) to a high-band coding section (107). A spectrum rearranging section (105) rearranges to make each frequency component of a high-band spectrum (S16) in reverse order on the frequency axis and outputs a modified high-band spectrum (S17) after rearranging to a high-band coding section (107). The high-band coding section (107) uses the decoded low-band spectrum (S18) output from the low-band decoding section (106) to code the modified high-band spectrum (S17) output from the spectrum rearranging section (105).
    Type: Application
    Filed: November 29, 2006
    Publication date: September 9, 2010
    Applicant: MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD.
    Inventor: Masahiro Oshikiri
  • Patent number: 7787632
    Abstract: The invention relates to methods and units supporting a multichannel audio extension. In order to allow an efficient extension requiring a low computational complexity, it is proposed that at an encoding end, at least state information is provided as side information for a provided mono audio signal (M) generated out of a multichannel audio signal. The state information indicates for each of a plurality of frequency bands how a predetermined or equally provided gain value is to be applied in the frequency domain to the mono audio signal (M) for obtaining first and a second channel signals (L,R) of a reconstructed multichannel audio signal.
    Type: Grant
    Filed: March 21, 2003
    Date of Patent: August 31, 2010
    Assignee: Nokia Corporation
    Inventor: Juha Ojanpera
  • Patent number: 7783477
    Abstract: A method for modelling, i.a. analyzing and/or synthesizing, a windowed signal such as sound or speech signals, by computing the frequencies and complex amplitudes from the signal using a nonlinear least squares method is disclosed. The computations complexity is reduced by taking into account the bandlimited property of a window.
    Type: Grant
    Filed: December 1, 2004
    Date of Patent: August 24, 2010
    Assignee: Universiteit Antwerpen
    Inventor: Wim D'Haes
  • Patent number: 7774746
    Abstract: Generating code is disclosed. A specification of one or more translation patterns is received. The one or more translation patterns are used to generate at least a portion of code associated with a translator. Using the one or more translation patterns to generate at least a portion of code associated with the translator results in the translator being configured to create a target object model. Creating the target object model includes populating one or more elements of the target object model in a processing order at least in part associated with an order of elements in the one or more translation patterns.
    Type: Grant
    Filed: April 19, 2006
    Date of Patent: August 10, 2010
    Assignee: Apple, Inc.
    Inventors: Philip Andrew Mansfield, Michael Robert Levy
  • Publication number: 20100198586
    Abstract: A processed representation of an audio signal having a sequence of frames is generated by sampling the audio signal within first and second frames of the sequence of frames, the second frame following the first frame, the sampling using information on a pitch contour of the first and second frames to derive a first sampled representation. The audio signal is sampled within the second and third frames, the third frame following the second frame in the sequence of frames. The sampling uses the information on the pitch contour of the second frame and information on a pitch contour of the third frame to derive a second sampled representation. A first scaling window is derived for the first sampled representation, and a second scaling window is derived for the second sampled representation, the scaling windows depending on the samplings applied to derive the first sampled representations or the second sampled representation.
    Type: Application
    Filed: March 23, 2009
    Publication date: August 5, 2010
    Applicant: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e. V.
    Inventors: Bernd Edler, Sascha Disch, Ralf Geiger, Stefan Bayer, Ulrich Kraemer, Guillaume Fuchs, Max Neuendorf, Markus Multrus, Gerald Schuller, Harald Popp
  • Publication number: 20100182510
    Abstract: A smoothing method for suppressing fluctuating artifacts in the reduction of interference noise includes the following steps: providing short-term spectra for a sequence of signal frames, transforming each short-term spectrum by way of a forward transformation which describes the short-term spectrum using transformation coefficients that represent the short-term spectrum subdivided into its coarse and fine structures; smoothing the transformation coefficients with the respective same coefficient indices by combining at least two successive transformed short-term spectra; and transforming the smoothed transformation coefficients into smoothed short-term spectra by way of a backward transformation.
    Type: Application
    Filed: June 25, 2008
    Publication date: July 22, 2010
    Applicants: RUHR-UNIVERSITÄT BOCHUM, SIEMENS AUDIOLOGISCHE TECHNIK GMBH
    Inventors: Timo Gerkmann, Colin Breithaupt, Rainer Martin
  • Publication number: 20100185440
    Abstract: The embodiments of a transcoding method, a transcoding device, and a communication apparatus are provided. The embodiment of a method includes: receiving a bit stream input from a sending end; determining an attribute of discontinuous transmission (DTX) used by a receiving end and a frame type of the input bit stream; and transcoding the input bit stream in a corresponding processing manner according to a determination result. Thereby, a corresponding transcoding operation is performed on the input bit stream according to the attribute of DTX used by the receiving end and the frame type of the input bit stream. In such a manner, input bit streams of various types can be processed, and the input bit streams can be correspondingly transcoded according to the requirements of the receiving end. Therefore, the average computational complexity and peak computational complexity can be effectively decreased without decreasing the quality of the synthesized speech.
    Type: Application
    Filed: January 21, 2010
    Publication date: July 22, 2010
    Inventors: Changchun Bao, Hao Xu, Fanrong Tang, Xiangyu Hu
  • Patent number: 7756700
    Abstract: Pitch estimation and classification into voiced, unvoiced and transitional speech were performed by a spectro-temporal auto-correlation technique. A peak picking formula was then employed. A weighing function was then applied to the power spectrum. The harmonics weighted power spectrum underwent mel-scaled band-pass filtering, and the log-energy of the filter's output was discrete cosine transformed to produce cepstral coefficients. A within-filter cubic-root amplitude compression was applied to reduce amplitude variation without compromise of the gain invariance properties.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: July 13, 2010
    Assignee: The Regents of the University of California
    Inventors: Kenneth Rose, Liang Gu
  • Publication number: 20100169083
    Abstract: Apparatus and methods for creating a composite data source having a common data representation from disparate sources of voice data. Data transmission links are established to heterogeneous messaging data sources, requests for voice data is sent using data access protocols, the voice data is received, and a set of voice data transformation rules are selectively applied to the voice data to transform the data into a common data representation. The common data representation can also be used as a source for reporting and graphical displays to monitor the operational aspects of the sources of voice data.
    Type: Application
    Filed: March 9, 2010
    Publication date: July 1, 2010
    Applicant: Computer Associates Think, Inc.
    Inventors: Joseph A. Rossi, Gaetane A. Scouras
  • Publication number: 20100169082
    Abstract: The intelligibility of speech signals is improved in the many situations where a voice signal is communicated or stored. Means and methods are disclosed for developing a scheme with high voice signal intelligibility without sacrificing the voice quality. The disclosed method comprises certain steps, including, but not limited to: Learning the noise on near-end side and enhancing the far-end voice as a function of the noise type and noise level on the near-end side. The disclosed method and apparatus are especially useful to increase the intelligibility of the communication device's loudspeaker output. The invention includes processing of an input speech signal to generate an enhanced intelligent signal. The FFT spectrum of the speech received from the far-end is modified in accordance with the LPC spectrum of the local background noise to generate an enhanced intelligent signal.
    Type: Application
    Filed: February 12, 2010
    Publication date: July 1, 2010
    Inventors: Alon Konchitsky, Alberto D. Berstein, Sandeep Kulakcherla, William Ribble
  • Publication number: 20100169081
    Abstract: An encoding device includes: a frequency region converter which converts an inputted audio signal into a frequency region; a band selector which selects a quantization object band from a plurality of sub bands obtained by dividing the frequency region; and a shape quantizer which quantizes the shape of the frequency region parameter of the quantization object band. When a prediction encoding presence/absence determiner determines that the number of common sub bands between the quantization object band and the quantization object band selected in the past is not smaller than a predetermined value, a gain quantizer performs prediction encoding on the gain of the frequency region parameter of the quantization object band. When the number of common sub bands is smaller than the predetermined value, the gain quantizer non-predictively encodes the gain of the frequency region parameter of the quantization object band.
    Type: Application
    Filed: December 12, 2007
    Publication date: July 1, 2010
    Applicant: PANASONIC CORPORATION
    Inventors: Tomofumi Yamanashi, Masahiro Oshikiri
  • Publication number: 20100161320
    Abstract: An apparatus and method for adaptive sub-band allocation of spectral coefficients are disclosed. The sizes of sub-bands are determined according to the distribution of spectral coefficients transformed from an input speech/audio signal to perform more elaborate quantization in units of sub-bands. Thus, quantization noise of the spectral coefficients is reduced, and sound quality in a frequency region is enhanced, thereby improving the quality of the signal.
    Type: Application
    Filed: September 9, 2009
    Publication date: June 24, 2010
    Inventors: Hyun Woo KIM, Hyun Joo BAE, Byung Sun LEE
  • Publication number: 20100161321
    Abstract: A coding apparatus reduces a circuit scale and the amount of coding processing calculation. A frequency domain conversion section performs a frequency analysis of the signal sampled at a sampling rate Fx with an analysis length of 2·Na and calculates first spectrum S1(k)(0?k<Na). A band extension section extends the effective frequency band of first spectrum S1(k) to 0?k<Nb so that a new spectrum can be assigned to the extended area following to the frequency k=Na of first spectrum S1(k). An extended spectrum assignment section assigns extended spectrum S1?(k)(Na?k<Nb) input to the extended frequency band from the outside. A spectral information specification section outputs information necessary to specify extended spectrum S1?(k) out of the spectrum given from the extended spectrum assignment section as a code.
    Type: Application
    Filed: February 18, 2010
    Publication date: June 24, 2010
    Applicant: PANASONIC CORPORATION
    Inventor: Masahiro OSHIKIRI
  • Patent number: 7742746
    Abstract: A mobile audio device (for example, a cellular telephone, personal digital audio player, or MP3 player) performs Audio Dynamic Range Control (ADRC) and Automatic Volume Control (AVC) to increase the volume of sound emitted from a speaker of the mobile audio device so that faint passages of the audio will be more audible. This amplification of faint passages occurs without overly amplifying other louder passages, and without substantial distortion due to clipping. Multi-Microphone Active Noise Cancellation (MMANC) functionality is, for example, used to remove background noise from audio information picked up on microphones of the mobile audio device. The noise-canceled audio may then be communicated from the device. The MMANC functionality generates a noise reference signal as an intermediate signal. The intermediate signal is conditioned and then used as a reference by the AVC process. The gain applied during the AVC process is a function of the noise reference signal.
    Type: Grant
    Filed: April 30, 2007
    Date of Patent: June 22, 2010
    Assignee: QUALCOMM Incorporated
    Inventors: Pei Xiang, Song Wang, Prajakt V. Kulkarni, Samir Kumar Gupta, Eddie L. T. Choy
  • Publication number: 20100145682
    Abstract: The present invention applies spectral flatness characteristic values to simplify psychoacoustic analysis of a sound signal. If the sound signal comprises a plurality of frames, the present invention calculates the energy of the sound signal in a frequency domain, calculates a plurality of spectral flatness, and decides to use a short-block or a long-block Modified Discrete Cosine Transform accordingly. If the sound signal comprises left and right channel signals, the present invention performs psychoacoustic analysis on the sound signal to count energy of the left and right channel signals in a frequency domain, counts spectral flatness of the left and right channel signals, and decides to use middle/side transform or left and right channel encoding to transform the left and right channel signals accordingly.
    Type: Application
    Filed: March 27, 2009
    Publication date: June 10, 2010
    Inventor: Yi-Lun Ho
  • Patent number: 7711552
    Abstract: A filter apparatus for filtering a time domain input signal to obtain a time domain output signal, which is a representation of the time domain input signal filtered using a filter characteristic having an non-uniform amplitude/frequency characteristic, comprises a complex analysis filter bank for generating a plurality of complex subband signals from the time domain input signals, a plurality of intermediate filters, wherein at least one of the intermediate filters of the plurality of the intermediate filters has a non-uniform amplitude/frequency characteristic, wherein the plurality of intermediate filters have a shorter impulse response compared to an impulse response of a filter having the filter characteristic, and wherein the non-uniform amplitude/frequency characteristics of the plurality of intermediate filters together represent the non-uniform filter characteristic, and a complex synthesis filter bank for synthesizing the output of the intermediate filters to obtain the time domain output signal.
    Type: Grant
    Filed: September 1, 2006
    Date of Patent: May 4, 2010
    Assignee: Dolby International AB
    Inventor: Lars Villemoes
  • Publication number: 20100106269
    Abstract: A method and apparatus for audio signal processing by applying log companding on spectral domain or time domain representations of the audio signals to provide an encoded audio signal, which is decoded upon receipt. A frequency domain representation or time domain representation of the audio signal is computed by separating the audio signal into specific frequency bands, each having a coefficient. Log companding with different compression ratios is performed on each coefficient to provide an encoded signal. Upon receipt of the encoded signal, inverse log companding and time frequency or time scale reconstruction are performed to provide the audio signal.
    Type: Application
    Filed: April 22, 2009
    Publication date: April 29, 2010
    Applicant: QUALCOMM Incorporated
    Inventors: Harinath Garudadri, Yen-Liang Shue, Somdeb Majumdar
  • Patent number: 7707030
    Abstract: A filter bank device for generating a complex spectral representation of a discrete-time signal includes a generator for generating a block-wise real spectral representation, which, for example, implements an MDCT, to obtain temporally successive blocks of real spectral coefficients. The output values of this spectral conversion device are fed to a post-processor for post-processing the block-wise real spectral representation to obtain an approximated complex spectral representation having successive blocks, each block having a set of complex approximated spectral coefficients, wherein a complex approximated spectral coefficient can be represented by a first partial spectral coefficient and by a second partial spectral coefficient, wherein at least one of the first and second partial spectral coefficients is determined by combining at least two real spectral coefficients.
    Type: Grant
    Filed: January 26, 2005
    Date of Patent: April 27, 2010
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.
    Inventors: Bernd Edler, Stefan Geyersberger
  • Patent number: 7693707
    Abstract: A voice and musical tone coding apparatus is provided that can perform high-quality coding by executing vector quantization taking the characteristics of human hearing into consideration. In this voice and musical tone coding apparatus, a quadrature transformation processing section (201) converts a voice and musical tone signal from time components to frequency components. An auditory masking characteristic value calculation section (203) finds an auditory masking characteristic value from a voice and musical tone signal. A vector quantization section (202) performs vector quantization changing a calculation method of a distance between a code vector found from a preset codebook and a frequency component based on an auditory masking characteristic value.
    Type: Grant
    Filed: December 20, 2004
    Date of Patent: April 6, 2010
    Assignee: Pansonic Corporation
    Inventors: Tomofumi Yamanashi, Kaoru Sato, Toshiyuki Morii
  • Publication number: 20100082335
    Abstract: The system for transmitting and receiving a wideband speech signal includes an A/D converter for receiving an analog speech signal to convert it into a digital speech signal, a transmitter analysis filter for receiving the digital speech signal and dividing it into a baseband signal and an enhancement residual band signal, a standard baseband encoder for accepting the baseband signal and coding it using an ITU-T encoder, an additional baseband encoder for reducing standard coding distortion in the baseband signal, an enhancement residual band encoder for coding a signal obtained by removing the coded baseband signal from the original digital speech signal, and an IP network interface for multiplexing the coded standard and additional baseband signals and enhancement residual band signal.
    Type: Application
    Filed: December 4, 2009
    Publication date: April 1, 2010
    Applicant: Electronics and Telecommunications Research Institute
    Inventors: Ho-Sang SUNG, Dae-Hwan HWANG, Dae-Hee YOUN, Hong-Goo KANG, Young-Cheol PARK, Ki-Seung LEE, Sung-Kyo JUNG, Kyung-Tae KIM
  • Publication number: 20100076754
    Abstract: The invention relates to transform coding/decoding of a digital audio signal represented by a succession of frames, using windows of different lengths. For the coding within the meaning of the invention, it is sought to detect (51) a particular event, such as an attack, in a current frame (Ti): and, at least if said particular event is detected at the start of the current frame (53), a short window (54) is directly applied in order to code (56) the current frame (Ti) without applying a transition window. Thus, the coding has a reduced delay in relation to the prior art. In addition, an ad hoc processing is applied during decoding in order to compensate for the direct passage from a long window to a short window during coding.
    Type: Application
    Filed: December 18, 2007
    Publication date: March 25, 2010
    Applicant: France Telecom
    Inventors: Balazs Kovesi, David Virette, Pierrick Philippe
  • Publication number: 20100070268
    Abstract: A system for a multimodal unification of articulation includes a voice signal modality to receive a voice signal, and a control signal modality which receives an input from a user and generates a control signal from the input which is selected from predetermined inputs directly corresponding to the phonetic information. The interactive voice based phonetic input system also includes a multimodal integration system to receive and integrates the voice signal and the control signal. The multimodal integration system delimits a context of a spoken utterance of the voice signal by using the control signal to preprocess and discretize into phonetic frames. A voice recognizer analyzing the voice signal integrated with the control signal to output a voice recognition result. This new paradigm helps overcome constraints found in interfacing mobile devices. Context information facilitates the handling of the commands in the application environment.
    Type: Application
    Filed: September 10, 2009
    Publication date: March 18, 2010
    Inventor: Jun Hyung Sung
  • Patent number: 7680671
    Abstract: AC-3 is a high quality audio compression format widely used in feature films and, more recently, on Digital Versatile Disks (DVD). For consumer applications the algorithm is usually coded into the firmware of a DSP Processor, which due to cost considerations may be capable of only fixed point arithmetic. It is generally assumed that 16-bit processing is incapable of delivering the high fidelity audio, expected from the AC-3 technology. Double precision computation can be utilized on such processors to provide the high quality; but the computational burden of such implementation will be beyond the capacity of the processor to enable real-time operation. Through extensive simulation study of a high quality AC-3 encoder implementation, a multi-precision technique for each processing block is presented whereby the quality of the encoder on a 16-bit processor matches the single precision 24-bit implementation very closely without excessive additional computational complexity.
    Type: Grant
    Filed: September 8, 2006
    Date of Patent: March 16, 2010
    Assignee: STMicroelectronics Asia Pacific Pte. Ltd.
    Inventors: Mohammed Javed Absar, Sapna George, Antonio Mario Alvarez-Tinoco
  • Patent number: 7672834
    Abstract: A method detects components of a non-stationary signal. The non-stationary signal is acquired and a non-negative matrix of the non-stationary signal is constructed. The matrix includes columns representing features of the non-stationary signal at different instances in time. The non-negative matrix is factored into characteristic profiles and temporal profiles.
    Type: Grant
    Filed: July 23, 2003
    Date of Patent: March 2, 2010
    Assignee: Mitsubishi Electric Research Laboratories, Inc.
    Inventor: Paris Smaragdis
  • Publication number: 20100010808
    Abstract: To provide a noise suppressing method and apparatus capable of achieving high-quality noise suppression using a lower amount of operations. Noise contained in an input signal is suppressed by transforming the input signal into frequency-domain signals; integrating bands of the frequency-domain signals to determine integrated frequency-domain signals; determining estimated noise based on the integrated frequency-domain signals; determining spectral gains based on the estimated noise and said integrated frequency-domain signals; and weighting said frequency-domain signals by the spectral gains.
    Type: Application
    Filed: August 29, 2006
    Publication date: January 14, 2010
    Applicant: NEC Corporation
    Inventors: Akihiko Sugiyama, Masanori Kato
  • Patent number: 7640157
    Abstract: A technique to enhance audio quality of a quantized audio signal when a perceptual audio coder is operating at low bit rates. The perceptual audio coder uses a modified two-loop quantization technique that maintains audio quality at medium to high bit rates while eliminating artifacts at low bit rates. The perceptual audio coder saves vanishing bands by stealing bits from surviving bands to reduce artifacts at low bit rates.
    Type: Grant
    Filed: February 6, 2004
    Date of Patent: December 29, 2009
    Assignee: Ittiam Systems (P) Ltd.
    Inventors: Vinod Prakash, Sarat Chandra Vadapalli, Anil Kumar, Preethi Konda
  • Patent number: 7636659
    Abstract: In accordance with the present invention, computer implemented methods and systems are provided for representing and modeling the temporal structure of audio signals. In response to receiving a signal, a time-to-frequency domain transformation on at least a portion of the received signal to generate a frequency domain representation is performed. The time-to-frequency domain transformation converts the signal from a time domain representation to the frequency domain representation. A frequency domain linear prediction (FDLP) is performed on the frequency domain representation to estimate a temporal envelope of the frequency domain representation. Based on the temporal envelope, one or more speech features are generated.
    Type: Grant
    Filed: March 25, 2005
    Date of Patent: December 22, 2009
    Assignee: The Trustees of Columbia University in the City of New York
    Inventors: Marios Athineos, Daniel P. W. Ellis
  • Publication number: 20090313009
    Abstract: The invention concerns a method for trained discrimination and attenuation of echoes of a digital audio signal generated from a transform coding, which consists, for each current frame of the signal. In comparing (A) in real time, in at least one frequency band a variable derived from one characteristic of the echo generating signal with that of a non-echo generating signal at a threshold value, and deducing therefrom (B) the existence or non-existence (C) of an echo derived from the transform coding, discriminating the existence of the echo and defining (D) a false alarm zone in the high-energy parts of the digital audio signal, determining an initial processing and attenuating the echoes (E) in the parts complementary to the low-energy false alarm zone and inhibiting (F) the attenuation of echoes in the false alarm zone. The invention is applicable to the technology of coders/decoders in particular hierarchical coders/decoders.
    Type: Application
    Filed: February 13, 2007
    Publication date: December 17, 2009
    Applicant: France Telecom
    Inventors: Balazs Kovesi, Alain Le Guyader
  • Publication number: 20090306972
    Abstract: A method conceals dropouts in one or more audio channels of a multi-channel arrangement. The method maps transmitted signals into a frequency domain during an error-free signal transmission of two or more channels. A magnitude spectra and spectral filter coefficients are derived. The spectral filter coefficients relate the magnitude spectrum of the audio channel to the magnitude spectrum of at least one other channel. When a dropout occurs, a replacement signal is generated through the filter coefficients and a substitution signal. The filter coefficients may be generated prior to the detection of the dropout.
    Type: Application
    Filed: June 5, 2009
    Publication date: December 10, 2009
    Inventors: Martin Opitz, Cornelia Falch, Robert Holdrich
  • Publication number: 20090306971
    Abstract: An audio signal quality enhancement apparatus and method. The apparatus includes a pitch calculating unit to extract a pitch period of an audio signal, a frequency domain transforming unit to transform the audio signal to a frequency domain, a frequency band dividing unit to classify the transformed audio signal into audio signals for each of the plurality of frequency bands based on the extracted pitch period, and a pitch enhancement unit to determine a gain based on a volume of the transformed audio signal, and to generate an output signal by multiplying each of the classified audio signals with respect to each of the plurality of frequency bands by the gain, thereby enhancing quality of the audio signal.
    Type: Application
    Filed: June 5, 2009
    Publication date: December 10, 2009
    Inventors: Jung Hoe Kim, Ho Chong Park, Eun Mi Oh
  • Patent number: 7630881
    Abstract: A system extends a bandwidth of bandlimited audio signals by analyzing bandlimited audio signals at a transmission cycle rate. The analyzer may obtain a bandlimited parameter at a transmission cycle rate. A mapping device or logic in the system obtains a wideband parameter based on the bandlimited parameter. An audio signal generator generates a highband and/or lowband audio signal based on the wideband parameter at the transmission cycle rate. In some systems, the bandlimited audio signal is analyzed at the transmission cycle rate. The highband and/or lowband audio signals and the combined wideband audio signal are generated at the transmission cycle rate.
    Type: Grant
    Filed: September 16, 2005
    Date of Patent: December 8, 2009
    Assignee: Nuance Communications, Inc.
    Inventors: Bernd Iser, Gerhard Uwe Schmidt
  • Patent number: 7627467
    Abstract: Real-time packet-based audio communications over packet-based networks frequently results in the loss of one or more packets during any given communication session. The real-time nature of such communications precludes retransmission of lost packets due to the unacceptable delays that would result. Consequently, packet loss concealment methods are employed to “hide” lost packets from the listener. Unfortunately, conventional loss concealment methods, such as packet repetition or stretch/overlap methods, do not fully exploit information available from partially received samples. Therefore, when a single frame of N coefficients is lost, 2N samples are only partially reconstructed, thereby degrading the reconstructed signal.
    Type: Grant
    Filed: June 30, 2005
    Date of Patent: December 1, 2009
    Assignee: Microsoft Corporation
    Inventors: Dinei A. Florencio, Philip A. Chou
  • Patent number: 7620554
    Abstract: A method is shown for supporting a multichannel audio extension at an encoding end of a multichannel audio coding system. In order to improve the audio quality over a large frequency range, the method comprises transforming each channel of a multichannel audio signal into the frequency domain and dividing a bandwidth of the frequency domain signals into a first region of lower frequencies and at least one further region of higher frequencies. Then, the frequency domain signals are encoded in each of the frequency regions with another type of coding to obtain parametric multichannel extension information for the respective frequency region. The invention relates equally to a method for supporting in a corresponding manner a multichannel audio extension at a decoding end. Also shown are a corresponding encoder, a corresponding decoder, and corresponding devices, systems and software program products.
    Type: Grant
    Filed: May 26, 2005
    Date of Patent: November 17, 2009
    Assignee: Nokia Corporation
    Inventor: Juha Ojanperä
  • Patent number: 7620263
    Abstract: An image processing system provides image enhancement and anti-clipping units. The anti-clipping unit for image sharpness enhancement, operates such that any shoot artifacts in the enhanced image that go beyond pixel value lower/upper bounds are properly adjusted back within the lower and upper bounds, without causing prominent edge jaggedness artifacts in the final resulting output image.
    Type: Grant
    Filed: October 6, 2005
    Date of Patent: November 17, 2009
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Surapong Lertrattanapanich, Yeong-Taeg Kim, Zhi Zhou
  • Patent number: 7605722
    Abstract: Provided are a lossless audio coding/decoding apparatus and method. The lossless audio coding apparatus includes a first coder to directly code first symbols; a second coder module comprising a plurality of second coders to convert the first symbols into second symbols and to code the second symbols; a first selector to compare the performance of the first coder to the performance of the second coders and to output a coding mode in accordance with a comparison result; and a second selector to output a final bitstream by coding the first symbols in correspondence with the coding mode. According to the present invention, the performance of audio coding may be improved.
    Type: Grant
    Filed: April 11, 2008
    Date of Patent: October 20, 2009
    Assignees: Electronics and Telecommunications Research Institute, Yonsei University Office of Research Affairs, Hanyang University Industry-University Cooperation Foundation, Kwangwoon University Industry-Academic Collaboration Foundation
    Inventors: Seung Kwon Beack, Jeongil Seo, Inseon Jang, Dae Young Jae, Hochong Park, Sang-One Kang, Young-Cheol Park, Jin Woo Hong
  • Patent number: 7602922
    Abstract: There is described a multi-channel encoder (10; 600) for processing input signals conveyed in N input channels to generate corresponding output signals conveyed in M output channels together with complementary parametric data; M and N are integers wherein N>M. The encoder (10; 600) includes a down-mixer for down-mixing the input signals to generate the corresponding output signals, the encoder also comprising an analyser for processing the input signals to generate the parameter data, said parametric data describing mutual differences between the N channels of input signal to allow for regenerating during decoding one or more of the N channels of input signals from the M channels of output signal. Such an encoder (10; 600) is capable of providing highly efficient data encoding and also of being backwards compatibility with relatively simpler decoders having fewer than N decoding output channels. The invention also concerns decoders (800) compatible with such a multi-channel encoder (10; 600).
    Type: Grant
    Filed: March 25, 2005
    Date of Patent: October 13, 2009
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Dirk J. Breebaart, Erik G. P. Schuijers, Gerard H. Hotho, Machiel W. Van Loon
  • Patent number: 7596491
    Abstract: Layered (embedded) code-excited linear prediction (CELP) speech encoders/decoders with adaptive plus algebraic codebooks applied in each layer with fixed codebook pulses of one layer used in higher layers. Pulse weightings emphasize lower layer pulses relative to the higher layer pulses.
    Type: Grant
    Filed: April 17, 2006
    Date of Patent: September 29, 2009
    Assignee: Texas Instruments Incorporated
    Inventor: Jacek Stachurski
  • Publication number: 20090234644
    Abstract: A scalable speech and audio codec is provided that implements combinatorial spectrum encoding. A residual signal is obtained from a Code Excited Linear Prediction (CELP)-based encoding layer, where the residual signal is a difference between an original audio signal and a reconstructed version of the original audio signal. The residual signal is transformed at a Discrete Cosine Transform (DCT)-type transform layer to obtain a corresponding transform spectrum having a plurality of spectral lines. The transform spectrum spectral lines are transformed using a combinatorial position coding technique. The combinatorial position coding technique includes generating a lexicographical index for a selected subset of spectral lines, where each lexicographic index represents one of a plurality of possible binary strings representing the positions of the selected subset of spectral lines. The lexicographical index represents non-zero spectral lines in a binary string in fewer bits than the length of the binary string.
    Type: Application
    Filed: October 21, 2008
    Publication date: September 17, 2009
    Applicant: QUALCOMM Incorporated
    Inventors: Yuriy Reznik, Pengjun Huang