Spectral Prediction For Pre-echo Prevention; Temporal Noise Shaping (tns), E.g., In Mpeg2 Or Mpeg4, Etc. (epo) Patents (Class 704/E19.014)
-
Patent number: 11670010Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for compressing and decompressing data. In one aspect, a method comprises: processing data using an encoder neural network to generate a latent representation of the data; processing the latent representation of the data using a hyper-encoder neural network to generate a latent representation of an entropy model; generating an entropy encoded representation of the latent representation of the entropy model; generating an entropy encoded representation of the latent representation of the data using the latent representation of the entropy model; and determining a compressed representation of the data from the entropy encoded representations of: (i) the latent representation of the data and (ii) the latent representation of the entropy model used to entropy encode the latent representation of the data.Type: GrantFiled: January 19, 2022Date of Patent: June 6, 2023Assignee: Google LLCInventors: David Charles Minnen, Saurabh Singh, Johannes Balle, Troy Chinen, Sung Jin Hwang, Nicholas Johnston, George Dan Toderici
-
Publication number: 20120323583Abstract: A communication terminal includes: a decoder which decodes an input bitstream received from another communication terminal, to generate an output audio signal and outputs the generated output audio signal to a speaker; an echo canceller which obtains an input audio signal representing sound captured by a microphone placed in a space to which the speaker outputs the sound, and removes, for respective subbands, an echo component included in the obtained input audio signal and corresponding to the output audio signal, to generate an audio signal for transmission; an encoder which codes the audio signal for transmission to generate an output bitstream and transmits the generated output bitstream to another communication terminal; and a control unit which controls, for the respective subbands, echo cancellation processing according to a reproduction band of at least one of the output audio signal and the audio signal for transmission.Type: ApplicationFiled: August 21, 2012Publication date: December 20, 2012Inventors: Shuji MIYASAKA, Kosuke NISHIO, Ichiro KAWASHIMA
-
Patent number: 8063809Abstract: A transient signal encoding method and device, decoding method and device, and processing system, where the transient signal encoding method includes: obtaining a reference sub-frame where a maximal time envelope having a maximal amplitude value is located from time envelopes of all sub-frames of an input transient signal; adjusting an amplitude value of the time envelope of each sub-frame before the reference sub-frame in such a way that a first difference is greater than a preset first threshold, in which the first difference is a difference between the amplitude value of the time envelope of each sub-frame before the reference sub-frame and the amplitude value of the maximal time envelope; and writing the adjusted time envelope into bitstream.Type: GrantFiled: June 29, 2011Date of Patent: November 22, 2011Assignee: Huawei Technologies Co., Ltd.Inventors: Zexin Liu, Longyin Chen, Lei Miao, Chen Hu, Wei Xiao, Herve Marcel Taddei, Qing Zhang
-
Publication number: 20110125491Abstract: The perceived quality of a speech signal is improved by estimating the average power of first and second signal components and applying a first gain factor to the second signal components to generate adjusted second signal components. The first gain factor is selected such that on application of the first gain factor to the second signal components, the ratio of the average power of the first signal components to the average power of the adjusted second signal components would be a first predetermined value, the first predetermined value being such as to inhibit perceptual distortion of the improved speech signal.Type: ApplicationFiled: November 23, 2009Publication date: May 26, 2011Inventors: Rogerio Guedes Alves, Kuan-Chieh Yen, Michael Christopher Vartanian, Sameer Arun Gadre
-
Patent number: 7941315Abstract: Accepting the speech having the noise superimposed thereon and converting it into a signal on a time axis of the speech, an amplitude component of a speech for each predetermined frequency band of the converted signal on the frequency axis is calculated. Calculating a noise reduction coefficient, the noise component is reduced by multiplying the signal on the frequency axis of the original signal by the calculated noise reduction coefficient. By estimating the target value of the remaining noise for each frequency band, a signal on a frequency axis in which a signal corresponding to a frequency band of which target value estimated by the noise target value is larger than the value of the amplitude component of the signal on the frequency axis of which noise component is reduced is corrected to a signal corresponding to the target value is restored, into a signal on a time axis.Type: GrantFiled: March 22, 2006Date of Patent: May 10, 2011Assignee: Fujitsu LimitedInventor: Naoshi Matsuo
-
Publication number: 20100145688Abstract: An apparatus and a method to encode and decode a speech signal using an encoding mode are provided. An encoding apparatus may select an encoding mode of a frame included in an input speech signal, and encode a frame having an unvoiced mode for an unvoiced speech as the selected encoding mode.Type: ApplicationFiled: December 4, 2009Publication date: June 10, 2010Applicant: SAMSUNG ELECTRONICS CO., LTD.Inventors: Ho Sang Sung, Ki Hyun Choo, Jung Hoe Kim, Eun Mi Oh
-
Publication number: 20100100390Abstract: To reduce the amount of transmitted information and further reduce the processing amount at a decoding apparatus.Type: ApplicationFiled: June 21, 2006Publication date: April 22, 2010Inventor: Naoya Tanaka
-
Publication number: 20090063160Abstract: A configurable common filterbank processor applicable for various audio standards and its processing method. Inverse modified discrete cosine transform (IMDCT) and window and overlap-add (WOA) decoding operations required by AC-3 and AAC, and IMDC, WOA, and matrixing decoding operations required by MP3 are divided into several different modes, and a quick algorithm is provided for expediting the operation of these modes, and a hardware architecture is designed universally for these modes, so that the hardware architecture can be applicable for the decoding operations of three different audio standards, respectively AC-3, AAC and MP3, to expand the scope of applicability of a decoder.Type: ApplicationFiled: November 5, 2007Publication date: March 5, 2009Inventors: Tsung-Han Tsai, Chun-Nan Liu, Hsing-Chuang Liu
-
Publication number: 20080162119Abstract: An acoustic signal is subjected to filtration whereby low frequency sounds such as respiration are removed. Intense acoustic sounds such as coughing are also removed, and ultrasonic carrier modulation and demodulation is also performed to increase the saliency of speech sounds. By removing non-speech sounds from an acoustic signal comprising speech, a method is disclosed for improving the functioning of devices such as speech recognition machinery. Devices for implementing such techniques are also disclosed.Type: ApplicationFiled: January 3, 2008Publication date: July 3, 2008Inventor: Martin L. Lenhardt
-
Publication number: 20080140393Abstract: Provided is a speech coding apparatus and method. A band divider divides an input signal into a high-band signal and a low-band signal, a narrowband encoder encodes the low-band signal using a Code Excited Linear Prediction (CELP)-based narrowband speech codec, a frequency characteristic collector converts the high-band signal to a signal in a frequency domain and obtains Modified Discrete Cosine Transform (MDCT) coefficients, a subband determiner determines subbands in a final stage based on the MDCT coefficients and determines subbands for quantization based on the subbands in a final stage, a gain quantizer performs gain quantization of the subbands, a bit assignment unit assigns bits to the subbands according to the magnitude of the gain quantization, and a shape quantizer performs shape quantization of the subbands in an algebraic method. Accordingly, algorithm consistency can be maintained and a complexity can be reduced by extending a bandwidth with a small number of bits in a speech codec.Type: ApplicationFiled: October 30, 2007Publication date: June 12, 2008Applicant: ELECTRONICS & TELECOMMUNICATIONS RESEARCH INSTITUTEInventors: Hyun-Woo KIM, Do Young Kim, Hae Won Jung