Spectral Prediction For Pre-echo Prevention; Temporal Noise Shaping (tns), E.g., In Mpeg2 Or Mpeg4, Etc. (epo) Patents (Class 704/E19.014)
  • Patent number: 11670010
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for compressing and decompressing data. In one aspect, a method comprises: processing data using an encoder neural network to generate a latent representation of the data; processing the latent representation of the data using a hyper-encoder neural network to generate a latent representation of an entropy model; generating an entropy encoded representation of the latent representation of the entropy model; generating an entropy encoded representation of the latent representation of the data using the latent representation of the entropy model; and determining a compressed representation of the data from the entropy encoded representations of: (i) the latent representation of the data and (ii) the latent representation of the entropy model used to entropy encode the latent representation of the data.
    Type: Grant
    Filed: January 19, 2022
    Date of Patent: June 6, 2023
    Assignee: Google LLC
    Inventors: David Charles Minnen, Saurabh Singh, Johannes Balle, Troy Chinen, Sung Jin Hwang, Nicholas Johnston, George Dan Toderici
  • Publication number: 20120323583
    Abstract: A communication terminal includes: a decoder which decodes an input bitstream received from another communication terminal, to generate an output audio signal and outputs the generated output audio signal to a speaker; an echo canceller which obtains an input audio signal representing sound captured by a microphone placed in a space to which the speaker outputs the sound, and removes, for respective subbands, an echo component included in the obtained input audio signal and corresponding to the output audio signal, to generate an audio signal for transmission; an encoder which codes the audio signal for transmission to generate an output bitstream and transmits the generated output bitstream to another communication terminal; and a control unit which controls, for the respective subbands, echo cancellation processing according to a reproduction band of at least one of the output audio signal and the audio signal for transmission.
    Type: Application
    Filed: August 21, 2012
    Publication date: December 20, 2012
    Inventors: Shuji MIYASAKA, Kosuke NISHIO, Ichiro KAWASHIMA
  • Patent number: 8063809
    Abstract: A transient signal encoding method and device, decoding method and device, and processing system, where the transient signal encoding method includes: obtaining a reference sub-frame where a maximal time envelope having a maximal amplitude value is located from time envelopes of all sub-frames of an input transient signal; adjusting an amplitude value of the time envelope of each sub-frame before the reference sub-frame in such a way that a first difference is greater than a preset first threshold, in which the first difference is a difference between the amplitude value of the time envelope of each sub-frame before the reference sub-frame and the amplitude value of the maximal time envelope; and writing the adjusted time envelope into bitstream.
    Type: Grant
    Filed: June 29, 2011
    Date of Patent: November 22, 2011
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Zexin Liu, Longyin Chen, Lei Miao, Chen Hu, Wei Xiao, Herve Marcel Taddei, Qing Zhang
  • Publication number: 20110125491
    Abstract: The perceived quality of a speech signal is improved by estimating the average power of first and second signal components and applying a first gain factor to the second signal components to generate adjusted second signal components. The first gain factor is selected such that on application of the first gain factor to the second signal components, the ratio of the average power of the first signal components to the average power of the adjusted second signal components would be a first predetermined value, the first predetermined value being such as to inhibit perceptual distortion of the improved speech signal.
    Type: Application
    Filed: November 23, 2009
    Publication date: May 26, 2011
    Inventors: Rogerio Guedes Alves, Kuan-Chieh Yen, Michael Christopher Vartanian, Sameer Arun Gadre
  • Patent number: 7941315
    Abstract: Accepting the speech having the noise superimposed thereon and converting it into a signal on a time axis of the speech, an amplitude component of a speech for each predetermined frequency band of the converted signal on the frequency axis is calculated. Calculating a noise reduction coefficient, the noise component is reduced by multiplying the signal on the frequency axis of the original signal by the calculated noise reduction coefficient. By estimating the target value of the remaining noise for each frequency band, a signal on a frequency axis in which a signal corresponding to a frequency band of which target value estimated by the noise target value is larger than the value of the amplitude component of the signal on the frequency axis of which noise component is reduced is corrected to a signal corresponding to the target value is restored, into a signal on a time axis.
    Type: Grant
    Filed: March 22, 2006
    Date of Patent: May 10, 2011
    Assignee: Fujitsu Limited
    Inventor: Naoshi Matsuo
  • Publication number: 20100145688
    Abstract: An apparatus and a method to encode and decode a speech signal using an encoding mode are provided. An encoding apparatus may select an encoding mode of a frame included in an input speech signal, and encode a frame having an unvoiced mode for an unvoiced speech as the selected encoding mode.
    Type: Application
    Filed: December 4, 2009
    Publication date: June 10, 2010
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ho Sang Sung, Ki Hyun Choo, Jung Hoe Kim, Eun Mi Oh
  • Publication number: 20100100390
    Abstract: To reduce the amount of transmitted information and further reduce the processing amount at a decoding apparatus.
    Type: Application
    Filed: June 21, 2006
    Publication date: April 22, 2010
    Inventor: Naoya Tanaka
  • Publication number: 20090063160
    Abstract: A configurable common filterbank processor applicable for various audio standards and its processing method. Inverse modified discrete cosine transform (IMDCT) and window and overlap-add (WOA) decoding operations required by AC-3 and AAC, and IMDC, WOA, and matrixing decoding operations required by MP3 are divided into several different modes, and a quick algorithm is provided for expediting the operation of these modes, and a hardware architecture is designed universally for these modes, so that the hardware architecture can be applicable for the decoding operations of three different audio standards, respectively AC-3, AAC and MP3, to expand the scope of applicability of a decoder.
    Type: Application
    Filed: November 5, 2007
    Publication date: March 5, 2009
    Inventors: Tsung-Han Tsai, Chun-Nan Liu, Hsing-Chuang Liu
  • Publication number: 20080162119
    Abstract: An acoustic signal is subjected to filtration whereby low frequency sounds such as respiration are removed. Intense acoustic sounds such as coughing are also removed, and ultrasonic carrier modulation and demodulation is also performed to increase the saliency of speech sounds. By removing non-speech sounds from an acoustic signal comprising speech, a method is disclosed for improving the functioning of devices such as speech recognition machinery. Devices for implementing such techniques are also disclosed.
    Type: Application
    Filed: January 3, 2008
    Publication date: July 3, 2008
    Inventor: Martin L. Lenhardt
  • Publication number: 20080140393
    Abstract: Provided is a speech coding apparatus and method. A band divider divides an input signal into a high-band signal and a low-band signal, a narrowband encoder encodes the low-band signal using a Code Excited Linear Prediction (CELP)-based narrowband speech codec, a frequency characteristic collector converts the high-band signal to a signal in a frequency domain and obtains Modified Discrete Cosine Transform (MDCT) coefficients, a subband determiner determines subbands in a final stage based on the MDCT coefficients and determines subbands for quantization based on the subbands in a final stage, a gain quantizer performs gain quantization of the subbands, a bit assignment unit assigns bits to the subbands according to the magnitude of the gain quantization, and a shape quantizer performs shape quantization of the subbands in an algebraic method. Accordingly, algorithm consistency can be maintained and a complexity can be reduced by extending a bandwidth with a small number of bits in a speech codec.
    Type: Application
    Filed: October 30, 2007
    Publication date: June 12, 2008
    Applicant: ELECTRONICS & TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Hyun-Woo KIM, Do Young Kim, Hae Won Jung