Patents by Inventor Rongshan Yu

Rongshan Yu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20110125507
    Abstract: A decoder configured to generate decoded audio data (e.g., decoded speech data) and including a postfilter coupled and configured to filter encoded audio data in the frequency domain, methods for frequency domain postfiltering of encoded audio data in a decoder, and methods for decoding encoded audio data in a decoder including by postfiltering encoded audio data in the frequency domain in the decoder. In some embodiments, the decoder is configured to decode input encoded audio without performing any time-to-frequency domain transform on encoded audio data to prepare data for postfiltering. Typically, the postfiltering improves the quality of the decoded audio signal by attenuating spectral valley regions thereof to remove excess quantization noise present in the encoded input audio while preserving formants of the decoded audio signal to avoid introducing unnecessary distortion.
    Type: Application
    Filed: July 14, 2009
    Publication date: May 26, 2011
    Applicant: Dolby Laboratories Licensing Corporation
    Inventor: Rongshan Yu
  • Publication number: 20110106533
    Abstract: A dual microphone voice activity detector system is presented. A voice activity detector system estimates the signal level and noise level at each microphone. A level differential between the two microphones of nearby sounds such as the signal is greater than the level differential of more distant sounds such as the noise. Thus, the voice activity detector detects the presence of nearby sounds.
    Type: Application
    Filed: June 25, 2009
    Publication date: May 5, 2011
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventor: Rongshan Yu
  • Patent number: 7876821
    Abstract: A method for rate control for encoding video sequence, wherein the video sequence includes a plurality of Group Of Pictures, wherein each Group of Picture includes at least and I-frame and an Inter-frame, where the rate control method includes the following steps for the encoding of the Inter-frame in the Group of Picture: determining a desired frame rate based on an available bandwidth of a channel for transmitting the video sequence and an available computational resources for the encoding process; determining a target buffer level based on the desired frame rate and the position of the Inter-frame with respect to the I-frame; and determining a target bit rate based on the target buffer level and the available channel bandwidth, wherein the target bit rate is used for controlling the rate of encoding the video sequence.
    Type: Grant
    Filed: September 5, 2002
    Date of Patent: January 25, 2011
    Assignee: Agency for Science, Technology and Research
    Inventors: Zhengguo Li, Feng Pan, Keng Pang Lim, Dajun Wu, Rongshan Yu, Genan Feng, Dusheng Wang
  • Publication number: 20110010168
    Abstract: The invention relates to the coding of audio signals that may include both speech-like and non-speech-like signal components. It describes methods and apparatus for code excited linear prediction (CELP) audio encoding and decoding that employ linear predictive coding (LPC) synthesis filters controlled by LPC parameters, a plurality of codebooks each having codevectors, at least one codebook providing an excitation more appropriate for non-speech-like signals and at least one codebook providing an excitation more appropriate for speech-like signals, and a plurality of gain factors, each associated with a codebook. The encoding methods and apparatus select from the codebooks codevectors and/or associated gain factors by minimizing a measure of the difference between the audio signal and a reconstruction of the audio signal derived from the codebook excitations. The decoding methods and apparatus generate a reconstructed output signal from the LPC parameters, codevectors, and gain factors.
    Type: Application
    Filed: March 12, 2009
    Publication date: January 13, 2011
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventors: Rongshan Yu, Regunathan Radhakrishnan, Robert Andersen, Grant Davidson
  • Publication number: 20110001642
    Abstract: Systems and methods for scalably encoding and decoding coded data are presented. One exemplary method for scalably coding data includes classifying, based upon at least one predetermined criteria, each of the plurality of data received as either (i) perceptually relevant data or (ii) perceptually irrelevant data. The perceptually relevant data is scalably coded, and the perceptually irrelevant data is non-scalably coded. Subsequently, the scalably coded perceptually relevant data and the non-scalably coded perceptually irrelevant are combined into a coded data stream for transmission.
    Type: Application
    Filed: June 7, 2004
    Publication date: January 6, 2011
    Inventors: Rongshan Yu, Xiao Lin, Susanto Rahardja
  • Publication number: 20100211388
    Abstract: A method for enhancing speech components of an audio signal composed of speech and noise components processes subbands of the audio signal, the processing including controlling the gain of the audio signal in ones of the subbands, wherein the gain in a subband is controlled at least by processes that convey either additive/subtractive differences in gain or multiplicative ratios of gain so as to reduce gain in a subband as the level of noise components increases with respect to the level of speech components in the subband and increase gain in a subband when speech components are present in subbands of the audio signal, the processes each responding to subbands of the audio signal and controlling gain independently of each other to provide a processed subband audio signal.
    Type: Application
    Filed: September 10, 2008
    Publication date: August 19, 2010
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventors: Rongshan Yu, Charles Phillip Brown
  • Publication number: 20100198593
    Abstract: Enhancing speech components of an audio signal composed of speech and noise components includes controlling the gain of the audio signal in ones of its subbands, wherein the gain in a subband is reduced as the level of estimated noise components increases with respect to the level of speech components, wherein the level of estimated noise components is determined at least in part by (1) comparing an estimated noise components level with the level of the audio signal in the subband and increasing the estimated noise components level in the subband by a predetermined amount when the input signal level in the subband exceeds the estimated noise components level in the subband by a limit for more than a defined time, or (2) obtaining and monitoring the signal-to-noise ratio in the subband and increasing the estimated noise components level in the subband by a predetermined amount when the signal-to-noise ratio in the subband exceeds a limit for more than a defined time.
    Type: Application
    Filed: September 10, 2008
    Publication date: August 5, 2010
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventor: Rongshan Yu
  • Publication number: 20100100386
    Abstract: A speech enhancement method operative for devices having limited available memory is described. The method is appropriate for very noisy environments and is capable of estimating the relative strengths of speech and noise components during both the presence as well as the absence of speech.
    Type: Application
    Filed: March 14, 2008
    Publication date: April 22, 2010
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventor: Rongshan Yu
  • Publication number: 20100076769
    Abstract: Speech enhancement based on a psycho-acoustic model is disclosed that is capable of preserving the fidelity of speech while sufficiently suppressing noise including the processing artifact known as “musical noise”.
    Type: Application
    Filed: March 14, 2008
    Publication date: March 25, 2010
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventor: Rongshan Yu
  • Patent number: 7656319
    Abstract: A system for the context-based for the context-based encoding of an input signal includes a domain transform module and a context-based coding module. The domain transform module is operable to convert the input signal into a sequence of transform coefficients c[i]. The context-based coding module includes a bit-plane scanning module, and context modeling module, and a statistical encoding module. The bit-plane scanning module is operable to produce a bit-plane symbol bps [i,bp] for each transform coefficient c[i] and each bit-plane [bp]. The context modeling module is operable to assign one or more context values to each of the received bit plane symbols bps [i,bp]. The statistical coding module is operable to code each of the bit plane symbols bps [i,bp] as a function of one or more of the corresponding context values to produce a context-based encoded symbol stream.
    Type: Grant
    Filed: July 14, 2004
    Date of Patent: February 2, 2010
    Assignee: Agency for Science, Technology and Research
    Inventors: Rongshan Yu, Xiao Lin, Susanto Rahardja
  • Patent number: 7532763
    Abstract: A method for processing bit symbols generated by a data source, in particular a video, still image or audio source, comprising the following steps of constructing a plurality of bit-planes from the data source, each bit-plane comprising a plurality of bit-plane symbols; scanning the bit-plane symbols of each bit-plane to generate a binary string of bit-plane symbols; and encoding the binary string of the bit-plane symbols using a statistical model, wherein the statistical model is based on statistical properties of a Laplacian probability distribution function which characterizes the data source.
    Type: Grant
    Filed: October 24, 2002
    Date of Patent: May 12, 2009
    Assignee: Agency for Science, Technology and Research
    Inventors: Rongshan Yu, Susanto Rahardja, Xiao Lin
  • Publication number: 20090028240
    Abstract: An encoder for encoding a first digital signal representative for a first channel and a second digital signal representative for a second channel is described. The encoder comprises cascaded intra-channel prediction elements for compressing the first digital signal and the second digital signal based on intra-channel correlation and an inter-channel prediction element for compressing the first digital signal and the second digital signal based on inter-channel correlation.
    Type: Application
    Filed: January 9, 2006
    Publication date: January 29, 2009
    Inventors: Haibin Huang, Wee Boon Choo, Rongshan Yu, Xiao Lin
  • Publication number: 20080094259
    Abstract: A system for the context-based for the context-based encoding of an input signal includes a domain transform module and a context-based coding module. The domain transform module is operable to convert the input signal into a sequence of transform coefficients c[i]. The context-based coding module includes a bit-plane scanning module, and context modeling module, and a statistical encoding module. The bit-plane scanning module is operable to produce a bit-plane symbol bps[i,bp] for each transform coefficient c[i] and each bit-plane [bp]. The context modeling module is operable to assign one or more context values to each of the received bit plane symbols bps[i,bp]. The statistical coding module is operable to code each of the bit plane symbols bps[i,bp] as a function of one or more of the corresponding context values to produce a context-based encoded symbol stream.
    Type: Application
    Filed: July 14, 2004
    Publication date: April 24, 2008
    Applicant: AGENCY FOR SCIENCE, TECHNOLOGY AND RESEARCH
    Inventors: Rongshan Yu, Xiao Lin, Susanto Rahardja
  • Publication number: 20080030385
    Abstract: A method for transforming a digital signal from the time domain into the frequency domain and vice versa using a transformation function comprising a transformation matrix, the digital signal comprising data symbols which are grouped into a plurality of blocks, each block comprising a predefined number of the data symbols. The method includes the process of transforming two blocks of the digital signal by one transforming element, wherein the transforming element corresponds to a block-diagonal matrix comprising two sub matrices, wherein each sub-matrix comprises the transformation matrix and the transforming element comprises a plurality of lifting stages and wherein each lifting stage comprises the processing of blocks of the digital signal by an auxiliary transformation and by a rounding unit.
    Type: Application
    Filed: May 6, 2004
    Publication date: February 7, 2008
    Inventors: Haibin Huang, Xiao Lin, Susanto Rahardja, Rongshan Yu
  • Publication number: 20080025519
    Abstract: Transfer functions like Head Related Transfer Functions (HRTF) needed for binaural rendering are implemented efficiently by a subband-domain filter structure. In one implementation, amplitude, fractional-sample delay and phase-correction filters are arranged in cascade with one another and applied to subband signals that represent spectral content of an audio signal in frequency subbands. Other filter structures are also disclosed. These filter structures may be used advantageously in a variety of signal processing applications. A few examples of audio applications include signal bandwidth compression, loudness equalization, room acoustics correction and assisted listening for individuals with hearing impairments.
    Type: Application
    Filed: July 27, 2007
    Publication date: January 31, 2008
    Inventors: Rongshan Yu, Charles Robinson, Mark Vinton
  • Publication number: 20070274383
    Abstract: A method for encoding a digital signal into a scalable bitstream comprising quantizing the digital signal, and encoding the quantized signal to form a core-layer bitstream, performing an error mapping based on the digital signal and the core-layer bitstream to remove information that has been encoded into the core-layer bitstream, resulting in an error signal, bit-plane coding the error signal based on perceptual information of the digital signal, resulting in an enhancement-layer bitstream, wherein the perceptual information of the digital signal is determined using a perceptual model, and multiplexing the core-layer bitstream and the enhancement-layer bitstream, thereby generating the scalable bitstream.
    Type: Application
    Filed: October 6, 2004
    Publication date: November 29, 2007
    Inventors: Rongshan Yu, Xiao Lin, Susanto Rahardja
  • Publication number: 20070276894
    Abstract: According to the process for determining a transform element for a given transformation function, which transformation function comprises a transformation matrix and corresponds to a transformation of a digital signal from the time domain into the frequency domain or vice versa, the transformation matrix is decomposed into a rotation matrix (306) and an auxiliary matrix (307) which, when multiplied with itself, equals a permutation matrix multiplied with an integer diagonal matrix. Further, the rotation matrix (306) and the auxiliary matrix (307) are each decomposed into a plurality of lifting matrices (308). Further, the transforming element is determined to comprise of a plurality of lifting stages (309) which correspond to the lifting matrices (308). The invention further provides a method for the transformation of a digital signal from the time domain into the frequency domain according to the transforming element determined by the process described above.
    Type: Application
    Filed: May 6, 2004
    Publication date: November 29, 2007
    Applicant: Agency for Science, Technology and Research
    Inventors: Haibin Huang, Xiao Lin, Susanto Rahardja, Rongshan Yu
  • Publication number: 20070276893
    Abstract: A method for performing a domain transformation of a digital signal from the time domain into the frequency domain and vice versa, the method including performing the transformation by a transforming element, the transformation element comprising a plurality of lifting stages, wherein the transformation corresponds to a transformation matrix and wherein at least one lifting stage of the plurality of lifting stages comprises at least one auxiliary transformation matrix and a rounding unit, the auxiliary transformation matrix comprising the transformation matrix itself or the corresponding transformation matrix of lower dimension. The method further comprising performing a rounding operation of the signal by the rounding unit after the transformation by the auxiliary transformation matrix.
    Type: Application
    Filed: May 6, 2004
    Publication date: November 29, 2007
    Inventors: Haibin Huang, Xiao Lin, Susanto Rahardja, Rongshan Yu
  • Publication number: 20060200709
    Abstract: A method for processing bit symbols generated by a data source, in particular a video, still image or audio source, comprising the following steps of constructing a plurality of bit-planes from the data source, each bit-plane comprising a plurality of bit-plane symbols; scanning the bit-plane symbols of each bit-plane to generate a binary string of bit-plane symbols; and encoding the binary string of the bit-plane symbols using a statistical model, wherein the statistical model is based on statistical properties of a Laplacian probability distribution function which characterizes the data source.
    Type: Application
    Filed: October 24, 2002
    Publication date: September 7, 2006
    Inventors: Rongshan Yu, Susanto Rahardja, Xiao Lin
  • Publication number: 20060140270
    Abstract: A method for rate control for encoding video sequence, wherein the video sequence comprises a plurality of Group Of Pictures, wherein each Group of Picture comprises at least and I-frame and an Inter-frame, the rate control method comprising the following steps for the encoding of the Inter-frame in the Group of Picture: determining a desired frame rate based on an available bandwidth of a channel for transmitting the video sequence and an available computational resources for the encoding process; determining a target buffer level based on the desired frame rate and the position of the Inter-frame with respect to the I-frame; and determining a target bit rate based on the target buffer level and the available channel bandwidth, wherein the target bit rate is used for controlling the rate of encoding the video sequence.
    Type: Application
    Filed: September 5, 2002
    Publication date: June 29, 2006
    Inventors: Zhengguo Li, Feng Pan, Keng Lim, Dajun Wu, Rongshan Yu, Genan Feng, Dusheng Wang