Patents by Inventor Kazuhito Koishida

Kazuhito Koishida has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7953604
    Abstract: An audio encoder performs frequency extension coding that comprises determining one or more shape parameters using a displacement vector that corresponds to a displacement of an even number (e.g., an even number of sub-bands between a sub-band in a baseband frequency range and a sub-band in an extended-band frequency range). The shape parameters can be determined on a per-audio-block basis. Restricting a displacement to an even number (in frequency extension coding or in other signal modulation schemes) can improve the quality of reconstructed audio. An audio encoder also can perform frequency extension coding that comprises determining one or more scale parameters at one or more audio blocks, and determining one or more anchor points for interpolating the one or more scale parameters.
    Type: Grant
    Filed: January 20, 2006
    Date of Patent: May 31, 2011
    Assignee: Microsoft Corporation
    Inventors: Sanjeev Mehrotra, Wei-Ge Chen, Kazuhito Koishida, Chao He
  • Patent number: 7904293
    Abstract: Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.
    Type: Grant
    Filed: October 9, 2007
    Date of Patent: March 8, 2011
    Assignee: Microsoft Corporation
    Inventors: Tian Wang, Kazuhito Koishida, Hosam A. Khalil, Xiaoqin Sun, Wei-Ge Chen
  • Patent number: 7885819
    Abstract: An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique.
    Type: Grant
    Filed: June 29, 2007
    Date of Patent: February 8, 2011
    Assignee: Microsoft Corporation
    Inventors: Kazuhito Koishida, Sanjeev Mehrotra, Chao He, Wei-Ge Chen
  • Patent number: 7831421
    Abstract: Techniques and tools related to delayed or lost coded audio information are described. For example, a concealment technique for one or more missing frames is selected based on one or more factors that include a classification of each of one or more available frames near the one or more missing frames. As another example, information from a concealment signal is used to produce substitute information that is relied on in decoding a subsequent frame. As yet another example, a data structure having nodes corresponding to received packet delays is used to determine a desired decoder packet delay value.
    Type: Grant
    Filed: May 31, 2005
    Date of Patent: November 9, 2010
    Assignee: Microsoft Corporation
    Inventors: Hosam A. Khalil, Tian Wang, Kazuhito Koishida, Xiaoqin Sun, Wei-Ge Chen
  • Publication number: 20100280827
    Abstract: Embodiments for implementing a speech recognition system that includes a speech classifier ensemble are disclosed. In accordance with one embodiment, the speech recognition system includes a classifier ensemble to convert feature vectors that represent a speech vector into log probability sets. The classifier ensemble includes a plurality of classifiers. The speech recognition system includes a decoder ensemble to transform the log probability sets into output symbol sequences. The speech recognition system further includes a query component to retrieve one or more speech utterances from a speech database using the output symbol sequences.
    Type: Application
    Filed: April 30, 2009
    Publication date: November 4, 2010
    Applicant: Microsoft Corporation
    Inventors: Kunal Mukerjee, Kazuhito Koishida, Shankar Regunathan
  • Patent number: 7774205
    Abstract: An audio encoder/decoder provides efficient compression of spectral transform coefficient data characterized by sparse spectral peaks. The audio encoder/decoder applies a temporal prediction of the frequency position of spectral peaks. The spectral peaks in the transform coefficients that are predicted from those in a preceding transform coding block are encoded as a shift in frequency position from the previous transform coding block and two non-zero coefficient levels. The prediction may avoid coding very large zero-level transform coefficient runs as compared to conventional run length coding. For spectral peaks not predicted from those in a preceding transform coding block, the spectral peaks are encoded as a value trio of a length of a run of zero-level spectral transform coefficients, and two non-zero coefficient levels.
    Type: Grant
    Filed: June 15, 2007
    Date of Patent: August 10, 2010
    Assignee: Microsoft Corporation
    Inventors: Kazuhito Koishida, Sanjeev Mehrotra, Wei-Ge Chen
  • Patent number: 7761290
    Abstract: An audio encoder/decoder performs band partitioning for vector quantization encoding of spectral holes and missing high frequencies that result from quantization when encoding at low bit rates. The encoder/decoder determines a band structure for spectral holes based on two threshold parameters: a minimum hole size threshold and a maximum band size threshold. Spectral holes wider than the minimum hole size threshold are partitioned evenly into bands not exceeding the maximum band size threshold in size. Such hole filling bands are configured up to a preset number of hole filling bands. The bands for missing high frequencies are then configured by dividing the high frequency region into bands having binary-increasing, linearly-increasing or arbitrarily-configured band sizes up to a maximum overall number of bands.
    Type: Grant
    Filed: June 15, 2007
    Date of Patent: July 20, 2010
    Assignee: Microsoft Corporation
    Inventors: Kazuhito Koishida, Sanjeev Mehrotra, Wei-Ge Chen
  • Patent number: 7734465
    Abstract: Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.
    Type: Grant
    Filed: October 9, 2007
    Date of Patent: June 8, 2010
    Assignee: Microsoft Corporation
    Inventors: Tian Wang, Kazuhito Koishida, Hosam A. Khalil, Xiaoqin Sun, Wei-Ge Chen
  • Publication number: 20100125455
    Abstract: Various strategies for rate/quality control and loss resiliency in an audio codec are described. The various strategies can be used in combination or independently. For example, a real-time speech codec uses intra frame coding/decoding, adaptive multi-mode forward error correction [“FEC”], and rate/quality control techniques. Intra frames help a decoder recover quickly from packet losses, while compression efficiency is still emphasized with predicted frames. Various strategies for inserting intra frames and signaling intra/predicted frames are described. With the adaptive multi-mode FEC, an encoder adaptively selects between multiple modes to efficiently and quickly provide a level of FEC that takes into account the bandwidth currently available for FEC. The FEC information itself may be predictively encoded and decoded relative to primary encoded information. Various rate/quality and FEC control strategies allow additional adaptation to available bandwidth and network conditions.
    Type: Application
    Filed: January 22, 2010
    Publication date: May 20, 2010
    Applicant: Microsoft Corporation
    Inventors: Tian Wang, Hosam A. Khalil, Kazuhito Koishida, Wei-Ge Chen, Mu Han
  • Patent number: 7707034
    Abstract: Techniques and tools are described for processing reconstructed audio signals. For example, a reconstructed audio signal is filtered in the time domain using filter coefficients that are calculated, at least in part, in the frequency domain. As another example, producing a set of filter coefficients for filtering a reconstructed audio signal includes clipping one or more peaks of a set of coefficient values. As yet another example, for a sub-band codec, in a frequency region near an intersection between two sub-bands, a reconstructed composite signal is enhanced.
    Type: Grant
    Filed: May 31, 2005
    Date of Patent: April 27, 2010
    Assignee: Microsoft Corporation
    Inventors: Xiaoqin Sun, Tian Wang, Hosam A. Khalil, Kazuhito Koishida, Wei-Ge Chen
  • Patent number: 7668712
    Abstract: Various strategies for rate/quality control and loss resiliency in an audio codec are described. The various strategies can be used in combination or independently. For example, a real-time speech codec uses intra frame coding/decoding, adaptive multi-mode forward error correction [“FEC”], and rate/quality control techniques. Intra frames help a decoder recover quickly from packet losses, while compression efficiency is still emphasized with predicted frames. Various strategies for inserting intra frames and signaling intra/predicted frames are described. With the adaptive multi-mode FEC, an encoder adaptively selects between multiple modes to efficiently and quickly provide a level of FEC that takes into account the bandwidth currently available for FEC. The FEC information itself may be predictively encoded and decoded relative to primary encoded information. Various rate/quality and FEC control strategies allow additional adaptation to available bandwidth and network conditions.
    Type: Grant
    Filed: March 31, 2004
    Date of Patent: February 23, 2010
    Assignee: Microsoft Corporation
    Inventors: Tian Wang, Hosam A. Khalil, Kazuhito Koishida, Wei-Ge Chen, Mu Han
  • Publication number: 20090276212
    Abstract: Techniques and tools related to delayed or lost coded audio information are described. For example, a concealment technique for one or more missing frames is selected based on one or more factors that include a classification of each of one or more available frames near the one or more missing frames. As another example, information from a concealment signal is used to produce substitute information that is relied on in decoding a subsequent frame. As yet another example, a data structure having nodes corresponding to received packet delays is used to determine a desired decoder packet delay value.
    Type: Application
    Filed: July 14, 2009
    Publication date: November 5, 2009
    Applicant: Microsoft Corporation
    Inventors: Hosam A. Khalil, Tian Wang, Kazuhito Koishida, Xiaoqin Sun, Wei-Ge Chen
  • Publication number: 20090248424
    Abstract: A scalable audio codec encodes an input audio signal as a base layer at a high compression ratio and one or more residual signals as an enhancement layer of a compressed bitstream, which permits a lossless or near lossless reconstruction of the input audio signal at decoding. The scalable audio codec uses perceptual transform coding to encode the base layer. The residual is calculated in a transform domain, which includes a frequency and possibly also multi-channel transform of the input audio. For lossless reconstruction, the frequency and multi-channel transforms are reversible.
    Type: Application
    Filed: March 25, 2008
    Publication date: October 1, 2009
    Applicant: Microsoft Corporation
    Inventors: Kazuhito Koishida, Sanjeev Mehrotra, Radhika Jandhyala
  • Patent number: 7590531
    Abstract: Techniques and tools related to delayed or lost coded audio information are described. For example, a concealment technique for one or more missing frames is selected based on one or more factors that include a classification of each of one or more available frames near the one or more missing frames. As another example, information from a concealment signal is used to produce substitute information that is relied on in decoding a subsequent frame. As yet another example, a data structure having nodes corresponding to received packet delays is used to determine a desired decoder packet delay value.
    Type: Grant
    Filed: August 4, 2005
    Date of Patent: September 15, 2009
    Assignee: Microsoft Corporation
    Inventors: Hosam A. Khalil, Tian Wang, Kazuhito Koishida, Xiaoqin Sun, Wei-Ge Chen
  • Patent number: 7562021
    Abstract: Coding of spectral data by representing certain portions of the spectral data as a scaled version of a code-vector, where the code-vector is chosen from either a fixed predetermined codebook or a codebook taken from a baseband. Various optional features are described for modifying the code-vectors in the codebook according to some rules which allow the code-vector to better represent the data they are modeling. The code-vector modification comprises a linear or non-linear transform of one or more code-vectors, such as, by exponentiation, negation, reversing, or combining elements from plural code-vectors.
    Type: Grant
    Filed: July 15, 2005
    Date of Patent: July 14, 2009
    Assignee: Microsoft Corporation
    Inventors: Sanjeev Mehrotra, Wei-Ge Chen, Kazuhito Koishida
  • Publication number: 20090125315
    Abstract: An audio encoder encodes side information into a compressed audio bitstream containing encoding parameters used by the encoder for one or more encoding techniques, such as a noise-mask-ratio curve used for rate control. A transcoder uses the encoder generated side information to transcode the audio from the original compressed bitstream having an initial bit-rate into a second bitstream having a new bit-rate. Because the side information is derived from the original audio, the transcoder is able to better maintain audio quality of the transcoding. The side information also allows the transcoder to re-encode from an intermediate decoding/encoding stage for faster and lower complexity transcoding.
    Type: Application
    Filed: November 9, 2007
    Publication date: May 14, 2009
    Applicant: Microsoft Corporation
    Inventors: Kazuhito Koishida, Sanjeev Mehrotra, Wei-Ge Chen
  • Publication number: 20090006103
    Abstract: An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique.
    Type: Application
    Filed: June 29, 2007
    Publication date: January 1, 2009
    Applicant: Microsoft Corporation
    Inventors: Kazuhito Koishida, Sanjeev Mehrotra, Chao He, Wei-Ge Chen
  • Publication number: 20080312758
    Abstract: An audio encoder/decoder provides efficient compression of spectral transform coefficient data characterized by sparse spectral peaks. The audio encoder/decoder applies a temporal prediction of the frequency position of spectral peaks. The spectral peaks in the transform coefficients that are predicted from those in a preceding transform coding block are encoded as a shift in frequency position from the previous transform coding block and two non-zero coefficient levels. The prediction may avoid coding very large zero-level transform coefficient runs as compared to conventional run length coding. For spectral peaks not predicted from those in a preceding transform coding block, the spectral peaks are encoded as a value trio of a length of a run of zero-level spectral transform coefficients, and two non-zero coefficient levels.
    Type: Application
    Filed: June 15, 2007
    Publication date: December 18, 2008
    Applicant: Microsoft Corporation
    Inventors: Kazuhito Koishida, Sanjeev Mehrotra, Wei-Ge Chen
  • Publication number: 20080312759
    Abstract: An audio encoder/decoder performs band partitioning for vector quantization encoding of spectral holes and missing high frequencies that result from quantization when encoding at low bit rates. The encoder/decoder determines a band structure for spectral holes based on two threshold parameters: a minimum hole size threshold and a maximum band size threshold. Spectral holes wider than the minimum hole size threshold are partitioned evenly into bands not exceeding the maximum band size threshold in size. Such hole filling bands are configured up to a preset number of hole filling bands. The bands for missing high frequencies are then configured by dividing the high frequency region into bands having binary-increasing, linearly-increasing or arbitrarily-configured band sizes up to a maximum overall number of bands.
    Type: Application
    Filed: June 15, 2007
    Publication date: December 18, 2008
    Applicant: Microsoft Corporation
    Inventors: Kazuhito Koishida, Sanjeev Mehrotra, Wei-Ge Chen
  • Patent number: 7454332
    Abstract: A gain-constrained noise suppression for speech more precisely estimates noise, including during speech, to reduce musical noise artifacts introduced from noise suppression. The noise suppression operates by applying a spectral gain G(m, k) to each short-time spectrum value S(m, k) of a speech signal, where m is the frame number and k is the spectrum index. The spectrum values are grouped into frequency bins, and a noise characteristic estimated for each bin classified as a “noise bin.” An energy parameter is smoothed in both the time domain and the frequency domain to improve noise estimation per bin. The gain factors G(m, k) are calculated based on the current signal spectrum and the noise estimation, then smoothed before being applied to the signal spectral values S(m, k).
    Type: Grant
    Filed: June 15, 2004
    Date of Patent: November 18, 2008
    Assignee: Microsoft Corporation
    Inventors: Kazuhito Koishida, Feng Zhuge, Hosam A. Khalil, Tian Wang, Wei-ge Chen