Patents by Inventor Kazuhito Koishida

Kazuhito Koishida has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Shape and scale parameters for extended-band frequency coding

Patent number: 7953604

Abstract: An audio encoder performs frequency extension coding that comprises determining one or more shape parameters using a displacement vector that corresponds to a displacement of an even number (e.g., an even number of sub-bands between a sub-band in a baseband frequency range and a sub-band in an extended-band frequency range). The shape parameters can be determined on a per-audio-block basis. Restricting a displacement to an even number (in frequency extension coding or in other signal modulation schemes) can improve the quality of reconstructed audio. An audio encoder also can perform frequency extension coding that comprises determining one or more scale parameters at one or more audio blocks, and determining one or more anchor points for interpolating the one or more scale parameters.

Type: Grant

Filed: January 20, 2006

Date of Patent: May 31, 2011

Assignee: Microsoft Corporation

Inventors: Sanjeev Mehrotra, Wei-Ge Chen, Kazuhito Koishida, Chao He
Sub-band voice codec with multi-stage codebooks and redundant coding

Patent number: 7904293

Abstract: Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.

Type: Grant

Filed: October 9, 2007

Date of Patent: March 8, 2011

Assignee: Microsoft Corporation

Inventors: Tian Wang, Kazuhito Koishida, Hosam A. Khalil, Xiaoqin Sun, Wei-Ge Chen
Bitstream syntax for multi-process audio decoding

Patent number: 7885819

Abstract: An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique.

Type: Grant

Filed: June 29, 2007

Date of Patent: February 8, 2011

Assignee: Microsoft Corporation

Inventors: Kazuhito Koishida, Sanjeev Mehrotra, Chao He, Wei-Ge Chen
Robust decoder

Patent number: 7831421

Abstract: Techniques and tools related to delayed or lost coded audio information are described. For example, a concealment technique for one or more missing frames is selected based on one or more factors that include a classification of each of one or more available frames near the one or more missing frames. As another example, information from a concealment signal is used to produce substitute information that is relied on in decoding a subsequent frame. As yet another example, a data structure having nodes corresponding to received packet delays is used to determine a desired decoder packet delay value.

Type: Grant

Filed: May 31, 2005

Date of Patent: November 9, 2010

Assignee: Microsoft Corporation

Inventors: Hosam A. Khalil, Tian Wang, Kazuhito Koishida, Xiaoqin Sun, Wei-Ge Chen
NOISE ROBUST SPEECH CLASSIFIER ENSEMBLE

Publication number: 20100280827

Abstract: Embodiments for implementing a speech recognition system that includes a speech classifier ensemble are disclosed. In accordance with one embodiment, the speech recognition system includes a classifier ensemble to convert feature vectors that represent a speech vector into log probability sets. The classifier ensemble includes a plurality of classifiers. The speech recognition system includes a decoder ensemble to transform the log probability sets into output symbol sequences. The speech recognition system further includes a query component to retrieve one or more speech utterances from a speech database using the output symbol sequences.

Type: Application

Filed: April 30, 2009

Publication date: November 4, 2010

Applicant: Microsoft Corporation

Inventors: Kunal Mukerjee, Kazuhito Koishida, Shankar Regunathan
Coding of sparse digital media spectral data

Patent number: 7774205

Abstract: An audio encoder/decoder provides efficient compression of spectral transform coefficient data characterized by sparse spectral peaks. The audio encoder/decoder applies a temporal prediction of the frequency position of spectral peaks. The spectral peaks in the transform coefficients that are predicted from those in a preceding transform coding block are encoded as a shift in frequency position from the previous transform coding block and two non-zero coefficient levels. The prediction may avoid coding very large zero-level transform coefficient runs as compared to conventional run length coding. For spectral peaks not predicted from those in a preceding transform coding block, the spectral peaks are encoded as a value trio of a length of a run of zero-level spectral transform coefficients, and two non-zero coefficient levels.

Type: Grant

Filed: June 15, 2007

Date of Patent: August 10, 2010

Assignee: Microsoft Corporation

Inventors: Kazuhito Koishida, Sanjeev Mehrotra, Wei-Ge Chen
Flexible frequency and time partitioning in perceptual transform coding of audio

Patent number: 7761290

Abstract: An audio encoder/decoder performs band partitioning for vector quantization encoding of spectral holes and missing high frequencies that result from quantization when encoding at low bit rates. The encoder/decoder determines a band structure for spectral holes based on two threshold parameters: a minimum hole size threshold and a maximum band size threshold. Spectral holes wider than the minimum hole size threshold are partitioned evenly into bands not exceeding the maximum band size threshold in size. Such hole filling bands are configured up to a preset number of hole filling bands. The bands for missing high frequencies are then configured by dividing the high frequency region into bands having binary-increasing, linearly-increasing or arbitrarily-configured band sizes up to a maximum overall number of bands.

Type: Grant

Filed: June 15, 2007

Date of Patent: July 20, 2010

Assignee: Microsoft Corporation

Inventors: Kazuhito Koishida, Sanjeev Mehrotra, Wei-Ge Chen
Sub-band voice codec with multi-stage codebooks and redundant coding

Patent number: 7734465

Abstract: Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.

Type: Grant

Filed: October 9, 2007

Date of Patent: June 8, 2010

Assignee: Microsoft Corporation

Inventors: Tian Wang, Kazuhito Koishida, Hosam A. Khalil, Xiaoqin Sun, Wei-Ge Chen
AUDIO ENCODING AND DECODING WITH INTRA FRAMES AND ADAPTIVE FORWARD ERROR CORRECTION

Publication number: 20100125455

Abstract: Various strategies for rate/quality control and loss resiliency in an audio codec are described. The various strategies can be used in combination or independently. For example, a real-time speech codec uses intra frame coding/decoding, adaptive multi-mode forward error correction [“FEC”], and rate/quality control techniques. Intra frames help a decoder recover quickly from packet losses, while compression efficiency is still emphasized with predicted frames. Various strategies for inserting intra frames and signaling intra/predicted frames are described. With the adaptive multi-mode FEC, an encoder adaptively selects between multiple modes to efficiently and quickly provide a level of FEC that takes into account the bandwidth currently available for FEC. The FEC information itself may be predictively encoded and decoded relative to primary encoded information. Various rate/quality and FEC control strategies allow additional adaptation to available bandwidth and network conditions.

Type: Application

Filed: January 22, 2010

Publication date: May 20, 2010

Applicant: Microsoft Corporation

Inventors: Tian Wang, Hosam A. Khalil, Kazuhito Koishida, Wei-Ge Chen, Mu Han
Audio codec post-filter

Patent number: 7707034

Abstract: Techniques and tools are described for processing reconstructed audio signals. For example, a reconstructed audio signal is filtered in the time domain using filter coefficients that are calculated, at least in part, in the frequency domain. As another example, producing a set of filter coefficients for filtering a reconstructed audio signal includes clipping one or more peaks of a set of coefficient values. As yet another example, for a sub-band codec, in a frequency region near an intersection between two sub-bands, a reconstructed composite signal is enhanced.

Type: Grant

Filed: May 31, 2005

Date of Patent: April 27, 2010

Assignee: Microsoft Corporation

Inventors: Xiaoqin Sun, Tian Wang, Hosam A. Khalil, Kazuhito Koishida, Wei-Ge Chen
Audio encoding and decoding with intra frames and adaptive forward error correction

Patent number: 7668712

Abstract: Various strategies for rate/quality control and loss resiliency in an audio codec are described. The various strategies can be used in combination or independently. For example, a real-time speech codec uses intra frame coding/decoding, adaptive multi-mode forward error correction [“FEC”], and rate/quality control techniques. Intra frames help a decoder recover quickly from packet losses, while compression efficiency is still emphasized with predicted frames. Various strategies for inserting intra frames and signaling intra/predicted frames are described. With the adaptive multi-mode FEC, an encoder adaptively selects between multiple modes to efficiently and quickly provide a level of FEC that takes into account the bandwidth currently available for FEC. The FEC information itself may be predictively encoded and decoded relative to primary encoded information. Various rate/quality and FEC control strategies allow additional adaptation to available bandwidth and network conditions.

Type: Grant

Filed: March 31, 2004

Date of Patent: February 23, 2010

Assignee: Microsoft Corporation

Inventors: Tian Wang, Hosam A. Khalil, Kazuhito Koishida, Wei-Ge Chen, Mu Han
ROBUST DECODER

Publication number: 20090276212

Abstract: Techniques and tools related to delayed or lost coded audio information are described. For example, a concealment technique for one or more missing frames is selected based on one or more factors that include a classification of each of one or more available frames near the one or more missing frames. As another example, information from a concealment signal is used to produce substitute information that is relied on in decoding a subsequent frame. As yet another example, a data structure having nodes corresponding to received packet delays is used to determine a desired decoder packet delay value.

Type: Application

Filed: July 14, 2009

Publication date: November 5, 2009

Applicant: Microsoft Corporation

Inventors: Hosam A. Khalil, Tian Wang, Kazuhito Koishida, Xiaoqin Sun, Wei-Ge Chen
LOSSLESS AND NEAR LOSSLESS SCALABLE AUDIO CODEC

Publication number: 20090248424

Abstract: A scalable audio codec encodes an input audio signal as a base layer at a high compression ratio and one or more residual signals as an enhancement layer of a compressed bitstream, which permits a lossless or near lossless reconstruction of the input audio signal at decoding. The scalable audio codec uses perceptual transform coding to encode the base layer. The residual is calculated in a transform domain, which includes a frequency and possibly also multi-channel transform of the input audio. For lossless reconstruction, the frequency and multi-channel transforms are reversible.

Type: Application

Filed: March 25, 2008

Publication date: October 1, 2009

Applicant: Microsoft Corporation

Inventors: Kazuhito Koishida, Sanjeev Mehrotra, Radhika Jandhyala
Robust decoder

Patent number: 7590531

Abstract: Techniques and tools related to delayed or lost coded audio information are described. For example, a concealment technique for one or more missing frames is selected based on one or more factors that include a classification of each of one or more available frames near the one or more missing frames. As another example, information from a concealment signal is used to produce substitute information that is relied on in decoding a subsequent frame. As yet another example, a data structure having nodes corresponding to received packet delays is used to determine a desired decoder packet delay value.

Type: Grant

Filed: August 4, 2005

Date of Patent: September 15, 2009

Assignee: Microsoft Corporation

Inventors: Hosam A. Khalil, Tian Wang, Kazuhito Koishida, Xiaoqin Sun, Wei-Ge Chen
Modification of codewords in dictionary used for efficient coding of digital media spectral data

Patent number: 7562021

Abstract: Coding of spectral data by representing certain portions of the spectral data as a scaled version of a code-vector, where the code-vector is chosen from either a fixed predetermined codebook or a codebook taken from a baseband. Various optional features are described for modifying the code-vectors in the codebook according to some rules which allow the code-vector to better represent the data they are modeling. The code-vector modification comprises a linear or non-linear transform of one or more code-vectors, such as, by exponentiation, negation, reversing, or combining elements from plural code-vectors.

Type: Grant

Filed: July 15, 2005

Date of Patent: July 14, 2009

Assignee: Microsoft Corporation

Inventors: Sanjeev Mehrotra, Wei-Ge Chen, Kazuhito Koishida
TRANSCODER USING ENCODER GENERATED SIDE INFORMATION

Publication number: 20090125315

Abstract: An audio encoder encodes side information into a compressed audio bitstream containing encoding parameters used by the encoder for one or more encoding techniques, such as a noise-mask-ratio curve used for rate control. A transcoder uses the encoder generated side information to transcode the audio from the original compressed bitstream having an initial bit-rate into a second bitstream having a new bit-rate. Because the side information is derived from the original audio, the transcoder is able to better maintain audio quality of the transcoding. The side information also allows the transcoder to re-encode from an intermediate decoding/encoding stage for faster and lower complexity transcoding.

Type: Application

Filed: November 9, 2007

Publication date: May 14, 2009

Applicant: Microsoft Corporation

Inventors: Kazuhito Koishida, Sanjeev Mehrotra, Wei-Ge Chen
BITSTREAM SYNTAX FOR MULTI-PROCESS AUDIO DECODING

Publication number: 20090006103

Abstract: An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique.

Type: Application

Filed: June 29, 2007

Publication date: January 1, 2009

Applicant: Microsoft Corporation

Inventors: Kazuhito Koishida, Sanjeev Mehrotra, Chao He, Wei-Ge Chen
CODING OF SPARSE DIGITAL MEDIA SPECTRAL DATA

Publication number: 20080312758

Abstract: An audio encoder/decoder provides efficient compression of spectral transform coefficient data characterized by sparse spectral peaks. The audio encoder/decoder applies a temporal prediction of the frequency position of spectral peaks. The spectral peaks in the transform coefficients that are predicted from those in a preceding transform coding block are encoded as a shift in frequency position from the previous transform coding block and two non-zero coefficient levels. The prediction may avoid coding very large zero-level transform coefficient runs as compared to conventional run length coding. For spectral peaks not predicted from those in a preceding transform coding block, the spectral peaks are encoded as a value trio of a length of a run of zero-level spectral transform coefficients, and two non-zero coefficient levels.

Type: Application

Filed: June 15, 2007

Publication date: December 18, 2008

Applicant: Microsoft Corporation

Inventors: Kazuhito Koishida, Sanjeev Mehrotra, Wei-Ge Chen
FLEXIBLE FREQUENCY AND TIME PARTITIONING IN PERCEPTUAL TRANSFORM CODING OF AUDIO

Publication number: 20080312759

Abstract: An audio encoder/decoder performs band partitioning for vector quantization encoding of spectral holes and missing high frequencies that result from quantization when encoding at low bit rates. The encoder/decoder determines a band structure for spectral holes based on two threshold parameters: a minimum hole size threshold and a maximum band size threshold. Spectral holes wider than the minimum hole size threshold are partitioned evenly into bands not exceeding the maximum band size threshold in size. Such hole filling bands are configured up to a preset number of hole filling bands. The bands for missing high frequencies are then configured by dividing the high frequency region into bands having binary-increasing, linearly-increasing or arbitrarily-configured band sizes up to a maximum overall number of bands.

Type: Application

Filed: June 15, 2007

Publication date: December 18, 2008

Applicant: Microsoft Corporation

Inventors: Kazuhito Koishida, Sanjeev Mehrotra, Wei-Ge Chen
Gain constrained noise suppression

Patent number: 7454332

Abstract: A gain-constrained noise suppression for speech more precisely estimates noise, including during speech, to reduce musical noise artifacts introduced from noise suppression. The noise suppression operates by applying a spectral gain G(m, k) to each short-time spectrum value S(m, k) of a speech signal, where m is the frame number and k is the spectrum index. The spectrum values are grouped into frequency bins, and a noise characteristic estimated for each bin classified as a “noise bin.” An energy parameter is smoothed in both the time domain and the frequency domain to improve noise estimation per bin. The gain factors G(m, k) are calculated based on the current signal spectrum and the noise estimation, then smoothed before being applied to the signal spectral values S(m, k).

Type: Grant

Filed: June 15, 2004

Date of Patent: November 18, 2008

Assignee: Microsoft Corporation

Inventors: Kazuhito Koishida, Feng Zhuge, Hosam A. Khalil, Tian Wang, Wei-ge Chen

prev 1 2 3 4 next