Patents by Inventor Yu-Hung Kao

Yu-Hung Kao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7103547
    Abstract: A small vocabulary speech recognizer suitable for implementation on a 16-bit fixed-point DSP is described. The input speech xt is sampled at analog-to-digital (A/D) converter 11 and the digital samples are applied to MFCC (Mel-scaled cepstrum coefficients) front end processing 13. For robustness to background noises, PMC (parallel model combination) 15 is integrated. The MFCC and Gaussian mean vectors are applied to PMC 15. The MFCC and PMC provide speech features extracted in noise and this is used to modify the HMMs. The noise adapted HMMs excluding mean vectors are applied to the search procedure to recognize the grammar. A method of computing MFCC comprises the steps of: performing dynamic Q-point computation for the preemphasis, Hamming Window, FFT, complex FFT to power spectrum and Mel scale power spectrum into filter bank steps, a log filter bank step and after the log filter bank step performing fixed Q-point computation. A polynomial fit is used to compute log2 in the log filter bank step.
    Type: Grant
    Filed: May 2, 2002
    Date of Patent: September 5, 2006
    Assignee: Texas Instruments Incorporated
    Inventors: Yu-Hung Kao, Yifan Gong
  • Patent number: 7080005
    Abstract: A typical English pronunciation dictionary takes up to 1,826,302 bytes in ASCII to store. A five times compression while maintaining computability is achieved by prefix delta encoding of the word and error encoding of the pronunciation.
    Type: Grant
    Filed: June 8, 2000
    Date of Patent: July 18, 2006
    Assignee: Texas Instruments Incorporated
    Inventor: Yu-Hung Kao
  • Patent number: 6980950
    Abstract: An utterance detector for speech recognition is described. The detector consists of two components. The first part makes a speech/non-speech decision for each incoming speech frame. The decision is based on a frequency-selective autocorrelation function obtained by speech power spectrum estimation, frequency filter, and inverse Fourier transform. The second component makes utterance detection decision, using a state machine that describes the detection process in terms of the speech/non-speech decision made by the first component.
    Type: Grant
    Filed: September 21, 2000
    Date of Patent: December 27, 2005
    Assignee: Texas Instruments Incorporated
    Inventors: Yifan Gong, Yu-Hung Kao
  • Publication number: 20020198706
    Abstract: A small vocabulary speech recognizer suitable for implementation on a 16-bit fixed-point DSP is described. The input speech xt is sampled at analog-to-digital (A/D) converter 11 and the digital samples are applied to MFCC (Mel-scaled cepstrum coefficients) front end processing 13. For robustness to background noises, PMC (parallel model combination) 15 is integrated. The MFCC and Gaussian mean vectors are applied to PMC 15. The MFCC and PMC provide speech features extracted in noise and this is used to modify the HMMs. The noise adapted HMMs excluding mean vectors are applied to the search procedure to recognize the grammar. A method of computing MFCC comprises the steps of: performing dynamic Q-point computation for the preemphasis, Hamming Window, FFT, complex FFT to power spectrum and Mel scale power spectrum into filter bank steps, a log filter bank step and after the log filter bank step performing fixed Q-point computation. A polynomial fit is used to compute log2 in the log filter bank step.
    Type: Application
    Filed: May 2, 2002
    Publication date: December 26, 2002
    Inventors: Yu-Hung Kao, Yifan Gong
  • Patent number: 6456970
    Abstract: The search network in a speech recognition system is reduced by parsing the incoming speech expanding all active paths (101), comparing to speech models and scoring the paths and storing recognition level values at the slots (103) and accumulating the scores and discarding previous slots when a word end is detected creating a word end slot (109).
    Type: Grant
    Filed: July 15, 1999
    Date of Patent: September 24, 2002
    Assignee: Texas Instruments Incorporated
    Inventor: Yu-Hung Kao
  • Patent number: 6374220
    Abstract: A method for N-best search for continuous speech recognition with limited storage space includes the steps of Viterbi pruning word level (same word, different time alignment, thus non-output differentiation) states and keeping the N-best sub-optimal paths for sentence level (output differentiation) states.
    Type: Grant
    Filed: July 15, 1999
    Date of Patent: April 16, 2002
    Assignee: Texas Instruments Incorporated
    Inventor: Yu-Hung Kao
  • Patent number: 6374222
    Abstract: A memory management method is described for reducing the size of memory required in speech recognition searching. The searching involves parsing the input speech and building a dynamically changing search tree. The basic unit of the search network is a slot. The present invention describes ways of reducing the size of the slot and therefore the size of the required memory. The slot size is reduced by removing the time index, by the model_index and state_index being packed and by a coding for last_time field where one bit represents a slot is available for reuse and a second bit is for backtrace update.
    Type: Grant
    Filed: July 16, 1999
    Date of Patent: April 16, 2002
    Assignee: Texas Instruments Incorporated
    Inventor: Yu-Hung Kao
  • Patent number: 6317712
    Abstract: Phonetic modeling includes the steps of forming triphone grammars (11) from phonetic data, training triphone models (13), clustering triphones (14) that are acoustically close together and mapping unclustered triphone grammars into a clustered model (16). The clustering process includes using a decision tree based on the acoustic likelihood and allows sub-model clusters in user-definable units.
    Type: Grant
    Filed: January 21, 1999
    Date of Patent: November 13, 2001
    Assignee: Texas Instruments Incorporated
    Inventors: Yu-Hung Kao, Kazuhiro Kondo
  • Patent number: 6285981
    Abstract: A speed up speech recognition search method is provided wherein the number of HMM states is determined and a microslot is allocated for Hidden Markov Models (HMMs) below a given threshold level of states. A macroslot treats a whole HMM as a basic unit. The lowest level of macroslot is a phone. If the number of states exceeds the threshold level a microslot is allocated for this HMM.
    Type: Grant
    Filed: June 7, 1999
    Date of Patent: September 4, 2001
    Assignee: Texas Instruments Incorporated
    Inventor: Yu-Hung Kao
  • Patent number: 5819221
    Abstract: Improved speech recognition is achieved according to the present invention by use of between word and/or between phrase coarticulation. The increase in the number of phonetic models required to model this additional vocabulary is reduced by clustering 19, 20 the inter-word/phrase models and grammar into only a few classes. By using one class for consonant inter-word context and two classes for vowel contexts, the accuracy for Japanese was almost as good as for unclustered models while the number of models was reduced more than half.
    Type: Grant
    Filed: August 31, 1994
    Date of Patent: October 6, 1998
    Assignee: Texas Instruments Incorporated
    Inventors: Kazuhiro Kondo, Ikuo Kudo, Yu-Hung Kao, Barbara J. Wheatley