Patents by Inventor Xuedong Huang

Xuedong Huang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 6990446
    Abstract: A method and apparatus for speaker recognition is provided that matches the noise in training data to noise in testing data using spectral addition. Under spectral addition, the mean and variance for a plurality of frequency components are adjusted in the training data and the test data so that each mean and variance is matched in a resulting matched training signal and matched test signal. The adjustments made to the training data and test data add to the mean and variance of the training data and test data instead of subtracting from the mean and variance.
    Type: Grant
    Filed: October 10, 2000
    Date of Patent: January 24, 2006
    Assignee: Microsoft Corporation
    Inventors: Xuedong Huang, Michael D. Plumpe
  • Publication number: 20050185813
    Abstract: A mobile device is provided that includes a digit input that can be manipulated by a user's fingers or thumb, an air conduction microphone and an alternative sensor that provides an alternative sensor signal indicative of speech. Under some embodiments, the mobile device also includes a proximity sensor that provides a proximity signal indicative of the distance from the mobile device to an object. Under some embodiments, the signal from the air conduction microphone, the alternative sensor signal, and the proximity signal are used to form an estimate of a clean speech value. In further embodiments, a sound is produced through a speaker in the mobile device based on the amount of noise in the clean speech value. In other embodiments, the sound produced through the speaker is based on the proximity sensor signal.
    Type: Application
    Filed: February 24, 2004
    Publication date: August 25, 2005
    Applicant: Microsoft Corporation
    Inventors: Michael Sinclair, Xuedong Huang, Zhengyou Zhang
  • Publication number: 20050149328
    Abstract: In a method of entering text into a device a first character input is provided that is indicative of a first character of a text entry. Next, a vocalization of the text entry is captured. A probable word candidate is then identified for a first word of the vocalization based upon the first character input and an analysis of the vocalization. Finally, the probable word candidate is displayed for a user.
    Type: Application
    Filed: December 30, 2003
    Publication date: July 7, 2005
    Applicant: Microsoft Corporation
    Inventors: Xuedong Huang, Alejandro Acero, Kuansan Wang, Milind Mahajan
  • Publication number: 20050149325
    Abstract: A method and apparatus are provided for reducing noise in a training signal and/or test signal. The noise reduction technique uses a stereo signal formed of two channel signals, each channel containing the same pattern signal. One of the channel signals is “clean” and the other includes additive noise. Using feature vectors from these channel signals, a collection of noise correction and scaling vectors is determined. When a feature vector of a noisy pattern signal is later received, it is multiplied by the best scaling vector for that feature vector and the best correction vector is added to the product to produce a noise reduced feature vector. Under one embodiment, the best scaling and correction vectors are identified by choosing an optimal mixture component for the noisy feature vector. The optimal mixture component being selected based on a distribution of noisy channel feature vectors associated with each mixture component.
    Type: Application
    Filed: February 16, 2005
    Publication date: July 7, 2005
    Applicant: Microsoft Corporation
    Inventors: Li Deng, Xuedong Huang, Alejandro Acero
  • Publication number: 20050143997
    Abstract: A method and apparatus for speaker recognition is provided that matches the noise in training data to noise in testing data using spectral addition. Under spectral addition, the mean and variance for a plurality of frequency components are adjusted in the training data and the test data so that each mean and variance is matched in a resulting matched training signal and matched test signal. The adjustments made to the training data and test data add to the mean and variance of the training data and test data instead of subtracting from the mean and variance.
    Type: Application
    Filed: February 24, 2005
    Publication date: June 30, 2005
    Applicant: Microsoft Corporation
    Inventors: Xuedong Huang, Michael Plumpe
  • Publication number: 20050114124
    Abstract: A method and system use an alternative sensor signal received from a sensor other than an air conduction microphone to estimate a clean speech value. The estimation uses either the alternative sensor signal alone, or in conjunction with the air conduction microphone signal. The clean speech value is estimated without using a model trained from noisy training data collected from an air conduction microphone. Under one embodiment, correction vectors are added to a vector formed from the alternative sensor signal in order to form a filter, which is applied to the air conductive microphone signal to produce the clean speech estimate. In other embodiments, the pitch of a speech signal is determined from the alternative sensor signal and is used to decompose an air conduction microphone signal. The decomposed signal is then used to determine a clean signal estimate.
    Type: Application
    Filed: November 26, 2003
    Publication date: May 26, 2005
    Applicant: Microsoft Corporation
    Inventors: Zicheng Liu, Michael Sinclair, Alejandro Acero, Xuedong Huang, James Droppo, Li Deng, Zhengyou Zhang, Yanli Zheng
  • Publication number: 20050080615
    Abstract: A language processing system includes a unified language model. The unified language model comprises a plurality of context-free grammars having non-terminal tokens representing semantic or syntactic concepts and terminals, and an N-gram language model having non-terminal tokens. A language processing module capable of receiving an input signal indicative of language accesses the unified language model to recognize the language. The language processing module generates hypotheses for the received language as a function of words of the unified language model and/or provides an output signal indicative of the language and at least some of the semantic or syntactic concepts contained therein.
    Type: Application
    Filed: December 3, 2004
    Publication date: April 14, 2005
    Applicant: Microsoft Corporation
    Inventors: Xuedong Huang, Milind Mahajan, Ye-Yi Wang, Xiaolong Mou
  • Publication number: 20050080611
    Abstract: A language processing system includes a unified language model. The unified language model comprises a plurality of context-free grammars having non-terminal tokens representing semantic or syntactic concepts and terminals, and an N-gram language model having non-terminal tokens. A language processing module capable of receiving an input signal indicative of language accesses the unified language model to recognize the language. The language processing module generates hypotheses for the received language as a function of words of the unified language model and/or provides an output signal indicative of the language and at least some of the semantic or syntactic concepts contained therein.
    Type: Application
    Filed: December 3, 2004
    Publication date: April 14, 2005
    Applicant: Microsoft Corporation
    Inventors: Xuedong Huang, Milind Mahajan, Ye-Yi Wang, Xiaolong Mou
  • Patent number: 6876966
    Abstract: A method and apparatus for training and using a pattern recognition model are provided. Under the invention, additive noise that matches noise expected in a test signal is included in a training signal. The noisy training signal is passed through one or more noise reduction techniques to produce pseudo-clean training data. The pseudo-clean training data is used to train the pattern recognition model. When the test signal is received, it is passed through the same noise reduction techniques used on the noisy training signal. This produces pseudo-clean test data, which is applied to the pattern recognition model. Under one embodiment, sets of training data are produced with each set containing a different type of noise.
    Type: Grant
    Filed: October 16, 2000
    Date of Patent: April 5, 2005
    Assignee: Microsoft Corporation
    Inventors: Li Deng, Xuedong Huang, Michael D. Plumpe
  • Publication number: 20050033571
    Abstract: The present invention combines a conventional audio microphone with an additional speech sensor that provides a speech sensor signal based on an input. The speech sensor signal is generated based on an action undertaken by a speaker during speech, such as facial movement, bone vibration, throat vibration, throat impedance changes, etc. A speech detector component receives an input from the speech sensor and outputs a speech detection signal indicative of whether a user is speaking. The speech detector generates the speech detection signal based on the microphone signal and the speech sensor signal.
    Type: Application
    Filed: August 7, 2003
    Publication date: February 10, 2005
    Applicant: Microsoft Corporation
    Inventors: Xuedong Huang, Zicheng Liu, Zhengyou Zhang, Michael Sinclair, Alejandro Acero
  • Publication number: 20050027515
    Abstract: The present invention combines a conventional audio microphone with an additional speech sensor that provides a speech sensor signal based on an input. The speech sensor signal is generated based on an action undertaken by a speaker during speech, such as facial movement, bone vibration, throat vibration, throat impedance changes, etc. A speech detector component receives an input from the speech sensor and outputs a speech detection signal indicative of whether a user is speaking. The speech detector generates the speech detection signal based on the microphone signal and the speech sensor signal.
    Type: Application
    Filed: July 29, 2003
    Publication date: February 3, 2005
    Applicant: Microsoft Corporation
    Inventors: Xuedong Huang, Zicheng Liu, Zhengyou Zhang, Michael Sinclair, Alejandro Acero
  • Publication number: 20040019483
    Abstract: A method of speech recognition is provided that identifies a production-related dynamics value by performing a linear interpolation between a production-related dynamics value at a previous time and a production-related target using a time-dependent interpolation weight. The hidden production-related dynamics value is used to compute a predicted value that is compared to an observed value of acoustics to determine the likelihood of the observed acoustics given a sequence of hidden phonological units. In some embodiments, the production-related dynamics value at the previous time is selected from a set of continuous values. In addition, the likelihood of the observed acoustics given a sequence of hidden phonological units is combined with a score associated with a discrete class of production-related dynamic values at the previous time to determine a score for a current phonological state.
    Type: Application
    Filed: October 9, 2002
    Publication date: January 29, 2004
    Inventors: Li Deng, Jian-Iai Zhou, Frank Torsten Bernd Seide, Asela J.R. Gunawardana, Hagai Attias, Alejandro Acero, Xuedong Huang
  • Patent number: 6675027
    Abstract: A mobile computing device, adapted to be held in the palm of a user's hand, includes an antenna for transmission of information from the mobile computing device. A first microphone, adapted to convert audible speech from the user into speech signals, is positioned at a distal end of the antenna. The antenna is rotatable, while the mobile computing device is held by the user, into a position which directs the first microphone toward the mouth of the user.
    Type: Grant
    Filed: November 22, 1999
    Date of Patent: January 6, 2004
    Inventor: Xuedong Huang
  • Patent number: 6654733
    Abstract: Fuzzy keyboards, to determine a most-likely-to-be-intended keystroke or keystrokes, are disclosed. In one embodiment, a method adds each of one or more keys to each of a current list of key sequence hypotheses, to create a new list of key sequence hypotheses. The method determines a likelihood probability for each hypothesis in the new list, and removes any hypothesis failing to satisfy any of one or more thresholds. The most likely key sequence of the new list may then be displayed. Some embodiments of the invention relate specifically to soft keyboards, while other embodiments relate specifically to real, physical and hard keyboards.
    Type: Grant
    Filed: January 18, 2000
    Date of Patent: November 25, 2003
    Assignee: Microsoft Corporation
    Inventors: Joshua Goodman, Daniel Venolia, Xuedong Huang
  • Publication number: 20030182113
    Abstract: A method of performing speech recognition, and a mobile computing device implementing the same, are disclosed. The method includes receiving audible speech at a microphone of the mobile computing device. The audible speech is converted into speech signals at the mobile computing device. Also at the mobile computing device, preliminary and secondary speech recognition functions are performed on the speech signals to obtain requests for results from modules. Then, the requests for results are transmitted from the mobile computing device to a second computing device located remotely from the mobile computing device to obtain the results which are then transmitted back to the mobile computing device for completion of the speech recognition process.
    Type: Application
    Filed: March 24, 2003
    Publication date: September 25, 2003
    Inventor: Xuedong Huang
  • Patent number: 6573844
    Abstract: Predictive keyboards, such as predictive soft keyboards, are disclosed. In one embodiment, a computer-implemented method predicts at least one key to be entered next within a sequence of keys. The method displays a soft keyboard where the predicted keys are displayed on the soft keyboard differently than the other keys on the keyboard. For example, the predicted keys may be larger in size on the soft keyboard as compared to the other keys. This makes the predicted keys more easily typed by a user as compared to the other keys.
    Type: Grant
    Filed: January 18, 2000
    Date of Patent: June 3, 2003
    Assignee: Microsoft Corporation
    Inventors: Daniel Venolia, Joshua Goodman, Xuedong Huang, Hsiao-Wuen Hon
  • Patent number: 6542866
    Abstract: A method and apparatus is provided for using multiple feature streams in speech recognition. In the method and apparatus, a feature extractor generates at least two feature vectors for a segment of an input signal. A decoder then generates a path score that is indicative of the probability that a word is represented by the input signal. The path score is generated by selecting the best feature vector to use for each segment. For each segment, the corresponding part in the path score for that segment is based in part on a chosen segment score that is selected from a group of at least two segment scores. The segment scores each represent a separate probability that a particular segment unit (e.g. senone, phoneme, diphone, triphone, or word) appears in that segment of the input signal. Although each segment score in the group relates to the same segment unit, the scores are based on different feature vectors for the segment.
    Type: Grant
    Filed: September 22, 1999
    Date of Patent: April 1, 2003
    Assignee: Microsoft Corporation
    Inventors: Li Jiang, Xuedong Huang
  • Patent number: 6539353
    Abstract: A method and apparatus is provided for speech recognition. The method and apparatus convert an analog speech signal into a digital signal and extract at least one feature from the digital signal. A hypothesis word string that consists of sub-word units is identified from the extracted feature. For each identified word, a word confidence measure is determined based on weighted confidence measure scores for each sub-word unit in the word. The weighted confidence measure scores are created by applying different weights to confidence scores associated with different sub-words of the hypothesis word.
    Type: Grant
    Filed: October 12, 1999
    Date of Patent: March 25, 2003
    Assignee: Microsoft Corporation
    Inventors: Li Jiang, Xuedong Huang
  • Patent number: 6502072
    Abstract: A method and apparatus is provided for two-tier noise rejection in speech recognition. The method and apparatus convert an analog speech signal into a digital signal and extract features from the digital signal. A hypothesis speech word and a hypothesis noise word are identified from respective extracted features. The features associated with the hypothesis speech word are examined in a second tier of noise rejection to determine if the features are more likely to represent noise than speech. The hypothesis speech word is replaced by a noise marker if the features are more likely to represent noise than speech.
    Type: Grant
    Filed: October 12, 1999
    Date of Patent: December 31, 2002
    Assignee: Microsoft Corporation
    Inventors: Li Jiang, Xuedong Huang
  • Patent number: 6490563
    Abstract: A computer implemented system and method of proofreading text in a computer system includes receiving text from a user into a text editing module. At least a portion of the text is converted to an audio signal upon the detection of an indicator, the indicator defining a boundary in the text by either being embodied therein or comprising delays in receiving text. The audio signal is played through a speaker to the user to provide feedback.
    Type: Grant
    Filed: August 17, 1998
    Date of Patent: December 3, 2002
    Assignee: Microsoft Corporation
    Inventors: Hsiao-Wuen Hon, Dong Li, Xuedong Huang, Yun-Chen Ju, Xianghui Sean Zhang