Patents by Inventor Xuedong Huang

Xuedong Huang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method and apparatus using spectral addition for speaker recognition

Patent number: 6990446

Abstract: A method and apparatus for speaker recognition is provided that matches the noise in training data to noise in testing data using spectral addition. Under spectral addition, the mean and variance for a plurality of frequency components are adjusted in the training data and the test data so that each mean and variance is matched in a resulting matched training signal and matched test signal. The adjustments made to the training data and test data add to the mean and variance of the training data and test data instead of subtracting from the mean and variance.

Type: Grant

Filed: October 10, 2000

Date of Patent: January 24, 2006

Assignee: Microsoft Corporation

Inventors: Xuedong Huang, Michael D. Plumpe
Method and apparatus for multi-sensory speech enhancement on a mobile device

Publication number: 20050185813

Abstract: A mobile device is provided that includes a digit input that can be manipulated by a user's fingers or thumb, an air conduction microphone and an alternative sensor that provides an alternative sensor signal indicative of speech. Under some embodiments, the mobile device also includes a proximity sensor that provides a proximity signal indicative of the distance from the mobile device to an object. Under some embodiments, the signal from the air conduction microphone, the alternative sensor signal, and the proximity signal are used to form an estimate of a clean speech value. In further embodiments, a sound is produced through a speaker in the mobile device based on the amount of noise in the clean speech value. In other embodiments, the sound produced through the speaker is based on the proximity sensor signal.

Type: Application

Filed: February 24, 2004

Publication date: August 25, 2005

Applicant: Microsoft Corporation

Inventors: Michael Sinclair, Xuedong Huang, Zhengyou Zhang
Method for entering text

Publication number: 20050149328

Abstract: In a method of entering text into a device a first character input is provided that is indicative of a first character of a text entry. Next, a vocalization of the text entry is captured. A probable word candidate is then identified for a first word of the vocalization based upon the first character input and an analysis of the vocalization. Finally, the probable word candidate is displayed for a user.

Type: Application

Filed: December 30, 2003

Publication date: July 7, 2005

Applicant: Microsoft Corporation

Inventors: Xuedong Huang, Alejandro Acero, Kuansan Wang, Milind Mahajan
Method of noise reduction using correction and scaling vectors with partitioning of the acoustic space in the domain of noisy speech

Publication number: 20050149325

Abstract: A method and apparatus are provided for reducing noise in a training signal and/or test signal. The noise reduction technique uses a stereo signal formed of two channel signals, each channel containing the same pattern signal. One of the channel signals is “clean” and the other includes additive noise. Using feature vectors from these channel signals, a collection of noise correction and scaling vectors is determined. When a feature vector of a noisy pattern signal is later received, it is multiplied by the best scaling vector for that feature vector and the best correction vector is added to the product to produce a noise reduced feature vector. Under one embodiment, the best scaling and correction vectors are identified by choosing an optimal mixture component for the noisy feature vector. The optimal mixture component being selected based on a distribution of noisy channel feature vectors associated with each mixture component.

Type: Application

Filed: February 16, 2005

Publication date: July 7, 2005

Applicant: Microsoft Corporation

Inventors: Li Deng, Xuedong Huang, Alejandro Acero
Method and apparatus using spectral addition for speaker recognition

Publication number: 20050143997

Abstract: A method and apparatus for speaker recognition is provided that matches the noise in training data to noise in testing data using spectral addition. Under spectral addition, the mean and variance for a plurality of frequency components are adjusted in the training data and the test data so that each mean and variance is matched in a resulting matched training signal and matched test signal. The adjustments made to the training data and test data add to the mean and variance of the training data and test data instead of subtracting from the mean and variance.

Type: Application

Filed: February 24, 2005

Publication date: June 30, 2005

Applicant: Microsoft Corporation

Inventors: Xuedong Huang, Michael Plumpe
Method and apparatus for multi-sensory speech enhancement

Publication number: 20050114124

Abstract: A method and system use an alternative sensor signal received from a sensor other than an air conduction microphone to estimate a clean speech value. The estimation uses either the alternative sensor signal alone, or in conjunction with the air conduction microphone signal. The clean speech value is estimated without using a model trained from noisy training data collected from an air conduction microphone. Under one embodiment, correction vectors are added to a vector formed from the alternative sensor signal in order to form a filter, which is applied to the air conductive microphone signal to produce the clean speech estimate. In other embodiments, the pitch of a speech signal is determined from the alternative sensor signal and is used to decompose an air conduction microphone signal. The decomposed signal is then used to determine a clean signal estimate.

Type: Application

Filed: November 26, 2003

Publication date: May 26, 2005

Applicant: Microsoft Corporation

Inventors: Zicheng Liu, Michael Sinclair, Alejandro Acero, Xuedong Huang, James Droppo, Li Deng, Zhengyou Zhang, Yanli Zheng
Use of a unified language model

Publication number: 20050080615

Abstract: A language processing system includes a unified language model. The unified language model comprises a plurality of context-free grammars having non-terminal tokens representing semantic or syntactic concepts and terminals, and an N-gram language model having non-terminal tokens. A language processing module capable of receiving an input signal indicative of language accesses the unified language model to recognize the language. The language processing module generates hypotheses for the received language as a function of words of the unified language model and/or provides an output signal indicative of the language and at least some of the semantic or syntactic concepts contained therein.

Type: Application

Filed: December 3, 2004

Publication date: April 14, 2005

Applicant: Microsoft Corporation

Inventors: Xuedong Huang, Milind Mahajan, Ye-Yi Wang, Xiaolong Mou
Use of a unified language model

Publication number: 20050080611

Abstract: A language processing system includes a unified language model. The unified language model comprises a plurality of context-free grammars having non-terminal tokens representing semantic or syntactic concepts and terminals, and an N-gram language model having non-terminal tokens. A language processing module capable of receiving an input signal indicative of language accesses the unified language model to recognize the language. The language processing module generates hypotheses for the received language as a function of words of the unified language model and/or provides an output signal indicative of the language and at least some of the semantic or syntactic concepts contained therein.

Type: Application

Filed: December 3, 2004

Publication date: April 14, 2005

Applicant: Microsoft Corporation

Inventors: Xuedong Huang, Milind Mahajan, Ye-Yi Wang, Xiaolong Mou
Pattern recognition training method and apparatus using inserted noise followed by noise reduction

Patent number: 6876966

Abstract: A method and apparatus for training and using a pattern recognition model are provided. Under the invention, additive noise that matches noise expected in a test signal is included in a training signal. The noisy training signal is passed through one or more noise reduction techniques to produce pseudo-clean training data. The pseudo-clean training data is used to train the pattern recognition model. When the test signal is received, it is passed through the same noise reduction techniques used on the noisy training signal. This produces pseudo-clean test data, which is applied to the pattern recognition model. Under one embodiment, sets of training data are produced with each set containing a different type of noise.

Type: Grant

Filed: October 16, 2000

Date of Patent: April 5, 2005

Assignee: Microsoft Corporation

Inventors: Li Deng, Xuedong Huang, Michael D. Plumpe
Head mounted multi-sensory audio input system

Publication number: 20050033571

Abstract: The present invention combines a conventional audio microphone with an additional speech sensor that provides a speech sensor signal based on an input. The speech sensor signal is generated based on an action undertaken by a speaker during speech, such as facial movement, bone vibration, throat vibration, throat impedance changes, etc. A speech detector component receives an input from the speech sensor and outputs a speech detection signal indicative of whether a user is speaking. The speech detector generates the speech detection signal based on the microphone signal and the speech sensor signal.

Type: Application

Filed: August 7, 2003

Publication date: February 10, 2005

Applicant: Microsoft Corporation

Inventors: Xuedong Huang, Zicheng Liu, Zhengyou Zhang, Michael Sinclair, Alejandro Acero
Multi-sensory speech detection system

Publication number: 20050027515

Abstract: The present invention combines a conventional audio microphone with an additional speech sensor that provides a speech sensor signal based on an input. The speech sensor signal is generated based on an action undertaken by a speaker during speech, such as facial movement, bone vibration, throat vibration, throat impedance changes, etc. A speech detector component receives an input from the speech sensor and outputs a speech detection signal indicative of whether a user is speaking. The speech detector generates the speech detection signal based on the microphone signal and the speech sensor signal.

Type: Application

Filed: July 29, 2003

Publication date: February 3, 2005

Applicant: Microsoft Corporation

Inventors: Xuedong Huang, Zicheng Liu, Zhengyou Zhang, Michael Sinclair, Alejandro Acero
Method of speech recognition using time-dependent interpolation and hidden dynamic value classes

Publication number: 20040019483

Abstract: A method of speech recognition is provided that identifies a production-related dynamics value by performing a linear interpolation between a production-related dynamics value at a previous time and a production-related target using a time-dependent interpolation weight. The hidden production-related dynamics value is used to compute a predicted value that is compared to an observed value of acoustics to determine the likelihood of the observed acoustics given a sequence of hidden phonological units. In some embodiments, the production-related dynamics value at the previous time is selected from a set of continuous values. In addition, the likelihood of the observed acoustics given a sequence of hidden phonological units is combined with a score associated with a discrete class of production-related dynamic values at the previous time to determine a score for a current phonological state.

Type: Application

Filed: October 9, 2002

Publication date: January 29, 2004

Inventors: Li Deng, Jian-Iai Zhou, Frank Torsten Bernd Seide, Asela J.R. Gunawardana, Hagai Attias, Alejandro Acero, Xuedong Huang
Personal mobile computing device having antenna microphone for improved speech recognition

Patent number: 6675027

Abstract: A mobile computing device, adapted to be held in the palm of a user's hand, includes an antenna for transmission of information from the mobile computing device. A first microphone, adapted to convert audible speech from the user into speech signals, is positioned at a distal end of the antenna. The antenna is rotatable, while the mobile computing device is held by the user, into a position which directs the first microphone toward the mouth of the user.

Type: Grant

Filed: November 22, 1999

Date of Patent: January 6, 2004

Inventor: Xuedong Huang
Fuzzy keyboard

Patent number: 6654733

Abstract: Fuzzy keyboards, to determine a most-likely-to-be-intended keystroke or keystrokes, are disclosed. In one embodiment, a method adds each of one or more keys to each of a current list of key sequence hypotheses, to create a new list of key sequence hypotheses. The method determines a likelihood probability for each hypothesis in the new list, and removes any hypothesis failing to satisfy any of one or more thresholds. The most likely key sequence of the new list may then be displayed. Some embodiments of the invention relate specifically to soft keyboards, while other embodiments relate specifically to real, physical and hard keyboards.

Type: Grant

Filed: January 18, 2000

Date of Patent: November 25, 2003

Assignee: Microsoft Corporation

Inventors: Joshua Goodman, Daniel Venolia, Xuedong Huang
Distributed speech recognition for mobile communication devices

Publication number: 20030182113

Abstract: A method of performing speech recognition, and a mobile computing device implementing the same, are disclosed. The method includes receiving audible speech at a microphone of the mobile computing device. The audible speech is converted into speech signals at the mobile computing device. Also at the mobile computing device, preliminary and secondary speech recognition functions are performed on the speech signals to obtain requests for results from modules. Then, the requests for results are transmitted from the mobile computing device to a second computing device located remotely from the mobile computing device to obtain the results which are then transmitted back to the mobile computing device for completion of the speech recognition process.

Type: Application

Filed: March 24, 2003

Publication date: September 25, 2003

Inventor: Xuedong Huang
Predictive keyboard

Patent number: 6573844

Abstract: Predictive keyboards, such as predictive soft keyboards, are disclosed. In one embodiment, a computer-implemented method predicts at least one key to be entered next within a sequence of keys. The method displays a soft keyboard where the predicted keys are displayed on the soft keyboard differently than the other keys on the keyboard. For example, the predicted keys may be larger in size on the soft keyboard as compared to the other keys. This makes the predicted keys more easily typed by a user as compared to the other keys.

Type: Grant

Filed: January 18, 2000

Date of Patent: June 3, 2003

Assignee: Microsoft Corporation

Inventors: Daniel Venolia, Joshua Goodman, Xuedong Huang, Hsiao-Wuen Hon
Speech recognition method and apparatus utilizing multiple feature streams

Patent number: 6542866

Abstract: A method and apparatus is provided for using multiple feature streams in speech recognition. In the method and apparatus, a feature extractor generates at least two feature vectors for a segment of an input signal. A decoder then generates a path score that is indicative of the probability that a word is represented by the input signal. The path score is generated by selecting the best feature vector to use for each segment. For each segment, the corresponding part in the path score for that segment is based in part on a chosen segment score that is selected from a group of at least two segment scores. The segment scores each represent a separate probability that a particular segment unit (e.g. senone, phoneme, diphone, triphone, or word) appears in that segment of the input signal. Although each segment score in the group relates to the same segment unit, the scores are based on different feature vectors for the segment.

Type: Grant

Filed: September 22, 1999

Date of Patent: April 1, 2003

Assignee: Microsoft Corporation

Inventors: Li Jiang, Xuedong Huang
Confidence measures using sub-word-dependent weighting of sub-word confidence scores for robust speech recognition

Patent number: 6539353

Abstract: A method and apparatus is provided for speech recognition. The method and apparatus convert an analog speech signal into a digital signal and extract at least one feature from the digital signal. A hypothesis word string that consists of sub-word units is identified from the extracted feature. For each identified word, a word confidence measure is determined based on weighted confidence measure scores for each sub-word unit in the word. The weighted confidence measure scores are created by applying different weights to confidence scores associated with different sub-words of the hypothesis word.

Type: Grant

Filed: October 12, 1999

Date of Patent: March 25, 2003

Assignee: Microsoft Corporation

Inventors: Li Jiang, Xuedong Huang
Two-tier noise rejection in speech recognition

Patent number: 6502072

Abstract: A method and apparatus is provided for two-tier noise rejection in speech recognition. The method and apparatus convert an analog speech signal into a digital signal and extract features from the digital signal. A hypothesis speech word and a hypothesis noise word are identified from respective extracted features. The features associated with the hypothesis speech word are examined in a second tier of noise rejection to determine if the features are more likely to represent noise than speech. The hypothesis speech word is replaced by a noise marker if the features are more likely to represent noise than speech.

Type: Grant

Filed: October 12, 1999

Date of Patent: December 31, 2002

Assignee: Microsoft Corporation

Inventors: Li Jiang, Xuedong Huang
Proofreading with text to speech feedback

Patent number: 6490563

Abstract: A computer implemented system and method of proofreading text in a computer system includes receiving text from a user into a text editing module. At least a portion of the text is converted to an audio signal upon the detection of an indicator, the indicator defining a boundary in the text by either being embodied therein or comprising delays in receiving text. The audio signal is played through a speaker to the user to provide feedback.

Type: Grant

Filed: August 17, 1998

Date of Patent: December 3, 2002

Assignee: Microsoft Corporation

Inventors: Hsiao-Wuen Hon, Dong Li, Xuedong Huang, Yun-Chen Ju, Xianghui Sean Zhang

prev … 6 7 8 9 10 11 next