Patents by Inventor Alejandro Acero

Alejandro Acero has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Adaptation of exponential models

Publication number: 20060018541

Abstract: A method and apparatus are provided for adapting an exponential probability model. In a first stage, a general-purpose background model is built from background data by determining a set of model parameters for the probability model based on a set of background data. The background model parameters are then used to define a prior model for the parameters of an adapted probability model that is adapted and more specific to an adaptation data set of interest. The adaptation data set is generally of much smaller size than the background data set. A second set of model parameters are then determined for the adapted probability model based on the set of adaptation data and the prior model.

Type: Application

Filed: October 29, 2004

Publication date: January 26, 2006

Applicant: Microsoft Corporation

Inventors: Ciprian Chelba, Alejandro Acero
Method and apparatus for denoising and deverberation using variational inference and strong speech models

Patent number: 6990447

Abstract: A probability distribution for speech model parameters, such as auto-regression parameters, is used to identify a distribution of denoised values from a noisy signal. Under one embodiment, the probability distributions of the speech model parameters and the denoised values are adjusted to improve a variational inference so that the variational inference better approximates the joint probability of the speech model parameters and the denoised values given a noisy signal. In some embodiments, this improvement is performed during an expectation step in an expectation-maximization algorithm. The statistical model can also be used to identify an average spectrum for the clean signal and this average spectrum may be provided to a speech recognizer instead of the estimate of the clean signal.

Type: Grant

Filed: November 15, 2001

Date of Patent: January 24, 2006

Assignee: Microsoft Corportion

Inventors: Hagai Attias, John Carlton Platt, Li Deng, Alejandro Acero
Method and apparatus for removing noise from feature vectors

Patent number: 6985858

Abstract: A method and computer-readable medium are provided for identifying clean signal feature vectors from noisy signal feature vectors. The method is based on variational inference techniques. One aspect of the invention includes using an iterative approach to identify the clean signal feature vector. Another aspect of the invention includes using the variance of a set of noise feature vectors and/or channel distortion feature vectors when identifying the clean signal feature vectors. Further aspects of the invention use mixtures of distributions of noise feature vectors and/or channel distortion feature vectors when identifying the clean signal feature vectors. Additional aspects of the invention include using a variance for the noisy signal feature vector conditioned on fixed values of noise, channel transfer function, and clean speech, when identifying the clean signal feature vector.

Type: Grant

Filed: March 20, 2001

Date of Patent: January 10, 2006

Assignee: Microsoft Corporation

Inventors: Brendan J. Frey, Alejandro Acero, Li Deng
Removing noise from feature vectors

Publication number: 20050273325

Abstract: A method and computer-readable medium are provided for identifying clean signal feature vectors from noisy signal feature vectors. Aspects of the invention use mixtures of distributions of noise feature vectors and/or channel distortion feature vectors when identifying the clean signal feature vectors.

Type: Application

Filed: July 20, 2005

Publication date: December 8, 2005

Applicant: Microsoft Corporation

Inventors: Brendan Frey, Alejandro Acero, Li Deng
Noise reduction using correction vectors based on dynamic aspects of speech and noise normalization

Publication number: 20050259558

Abstract: A method and apparatus are provided for reducing noise in a signal. Under one aspect of the invention, a correction vector is selected based on a noisy feature vector that represents a noisy signal. The selected correction vector incorporates dynamic aspects of pattern signals. The selected correction vector is then added to the noisy feature vector to produce a cleaned feature vector. In other aspects of the invention, a noise value is produced from an estimate of the noise in a noisy signal. The noise value is subtracted from a value representing a portion of the noisy signal to produce a noise-normalized value. The noise-normalized value is used to select a correction value that is added to the noise-normalized value to produce a cleaned noise-normalized value. The noise value is then added to the cleaned noise-normalized value to produce a cleaned value representing a portion of a cleaned signal.

Type: Application

Filed: July 26, 2005

Publication date: November 24, 2005

Applicant: Microsoft Corporation

Inventors: James Droppo, Li Deng, Alejandro Acero
Removing noise from feature vectors

Publication number: 20050256706

Abstract: A method and computer-readable medium are provided for identifying clean signal feature vectors from noisy signal feature vectors. One aspect of the invention includes using an iterative approach to identify the clean signal feature vector. Another aspect of the invention includes using the variance of a set of noise feature vectors and/or channel distortion feature vectors when identifying the clean signal feature vectors.

Type: Application

Filed: July 20, 2005

Publication date: November 17, 2005

Applicant: Microsoft Corporation

Inventors: Brendan Frey, Alejandro Acero, Li Deng
Including the category of environmental noise when processing speech signals

Patent number: 6959276

Abstract: A method and apparatus are provided for identifying a noise environment for a frame of an input signal based on at least one feature for that frame. Under one embodiment, the noise environment is identified by determining the probability of each of a set of possible noise environments. For some embodiments, the probabilities of the noise environments for past frames are included in the identification of an environment for a current frame. In one particular embodiment, a count is generated for each environment that indicates the number of past frames for which the environment was the most probable environment. The environment with the highest count is then selected as the environment for the current frame.

Type: Grant

Filed: September 27, 2001

Date of Patent: October 25, 2005

Assignee: Microsoft Corporation

Inventors: James G. Droppo, Alejandro Acero, Li Deng
Method and apparatus for predicting word error rates from text

Publication number: 20050228670

Abstract: A method of modeling a speech recognition system includes decoding a speech signal produced from a training text to produce a sequence of predicted speech units. The training text comprises a sequence of actual speech units that is used with the sequence of predicted speech units to form a confusion model. In further embodiments, the confusion model is used to decode a text to identify an error rate that would be expected if the speech recognition system decoded speech based on the text.

Type: Application

Filed: June 6, 2005

Publication date: October 13, 2005

Applicant: Microsoft Corporation

Inventors: Milind Mahajan, Yonggang Deng, Alejandro Acero, Asela Gunawardana, Ciprian Chelba
Language model adaptation using semantic supervision

Publication number: 20050228641

Abstract: A method and apparatus are provided for adapting a language model. The method and apparatus provide supervised class-based adaptation of the language model utilizing in-domain semantic information.

Type: Application

Filed: March 31, 2004

Publication date: October 13, 2005

Applicant: Microsoft Corporation

Inventors: Ciprian Chelba, Milind Mahajan, Alejandro Acero, Yik-Cheung Tam
Representation of a deleted interpolation N-gram language model in ARPA standard format

Publication number: 20050216265

Abstract: A method and apparatus are provided for storing parameters of a deleted interpolation language model as parameters of a backoff language model. In particular, the parameters of the deleted interpolation language model are stored in the standard ARPA format. Under one embodiment, the deleted interpolation language model parameters are formed using fractional counts.

Type: Application

Filed: March 26, 2004

Publication date: September 29, 2005

Applicant: Microsoft Corporation

Inventors: Ciprian Chelba, Milind Mahajan, Alejandro Acero
Method of iterative noise estimation in a recursive framework

Patent number: 6944590

Abstract: A method and apparatus estimate additive noise in a noisy signal using an iterative technique within a recursive framework. In particular, the noisy signal is divided into frames and the noise in each frame is determined based on the noise in another frame and the noise determined in a previous iteration for the current frame. In one particular embodiment, the noise found in a previous iteration for a frame is used to define an expansion point for a Taylor series approximation that is used to estimate the noise in the current frame.

Type: Grant

Filed: April 5, 2002

Date of Patent: September 13, 2005

Assignee: Microsoft Corporation

Inventors: Li Deng, James G. Droppo, Alejandro Acero
Method and apparatus for constructing a speech filter using estimates of clean speech and noise

Publication number: 20050182624

Abstract: A method and apparatus identify a clean speech signal from a noisy speech signal. To do this, a clean speech value and a noise value are estimated from the noisy speech signal. The clean speech value and the noise value are then used to define a gain on a filter. The noisy speech signal is applied to the filter to produce the clean speech signal. Under some embodiments, the noise value and the clean speech value are used in both the numerator and the denominator of the filter gain, with the numerator being guaranteed to be positive.

Type: Application

Filed: February 16, 2004

Publication date: August 18, 2005

Applicant: Microsoft Corporation

Inventors: Jian Wu, James Droppo, Li Deng, Alejandro Acero
Automatic speech recognition learning using user corrections

Publication number: 20050159949

Abstract: An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.

Type: Application

Filed: January 20, 2004

Publication date: July 21, 2005

Applicant: Microsoft Corporation

Inventors: Dong Yu, Peter Mau, Mei-Yuh Hwang, Alejandro Acero
Annotating programs for automatic summary generation

Publication number: 20050159956

Abstract: Audio/video programming content is made available to a receiver from a content provider, and meta data is made available to the receiver from a meta data provider. The meta data corresponds to the programming content, and identifies, for each of multiple portions of the programming content, an indicator of a likelihood that the portion is an exciting portion of the content. In one implementation, the meta data includes probabilities that segments of a baseball program are exciting, and is generated by analyzing the audio data of the baseball program for both excited speech and baseball hits. The meta data can then be used to generate a summary for the baseball program.

Type: Application

Filed: March 4, 2005

Publication date: July 21, 2005

Applicant: Microsoft Corporation

Inventors: Yong Rui, Anoop Gupta, Alejandro Acero
Annotating programs for automatic summary generations

Publication number: 20050160457

Abstract: Audio/video programming content is made available to a receiver from a content provider, and meta data is made available to the receiver from a meta data provider. The meta data corresponds to the programming content, and identifies, for each of multiple portions of the programming content, an indicator of a likelihood that the portion is an exciting portion of the content. In one implementation, the meta data includes probabilities that segments of a baseball program are exciting, and is generated by analyzing the audio data of the baseball program for both excited speech and baseball hits. The meta data can then be used to generate a summary for the baseball program.

Type: Application

Filed: March 15, 2005

Publication date: July 21, 2005

Applicant: Microsoft Corporation

Inventors: Yong Rui, Anoop Gupta, Alejandro Acero
Method of noise reduction using correction and scaling vectors with partitioning of the acoustic space in the domain of noisy speech

Publication number: 20050149325

Abstract: A method and apparatus are provided for reducing noise in a training signal and/or test signal. The noise reduction technique uses a stereo signal formed of two channel signals, each channel containing the same pattern signal. One of the channel signals is “clean” and the other includes additive noise. Using feature vectors from these channel signals, a collection of noise correction and scaling vectors is determined. When a feature vector of a noisy pattern signal is later received, it is multiplied by the best scaling vector for that feature vector and the best correction vector is added to the product to produce a noise reduced feature vector. Under one embodiment, the best scaling and correction vectors are identified by choosing an optimal mixture component for the noisy feature vector. The optimal mixture component being selected based on a distribution of noisy channel feature vectors associated with each mixture component.

Type: Application

Filed: February 16, 2005

Publication date: July 7, 2005

Applicant: Microsoft Corporation

Inventors: Li Deng, Xuedong Huang, Alejandro Acero
Method for entering text

Publication number: 20050149328

Abstract: In a method of entering text into a device a first character input is provided that is indicative of a first character of a text entry. Next, a vocalization of the text entry is captured. A probable word candidate is then identified for a first word of the vocalization based upon the first character input and an analysis of the vocalization. Finally, the probable word candidate is displayed for a user.

Type: Application

Filed: December 30, 2003

Publication date: July 7, 2005

Applicant: Microsoft Corporation

Inventors: Xuedong Huang, Alejandro Acero, Kuansan Wang, Milind Mahajan
Method and apparatus for continuous valued vocal tract resonance tracking using piecewise linear approximations

Publication number: 20050114134

Abstract: A method and apparatus tracks vocal tract resonance components, including both frequencies and bandwidths, in a speech signal. The components are tracked by defining a state equation that is linear with respect to a past vocal tract resonance vector and that predicts a current vocal tract resonance vector. An observation equation is also defined that is linear with respect to a current vocal tract resonance vector and that predicts at least one component of an observation vector. The state equation, the observation equation, and a sequence of observation vectors are used to identify a sequence of vocal tract resonance vectors using Kalman filter algorithm. Under one embodiment, the observation equation is defined based on a piecewise linear approximation to a non-linear function. The parameters of the linear approximation are selected based on pre-defined regions, which are determined from a crude estimate of a vocal tract resonance vector.

Type: Application

Filed: November 26, 2003

Publication date: May 26, 2005

Applicant: Microsoft Corporation

Inventors: Li Deng, Hagai Attias, Alejandro Acero, Leo Lee
Method and apparatus for multi-sensory speech enhancement

Publication number: 20050114124

Abstract: A method and system use an alternative sensor signal received from a sensor other than an air conduction microphone to estimate a clean speech value. The estimation uses either the alternative sensor signal alone, or in conjunction with the air conduction microphone signal. The clean speech value is estimated without using a model trained from noisy training data collected from an air conduction microphone. Under one embodiment, correction vectors are added to a vector formed from the alternative sensor signal in order to form a filter, which is applied to the air conductive microphone signal to produce the clean speech estimate. In other embodiments, the pitch of a speech signal is determined from the alternative sensor signal and is used to decompose an air conduction microphone signal. The decomposed signal is then used to determine a clean signal estimate.

Type: Application

Filed: November 26, 2003

Publication date: May 26, 2005

Applicant: Microsoft Corporation

Inventors: Zicheng Liu, Michael Sinclair, Alejandro Acero, Xuedong Huang, James Droppo, Li Deng, Zhengyou Zhang, Yanli Zheng
Sound source separation using convolutional mixing and a priori sound source knowledge

Publication number: 20050091042

Abstract: Sound source separation, without permutation, using convolutional mixing independent component analysis based on a priori knowledge of the target sound source is disclosed. The target sound source can be a human speaker. The reconstruction filters used in the sound source separation take into account the a priori knowledge of the target sound source, such as an estimate the spectra of the target sound source. The filters may be generally constructed based on a speech recognition system. Matching the words of the dictionary of the speech recognition system to a reconstructed signal indicates whether proper separation has occurred. More specifically, the filters may be constructed based on a vector quantization codebook of vectors representing typical sound source patterns. Matching the vectors of the codebook to a reconstructed signal indicates whether proper separation has occurred. The vectors may be linear prediction vectors, among others.

Type: Application

Filed: November 18, 2004

Publication date: April 28, 2005

Applicant: Microsoft Corporation

Inventors: Alejandro Acero, Steven Altschuler, Lani Wu

prev … 10 11 12 13 14 15 16 17 next