Patents by Inventor Alejandro Acero

Alejandro Acero has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20060018541
    Abstract: A method and apparatus are provided for adapting an exponential probability model. In a first stage, a general-purpose background model is built from background data by determining a set of model parameters for the probability model based on a set of background data. The background model parameters are then used to define a prior model for the parameters of an adapted probability model that is adapted and more specific to an adaptation data set of interest. The adaptation data set is generally of much smaller size than the background data set. A second set of model parameters are then determined for the adapted probability model based on the set of adaptation data and the prior model.
    Type: Application
    Filed: October 29, 2004
    Publication date: January 26, 2006
    Applicant: Microsoft Corporation
    Inventors: Ciprian Chelba, Alejandro Acero
  • Patent number: 6990447
    Abstract: A probability distribution for speech model parameters, such as auto-regression parameters, is used to identify a distribution of denoised values from a noisy signal. Under one embodiment, the probability distributions of the speech model parameters and the denoised values are adjusted to improve a variational inference so that the variational inference better approximates the joint probability of the speech model parameters and the denoised values given a noisy signal. In some embodiments, this improvement is performed during an expectation step in an expectation-maximization algorithm. The statistical model can also be used to identify an average spectrum for the clean signal and this average spectrum may be provided to a speech recognizer instead of the estimate of the clean signal.
    Type: Grant
    Filed: November 15, 2001
    Date of Patent: January 24, 2006
    Assignee: Microsoft Corportion
    Inventors: Hagai Attias, John Carlton Platt, Li Deng, Alejandro Acero
  • Patent number: 6985858
    Abstract: A method and computer-readable medium are provided for identifying clean signal feature vectors from noisy signal feature vectors. The method is based on variational inference techniques. One aspect of the invention includes using an iterative approach to identify the clean signal feature vector. Another aspect of the invention includes using the variance of a set of noise feature vectors and/or channel distortion feature vectors when identifying the clean signal feature vectors. Further aspects of the invention use mixtures of distributions of noise feature vectors and/or channel distortion feature vectors when identifying the clean signal feature vectors. Additional aspects of the invention include using a variance for the noisy signal feature vector conditioned on fixed values of noise, channel transfer function, and clean speech, when identifying the clean signal feature vector.
    Type: Grant
    Filed: March 20, 2001
    Date of Patent: January 10, 2006
    Assignee: Microsoft Corporation
    Inventors: Brendan J. Frey, Alejandro Acero, Li Deng
  • Publication number: 20050273325
    Abstract: A method and computer-readable medium are provided for identifying clean signal feature vectors from noisy signal feature vectors. Aspects of the invention use mixtures of distributions of noise feature vectors and/or channel distortion feature vectors when identifying the clean signal feature vectors.
    Type: Application
    Filed: July 20, 2005
    Publication date: December 8, 2005
    Applicant: Microsoft Corporation
    Inventors: Brendan Frey, Alejandro Acero, Li Deng
  • Publication number: 20050259558
    Abstract: A method and apparatus are provided for reducing noise in a signal. Under one aspect of the invention, a correction vector is selected based on a noisy feature vector that represents a noisy signal. The selected correction vector incorporates dynamic aspects of pattern signals. The selected correction vector is then added to the noisy feature vector to produce a cleaned feature vector. In other aspects of the invention, a noise value is produced from an estimate of the noise in a noisy signal. The noise value is subtracted from a value representing a portion of the noisy signal to produce a noise-normalized value. The noise-normalized value is used to select a correction value that is added to the noise-normalized value to produce a cleaned noise-normalized value. The noise value is then added to the cleaned noise-normalized value to produce a cleaned value representing a portion of a cleaned signal.
    Type: Application
    Filed: July 26, 2005
    Publication date: November 24, 2005
    Applicant: Microsoft Corporation
    Inventors: James Droppo, Li Deng, Alejandro Acero
  • Publication number: 20050256706
    Abstract: A method and computer-readable medium are provided for identifying clean signal feature vectors from noisy signal feature vectors. One aspect of the invention includes using an iterative approach to identify the clean signal feature vector. Another aspect of the invention includes using the variance of a set of noise feature vectors and/or channel distortion feature vectors when identifying the clean signal feature vectors.
    Type: Application
    Filed: July 20, 2005
    Publication date: November 17, 2005
    Applicant: Microsoft Corporation
    Inventors: Brendan Frey, Alejandro Acero, Li Deng
  • Patent number: 6959276
    Abstract: A method and apparatus are provided for identifying a noise environment for a frame of an input signal based on at least one feature for that frame. Under one embodiment, the noise environment is identified by determining the probability of each of a set of possible noise environments. For some embodiments, the probabilities of the noise environments for past frames are included in the identification of an environment for a current frame. In one particular embodiment, a count is generated for each environment that indicates the number of past frames for which the environment was the most probable environment. The environment with the highest count is then selected as the environment for the current frame.
    Type: Grant
    Filed: September 27, 2001
    Date of Patent: October 25, 2005
    Assignee: Microsoft Corporation
    Inventors: James G. Droppo, Alejandro Acero, Li Deng
  • Publication number: 20050228670
    Abstract: A method of modeling a speech recognition system includes decoding a speech signal produced from a training text to produce a sequence of predicted speech units. The training text comprises a sequence of actual speech units that is used with the sequence of predicted speech units to form a confusion model. In further embodiments, the confusion model is used to decode a text to identify an error rate that would be expected if the speech recognition system decoded speech based on the text.
    Type: Application
    Filed: June 6, 2005
    Publication date: October 13, 2005
    Applicant: Microsoft Corporation
    Inventors: Milind Mahajan, Yonggang Deng, Alejandro Acero, Asela Gunawardana, Ciprian Chelba
  • Publication number: 20050228641
    Abstract: A method and apparatus are provided for adapting a language model. The method and apparatus provide supervised class-based adaptation of the language model utilizing in-domain semantic information.
    Type: Application
    Filed: March 31, 2004
    Publication date: October 13, 2005
    Applicant: Microsoft Corporation
    Inventors: Ciprian Chelba, Milind Mahajan, Alejandro Acero, Yik-Cheung Tam
  • Publication number: 20050216265
    Abstract: A method and apparatus are provided for storing parameters of a deleted interpolation language model as parameters of a backoff language model. In particular, the parameters of the deleted interpolation language model are stored in the standard ARPA format. Under one embodiment, the deleted interpolation language model parameters are formed using fractional counts.
    Type: Application
    Filed: March 26, 2004
    Publication date: September 29, 2005
    Applicant: Microsoft Corporation
    Inventors: Ciprian Chelba, Milind Mahajan, Alejandro Acero
  • Patent number: 6944590
    Abstract: A method and apparatus estimate additive noise in a noisy signal using an iterative technique within a recursive framework. In particular, the noisy signal is divided into frames and the noise in each frame is determined based on the noise in another frame and the noise determined in a previous iteration for the current frame. In one particular embodiment, the noise found in a previous iteration for a frame is used to define an expansion point for a Taylor series approximation that is used to estimate the noise in the current frame.
    Type: Grant
    Filed: April 5, 2002
    Date of Patent: September 13, 2005
    Assignee: Microsoft Corporation
    Inventors: Li Deng, James G. Droppo, Alejandro Acero
  • Publication number: 20050182624
    Abstract: A method and apparatus identify a clean speech signal from a noisy speech signal. To do this, a clean speech value and a noise value are estimated from the noisy speech signal. The clean speech value and the noise value are then used to define a gain on a filter. The noisy speech signal is applied to the filter to produce the clean speech signal. Under some embodiments, the noise value and the clean speech value are used in both the numerator and the denominator of the filter gain, with the numerator being guaranteed to be positive.
    Type: Application
    Filed: February 16, 2004
    Publication date: August 18, 2005
    Applicant: Microsoft Corporation
    Inventors: Jian Wu, James Droppo, Li Deng, Alejandro Acero
  • Publication number: 20050159949
    Abstract: An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.
    Type: Application
    Filed: January 20, 2004
    Publication date: July 21, 2005
    Applicant: Microsoft Corporation
    Inventors: Dong Yu, Peter Mau, Mei-Yuh Hwang, Alejandro Acero
  • Publication number: 20050159956
    Abstract: Audio/video programming content is made available to a receiver from a content provider, and meta data is made available to the receiver from a meta data provider. The meta data corresponds to the programming content, and identifies, for each of multiple portions of the programming content, an indicator of a likelihood that the portion is an exciting portion of the content. In one implementation, the meta data includes probabilities that segments of a baseball program are exciting, and is generated by analyzing the audio data of the baseball program for both excited speech and baseball hits. The meta data can then be used to generate a summary for the baseball program.
    Type: Application
    Filed: March 4, 2005
    Publication date: July 21, 2005
    Applicant: Microsoft Corporation
    Inventors: Yong Rui, Anoop Gupta, Alejandro Acero
  • Publication number: 20050160457
    Abstract: Audio/video programming content is made available to a receiver from a content provider, and meta data is made available to the receiver from a meta data provider. The meta data corresponds to the programming content, and identifies, for each of multiple portions of the programming content, an indicator of a likelihood that the portion is an exciting portion of the content. In one implementation, the meta data includes probabilities that segments of a baseball program are exciting, and is generated by analyzing the audio data of the baseball program for both excited speech and baseball hits. The meta data can then be used to generate a summary for the baseball program.
    Type: Application
    Filed: March 15, 2005
    Publication date: July 21, 2005
    Applicant: Microsoft Corporation
    Inventors: Yong Rui, Anoop Gupta, Alejandro Acero
  • Publication number: 20050149325
    Abstract: A method and apparatus are provided for reducing noise in a training signal and/or test signal. The noise reduction technique uses a stereo signal formed of two channel signals, each channel containing the same pattern signal. One of the channel signals is “clean” and the other includes additive noise. Using feature vectors from these channel signals, a collection of noise correction and scaling vectors is determined. When a feature vector of a noisy pattern signal is later received, it is multiplied by the best scaling vector for that feature vector and the best correction vector is added to the product to produce a noise reduced feature vector. Under one embodiment, the best scaling and correction vectors are identified by choosing an optimal mixture component for the noisy feature vector. The optimal mixture component being selected based on a distribution of noisy channel feature vectors associated with each mixture component.
    Type: Application
    Filed: February 16, 2005
    Publication date: July 7, 2005
    Applicant: Microsoft Corporation
    Inventors: Li Deng, Xuedong Huang, Alejandro Acero
  • Publication number: 20050149328
    Abstract: In a method of entering text into a device a first character input is provided that is indicative of a first character of a text entry. Next, a vocalization of the text entry is captured. A probable word candidate is then identified for a first word of the vocalization based upon the first character input and an analysis of the vocalization. Finally, the probable word candidate is displayed for a user.
    Type: Application
    Filed: December 30, 2003
    Publication date: July 7, 2005
    Applicant: Microsoft Corporation
    Inventors: Xuedong Huang, Alejandro Acero, Kuansan Wang, Milind Mahajan
  • Publication number: 20050114134
    Abstract: A method and apparatus tracks vocal tract resonance components, including both frequencies and bandwidths, in a speech signal. The components are tracked by defining a state equation that is linear with respect to a past vocal tract resonance vector and that predicts a current vocal tract resonance vector. An observation equation is also defined that is linear with respect to a current vocal tract resonance vector and that predicts at least one component of an observation vector. The state equation, the observation equation, and a sequence of observation vectors are used to identify a sequence of vocal tract resonance vectors using Kalman filter algorithm. Under one embodiment, the observation equation is defined based on a piecewise linear approximation to a non-linear function. The parameters of the linear approximation are selected based on pre-defined regions, which are determined from a crude estimate of a vocal tract resonance vector.
    Type: Application
    Filed: November 26, 2003
    Publication date: May 26, 2005
    Applicant: Microsoft Corporation
    Inventors: Li Deng, Hagai Attias, Alejandro Acero, Leo Lee
  • Publication number: 20050114124
    Abstract: A method and system use an alternative sensor signal received from a sensor other than an air conduction microphone to estimate a clean speech value. The estimation uses either the alternative sensor signal alone, or in conjunction with the air conduction microphone signal. The clean speech value is estimated without using a model trained from noisy training data collected from an air conduction microphone. Under one embodiment, correction vectors are added to a vector formed from the alternative sensor signal in order to form a filter, which is applied to the air conductive microphone signal to produce the clean speech estimate. In other embodiments, the pitch of a speech signal is determined from the alternative sensor signal and is used to decompose an air conduction microphone signal. The decomposed signal is then used to determine a clean signal estimate.
    Type: Application
    Filed: November 26, 2003
    Publication date: May 26, 2005
    Applicant: Microsoft Corporation
    Inventors: Zicheng Liu, Michael Sinclair, Alejandro Acero, Xuedong Huang, James Droppo, Li Deng, Zhengyou Zhang, Yanli Zheng
  • Publication number: 20050091042
    Abstract: Sound source separation, without permutation, using convolutional mixing independent component analysis based on a priori knowledge of the target sound source is disclosed. The target sound source can be a human speaker. The reconstruction filters used in the sound source separation take into account the a priori knowledge of the target sound source, such as an estimate the spectra of the target sound source. The filters may be generally constructed based on a speech recognition system. Matching the words of the dictionary of the speech recognition system to a reconstructed signal indicates whether proper separation has occurred. More specifically, the filters may be constructed based on a vector quantization codebook of vectors representing typical sound source patterns. Matching the vectors of the codebook to a reconstructed signal indicates whether proper separation has occurred. The vectors may be linear prediction vectors, among others.
    Type: Application
    Filed: November 18, 2004
    Publication date: April 28, 2005
    Applicant: Microsoft Corporation
    Inventors: Alejandro Acero, Steven Altschuler, Lani Wu