Patents by Inventor Alejandro Acero

Alejandro Acero has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20080298562
    Abstract: A voice interaction system is configured to analyze an utterance and identify inherent attributes that are indicative of a demographic characteristic of the system user that spoke the utterance. The system then selects and presents a personalized response to the user, the response being selected based at least in part on the identified demographic characteristic. In one embodiment, the demographic characteristic is one or more of the caller's age, gender, ethnicity, education level, emotional state, health status and geographic group. In another embodiment, the selection of the response is further based on consideration of corroborative caller data.
    Type: Application
    Filed: June 4, 2007
    Publication date: December 4, 2008
    Applicant: Microsoft Corporation
    Inventors: Yun-Cheng Ju, Alejandro Acero, Neal Alan Berstein, Geoffrey Gerson Zweig
  • Patent number: 7460992
    Abstract: A method and apparatus are provided for using the uncertainty of a noise-removal process during pattern recognition. In particular, noise is removed from a representation of a portion of a noisy signal to produce a representation of a cleaned signal. In the meantime, an uncertainty associated with the noise removal is computed and is used with the representation of the cleaned signal to modify a probability for a phonetic state in the recognition system. In particular embodiments, the uncertainty is used to modify a probability distribution, by increasing the variance in each Gaussian distribution by the amount equal to the estimated variance of the cleaned signal, which is used in decoding the phonetic state sequence in a pattern recognition task.
    Type: Grant
    Filed: May 16, 2006
    Date of Patent: December 2, 2008
    Assignee: Microsoft Corporation
    Inventors: James G. Droppo, Alejandro Acero, Li Deng
  • Publication number: 20080288219
    Abstract: A novel beamforming post-processor technique with enhanced noise suppression capability. The present beam forming post-processor technique is a non-linear post-processing technique for sensor arrays (e.g., microphone arrays) which improves the directivity and signal separation capabilities. The technique works in so-called instantaneous direction of arrival space, estimates the probability for sound coming from a given incident angle or look-up direction and applies a time-varying, gain based, spatio-temporal filter for suppressing sounds coming from directions other than the sound source direction resulting in minimal artifacts and musical noise.
    Type: Application
    Filed: May 17, 2007
    Publication date: November 20, 2008
    Applicant: Microsoft Corporation
    Inventors: Ivan Tashev, Alejandro Acero
  • Patent number: 7454338
    Abstract: A method and apparatus are provided that generate values for a first set of dimensions of a feature vector from a speech signal. The values of the first set of dimensions are used to estimate values for a second set of dimensions of the feature vector to form an extended feature vector. The extended feature vector is then used to train an acoustic model.
    Type: Grant
    Filed: February 8, 2005
    Date of Patent: November 18, 2008
    Assignee: Microsoft Corporation
    Inventors: Michael L. Seltzer, Alejandro Acero
  • Publication number: 20080281591
    Abstract: A method and apparatus are provided for using the uncertainty of a noise-removal process during pattern recognition. In particular, noise is removed from a representation of a portion of a noisy signal to produce a representation of a cleaned signal. In the meantime, an uncertainty associated with the noise removal is computed and is used with the representation of the cleaned signal to modify a probability for a phonetic state in the recognition system. In particular embodiments, the uncertainty is used to modify a probability distribution, by increasing the variance in each Gaussian distribution by the amount equal to the estimated variance of the cleaned signal, which is used in decoding the phonetic state sequence in a pattern recognition task.
    Type: Application
    Filed: July 25, 2008
    Publication date: November 13, 2008
    Applicant: MICROSOFT CORPORATION
    Inventors: James G. Droppo, Alejandro Acero, Li Deng
  • Publication number: 20080281827
    Abstract: A structured database is used for webpage information extraction, and in particular, to obtain training data from the webpage for training a statistical model. The structured database has a plurality of entries, wherein each entry comprises a plurality of fields. One of the fields comprises a URL (uniform resource locater), while another field comprises information at least similar to other information to be located in a webpage associated with the URL. For at least some of the entries in the structured database, a web page associated with the URL is retrieved. The webpage is analyzed and if information is found in the webpage similar to the information in the structured database, the webpage is identified as being suitable to be considered as a training sample.
    Type: Application
    Filed: May 10, 2007
    Publication date: November 13, 2008
    Applicant: Microsoft Corporation
    Inventors: Ye-Yi Wang, Alejandro Acero, Mandar A. Rahurkar
  • Publication number: 20080281806
    Abstract: A database having listings rather than long documents is searched using a term frequency-inverse document frequency (Tf/Idf) algorithm.
    Type: Application
    Filed: May 10, 2007
    Publication date: November 13, 2008
    Applicant: Microsoft Corporation
    Inventors: Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, Alejandro Acero, Geoffrey G. Zweig
  • Patent number: 7451083
    Abstract: A method and computer-readable medium are provided for identifying clean signal feature vectors from noisy signal feature vectors. One aspect of the invention includes using an iterative approach to identify the clean signal feature vector. Another aspect of the invention includes using the variance of a set of noise feature vectors and/or channel distortion feature vectors when identifying the clean signal feature vectors.
    Type: Grant
    Filed: July 20, 2005
    Date of Patent: November 11, 2008
    Assignee: Microsoft Corporation
    Inventors: Brendan J. Frey, Alejandro Acero, Li Deng
  • Patent number: 7447630
    Abstract: A method and system use an alternative sensor signal received from a sensor other than an air conduction microphone to estimate a clean speech value. The estimation uses either the alternative sensor signal alone, or in conjunction with the air conduction microphone signal. The clean speech value is estimated without using a model trained from noisy training data collected from an air conduction microphone. Under one embodiment, correction vectors are added to a vector formed from the alternative sensor signal in order to form a filter, which is applied to the air conductive microphone signal to produce the clean speech estimate. In other embodiments, the pitch of a speech signal is determined from the alternative sensor signal and is used to decompose an air conduction microphone signal. The decomposed signal is then used to determine a clean signal estimate.
    Type: Grant
    Filed: November 26, 2003
    Date of Patent: November 4, 2008
    Assignee: Microsoft Corporation
    Inventors: Zicheng Liu, Michael J. Sinclair, Alejandro Acero, Xuedong D. Huang, James G. Droppo, Li Deng, Zhengyou Zhang, Yanli Zheng
  • Publication number: 20080262995
    Abstract: A method of communicating information about a product evaluation between a system having a data store and a wireless client device is discussed. The method includes receiving a signal representative of an audible indication from the client device via a wireless communication link identifying the product about which evaluation information is to be communicated. The method further includes comparing an indication of the signal to data in the data store in response to match the indication with a portion of the data and communicating evaluation information between the wireless client device and the system.
    Type: Application
    Filed: April 19, 2007
    Publication date: October 23, 2008
    Applicant: Microsoft Corporation
    Inventors: Geoffrey Gerson Zweig, Yun-Cheng Ju, Patrick Nguyen, Alejandro Acero
  • Publication number: 20080232607
    Abstract: A novel adaptive beamforming technique with enhanced noise suppression capability. The technique incorporates the sound-source presence probability into an adaptive blocking matrix. In one embodiment the sound-source presence probability is estimated based on the instantaneous direction of arrival of the input signals and voice activity detection. The technique guarantees robustness to steering vector errors without imposing ad hoc constraints on the adaptive filter coefficients. It can provide good suppression performance for both directional interference signals as well as isotropic ambient noise.
    Type: Application
    Filed: March 22, 2007
    Publication date: September 25, 2008
    Applicant: Microsoft Corporation
    Inventors: Ivan Tashev, Alejandro Acero, Byung-Jun Yoon
  • Patent number: 7424423
    Abstract: A method of tracking formants defines a formant search space comprising sets of formants to be searched. Formants are identified for a first frame in the speech utterance by searching the entirety of the formant search space using the codebook, and for the remaining frames by searching the same space using both the codebook and the continuity constraint across adjacent frames. Under one embodiment, the formants are identified by mapping sets of formants into feature vectors and applying the feature vectors to a model. Formants are also identified by applying dynamic programming to search for the best sequence that optimally satisfies the continuity constraint required by the model.
    Type: Grant
    Filed: April 1, 2003
    Date of Patent: September 9, 2008
    Assignee: Microsoft Corporation
    Inventors: Issam Bazzi, Li Deng, Alejandro Acero
  • Publication number: 20080215311
    Abstract: Methods are disclosed for estimating language models such that the conditional likelihood of a class given a word string, which is very well correlated with classification accuracy, is maximized. The methods comprise tuning statistical language model parameters jointly for all classes such that a classifier discriminates between the correct class and the incorrect ones for a given training sentence or utterance. Specific embodiments of the present invention pertain to implementation of the rational function growth transform in the context of a discriminative training technique for n-gram classifiers.
    Type: Application
    Filed: April 15, 2008
    Publication date: September 4, 2008
    Applicant: Microsoft Corporation
    Inventors: Ciprian Chelba, Alejandro Acero, Milind Mahajan
  • Publication number: 20080215321
    Abstract: Pitch is tracked for individual samples, which are taken much more frequently than an analysis frame. Speech is identified based on the tracked pitch and the speech components of the signal are removed with a time-varying filter, leaving only an estimate of a time-varying speech signal. This estimate is then used to generate a time-varying noise model which, in turn, can be used to enhance speech related systems.
    Type: Application
    Filed: April 19, 2007
    Publication date: September 4, 2008
    Applicant: Microsoft Corporation
    Inventors: James G. Droppo, Alejandro Acero, Luis Buera
  • Patent number: 7418383
    Abstract: A unified, nonlinear, non-stationary, stochastic model is disclosed for estimating and removing effects of background noise on speech cepstra. Generally stated, the model is a union of dynamic system equations for speech and noise, and a model describing how speech and noise are mixed. Embodiments also pertain to related methods for enhancement.
    Type: Grant
    Filed: September 3, 2004
    Date of Patent: August 26, 2008
    Assignee: Microsoft Corporation
    Inventors: James Droppo, Alejandro Acero
  • Publication number: 20080201139
    Abstract: A method and apparatus for training an acoustic model are disclosed. A training corpus is accessed and converted into an initial acoustic model. Scores are calculated for a correct class and competitive classes, respectively, for each token given the initial acoustic model. Also, a sample-adaptive window bandwidth is calculated for each training token. From the calculated scores and the sample-adaptive window bandwidth values, loss values are calculated based on a loss function. The loss function, which may be derived from a Bayesian risk minimization viewpoint, can include a margin value that moves a decision boundary such that token-to-boundary distances for correct tokens that are near the decision boundary are maximized. The margin can either be a fixed margin or can vary monotonically as a function of algorithm iterations. The acoustic model is updated based on the calculated loss values. This process can be repeated until an empirical convergence is met.
    Type: Application
    Filed: February 20, 2007
    Publication date: August 21, 2008
    Applicant: Microsoft Corporation
    Inventors: Dong Yu, Alejandro Acero, Li Deng, Xiaodong He
  • Patent number: 7409346
    Abstract: A structured generative model of a speech coarticulation and reduction is described with a novel two-stage implementation. At the first stage, the dynamics of formants or vocal tract resonance (VTR) are generated using prior information of resonance targets in the phone sequence. Bi-directional temporal filtering with finite impulse response (FIR) is applied to the segmental target sequence as the FIR filter's input. At the second stage the dynamics of speech cepstra are predicted analytically based on the FIR filtered VTR targets. The combined system of these two stages thus generates correlated and causally related VTR and cepstral dynamics where phonetic reduction is represented explicitly in the hidden resonance space and implicitly in the observed cepstral space. The combined system also gives the acoustic observation probability given a phone sequence. Using this probability, different phone sequences can be compared and ranked in terms of their respective probability values.
    Type: Grant
    Filed: March 1, 2005
    Date of Patent: August 5, 2008
    Assignee: Microsoft Corporation
    Inventors: Alejandro Acero, Dong Yu, Li Deng
  • Patent number: 7406416
    Abstract: A method and apparatus are provided for storing parameters of a deleted interpolation language model as parameters of a backoff language model. In particular, the parameters of the deleted interpolation language model are stored in the standard ARPA format. Under one embodiment, the deleted interpolation language model parameters are formed using fractional counts.
    Type: Grant
    Filed: March 26, 2004
    Date of Patent: July 29, 2008
    Assignee: Microsoft Corporation
    Inventors: Ciprian Chelba, Milind Mahajan, Alejandro Acero
  • Publication number: 20080177547
    Abstract: A novel system integrates speech recognition and semantic classification, so that acoustic scores in a speech recognizer that accepts spoken utterances may be taken into account when training both language models and semantic classification models. For example, a joint association score may be defined that is indicative of a correspondence of a semantic class and a word sequence for an acoustic signal. The joint association score may incorporate parameters such as weighting parameters for signal-to-class modeling of the acoustic signal, language model parameters and scores, and acoustic model parameters and scores. The parameters may be revised to raise the joint association score of a target word sequence with a target semantic class relative to the joint association score of a competitor word sequence with the target semantic class. The parameters may be designed so that the semantic classification errors in the training data are minimized.
    Type: Application
    Filed: January 19, 2007
    Publication date: July 24, 2008
    Applicant: Microsoft Corporation
    Inventors: Sibel Yaman, Li Deng, Dong Yu, Ye-Yi Wang, Alejandro Acero
  • Publication number: 20080177545
    Abstract: A novel system for automatic reading tutoring provides effective error detection and reduced false alarms combined with low processing time burdens and response times short enough to maintain a natural, engaging flow of interaction. According to one illustrative embodiment, an automatic reading tutoring method includes displaying a text output and receiving an acoustic input. The acoustic input is modeled with a domain-specific target language model specific to the text output, and with a general-domain garbage language model, both of which may be efficiently constructed as context-free grammars. The domain-specific target language model may be built dynamically or “on-the-fly” based on the currently displayed text (e.g. the story to be read by the user), while the general-domain garbage language model is shared among all different text outputs. User-perceptible tutoring feedback is provided based on the target language model and the garbage language model.
    Type: Application
    Filed: January 19, 2007
    Publication date: July 24, 2008
    Applicant: Microsoft Corporation
    Inventors: Xiaolong Li, Yun-Cheng Ju, Li Deng, Alejandro Acero