Patents by Inventor Alejandro Acero

Alejandro Acero has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8010356
    Abstract: Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.
    Type: Grant
    Filed: February 17, 2006
    Date of Patent: August 30, 2011
    Assignee: Microsoft Corporation
    Inventors: Li Deng, Dong Yu, Xiaolong Li, Alejandro Acero
  • Patent number: 8005238
    Abstract: A novel adaptive beamforming technique with enhanced noise suppression capability. The technique incorporates the sound-source presence probability into an adaptive blocking matrix. In one embodiment the sound-source presence probability is estimated based on the instantaneous direction of arrival of the input signals and voice activity detection. The technique guarantees robustness to steering vector errors without imposing ad hoc constraints on the adaptive filter coefficients. It can provide good suppression performance for both directional interference signals as well as isotropic ambient noise.
    Type: Grant
    Filed: March 22, 2007
    Date of Patent: August 23, 2011
    Assignee: Microsoft Corporation
    Inventors: Ivan Tashev, Alejandro Acero, Byung-Jun Yoon
  • Patent number: 8005237
    Abstract: A novel beamforming post-processor technique with enhanced noise suppression capability. The present beam forming post-processor technique is a non-linear post-processing technique for sensor arrays (e.g., microphone arrays) which improves the directivity and signal separation capabilities. The technique works in so-called instantaneous direction of arrival space, estimates the probability for sound coming from a given incident angle or look-up direction and applies a time-varying, gain based, spatio-temporal filter for suppressing sounds coming from directions other than the sound source direction resulting in minimal artifacts and musical noise.
    Type: Grant
    Filed: May 17, 2007
    Date of Patent: August 23, 2011
    Assignee: Microsoft Corp.
    Inventors: Ivan Tashev, Alejandro Acero
  • Patent number: 7991615
    Abstract: Described is the use of acoustic data to improve grapheme-to-phoneme conversion for speech recognition, such as to more accurately recognize spoken names in a voice-dialing system. A joint model of acoustics and graphonemes (acoustic data, phonemes sequences, grapheme sequences and an alignment between phoneme sequences and grapheme sequences) is described, as is retraining by maximum likelihood training and discriminative training in adapting graphoneme model parameters using acoustic data. Also described is the unsupervised collection of grapheme labels for received acoustic data, thereby automatically obtaining a substantial number of actual samples that may be used in retraining. Speech input that does not meet a confidence threshold may be filtered out so as to not be used by the retrained model.
    Type: Grant
    Filed: December 7, 2007
    Date of Patent: August 2, 2011
    Assignee: Microsoft Corporation
    Inventors: Xiao Li, Asela J. R. Gunawardana, Alejandro Acero
  • Patent number: 7983901
    Abstract: The present invention uses a natural language understanding system that is currently being trained to assist in annotating training data for training that natural language understanding system. Unannotated training data is provided to the system and the system proposes annotations to the training data. The user is offered an opportunity to confirm or correct the proposed annotations, and the system is trained with the corrected or verified annotations.
    Type: Grant
    Filed: May 6, 2009
    Date of Patent: July 19, 2011
    Assignee: Microsoft Corporation
    Inventors: Alejandro Acero, Ye-Yi Wang, Leon Wong
  • Publication number: 20110161078
    Abstract: Pitch is tracked for individual samples, which are taken much more frequently than an analysis frame. Speech is identified based on the tracked pitch and the speech components of the signal are removed with a time-varying filter, leaving only an estimate of a time-varying speech signal. This estimate is then used to generate a time-varying noise model which, in turn, can be used to enhance speech related systems.
    Type: Application
    Filed: March 7, 2011
    Publication date: June 30, 2011
    Applicant: Microsoft Corporation
    Inventors: James G. Droppo, Alejandro Acero, Luis Buera
  • Publication number: 20110137639
    Abstract: A statistical language model is trained for use in a directory assistance system using the data in a directory assistance listing corpus. Calculations are made to determine how important words in the corpus are in distinguishing a listing from other listings, and how likely words are to be omitted or added by a user. The language model is trained using these calculations.
    Type: Application
    Filed: February 15, 2011
    Publication date: June 9, 2011
    Applicant: Microsoft Corporation
    Inventors: Dong Yu, Alejandro Acero, Yun-Cheng Ju
  • Publication number: 20110131046
    Abstract: A computer-implemented speech recognition system described herein includes a receiver component that receives a plurality of detected units of an audio signal, wherein the audio signal comprises a speech utterance of an individual. A selector component selects a subset of the plurality of detected units that correspond to a particular time-span. A generator component generates at least one feature with respect to the particular time-span, wherein the at least one feature is one of an existence feature, an expectation feature, or an edit distance feature. Additionally, a statistical speech recognition model outputs at least one word that corresponds to the particular time-span based at least in part upon the at least one feature generated by the feature generator component.
    Type: Application
    Filed: November 30, 2009
    Publication date: June 2, 2011
    Applicant: Microsoft Corporation
    Inventors: Geoffrey Gerson Zweig, Patrick An-Phu Nguyen, James Garnet Droppo, III, Alejandro Acero
  • Patent number: 7949526
    Abstract: A voice interaction system is configured to analyze an utterance and identify inherent attributes that are indicative of a demographic characteristic of the system user that spoke the utterance. The system then selects and presents a personalized response to the user, the response being selected based at least in part on the identified demographic characteristic. In one embodiment, the demographic characteristic is one or more of the caller's age, gender, ethnicity, education level, emotional state, health status and geographic group. In another embodiment, the selection of the response is further based on consideration of corroborative caller data.
    Type: Grant
    Filed: June 4, 2007
    Date of Patent: May 24, 2011
    Assignee: Microsoft Corporation
    Inventors: Yun-Cheng Ju, Alejandro Acero, Neal Bernstein, Geoffrey Zweig
  • Patent number: 7941316
    Abstract: A method of entering information into a mobile device includes receiving a multi-word speech input from a user, performing speech recognition on the speech input to obtain a multi-word speech recognition result, and sequentially displaying, in a display, words in the speech recognition result for user confirmation or correction, by adding one word at a time to the display. A next word is only displayed after user confirmation or correct has been received for a previously displayed word that is immediately preceding the next word in the speech recognition result. The method also includes calculating a hypothesis lattice indicative of a plurality of speech recognition hypotheses based on the speech input and, prior to finishing calculating the hypothesis lattice and while continuing to calculate the hypothesis lattice, calculating a preliminary hypothesis lattice indicative of only partial speech recognition hypotheses based on the speech input and outputting the preliminary hypotheses lattice.
    Type: Grant
    Filed: October 28, 2005
    Date of Patent: May 10, 2011
    Assignee: Microsoft Corporation
    Inventors: Milind V. Mahajan, Alejandro Acero, Bo-June Hsu
  • Patent number: 7930178
    Abstract: A frame of a speech signal is converted into the spectral domain to identify a plurality of frequency components and an energy value for the frame is determined. The plurality of frequency components is divided by the energy value for the frame to form energy-normalized frequency components. A model is then constructed from the energy-normalized frequency components and can be used for speech recognition and speech enhancement.
    Type: Grant
    Filed: December 23, 2005
    Date of Patent: April 19, 2011
    Assignee: Microsoft Corporation
    Inventors: Zhengyou Zhang, Alejandro Acero, Amarnag Subramanya, Zicheng Liu
  • Patent number: 7925602
    Abstract: Described is a technology by which a maximum entropy model used for classification is trained with a significantly lesser amount of training data than is normally used in training other maximum entropy models, yet provides similar accuracy to the others. The maximum entropy model is initially parameterized with parameter values determined from weights obtained by training a vector space model or an n-gram model. The weights may be scaled into the initial parameter values by determining a scaling factor. Gaussian mean values may also be determined, and used for regularization in training the maximum entropy model. Scaling may also be applied to the Gaussian mean values. After initial parameterization, training comprises using training data to iteratively adjust the initial parameters into adjusted parameters until convergence is determined.
    Type: Grant
    Filed: December 7, 2007
    Date of Patent: April 12, 2011
    Assignee: Microsoft Corporation
    Inventors: Ye-Yi Wang, Alejandro Acero
  • Patent number: 7925502
    Abstract: Pitch is tracked for individual samples, which are taken much more frequently than an analysis frame. Speech is identified based on the tracked pitch and the speech components of the signal are removed with a time-varying filter, leaving only an estimate of a time-varying speech signal. This estimate is then used to generate a time-varying noise model which, in turn, can be used to enhance speech related systems.
    Type: Grant
    Filed: April 19, 2007
    Date of Patent: April 12, 2011
    Assignee: Microsoft Corporation
    Inventors: James G. Droppo, Alejandro Acero, Luis Buera
  • Patent number: 7912707
    Abstract: A statistical language model is trained for use in a directory assistance system using the data in a directory assistance listing corpus. Calculations are made to determine how important words in the corpus are in distinguishing a listing from other listings, and how likely words are to be omitted or added by a user. The language model is trained using these calculations.
    Type: Grant
    Filed: December 19, 2006
    Date of Patent: March 22, 2011
    Assignee: Microsoft Corporation
    Inventors: Dong Yu, Alejandro Acero, Yun-Cheng Ju
  • Patent number: 7885812
    Abstract: Parameters for a feature extractor and acoustic model of a speech recognition module are trained. An objective function is utilized to determine values for the feature extractor parameters and the acoustic model parameters.
    Type: Grant
    Filed: November 15, 2006
    Date of Patent: February 8, 2011
    Assignee: Microsoft Corporation
    Inventors: Alejandro Acero, James G. Droppo, Milind V. Mahajan
  • Patent number: 7877256
    Abstract: A time-synchronous lattice-constrained search algorithm is developed and used to process a linguistic model of speech that has a long-contextual-span capability. In the algorithm, hypotheses are represented as traces that include an indication of a current frame, previous frames and future frames. Each frame can include an associated linguistic unit such as a phone or units that are derived from a phone. Additionally, pruning strategies can be applied to speed up the search. Further, word-ending recombination methods are developed to speed up the computation. These methods can effectively deal with an exponentially increased search space.
    Type: Grant
    Filed: February 17, 2006
    Date of Patent: January 25, 2011
    Assignee: Microsoft Corporation
    Inventors: Xiaolong Li, Li Deng, Dong Yu, Alejandro Acero
  • Publication number: 20110015927
    Abstract: An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.
    Type: Application
    Filed: September 17, 2010
    Publication date: January 20, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: Dong Yu, Peter Mau, Mei-Yuh Hwang, Alejandro Acero
  • Patent number: 7865357
    Abstract: A method of forming a shareable filler model (shareable model for garbage words) from a word n-gram model is provided. The word n-gram model is converted into a probabilistic context free grammar (PCFG). The PCFG is modified into a substantially application-independent PCFG, which constitutes the shareable filler model.
    Type: Grant
    Filed: March 14, 2006
    Date of Patent: January 4, 2011
    Assignee: Microsoft Corporation
    Inventors: Alejandro Acero, Dong Yu, Ye-Yi Wang, Yun-Cheng Ju
  • Patent number: 7860707
    Abstract: A computer-implemented method is disclosed for improving the accuracy of a directory assistance system. The method includes constructing a prefix tree based on a collection of alphabetically organized words. The prefix tree is utilized as a basis for generating splitting rules for a compound word included in an index associated with the directory assistance system. A language model check and a pronunciation check are conducted in order to determine which of the generated splitting rules are mostly likely correct. The compound word is split into word components based on the most likely correct rule or rules. The word components are incorporated into a data set associated with the directory assistance system, such as into a recognition grammar and/or the index.
    Type: Grant
    Filed: December 13, 2006
    Date of Patent: December 28, 2010
    Assignee: Microsoft Corporation
    Inventors: Dong Yu, Alejandro Acero, Yun-Cheng Ju
  • Patent number: 7860314
    Abstract: A method and apparatus are provided for adapting an exponential probability model. In a first stage, a general-purpose background model is built from background data by determining a set of model parameters for the probability model based on a set of background data. The background model parameters are then used to define a prior model for the parameters of an adapted probability model that is adapted and more specific to an adaptation data set of interest. The adaptation data set is generally of much smaller size than the background data set. A second set of model parameters are then determined for the adapted probability model based on the set of adaptation data and the prior model.
    Type: Grant
    Filed: October 29, 2004
    Date of Patent: December 28, 2010
    Assignee: Microsoft Corporation
    Inventors: Ciprian I. Chelba, Alejandro Acero