Patents by Inventor Alejandro Acero

Alejandro Acero has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Parameter learning in a hidden trajectory model

Patent number: 8010356

Abstract: Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.

Type: Grant

Filed: February 17, 2006

Date of Patent: August 30, 2011

Assignee: Microsoft Corporation

Inventors: Li Deng, Dong Yu, Xiaolong Li, Alejandro Acero
Robust adaptive beamforming with enhanced noise suppression

Patent number: 8005238

Abstract: A novel adaptive beamforming technique with enhanced noise suppression capability. The technique incorporates the sound-source presence probability into an adaptive blocking matrix. In one embodiment the sound-source presence probability is estimated based on the instantaneous direction of arrival of the input signals and voice activity detection. The technique guarantees robustness to steering vector errors without imposing ad hoc constraints on the adaptive filter coefficients. It can provide good suppression performance for both directional interference signals as well as isotropic ambient noise.

Type: Grant

Filed: March 22, 2007

Date of Patent: August 23, 2011

Assignee: Microsoft Corporation

Inventors: Ivan Tashev, Alejandro Acero, Byung-Jun Yoon
Sensor array beamformer post-processor

Patent number: 8005237

Abstract: A novel beamforming post-processor technique with enhanced noise suppression capability. The present beam forming post-processor technique is a non-linear post-processing technique for sensor arrays (e.g., microphone arrays) which improves the directivity and signal separation capabilities. The technique works in so-called instantaneous direction of arrival space, estimates the probability for sound coming from a given incident angle or look-up direction and applies a time-varying, gain based, spatio-temporal filter for suppressing sounds coming from directions other than the sound source direction resulting in minimal artifacts and musical noise.

Type: Grant

Filed: May 17, 2007

Date of Patent: August 23, 2011

Assignee: Microsoft Corp.

Inventors: Ivan Tashev, Alejandro Acero
Grapheme-to-phoneme conversion using acoustic data

Patent number: 7991615

Abstract: Described is the use of acoustic data to improve grapheme-to-phoneme conversion for speech recognition, such as to more accurately recognize spoken names in a voice-dialing system. A joint model of acoustics and graphonemes (acoustic data, phonemes sequences, grapheme sequences and an alignment between phoneme sequences and grapheme sequences) is described, as is retraining by maximum likelihood training and discriminative training in adapting graphoneme model parameters using acoustic data. Also described is the unsupervised collection of grapheme labels for received acoustic data, thereby automatically obtaining a substantial number of actual samples that may be used in retraining. Speech input that does not meet a confidence threshold may be filtered out so as to not be used by the retrained model.

Type: Grant

Filed: December 7, 2007

Date of Patent: August 2, 2011

Assignee: Microsoft Corporation

Inventors: Xiao Li, Asela J. R. Gunawardana, Alejandro Acero
Computer-aided natural language annotation

Patent number: 7983901

Abstract: The present invention uses a natural language understanding system that is currently being trained to assist in annotating training data for training that natural language understanding system. Unannotated training data is provided to the system and the system proposes annotations to the training data. The user is offered an opportunity to confirm or correct the proposed annotations, and the system is trained with the corrected or verified annotations.

Type: Grant

Filed: May 6, 2009

Date of Patent: July 19, 2011

Assignee: Microsoft Corporation

Inventors: Alejandro Acero, Ye-Yi Wang, Leon Wong
PITCH MODEL FOR NOISE ESTIMATION

Publication number: 20110161078

Abstract: Pitch is tracked for individual samples, which are taken much more frequently than an analysis frame. Speech is identified based on the tracked pitch and the speech components of the signal are removed with a time-varying filter, leaving only an estimate of a time-varying speech signal. This estimate is then used to generate a time-varying noise model which, in turn, can be used to enhance speech related systems.

Type: Application

Filed: March 7, 2011

Publication date: June 30, 2011

Applicant: Microsoft Corporation

Inventors: James G. Droppo, Alejandro Acero, Luis Buera
ADAPTING A LANGUAGE MODEL TO ACCOMMODATE INPUTS NOT FOUND IN A DIRECTORY ASSISTANCE LISTING

Publication number: 20110137639

Abstract: A statistical language model is trained for use in a directory assistance system using the data in a directory assistance listing corpus. Calculations are made to determine how important words in the corpus are in distinguishing a listing from other listings, and how likely words are to be omitted or added by a user. The language model is trained using these calculations.

Type: Application

Filed: February 15, 2011

Publication date: June 9, 2011

Applicant: Microsoft Corporation

Inventors: Dong Yu, Alejandro Acero, Yun-Cheng Ju
FEATURES FOR UTILIZATION IN SPEECH RECOGNITION

Publication number: 20110131046

Abstract: A computer-implemented speech recognition system described herein includes a receiver component that receives a plurality of detected units of an audio signal, wherein the audio signal comprises a speech utterance of an individual. A selector component selects a subset of the plurality of detected units that correspond to a particular time-span. A generator component generates at least one feature with respect to the particular time-span, wherein the at least one feature is one of an existence feature, an expectation feature, or an edit distance feature. Additionally, a statistical speech recognition model outputs at least one word that corresponds to the particular time-span based at least in part upon the at least one feature generated by the feature generator component.

Type: Application

Filed: November 30, 2009

Publication date: June 2, 2011

Applicant: Microsoft Corporation

Inventors: Geoffrey Gerson Zweig, Patrick An-Phu Nguyen, James Garnet Droppo, III, Alejandro Acero
Voice aware demographic personalization

Patent number: 7949526

Abstract: A voice interaction system is configured to analyze an utterance and identify inherent attributes that are indicative of a demographic characteristic of the system user that spoke the utterance. The system then selects and presents a personalized response to the user, the response being selected based at least in part on the identified demographic characteristic. In one embodiment, the demographic characteristic is one or more of the caller's age, gender, ethnicity, education level, emotional state, health status and geographic group. In another embodiment, the selection of the response is further based on consideration of corroborative caller data.

Type: Grant

Filed: June 4, 2007

Date of Patent: May 24, 2011

Assignee: Microsoft Corporation

Inventors: Yun-Cheng Ju, Alejandro Acero, Neal Bernstein, Geoffrey Zweig
Combined speech and alternate input modality to a mobile device

Patent number: 7941316

Abstract: A method of entering information into a mobile device includes receiving a multi-word speech input from a user, performing speech recognition on the speech input to obtain a multi-word speech recognition result, and sequentially displaying, in a display, words in the speech recognition result for user confirmation or correction, by adding one word at a time to the display. A next word is only displayed after user confirmation or correct has been received for a previously displayed word that is immediately preceding the next word in the speech recognition result. The method also includes calculating a hypothesis lattice indicative of a plurality of speech recognition hypotheses based on the speech input and, prior to finishing calculating the hypothesis lattice and while continuing to calculate the hypothesis lattice, calculating a preliminary hypothesis lattice indicative of only partial speech recognition hypotheses based on the speech input and outputting the preliminary hypotheses lattice.

Type: Grant

Filed: October 28, 2005

Date of Patent: May 10, 2011

Assignee: Microsoft Corporation

Inventors: Milind V. Mahajan, Alejandro Acero, Bo-June Hsu
Speech modeling and enhancement based on magnitude-normalized spectra

Patent number: 7930178

Abstract: A frame of a speech signal is converted into the spectral domain to identify a plurality of frequency components and an energy value for the frame is determined. The plurality of frequency components is divided by the energy value for the frame to form energy-normalized frequency components. A model is then constructed from the energy-normalized frequency components and can be used for speech recognition and speech enhancement.

Type: Grant

Filed: December 23, 2005

Date of Patent: April 19, 2011

Assignee: Microsoft Corporation

Inventors: Zhengyou Zhang, Alejandro Acero, Amarnag Subramanya, Zicheng Liu
Maximum entropy model classfier that uses gaussian mean values

Patent number: 7925602

Abstract: Described is a technology by which a maximum entropy model used for classification is trained with a significantly lesser amount of training data than is normally used in training other maximum entropy models, yet provides similar accuracy to the others. The maximum entropy model is initially parameterized with parameter values determined from weights obtained by training a vector space model or an n-gram model. The weights may be scaled into the initial parameter values by determining a scaling factor. Gaussian mean values may also be determined, and used for regularization in training the maximum entropy model. Scaling may also be applied to the Gaussian mean values. After initial parameterization, training comprises using training data to iteratively adjust the initial parameters into adjusted parameters until convergence is determined.

Type: Grant

Filed: December 7, 2007

Date of Patent: April 12, 2011

Assignee: Microsoft Corporation

Inventors: Ye-Yi Wang, Alejandro Acero
Pitch model for noise estimation

Patent number: 7925502

Abstract: Pitch is tracked for individual samples, which are taken much more frequently than an analysis frame. Speech is identified based on the tracked pitch and the speech components of the signal are removed with a time-varying filter, leaving only an estimate of a time-varying speech signal. This estimate is then used to generate a time-varying noise model which, in turn, can be used to enhance speech related systems.

Type: Grant

Filed: April 19, 2007

Date of Patent: April 12, 2011

Assignee: Microsoft Corporation

Inventors: James G. Droppo, Alejandro Acero, Luis Buera
Adapting a language model to accommodate inputs not found in a directory assistance listing

Patent number: 7912707

Abstract: A statistical language model is trained for use in a directory assistance system using the data in a directory assistance listing corpus. Calculations are made to determine how important words in the corpus are in distinguishing a listing from other listings, and how likely words are to be omitted or added by a user. The language model is trained using these calculations.

Type: Grant

Filed: December 19, 2006

Date of Patent: March 22, 2011

Assignee: Microsoft Corporation

Inventors: Dong Yu, Alejandro Acero, Yun-Cheng Ju
Joint training of feature extraction and acoustic model parameters for speech recognition

Patent number: 7885812

Abstract: Parameters for a feature extractor and acoustic model of a speech recognition module are trained. An objective function is utilized to determine values for the feature extractor parameters and the acoustic model parameters.

Type: Grant

Filed: November 15, 2006

Date of Patent: February 8, 2011

Assignee: Microsoft Corporation

Inventors: Alejandro Acero, James G. Droppo, Milind V. Mahajan
Time synchronous decoding for long-span hidden trajectory model

Patent number: 7877256

Abstract: A time-synchronous lattice-constrained search algorithm is developed and used to process a linguistic model of speech that has a long-contextual-span capability. In the algorithm, hypotheses are represented as traces that include an indication of a current frame, previous frames and future frames. Each frame can include an associated linguistic unit such as a phone or units that are derived from a phone. Additionally, pruning strategies can be applied to speed up the search. Further, word-ending recombination methods are developed to speed up the computation. These methods can effectively deal with an exponentially increased search space.

Type: Grant

Filed: February 17, 2006

Date of Patent: January 25, 2011

Assignee: Microsoft Corporation

Inventors: Xiaolong Li, Li Deng, Dong Yu, Alejandro Acero
SYSTEM AND METHOD FOR EFFICIENT LASER PROCESSING OF A MOVING WEB-BASED MATERIAL

Publication number: 20110015927

Abstract: An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.

Type: Application

Filed: September 17, 2010

Publication date: January 20, 2011

Applicant: MICROSOFT CORPORATION

Inventors: Dong Yu, Peter Mau, Mei-Yuh Hwang, Alejandro Acero
Shareable filler model for grammar authoring

Patent number: 7865357

Abstract: A method of forming a shareable filler model (shareable model for garbage words) from a word n-gram model is provided. The word n-gram model is converted into a probabilistic context free grammar (PCFG). The PCFG is modified into a substantially application-independent PCFG, which constitutes the shareable filler model.

Type: Grant

Filed: March 14, 2006

Date of Patent: January 4, 2011

Assignee: Microsoft Corporation

Inventors: Alejandro Acero, Dong Yu, Ye-Yi Wang, Yun-Cheng Ju
Compound word splitting for directory assistance services

Patent number: 7860707

Abstract: A computer-implemented method is disclosed for improving the accuracy of a directory assistance system. The method includes constructing a prefix tree based on a collection of alphabetically organized words. The prefix tree is utilized as a basis for generating splitting rules for a compound word included in an index associated with the directory assistance system. A language model check and a pronunciation check are conducted in order to determine which of the generated splitting rules are mostly likely correct. The compound word is split into word components based on the most likely correct rule or rules. The word components are incorporated into a data set associated with the directory assistance system, such as into a recognition grammar and/or the index.

Type: Grant

Filed: December 13, 2006

Date of Patent: December 28, 2010

Assignee: Microsoft Corporation

Inventors: Dong Yu, Alejandro Acero, Yun-Cheng Ju
Adaptation of exponential models

Patent number: 7860314

Abstract: A method and apparatus are provided for adapting an exponential probability model. In a first stage, a general-purpose background model is built from background data by determining a set of model parameters for the probability model based on a set of background data. The background model parameters are then used to define a prior model for the parameters of an adapted probability model that is adapted and more specific to an adaptation data set of interest. The adaptation data set is generally of much smaller size than the background data set. A second set of model parameters are then determined for the adapted probability model based on the set of adaptation data and the prior model.

Type: Grant

Filed: October 29, 2004

Date of Patent: December 28, 2010

Assignee: Microsoft Corporation

Inventors: Ciprian I. Chelba, Alejandro Acero

prev 1 2 3 4 5 6 7 8 9 … next