Patents by Inventor Alejandro Acero

Alejandro Acero has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Rules-based grammar for slots and statistical model for preterminals in natural language understanding system

Patent number: 7603267

Abstract: A NLU system includes a rules-based grammar for slots in a schema and a statistical model for preterminals. A training system is also provided.

Type: Grant

Filed: May 1, 2003

Date of Patent: October 13, 2009

Assignee: Microsoft Corporation

Inventors: Yi-Yi Wang, Alejandro Acero
Creating a speech recognition grammar for alphanumeric concepts

Patent number: 7599837

Abstract: A method and system to generate a grammar adapted for use by a speech recognizer includes receiving a representation of an alphanumeric expression. For instance, the representation can take the form of a regular expression or a mask. The grammar is generated based on the representation.

Type: Grant

Filed: September 15, 2004

Date of Patent: October 6, 2009

Assignee: Microsoft Corporation

Inventors: Ye-Yi Wang, Yun-Cheng Ju, Leonard Alan Collins, Mark Cecys, Alejandro Acero
SPATIAL NOISE SUPPRESSION FOR A MICROPHONE ARRAY

Publication number: 20090226005

Abstract: A noise reduction system and a method of noise reduction includes a microphone array comprising a first microphone, a second microphone, and a third microphone. Each microphone has a known position and a known directivity pattern. An instantaneous direction-of-arrival (IDOA) module determines a first phase difference quantity and a second phase difference quantity. The first phase difference quantity is based on phase differences between non-repetitive pairs of input signals received by the first microphone and the second microphone, while the second phase difference quantity is based on phase differences between non-repetitive pairs of input signals received by the first microphone and the third microphone. A spatial noise reduction module computes an estimate of a desired signal based on a priori spatial signal-to-noise ratio and an a posteriori spatial signal-to-noise ratio based on the first and second phase difference quantities.

Type: Application

Filed: May 12, 2009

Publication date: September 10, 2009

Applicant: Microsoft Corporation

Inventors: Alejandro Acero, Ivan J. Tashev, Michael L. Seltzer
Indexing and ranking processes for directory assistance services

Patent number: 7580942

Abstract: A computer-implemented method is disclosed for providing a directory assistance service. The method includes generating an indexing file that is a representation of information associated with a collection of listings stored in an index. The indexing file is utilized as a basis for ranking listings in an index based on the strength of association with a query. Based at least in part on the ranking, an output is provided and is indicative of listings in the index that are likely correspond to the query. At least one particular listing in the index is excluded from the output without there ever being a comparison of features in the query with features in the one particular listing.

Type: Grant

Filed: January 12, 2007

Date of Patent: August 25, 2009

Assignee: Microsoft Corporation

Inventors: Dong Yu, Alejandro Acero, Yun-Cheng Ju, Ye-Yi Wang
Method and apparatus for multi-sensory speech enhancement

Patent number: 7574008

Abstract: A method and apparatus determine a channel response for an alternative sensor using an alternative sensor signal and an air conduction microphone signal. The channel response is then used to estimate a clean speech value using at least a portion of the alternative sensor signal.

Type: Grant

Filed: September 17, 2004

Date of Patent: August 11, 2009

Assignee: Microsoft Corporation

Inventors: Zhengyou Zhang, Alejandro Acero, James G. Droppo, Xuedong David Huang, Zicheng Liu
Method for training of subspace coded gaussian models

Patent number: 7571097

Abstract: A method for compressing multiple dimensional gaussian distributions with diagonal covariance matrixes includes clustering a plurality of gaussian distributions in a multiplicity of clusters for each dimension. Each cluster can be represented by a centroid having a mean and a variance. A total decrease in likelihood of a training dataset is minimized for the representation of the plurality of gaussian distributions.

Type: Grant

Filed: March 13, 2003

Date of Patent: August 4, 2009

Assignee: Microsoft Corporation

Inventors: Alejandro Acero, Michael D. Plumpe
Spatial noise suppression for a microphone array

Patent number: 7565288

Abstract: A microphone array having at least three microphones provides a captured signal. Spatial noise suppression estimates a desired signal from a captured signal using spatio-temporal distribution of the speech and the noise. In particular, spatial information indicative of at least two quantities of direction are used. A first quantity is based on a first combination of the signals from the at least three microphones, a second quantity is based on a second combination of the signals of the at least three microphones.

Type: Grant

Filed: December 22, 2005

Date of Patent: July 21, 2009

Assignee: Microsoft Corporation

Inventors: Alejandro Acero, Ivan J. Tashev, Michael L. Seltzer
Quantitative model for formant dynamics and contextually assimilated reduction in fluent speech

Patent number: 7565292

Abstract: A method of identifying a sequence of formant trajectory values is provided in which a sequence of target values are identified for a formant as step functions. The target values and the duration for each segment target for the formant are applied to a finite impulse response filter to form a sequence of formant trajectory values. The parameters of this filter, as well as the duration of the targets for each phone, can be modified to produce many kinds of target undershooting effects in a contextually assimilated manner. The procedure for producing the formant trajectory values does not require any acoustic data from speech.

Type: Grant

Filed: September 17, 2004

Date of Patent: July 21, 2009

Assignee: Micriosoft Corporation

Inventors: Li Deng, Alejandro Acero, Dong Yu
Acoustic models with structured hidden dynamics with integration over many possible hidden trajectories

Patent number: 7565284

Abstract: A method of producing at least one possible sequence of vocal tract resonance (VTR) for a fixed sequence of phonetic units, and producing the acoustic observation probability by integrating over such distributions is provided. The method includes identifying a sequence of target distributions for a VTR sequence corresponding to a phone sequence with a given segmentation. The sequence of target distributions is applied to a finite impulse response filter to produce distributions for possible VTR trajectories. Then these distributions are applied to a linearized nonlinear function to produce the acoustic observation probability for the given sequence of phonetic units. This acoustic observation probability is used for phonetic recognition.

Type: Grant

Filed: March 1, 2005

Date of Patent: July 21, 2009

Assignee: Microsoft Corporation

Inventors: Li Deng, Alejandro Acero, Dong Yu, Xiang Li
SPEECH RECOGNITION WITH NON-LINEAR NOISE REDUCTION ON MEL-FREQUENCY CEPTRA

Publication number: 20090177468

Abstract: In an automatic speech recognition system, a feature extractor extracts features from a speech signal, and speech is recognized by the automatic speech recognition system based on the extracted features. Noise reduction as part of the feature extractor is provided by feature enhancement in which feature-domain noise reduction in the form of Mel-frequency cepstra is provided based on the minimum means square error criterion. Specifically, the devised method takes into account the random phase between the clean speech and the mixing noise. The feature-domain noise reduction is performed in a dimension-wise fashion to the individual dimensions of the feature vectors input to the automatic speech recognition system, in order to perform environment-robust speech recognition.

Type: Application

Filed: January 8, 2008

Publication date: July 9, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Dong Yu, Alejandro Acero, James G. Droppo, Li Deng
System for automatically annotating training data for a natural language understanding system

Patent number: 7548847

Abstract: The present invention uses a natural language understanding system that is currently being trained to assist in annotating training data for training that natural language understanding system. Unannotated training data is provided to the system and the system proposes annotations to the training data. The user is offered an opportunity to confirm or correct the proposed annotations, and the system is trained with the corrected or verified annotations.

Type: Grant

Filed: May 10, 2002

Date of Patent: June 16, 2009

Assignee: Microsoft Corporation

Inventors: Alejandro Acero, Ye-Yi Wang, Leon Wong
MAXIMUM ENTROPY MODEL PARAMETERIZATION

Publication number: 20090150308

Abstract: Described is a technology by which a maximum entropy model used for classification is trained with a significantly lesser amount of training data than is normally used in training other maximum entropy models, yet provides similar accuracy to the others. The maximum entropy model is initially parameterized with parameter values determined from weights obtained by training a vector space model or an n-gram model. The weights may be scaled into the initial parameter values by determining a scaling factor. Gaussian mean values may also be determined, and used for regularization in training the maximum entropy model. Scaling may also be applied to the Gaussian mean values. After initial parameterization, training comprises using training data to iteratively adjust the initial parameters into adjusted parameters until convergence is determined.

Type: Application

Filed: December 7, 2007

Publication date: June 11, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Ye-Yi Wang, Alejandro Acero
GRAPHEME-TO-PHONEME CONVERSION USING ACOUSTIC DATA

Publication number: 20090150153

Abstract: Described is the use of acoustic data to improve grapheme-to-phoneme conversion for speech recognition, such as to more accurately recognize spoken names in a voice-dialing system. A joint model of acoustics and graphonemes (acoustic data, phonemes sequences, grapheme sequences and an alignment between phoneme sequences and grapheme sequences) is described, as is retraining by maximum likelihood training and discriminative training in adapting graphoneme model parameters using acoustic data. Also described is the unsupervised collection of grapheme labels for received acoustic data, thereby automatically obtaining a substantial number of actual samples that may be used in retraining. Speech input that does not meet a confidence threshold may be filtered out so as to not be used by the retrained model.

Type: Application

Filed: December 7, 2007

Publication date: June 11, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Xiao Li, Asela J. R. Gunawardana, Alejandro Acero
HIGH PERFORMANCE HMM ADAPTATION WITH JOINT COMPENSATION OF ADDITIVE AND CONVOLUTIVE DISTORTIONS

Publication number: 20090144059

Abstract: A method of compensating for additive and convolutive distortions applied to a signal indicative of an utterance is discussed. The method includes receiving a signal and initializing noise mean and channel mean vectors. Gaussian dependent matrix and Hidden Markov Model (HMM) parameters are calculated or updated to account for additive noise from the noise mean vector or convolutive distortion from the channel mean vector. The HMM parameters are adapted by decoding the utterance using the previously calculated HMM parameters and adjusting the Gaussian dependent matrix and the HMM parameters based upon data received during the decoding. The adapted HMM parameters are applied to decode the input utterance and provide a transcription of the utterance.

Type: Application

Filed: December 3, 2007

Publication date: June 4, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Dong Yu, Li Deng, Alejandro Acero, Yifan Gong, Jinyu Li
Noise reduction using correction vectors based on dynamic aspects of speech and noise normalization

Patent number: 7542900

Abstract: A method and apparatus are provided for reducing noise in a signal. Under one aspect of the invention, a correction vector is selected based on a noisy feature vector that represents a noisy signal. The selected correction vector incorporates dynamic aspects of pattern signals. The selected correction vector is then added to the noisy feature vector to produce a cleaned feature vector. In other aspects of the invention, a noise value is produced from an estimate of the noise in a noisy signal. The noise value is subtracted from a value representing a portion of the noisy signal to produce a noise-normalized value. The noise-normalized value is used to select a correction value that is added to the noise-normalized value to produce a cleaned noise-normalized value. The noise value is then added to the cleaned noise-normalized value to produce a cleaned value representing a portion of a cleaned signal.

Type: Grant

Filed: May 5, 2006

Date of Patent: June 2, 2009

Assignee: Microsoft Corporation

Inventors: James G. Droppo, Li Deng, Alejandro Acero
Configurable parameters for grammar authoring for speech recognition and natural language understanding

Patent number: 7529657

Abstract: A method for authoring a grammar for use in a language processing application is provided. The method includes receiving at least one grammar configuration parameter relating to how to configure a grammar and creating the grammar based on the at least one grammar configuration parameter.

Type: Grant

Filed: September 24, 2004

Date of Patent: May 5, 2009

Assignee: Microsoft Corporation

Inventors: Ye-Yi Wang, Alejandro Acero
Speaker adaptive learning of resonance targets in a hidden trajectory model of speech coarticulation

Patent number: 7519531

Abstract: A computer-implemented method is provided for training a hidden trajectory model, of a speech recognition system, which generates Vocal Tract Resonance (VTR) targets. The method includes obtaining generic VTR target parameters corresponding to a generic speaker used by a target selector to generate VTR target sequences. The generic VTR target parameters are scaled for a particular speaker using a speaker-dependent scaling factor for the particular speaker to generate speaker-adaptive VTR target parameters. This scaling is performed for both the training data and the test data, and for the training data, the scaling is performed iteratively with the process of obtaining the generic targets. The computation of the scaling factor makes use of the results of a VTR tracker. The speaker-adaptive VTR target parameters for the particular speaker are then stored in order to configure the hidden trajectory model to perform speech recognition for the particular speaker using the speaker-adaptive VTR target parameters.

Type: Grant

Filed: March 30, 2005

Date of Patent: April 14, 2009

Assignee: Microsoft Corporation

Inventors: Alejandro Acero, Dong Yu, Li Deng
Method and apparatus using harmonic-model-based front end for robust speech recognition

Patent number: 7516067

Abstract: A system and method are provided that reduce noise in speech signals. The system and method decompose a noisy speech signal into a harmonic component and a residual component. The harmonic component and residual component are then combined as a sum to form a noise-reduced value. In some embodiments, the sum is a weighted sum where the harmonic component is multiplied by a scaling factor. In some embodiments, the noise-reduced value is used in speech recognition.

Type: Grant

Filed: August 25, 2003

Date of Patent: April 7, 2009

Assignee: Microsoft Corporation

Inventors: Michael Seltzer, James Droppo, Alejandro Acero
Language model adaptation using semantic supervision

Patent number: 7478038

Abstract: A method and apparatus are provided for adapting a language model. The method and apparatus provide supervised class-based adaptation of the language model utilizing in-domain semantic information.

Type: Grant

Filed: March 31, 2004

Date of Patent: January 13, 2009

Assignee: Microsoft Corporation

Inventors: Ciprian Chelba, Milind Mahajan, Alejandro Acero, Yik-Cheung Tam
Greedy algorithm for identifying values for vocal tract resonance vectors

Patent number: 7475011

Abstract: A method and apparatus identify values for components of a vocal tract resonance vector by sequentially determining values for each component of the vocal tract resonance vector. To determine a value for a component, the other components are set to static values. A plurality of values for a function are then determined using a plurality of values for the component that is being determined while using the static values for all of the other components. One of the plurality of values for the component is then selected based on the plurality of values for the function.

Type: Grant

Filed: August 25, 2004

Date of Patent: January 6, 2009

Assignee: Microsoft Corporation

Inventors: Li Deng, Alejandro Acero, Issam H. Bazzi

prev … 4 5 6 7 8 9 10 11 12 … next