Patents by Inventor Fileno Alleva

Fileno Alleva has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Adaptive online feature normalization for speech recognition

Patent number: 9263030

Abstract: A speech recognition system adaptively estimates a warping factor used to reduce speaker variability. The warping factor is estimated using a small window (e.g. 100 ms) of speech. The warping factor is adaptively adjusted as more speech is obtained until the warping factor converges or a pre-defined maximum number of adaptation is reached. The speaker may be placed into a group selected from two or more groups based on characteristics that are associated with the speaker's window of speech. Different step sizes may be used within the different groups when estimating the warping factor. VTLN is applied to the speech input using the estimated warping factor. A linear transformation, including a bias term, may also be computed to assist in normalizing the speech along with the application of the VTLN.

Type: Grant

Filed: January 23, 2013

Date of Patent: February 16, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventors: Shizhen Wang, Yifan Gong, Fileno Alleva
ADAPTIVE ONLINE FEATURE NORMALIZATION FOR SPEECH RECOGNITION

Publication number: 20140207448

Abstract: A speech recognition system adaptively estimates a warping factor used to reduce speaker variability. The warping factor is estimated using a small window (e.g. 100 ms) of speech. The warping factor is adaptively adjusted as more speech is obtained until the warping factor converges or a pre-defined maximum number of adaptation is reached. The speaker may be placed into a group selected from two or more groups based on characteristics that are associated with the speaker's window of speech. Different step sizes may be used within the different groups when estimating the warping factor. VTLN is applied to the speech input using the estimated warping factor. A linear transformation, including a bias term, may also be computed to assist in normalizing the speech along with the application of the VTLN.

Type: Application

Filed: January 23, 2013

Publication date: July 24, 2014

Applicant: Microsoft Corporation

Inventors: Shizhen Wang, Yifan Gong, Fileno Alleva
Disambiguation language model

Patent number: 7251600

Abstract: A language model for a language processing system such as a speech recognition system is constructed from training corpus formed from associated characters, word phrases and context cues. A method and apparatus for generating the training corpus used to train the language model and a system or module using such a language model is disclosed.

Type: Grant

Filed: March 29, 2005

Date of Patent: July 31, 2007

Assignee: Microsoft Corporation

Inventors: Yun-cheng Ju, Fileno Alleva
Method and apparatus for constructing and using syllable-like unit language models

Publication number: 20050187769

Abstract: A method and computer-readable medium use syllable-like units (SLUs) to decode a pronunciation into a phonetic description. The syllable-like units are generally larger than a single phoneme but smaller than a word. The present invention provides a means for defining these syllable-like units and for generating a language model based on these syllable-like units that can be used in the decoding process. As SLUs are longer than phonemes, they contain more acoustic contextual clues and better lexical constraints for speech recognition. Thus, the phoneme accuracy produced from SLU recognition is much better than all-phone sequence recognition.

Type: Application

Filed: April 20, 2005

Publication date: August 25, 2005

Applicant: Microsoft Corporation

Inventors: Mei-Yuh Hwang, Fileno Alleva, Rebecca Weiss
Disambiguation language model

Publication number: 20050171761

Abstract: A language model for a language processing system such as a speech recognition system is constructed from training corpus formed from associated characters, word phrases and context cues. A method and apparatus for generating the training corpus used to train the language model and a system or module using such a language model is disclosed.

Type: Application

Filed: March 29, 2005

Publication date: August 4, 2005

Applicant: Microsoft Corporation

Inventors: Yun-cheng Ju, Fileno Alleva
Method and apparatus for generating and displaying N-Best alternatives in a speech recognition system

Publication number: 20050091054

Abstract: The present invention is directed to a method and apparatus for generating alternatives to words indicative of recognized speech. A reference path of recognized words is generated, based upon input speech data. An operator selection input is received and is indicative of a selected portion of the recognized speech, for which alternatives are to be generated. Boundary conditions for alternatives to be generated are calculated based upon bounds of a reference subpath corresponding to the selected portion of the recognized speech. Alternate subpaths satisfying the boundary conditions are constructed from a hypothesis store which corresponds to the input speech data.

Type: Application

Filed: November 23, 2004

Publication date: April 28, 2005

Applicant: Microsoft Corporation

Inventors: Chris Thrasher, Fileno Alleva
Method and system for frame alignment and unsupervised adaptation of acoustic models

Publication number: 20050071162

Abstract: An unsupervised adaptation method and apparatus are provided that reduce the storage and time requirements associated with adaptation. Under the invention, utterances are converted into feature vectors, which are decoded to produce a transcript and alignment unit boundaries for the utterance. Individual alignment units and the feature vectors associated with those alignment units are then provided to an alignment function, which aligns the feature vectors with the states of each alignment unit. Because the alignment is performed within alignment unit boundaries, fewer feature vectors are used and the time for alignment is reduced. After alignment, the feature vector dimensions aligned to a state are added to dimension sums that are kept for that state. After all the states in an utterance have had their sums updated, the speech signal and the alignment units are deleted. Once sufficient frames of data have been received to perform adaptive training, the acoustic model is adapted.

Type: Application

Filed: November 12, 2004

Publication date: March 31, 2005

Applicant: Microsoft Corporation

Inventors: William Rockenbeck, Milind Mahajan, Fileno Alleva

Adaptive online feature normalization for speech recognition

ADAPTIVE ONLINE FEATURE NORMALIZATION FOR SPEECH RECOGNITION

Disambiguation language model

Method and apparatus for constructing and using syllable-like unit language models

Disambiguation language model

Method and apparatus for generating and displaying N-Best alternatives in a speech recognition system

Method and system for frame alignment and unsupervised adaptation of acoustic models