Patents by Inventor Fileno Alleva

Fileno Alleva has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9263030
    Abstract: A speech recognition system adaptively estimates a warping factor used to reduce speaker variability. The warping factor is estimated using a small window (e.g. 100 ms) of speech. The warping factor is adaptively adjusted as more speech is obtained until the warping factor converges or a pre-defined maximum number of adaptation is reached. The speaker may be placed into a group selected from two or more groups based on characteristics that are associated with the speaker's window of speech. Different step sizes may be used within the different groups when estimating the warping factor. VTLN is applied to the speech input using the estimated warping factor. A linear transformation, including a bias term, may also be computed to assist in normalizing the speech along with the application of the VTLN.
    Type: Grant
    Filed: January 23, 2013
    Date of Patent: February 16, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shizhen Wang, Yifan Gong, Fileno Alleva
  • Publication number: 20140207448
    Abstract: A speech recognition system adaptively estimates a warping factor used to reduce speaker variability. The warping factor is estimated using a small window (e.g. 100 ms) of speech. The warping factor is adaptively adjusted as more speech is obtained until the warping factor converges or a pre-defined maximum number of adaptation is reached. The speaker may be placed into a group selected from two or more groups based on characteristics that are associated with the speaker's window of speech. Different step sizes may be used within the different groups when estimating the warping factor. VTLN is applied to the speech input using the estimated warping factor. A linear transformation, including a bias term, may also be computed to assist in normalizing the speech along with the application of the VTLN.
    Type: Application
    Filed: January 23, 2013
    Publication date: July 24, 2014
    Applicant: Microsoft Corporation
    Inventors: Shizhen Wang, Yifan Gong, Fileno Alleva
  • Patent number: 7251600
    Abstract: A language model for a language processing system such as a speech recognition system is constructed from training corpus formed from associated characters, word phrases and context cues. A method and apparatus for generating the training corpus used to train the language model and a system or module using such a language model is disclosed.
    Type: Grant
    Filed: March 29, 2005
    Date of Patent: July 31, 2007
    Assignee: Microsoft Corporation
    Inventors: Yun-cheng Ju, Fileno Alleva
  • Publication number: 20050187769
    Abstract: A method and computer-readable medium use syllable-like units (SLUs) to decode a pronunciation into a phonetic description. The syllable-like units are generally larger than a single phoneme but smaller than a word. The present invention provides a means for defining these syllable-like units and for generating a language model based on these syllable-like units that can be used in the decoding process. As SLUs are longer than phonemes, they contain more acoustic contextual clues and better lexical constraints for speech recognition. Thus, the phoneme accuracy produced from SLU recognition is much better than all-phone sequence recognition.
    Type: Application
    Filed: April 20, 2005
    Publication date: August 25, 2005
    Applicant: Microsoft Corporation
    Inventors: Mei-Yuh Hwang, Fileno Alleva, Rebecca Weiss
  • Publication number: 20050171761
    Abstract: A language model for a language processing system such as a speech recognition system is constructed from training corpus formed from associated characters, word phrases and context cues. A method and apparatus for generating the training corpus used to train the language model and a system or module using such a language model is disclosed.
    Type: Application
    Filed: March 29, 2005
    Publication date: August 4, 2005
    Applicant: Microsoft Corporation
    Inventors: Yun-cheng Ju, Fileno Alleva
  • Publication number: 20050091054
    Abstract: The present invention is directed to a method and apparatus for generating alternatives to words indicative of recognized speech. A reference path of recognized words is generated, based upon input speech data. An operator selection input is received and is indicative of a selected portion of the recognized speech, for which alternatives are to be generated. Boundary conditions for alternatives to be generated are calculated based upon bounds of a reference subpath corresponding to the selected portion of the recognized speech. Alternate subpaths satisfying the boundary conditions are constructed from a hypothesis store which corresponds to the input speech data.
    Type: Application
    Filed: November 23, 2004
    Publication date: April 28, 2005
    Applicant: Microsoft Corporation
    Inventors: Chris Thrasher, Fileno Alleva
  • Publication number: 20050071162
    Abstract: An unsupervised adaptation method and apparatus are provided that reduce the storage and time requirements associated with adaptation. Under the invention, utterances are converted into feature vectors, which are decoded to produce a transcript and alignment unit boundaries for the utterance. Individual alignment units and the feature vectors associated with those alignment units are then provided to an alignment function, which aligns the feature vectors with the states of each alignment unit. Because the alignment is performed within alignment unit boundaries, fewer feature vectors are used and the time for alignment is reduced. After alignment, the feature vector dimensions aligned to a state are added to dimension sums that are kept for that state. After all the states in an utterance have had their sums updated, the speech signal and the alignment units are deleted. Once sufficient frames of data have been received to perform adaptive training, the acoustic model is adapted.
    Type: Application
    Filed: November 12, 2004
    Publication date: March 31, 2005
    Applicant: Microsoft Corporation
    Inventors: William Rockenbeck, Milind Mahajan, Fileno Alleva