Patents by Inventor James Garnet Droppo, III

James Garnet Droppo, III has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20200335119
    Abstract: Embodiments are associated with determination of a first plurality of multi-dimensional vectors, each of the first plurality of multi-dimensional vectors representing speech of a target speaker, determination of a multi-dimensional vector representing a speech signal of two or more speakers, determination of a weighted vector representing speech of the target speaker based on the first plurality of multi-dimensional vectors and on similarities between the multi-dimensional vector and each of the first plurality of multi-dimensional vectors, and extraction of speech of the target speaker from the speech signal based on the weighted vector and the speech signal.
    Type: Application
    Filed: June 7, 2019
    Publication date: October 22, 2020
    Inventors: Xiong XIAO, Zhuo CHEN, Takuya YOSHIOKA, Changliang LIU, Hakan ERDOGAN, Dimitrios Basile DIMITRIADIS, Yifan GONG, James Garnet Droppo, III
  • Patent number: 9082403
    Abstract: The subject disclosure is directed towards training a classifier for spoken utterances without relying on human-assistance. The spoken utterances may be related to a voice menu program for which a speech comprehension component interprets the spoken utterances into voice menu options. The speech comprehension component provides confirmations to some of the spoken utterances in order to accurately assign a semantic label. For each spoken utterance with a denied confirmation, the speech comprehension component automatically generates a pseudo-semantic label that is consistent with the denied confirmation and selected from a set of potential semantic labels and updates a classification model associated with the classifier using the pseudo-semantic label.
    Type: Grant
    Filed: December 15, 2011
    Date of Patent: July 14, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yun-Cheng Ju, James Garnet Droppo, III
  • Publication number: 20130159000
    Abstract: The subject disclosure is directed towards training a classifier for spoken utterances without relying on human-assistance. The spoken utterances may be related to a voice menu program for which a speech comprehension component interprets the spoken utterances into voice menu options. The speech comprehension component provides confirmations to some of the spoken utterances in order to accurately assign a semantic label. For each spoken utterance with a denied confirmation, the speech comprehension component automatically generates a pseudo-semantic label that is consistent with the denied confirmation and selected from a set of potential semantic labels and updates a classification model associated with the classifier using the pseudo-semantic label.
    Type: Application
    Filed: December 15, 2011
    Publication date: June 20, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Yun-Cheng Ju, James Garnet Droppo, III
  • Patent number: 8401852
    Abstract: A computer-implemented speech recognition system described herein includes a receiver component that receives a plurality of detected units of an audio signal, wherein the audio signal comprises a speech utterance of an individual. A selector component selects a subset of the plurality of detected units that correspond to a particular time-span. A generator component generates at least one feature with respect to the particular time-span, wherein the at least one feature is one of an existence feature, an expectation feature, or an edit distance feature. Additionally, a statistical speech recognition model outputs at least one word that corresponds to the particular time-span based at least in part upon the at least one feature generated by the feature generator component.
    Type: Grant
    Filed: November 30, 2009
    Date of Patent: March 19, 2013
    Assignee: Microsoft Corporation
    Inventors: Geoffrey Gerson Zweig, Patrick An-Phu Nguyen, James Garnet Droppo, III, Alejandro Acero
  • Publication number: 20110307252
    Abstract: Described is the use of utterance classification based methods and other machine learning techniques to provide a telephony application or other voice menu application (e.g., an automotive application) that need not use Context-Free-Grammars to determine a user's spoken intent. A classifier receives text from an information retrieval-based speech recognizer and outputs a semantic label corresponding to the likely intent of a user's speech. The semantic label is then output, such as for use by a voice menu program in branching between menus. Also described is training, including training the language model from acoustic data without transcriptions, and training the classifier from speech-recognized acoustic data having associated semantic labels.
    Type: Application
    Filed: June 15, 2010
    Publication date: December 15, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: Yun-Cheng Ju, James Garnet Droppo, III
  • Publication number: 20110224982
    Abstract: Described is a technology in which information retrieval (IR) techniques are used in a speech recognition (ASR) system. Acoustic units (e.g., phones, syllables, multi-phone units, words and/or phrases) are decoded, and features found from those acoustic units. The features are then used with IR techniques (e.g., TF-IDF based retrieval) to obtain a target output (a word or words).
    Type: Application
    Filed: March 12, 2010
    Publication date: September 15, 2011
    Applicant: c/o Microsoft Corporation
    Inventors: Alejandro Acero, James Garnet Droppo, III, Xiaoqiang Xiao, Geoffrey G. Zweig
  • Publication number: 20110131046
    Abstract: A computer-implemented speech recognition system described herein includes a receiver component that receives a plurality of detected units of an audio signal, wherein the audio signal comprises a speech utterance of an individual. A selector component selects a subset of the plurality of detected units that correspond to a particular time-span. A generator component generates at least one feature with respect to the particular time-span, wherein the at least one feature is one of an existence feature, an expectation feature, or an edit distance feature. Additionally, a statistical speech recognition model outputs at least one word that corresponds to the particular time-span based at least in part upon the at least one feature generated by the feature generator component.
    Type: Application
    Filed: November 30, 2009
    Publication date: June 2, 2011
    Applicant: Microsoft Corporation
    Inventors: Geoffrey Gerson Zweig, Patrick An-Phu Nguyen, James Garnet Droppo, III, Alejandro Acero