Patents by Inventor James Garnet Droppo, III

James Garnet Droppo, III has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SPEECH EXTRACTION USING ATTENTION NETWORK

Publication number: 20200335119

Abstract: Embodiments are associated with determination of a first plurality of multi-dimensional vectors, each of the first plurality of multi-dimensional vectors representing speech of a target speaker, determination of a multi-dimensional vector representing a speech signal of two or more speakers, determination of a weighted vector representing speech of the target speaker based on the first plurality of multi-dimensional vectors and on similarities between the multi-dimensional vector and each of the first plurality of multi-dimensional vectors, and extraction of speech of the target speaker from the speech signal based on the weighted vector and the speech signal.

Type: Application

Filed: June 7, 2019

Publication date: October 22, 2020

Inventors: Xiong XIAO, Zhuo CHEN, Takuya YOSHIOKA, Changliang LIU, Hakan ERDOGAN, Dimitrios Basile DIMITRIADIS, Yifan GONG, James Garnet Droppo, III
Spoken utterance classification training for a speech recognition system

Patent number: 9082403

Abstract: The subject disclosure is directed towards training a classifier for spoken utterances without relying on human-assistance. The spoken utterances may be related to a voice menu program for which a speech comprehension component interprets the spoken utterances into voice menu options. The speech comprehension component provides confirmations to some of the spoken utterances in order to accurately assign a semantic label. For each spoken utterance with a denied confirmation, the speech comprehension component automatically generates a pseudo-semantic label that is consistent with the denied confirmation and selected from a set of potential semantic labels and updates a classification model associated with the classifier using the pseudo-semantic label.

Type: Grant

Filed: December 15, 2011

Date of Patent: July 14, 2015

Assignee: Microsoft Technology Licensing, LLC

Inventors: Yun-Cheng Ju, James Garnet Droppo, III
Spoken Utterance Classification Training for a Speech Recognition System

Publication number: 20130159000

Abstract: The subject disclosure is directed towards training a classifier for spoken utterances without relying on human-assistance. The spoken utterances may be related to a voice menu program for which a speech comprehension component interprets the spoken utterances into voice menu options. The speech comprehension component provides confirmations to some of the spoken utterances in order to accurately assign a semantic label. For each spoken utterance with a denied confirmation, the speech comprehension component automatically generates a pseudo-semantic label that is consistent with the denied confirmation and selected from a set of potential semantic labels and updates a classification model associated with the classifier using the pseudo-semantic label.

Type: Application

Filed: December 15, 2011

Publication date: June 20, 2013

Applicant: MICROSOFT CORPORATION

Inventors: Yun-Cheng Ju, James Garnet Droppo, III
Utilizing features generated from phonic units in speech recognition

Patent number: 8401852

Abstract: A computer-implemented speech recognition system described herein includes a receiver component that receives a plurality of detected units of an audio signal, wherein the audio signal comprises a speech utterance of an individual. A selector component selects a subset of the plurality of detected units that correspond to a particular time-span. A generator component generates at least one feature with respect to the particular time-span, wherein the at least one feature is one of an existence feature, an expectation feature, or an edit distance feature. Additionally, a statistical speech recognition model outputs at least one word that corresponds to the particular time-span based at least in part upon the at least one feature generated by the feature generator component.

Type: Grant

Filed: November 30, 2009

Date of Patent: March 19, 2013

Assignee: Microsoft Corporation

Inventors: Geoffrey Gerson Zweig, Patrick An-Phu Nguyen, James Garnet Droppo, III, Alejandro Acero
Using Utterance Classification in Telephony and Speech Recognition Applications

Publication number: 20110307252

Abstract: Described is the use of utterance classification based methods and other machine learning techniques to provide a telephony application or other voice menu application (e.g., an automotive application) that need not use Context-Free-Grammars to determine a user's spoken intent. A classifier receives text from an information retrieval-based speech recognizer and outputs a semantic label corresponding to the likely intent of a user's speech. The semantic label is then output, such as for use by a voice menu program in branching between menus. Also described is training, including training the language model from acoustic data without transcriptions, and training the classifier from speech-recognized acoustic data having associated semantic labels.

Type: Application

Filed: June 15, 2010

Publication date: December 15, 2011

Applicant: MICROSOFT CORPORATION

Inventors: Yun-Cheng Ju, James Garnet Droppo, III
AUTOMATIC SPEECH RECOGNITION BASED UPON INFORMATION RETRIEVAL METHODS

Publication number: 20110224982

Abstract: Described is a technology in which information retrieval (IR) techniques are used in a speech recognition (ASR) system. Acoustic units (e.g., phones, syllables, multi-phone units, words and/or phrases) are decoded, and features found from those acoustic units. The features are then used with IR techniques (e.g., TF-IDF based retrieval) to obtain a target output (a word or words).

Type: Application

Filed: March 12, 2010

Publication date: September 15, 2011

Applicant: c/o Microsoft Corporation

Inventors: Alejandro Acero, James Garnet Droppo, III, Xiaoqiang Xiao, Geoffrey G. Zweig
FEATURES FOR UTILIZATION IN SPEECH RECOGNITION

Publication number: 20110131046

Abstract: A computer-implemented speech recognition system described herein includes a receiver component that receives a plurality of detected units of an audio signal, wherein the audio signal comprises a speech utterance of an individual. A selector component selects a subset of the plurality of detected units that correspond to a particular time-span. A generator component generates at least one feature with respect to the particular time-span, wherein the at least one feature is one of an existence feature, an expectation feature, or an edit distance feature. Additionally, a statistical speech recognition model outputs at least one word that corresponds to the particular time-span based at least in part upon the at least one feature generated by the feature generator component.

Type: Application

Filed: November 30, 2009

Publication date: June 2, 2011

Applicant: Microsoft Corporation

Inventors: Geoffrey Gerson Zweig, Patrick An-Phu Nguyen, James Garnet Droppo, III, Alejandro Acero

SPEECH EXTRACTION USING ATTENTION NETWORK

Spoken utterance classification training for a speech recognition system

Spoken Utterance Classification Training for a Speech Recognition System

Utilizing features generated from phonic units in speech recognition

Using Utterance Classification in Telephony and Speech Recognition Applications

AUTOMATIC SPEECH RECOGNITION BASED UPON INFORMATION RETRIEVAL METHODS

FEATURES FOR UTILIZATION IN SPEECH RECOGNITION