Patents by Inventor James G. Droppo

James G. Droppo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Self-stabilized deep neural network

Patent number: 10885438

Abstract: A neural network is structured with a plurality of levels of nodes. Each level has a level-specific stabilization parameter that adjusts a learning rate, at a corresponding level, during training. The stabilization parameter has a value that varies inversely relative to a change in an objective training function during back-propagation of the error through the level.

Type: Grant

Filed: December 28, 2015

Date of Patent: January 5, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: James G. Droppo, Pegah Ghahremani, Avner May
Linearly augmented neural network

Patent number: 10558909

Abstract: A neural network is structured to connect the input values of an input set, at each level, to that level's output using a linear bypass connection. The linear bypass connection passes the input values, to the output, without applying a non-linear function to them.

Type: Grant

Filed: December 28, 2015

Date of Patent: February 11, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: James G. Droppo, Pegah Ghahremani, Michael L. Seltzer
Prediction-based sequence recognition

Patent number: 9824684

Abstract: A sequence recognition system comprises a prediction component configured to receive a set of observed features from a signal to be recognized and to output a prediction output indicative of a predicted recognition based on the set of observed features. The sequence recognition system also comprises a classification component configured to receive the prediction output and to output a label indicative of recognition of the signal based on the prediction output.

Type: Grant

Filed: December 22, 2014

Date of Patent: November 21, 2017

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Dong Yu, Yu Zhang, Michael L. Seltzer, James G. Droppo
Dual-band speech encoding and estimating a narrowband speech feature from a wideband speech feature

Patent number: 9786284

Abstract: This document describes various techniques for dual-band speech encoding. In some embodiments, a first type of speech feature is received from a remote entity, an estimate of a second type of speech feature is determined based on the first type of speech feature, the estimate of the second type of speech feature is provided to a speech recognizer, speech-recognition results based on the estimate of the second type of speech feature are received from the speech recognizer, and the speech-recognition results are transmitted to the remote entity.

Type: Grant

Filed: August 14, 2014

Date of Patent: October 10, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Alejandro Acero, James G. Droppo, III, Michael L. Seltzer
LINEARLY AUGMENTED NEURAL NETWORK

Publication number: 20170185887

Abstract: A neural network is structured to connect the input values of an input set, at each level, to that level's output using a linear bypass connection. The linear bypass connection passes the input values, to the output, without applying a non-linear function to them.

Type: Application

Filed: December 28, 2015

Publication date: June 29, 2017

Inventors: James G. Droppo, Pegah Ghahremani, Michael L. Seltzer
SELF-STABILIZED DEEP NEURAL NETWORK

Publication number: 20170185897

Abstract: A neural network is structured with a plurality of levels of nodes. Each level has a level-specific stabilization parameter that adjusts a learning rate, at a corresponding level, during training.

Type: Application

Filed: December 28, 2015

Publication date: June 29, 2017

Inventors: James G. Droppo, Pegah Ghahremani, Avner May
PREDICTION-BASED SEQUENCE RECOGNITION

Publication number: 20160140956

Abstract: A sequence recognition system comprises a prediction component configured to receive a set of observed features from a signal to be recognized and to output a prediction output indicative of a predicted recognition based on the set of observed features. The sequence recognition system also comprises a classification component configured to receive the prediction output and to output a label indicative of recognition of the signal based on the prediction output.

Type: Application

Filed: December 22, 2014

Publication date: May 19, 2016

Inventors: Dong Yu, Yu Zhang, Michael L. Seltzer, James G. Droppo
Dual-Band Speech Encoding

Publication number: 20140358525

Abstract: This document describes various techniques for dual-band speech encoding. In some embodiments, a first type of speech feature is received from a remote entity, an estimate of a second type of speech feature is determined based on the first type of speech feature, the estimate of the second type of speech feature is provided to a speech recognizer, speech-recognition results based on the estimate of the second type of speech feature are received from the speech recognizer, and the speech-recognition results are transmitted to the remote entity.

Type: Application

Filed: August 14, 2014

Publication date: December 4, 2014

Inventors: Alejandro Acero, James G. Droppo, III, Michael L. Seltzer
Dual-band speech encoding

Patent number: 8818797

Abstract: This document describes various techniques for dual-band speech encoding. In some embodiments, a first type of speech feature is received from a remote entity, an estimate of a second type of speech feature is determined based on the first type of speech feature, the estimate of the second type of speech feature is provided to a speech recognizer, speech-recognition results based on the estimate of the second type of speech feature are received from the speech recognizer, and the speech-recognition results are transmitted to the remote entity.

Type: Grant

Filed: December 23, 2010

Date of Patent: August 26, 2014

Assignee: Microsoft Corporation

Inventors: Alejandro Acero, James G. Droppo, III, Michael L. Seltzer
Noise suppressor for speech recognition

Patent number: 8615393

Abstract: A noise suppressor for altering a speech signal is trained based on a speech recognition system. An objective function can be utilized to adjust parameters of the noise suppressor. The noise suppressor can be used to alter speech signals for the speech recognition system.

Type: Grant

Filed: November 15, 2006

Date of Patent: December 24, 2013

Assignee: Microsoft Corporation

Inventors: Ivan J. Tashev, Alejandro Acero, James G. Droppo
Warped spectral and fine estimate audio encoding

Patent number: 8532985

Abstract: A warped spectral estimate of an original audio signal can be used to encode a representation of a fine estimate of the original signal. The representation of the warped spectral estimate and the representation of the fine estimate can be sent to a speech recognition system. The representation of the warped spectral estimate can be passed to a speech recognition engine, where it may be used for speech recognition. The representation of the warped spectral estimate can also be used along with the representation of the fine estimate to reconstruct a representation of the original audio signal.

Type: Grant

Filed: December 3, 2010

Date of Patent: September 10, 2013

Assignee: Microsoft Coporation

Inventors: Michael L. Seltzer, James G. Droppo, Henrique S. Malvar, Alejandro Acero, Xing Fan
Loudspeaker array design

Patent number: 8379891

Abstract: Sound signals to be output from a loudspeaker array are modified by a plurality of filters designed according to an unconstrained optimization procedure to improve overall performance (e.g., power, directivity) of the loudspeaker array. More particularly, respective filters are configured to receive a signal to be output to a plurality of loudspeakers. Upon receiving the signal, the respective filters individually modify the received signal according to the results of the unconstrained optimization procedure and then output the individually modified signals to respective loudspeakers. The unconstrained optimization procedure takes into account manufacturing tolerances and individually enhances the signal output to each of a plurality of individual loudspeakers within an array to achieve an overall improvement in performance.

Type: Grant

Filed: June 4, 2008

Date of Patent: February 19, 2013

Assignee: Microsoft Corporation

Inventors: Ivan J. Tashev, James G. Droppo, Michael L. Seltzer, Alejandro Acero
Speech recognition with non-linear noise reduction on Mel-frequency cepstra

Patent number: 8306817

Abstract: In an automatic speech recognition system, a feature extractor extracts features from a speech signal, and speech is recognized by the automatic speech recognition system based on the extracted features. Noise reduction as part of the feature extractor is provided by feature enhancement in which feature-domain noise reduction in the form of Mel-frequency cepstra is provided based on the minimum means square error criterion. Specifically, the devised method takes into account the random phase between the clean speech and the mixing noise. The feature-domain noise reduction is performed in a dimension-wise fashion to the individual dimensions of the feature vectors input to the automatic speech recognition system, in order to perform environment-robust speech recognition.

Type: Grant

Filed: January 8, 2008

Date of Patent: November 6, 2012

Assignee: Microsoft Corporation

Inventors: Dong Yu, Alejandro Acero, James G. Droppo, Li Deng
Dual-Band Speech Encoding

Publication number: 20120166186

Abstract: This document describes various techniques for dual-band speech encoding. In some embodiments, a first type of speech feature is received from a remote entity, an estimate of a second type of speech feature is determined based on the first type of speech feature, the estimate of the second type of speech feature is provided to a speech recognizer, speech-recognition results based on the estimate of the second type of speech feature are received from the speech recognizer, and the speech-recognition results are transmitted to the remote entity.

Type: Application

Filed: December 23, 2010

Publication date: June 28, 2012

Applicant: Microsoft Corporation

Inventors: Alejandro Acero, James G. Droppo, III, Michael L. Seltzer
WARPED SPECTRAL AND FINE ESTIMATE AUDIO ENCODING

Publication number: 20120143599

Abstract: A warped spectral estimate of an original audio signal can be used to encode a representation of a fine estimate of the original signal. The representation of the warped spectral estimate and the representation of the fine estimate can be sent to a speech recognition system. The representation of the warped spectral estimate can be passed to a speech recognition engine, where it may be used for speech recognition. The representation of the warped spectral estimate can also be used along with the representation of the fine estimate to reconstruct a representation of the original audio signal.

Type: Application

Filed: December 3, 2010

Publication date: June 7, 2012

Applicant: Microsoft Corporation

Inventors: Michael L. Seltzer, James G. Droppo, Henrique S. Malvar, Alejandro Acero, Xing Fan
Pitch model for noise estimation

Patent number: 8180636

Abstract: Pitch is tracked for individual samples, which are taken much more frequently than an analysis frame. Speech is identified based on the tracked pitch and the speech components of the signal are removed with a time-varying filter, leaving only an estimate of a time-varying speech signal. This estimate is then used to generate a time-varying noise model which, in turn, can be used to enhance speech related systems.

Type: Grant

Filed: March 7, 2011

Date of Patent: May 15, 2012

Assignee: Microsoft Corporation

Inventors: James G. Droppo, Alejandro Acero, Luis Buera
PITCH MODEL FOR NOISE ESTIMATION

Publication number: 20110161078

Abstract: Pitch is tracked for individual samples, which are taken much more frequently than an analysis frame. Speech is identified based on the tracked pitch and the speech components of the signal are removed with a time-varying filter, leaving only an estimate of a time-varying speech signal. This estimate is then used to generate a time-varying noise model which, in turn, can be used to enhance speech related systems.

Type: Application

Filed: March 7, 2011

Publication date: June 30, 2011

Applicant: Microsoft Corporation

Inventors: James G. Droppo, Alejandro Acero, Luis Buera
Pitch model for noise estimation

Patent number: 7925502

Abstract: Pitch is tracked for individual samples, which are taken much more frequently than an analysis frame. Speech is identified based on the tracked pitch and the speech components of the signal are removed with a time-varying filter, leaving only an estimate of a time-varying speech signal. This estimate is then used to generate a time-varying noise model which, in turn, can be used to enhance speech related systems.

Type: Grant

Filed: April 19, 2007

Date of Patent: April 12, 2011

Assignee: Microsoft Corporation

Inventors: James G. Droppo, Alejandro Acero, Luis Buera
Joint training of feature extraction and acoustic model parameters for speech recognition

Patent number: 7885812

Abstract: Parameters for a feature extractor and acoustic model of a speech recognition module are trained. An objective function is utilized to determine values for the feature extractor parameters and the acoustic model parameters.

Type: Grant

Filed: November 15, 2006

Date of Patent: February 8, 2011

Assignee: Microsoft Corporation

Inventors: Alejandro Acero, James G. Droppo, Milind V. Mahajan
Method of pattern recognition using noise reduction uncertainty

Patent number: 7769582

Abstract: A method and apparatus are provided for using the uncertainty of a noise-removal process during pattern recognition. In particular, noise is removed from a representation of a portion of a noisy signal to produce a representation of a cleaned signal. In the meantime, an uncertainty associated with the noise removal is computed and is used with the representation of the cleaned signal to modify a probability for a phonetic state in the recognition system. In particular embodiments, the uncertainty is used to modify a probability distribution, by increasing the variance in each Gaussian distribution by the amount equal to the estimated variance of the cleaned signal, which is used in decoding the phonetic state sequence in a pattern recognition task.

Type: Grant

Filed: July 25, 2008

Date of Patent: August 3, 2010

Assignee: Microsoft Corporation

Inventors: James G. Droppo, Alejandro Acero, Li Deng

1 2 3 4 next