Patents by Inventor Michael L. Seltzer

Michael L. Seltzer has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Linearly augmented neural network

Patent number: 10558909

Abstract: A neural network is structured to connect the input values of an input set, at each level, to that level's output using a linear bypass connection. The linear bypass connection passes the input values, to the output, without applying a non-linear function to them.

Type: Grant

Filed: December 28, 2015

Date of Patent: February 11, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: James G. Droppo, Pegah Ghahremani, Michael L. Seltzer
Prediction-based sequence recognition

Patent number: 9824684

Abstract: A sequence recognition system comprises a prediction component configured to receive a set of observed features from a signal to be recognized and to output a prediction output indicative of a predicted recognition based on the set of observed features. The sequence recognition system also comprises a classification component configured to receive the prediction output and to output a label indicative of recognition of the signal based on the prediction output.

Type: Grant

Filed: December 22, 2014

Date of Patent: November 21, 2017

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Dong Yu, Yu Zhang, Michael L. Seltzer, James G. Droppo
Dual-band speech encoding and estimating a narrowband speech feature from a wideband speech feature

Patent number: 9786284

Abstract: This document describes various techniques for dual-band speech encoding. In some embodiments, a first type of speech feature is received from a remote entity, an estimate of a second type of speech feature is determined based on the first type of speech feature, the estimate of the second type of speech feature is provided to a speech recognizer, speech-recognition results based on the estimate of the second type of speech feature are received from the speech recognizer, and the speech-recognition results are transmitted to the remote entity.

Type: Grant

Filed: August 14, 2014

Date of Patent: October 10, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Alejandro Acero, James G. Droppo, III, Michael L. Seltzer
Mixed speech recognition

Patent number: 9779727

Abstract: The claimed subject matter includes a system and method for recognizing mixed speech from a source. The method includes training a first neural network to recognize the speech signal spoken by the speaker with a higher level of a speech characteristic from a mixed speech sample. The method also includes training a second neural network to recognize the speech signal spoken by the speaker with a lower level of the speech characteristic from the mixed speech sample. Additionally, the method includes decoding the mixed speech sample with the first neural network and the second neural network by optimizing the joint likelihood of observing the two speech signals considering the probability that a specific frame is a switching point of the speech characteristic.

Type: Grant

Filed: December 30, 2016

Date of Patent: October 3, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Dong Yu, Chao Weng, Michael L. Seltzer, James Droppo
LINEARLY AUGMENTED NEURAL NETWORK

Publication number: 20170185887

Abstract: A neural network is structured to connect the input values of an input set, at each level, to that level's output using a linear bypass connection. The linear bypass connection passes the input values, to the output, without applying a non-linear function to them.

Type: Application

Filed: December 28, 2015

Publication date: June 29, 2017

Inventors: James G. Droppo, Pegah Ghahremani, Michael L. Seltzer
MIXED SPEECH RECOGNITION

Publication number: 20170110120

Abstract: The claimed subject matter includes a system and method for recognizing mixed speech from a source. The method includes training a first neural network to recognize the speech signal spoken by the speaker with a higher level of a speech characteristic from a mixed speech sample. The method also includes training a second neural network to recognize the speech signal spoken by the speaker with a lower level of the speech characteristic from the mixed speech sample. Additionally, the method includes decoding the mixed speech sample with the first neural network and the second neural network by optimizing the joint likelihood of observing the two speech signals considering the probability that a specific frame is a switching point of the speech characteristic.

Type: Application

Filed: December 30, 2016

Publication date: April 20, 2017

Applicant: Microsoft Technology Licensing, LLC

Inventors: Dong Yu, Chao Weng, Michael L. Seltzer, James Droppo
Mixed speech recognition

Patent number: 9558742

Abstract: The claimed subject matter includes a system and method for recognizing mixed speech from a source. The method includes training a first neural network to recognize the speech signal spoken by the speaker with a higher level of a speech characteristic from a mixed speech sample. The method also includes training a second neural network to recognize the speech signal spoken by the speaker with a lower level of the speech characteristic from the mixed speech sample. Additionally, the method includes decoding the mixed speech sample with the first neural network and the second neural network by optimizing the joint likelihood of observing the two speech signals considering the probability that a specific frame is a switching point of the speech characteristic.

Type: Grant

Filed: June 8, 2016

Date of Patent: January 31, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Dong Yu, Chao Weng, Michael L. Seltzer, James Droppo
MIXED SPEECH RECOGNITION

Publication number: 20160284348

Abstract: The claimed subject matter includes a system and method for recognizing mixed speech from a source. The method includes training a first neural network to recognize the speech signal spoken by the speaker with a higher level of a speech characteristic from a mixed speech sample. The method also includes training a second neural network to recognize the speech signal spoken by the speaker with a lower level of the speech characteristic from the mixed speech sample. Additionally, the method includes decoding the mixed speech sample with the first neural network and the second neural network by optimizing the joint likelihood of observing the two speech signals considering the probability that a specific frame is a switching point of the speech characteristic.

Type: Application

Filed: June 8, 2016

Publication date: September 29, 2016

Applicant: Microsoft Technology Licensing, LLC

Inventors: Dong Yu, Chao Weng, Michael L. Seltzer, James Droppo
Mixed speech recognition

Patent number: 9390712

Abstract: The claimed subject matter includes a system and method for recognizing mixed speech from a source. The method includes training a first neural network to recognize the speech signal spoken by the speaker with a higher level of a speech characteristic from a mixed speech sample. The method also includes training a second neural network to recognize the speech signal spoken by the speaker with a lower level of the speech characteristic from the mixed speech sample. Additionally, the method includes decoding the mixed speech sample with the first neural network and the second neural network by optimizing the joint likelihood of observing the two speech signals considering the probability that a specific frame is a switching point of the speech characteristic.

Type: Grant

Filed: March 24, 2014

Date of Patent: July 12, 2016

Assignee: Microsoft Technology Licensing, LLC.

Inventors: Dong Yu, Chao Weng, Michael L. Seltzer, James Droppo
PREDICTION-BASED SEQUENCE RECOGNITION

Publication number: 20160140956

Abstract: A sequence recognition system comprises a prediction component configured to receive a set of observed features from a signal to be recognized and to output a prediction output indicative of a predicted recognition based on the set of observed features. The sequence recognition system also comprises a classification component configured to receive the prediction output and to output a label indicative of recognition of the signal based on the prediction output.

Type: Application

Filed: December 22, 2014

Publication date: May 19, 2016

Inventors: Dong Yu, Yu Zhang, Michael L. Seltzer, James G. Droppo
Low-footprint adaptation and personalization for a deep neural network

Patent number: 9324321

Abstract: The adaptation and personalization of a deep neural network (DNN) model for automatic speech recognition is provided. An utterance which includes speech features for one or more speakers may be received in ASR tasks such as voice search or short message dictation. A decomposition approach may then be applied to an original matrix in the DNN model. In response to applying the decomposition approach, the original matrix may be converted into multiple new matrices which are smaller than the original matrix. A square matrix may then be added to the new matrices. Speaker-specific parameters may then be stored in the square matrix. The DNN model may then be adapted by updating the square matrix. This process may be applied to all of a number of original matrices in the DNN model. The adapted DNN model may include a reduced number of parameters than those received in the original DNN model.

Type: Grant

Filed: March 7, 2014

Date of Patent: April 26, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jian Xue, Jinyu Li, Dong Yu, Michael L. Seltzer, Yifan Gong
MIXED SPEECH RECOGNITION

Publication number: 20150269933

Abstract: The claimed subject matter includes a system and method for recognizing mixed speech from a source. The method includes training a first neural network to recognize the speech signal spoken by the speaker with a higher level of a speech characteristic from a mixed speech sample. The method also includes training a second neural network to recognize the speech signal spoken by the speaker with a lower level of the speech characteristic from the mixed speech sample. Additionally, the method includes decoding the mixed speech sample with the first neural network and the second neural network by optimizing the joint likelihood of observing the two speech signals considering the probability that a specific frame is a switching point of the speech characteristic.

Type: Application

Filed: March 24, 2014

Publication date: September 24, 2015

Inventors: Dong Yu, Chao Weng, Michael L. Seltzer, James Droppo
LOW-FOOTPRINT ADAPTATION AND PERSONALIZATION FOR A DEEP NEURAL NETWORK

Publication number: 20150255061

Abstract: The adaptation and personalization of a deep neural network (DNN) model for automatic speech recognition is provided. An utterance which includes speech features for one or more speakers may be received in ASR tasks such as voice search or short message dictation. A decomposition approach may then be applied to an original matrix in the DNN model. In response to applying the decomposition approach, the original matrix may be converted into multiple new matrices which are smaller than the original matrix. A square matrix may then be added to the new matrices. Speaker-specific parameters may then be stored in the square matrix. The DNN model may then be adapted by updating the square matrix. This process may be applied to all of a number of original matrices in the DNN model. The adapted DNN model may include a reduced number of parameters than those received in the original DNN model.

Type: Application

Filed: March 7, 2014

Publication date: September 10, 2015

Applicant: MICROSOFT CORPORATION

Inventors: Jian Xue, Jinyu Li, Dong Yu, Michael L. Seltzer, Yifan Gong
Dual-Band Speech Encoding

Publication number: 20140358525

Abstract: This document describes various techniques for dual-band speech encoding. In some embodiments, a first type of speech feature is received from a remote entity, an estimate of a second type of speech feature is determined based on the first type of speech feature, the estimate of the second type of speech feature is provided to a speech recognizer, speech-recognition results based on the estimate of the second type of speech feature are received from the speech recognizer, and the speech-recognition results are transmitted to the remote entity.

Type: Application

Filed: August 14, 2014

Publication date: December 4, 2014

Inventors: Alejandro Acero, James G. Droppo, III, Michael L. Seltzer
Dual-band speech encoding

Patent number: 8818797

Abstract: This document describes various techniques for dual-band speech encoding. In some embodiments, a first type of speech feature is received from a remote entity, an estimate of a second type of speech feature is determined based on the first type of speech feature, the estimate of the second type of speech feature is provided to a speech recognizer, speech-recognition results based on the estimate of the second type of speech feature are received from the speech recognizer, and the speech-recognition results are transmitted to the remote entity.

Type: Grant

Filed: December 23, 2010

Date of Patent: August 26, 2014

Assignee: Microsoft Corporation

Inventors: Alejandro Acero, James G. Droppo, III, Michael L. Seltzer
Warped spectral and fine estimate audio encoding

Patent number: 8532985

Abstract: A warped spectral estimate of an original audio signal can be used to encode a representation of a fine estimate of the original signal. The representation of the warped spectral estimate and the representation of the fine estimate can be sent to a speech recognition system. The representation of the warped spectral estimate can be passed to a speech recognition engine, where it may be used for speech recognition. The representation of the warped spectral estimate can also be used along with the representation of the fine estimate to reconstruct a representation of the original audio signal.

Type: Grant

Filed: December 3, 2010

Date of Patent: September 10, 2013

Assignee: Microsoft Coporation

Inventors: Michael L. Seltzer, James G. Droppo, Henrique S. Malvar, Alejandro Acero, Xing Fan
Incorporating prior knowledge into independent component analysis

Patent number: 8515096

Abstract: The quality of sound recorded from a plurality of people speaking at the same time is improved by incorporating prior knowledge into an independent component analysis (ICA) separating algorithm. More particularly, prior knowledge is defined as a probability distribution according to some prior situation (e.g., prior distribution of people in a room). A mixture of sounds (e.g., mixture of voices) from a plurality of sources (e.g., people) captured by one or more recording devices (e.g., microphones) is separated into individual components (e.g., individual voices from respective people) by applying an maximum a posteriori (MAP) ICA algorithm which incorporates prior knowledge of the respective sources (e.g., location of sources) directly into the MAP ICA algorithm thereby allowing recovery of independent underlying sounds associated with individual sources from the mixture.

Type: Grant

Filed: June 18, 2008

Date of Patent: August 20, 2013

Assignee: Microsoft Corporation

Inventors: Michael L. Seltzer, Graham Taylor, Alejandro Acero
Loudspeaker array design

Patent number: 8379891

Abstract: Sound signals to be output from a loudspeaker array are modified by a plurality of filters designed according to an unconstrained optimization procedure to improve overall performance (e.g., power, directivity) of the loudspeaker array. More particularly, respective filters are configured to receive a signal to be output to a plurality of loudspeakers. Upon receiving the signal, the respective filters individually modify the received signal according to the results of the unconstrained optimization procedure and then output the individually modified signals to respective loudspeakers. The unconstrained optimization procedure takes into account manufacturing tolerances and individually enhances the signal output to each of a plurality of individual loudspeakers within an array to achieve an overall improvement in performance.

Type: Grant

Filed: June 4, 2008

Date of Patent: February 19, 2013

Assignee: Microsoft Corporation

Inventors: Ivan J. Tashev, James G. Droppo, Michael L. Seltzer, Alejandro Acero
Dual-Band Speech Encoding

Publication number: 20120166186

Abstract: This document describes various techniques for dual-band speech encoding. In some embodiments, a first type of speech feature is received from a remote entity, an estimate of a second type of speech feature is determined based on the first type of speech feature, the estimate of the second type of speech feature is provided to a speech recognizer, speech-recognition results based on the estimate of the second type of speech feature are received from the speech recognizer, and the speech-recognition results are transmitted to the remote entity.

Type: Application

Filed: December 23, 2010

Publication date: June 28, 2012

Applicant: Microsoft Corporation

Inventors: Alejandro Acero, James G. Droppo, III, Michael L. Seltzer
WARPED SPECTRAL AND FINE ESTIMATE AUDIO ENCODING

Publication number: 20120143599

Abstract: A warped spectral estimate of an original audio signal can be used to encode a representation of a fine estimate of the original signal. The representation of the warped spectral estimate and the representation of the fine estimate can be sent to a speech recognition system. The representation of the warped spectral estimate can be passed to a speech recognition engine, where it may be used for speech recognition. The representation of the warped spectral estimate can also be used along with the representation of the fine estimate to reconstruct a representation of the original audio signal.

Type: Application

Filed: December 3, 2010

Publication date: June 7, 2012

Applicant: Microsoft Corporation

Inventors: Michael L. Seltzer, James G. Droppo, Henrique S. Malvar, Alejandro Acero, Xing Fan

1 2 next