Patents by Inventor Brian E. D. Kingsbury

Brian E. D. Kingsbury has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Learning front-end speech recognition parameters within neural network training

Patent number: 10360901

Abstract: Techniques for learning front-end speech recognition parameters as part of training a neural network classifier include obtaining an input speech signal, and applying front-end speech recognition parameters to extract features from the input speech signal. The extracted features may be fed through a neural network to obtain an output classification for the input speech signal, and an error measure may be computed for the output classification through comparison of the output classification with a known target classification. Back propagation may be applied to adjust one or more of the front-end parameters as one or more layers of the neural network, based on the error measure.

Type: Grant

Filed: December 5, 2014

Date of Patent: July 23, 2019

Assignee: Nuance Communications, Inc.

Inventors: Tara N. Sainath, Brian E. D. Kingsbury, Abdel-rahman Mohamed, Bhuvana Ramabhadran
Systems and methods for accelerating hessian-free optimization for deep neural networks by implicit preconditioning and sampling

Patent number: 10056075

Abstract: A method for training a deep neural network, comprises receiving and formatting speech data for the training, preconditioning a system of equations to be used for analyzing the speech data in connection with the training by using a non-fixed point quasi-Newton preconditioning scheme, and employing flexible Krylov subspace solvers in response to variations in the preconditioning scheme for different iterations of the training.

Type: Grant

Filed: December 9, 2016

Date of Patent: August 21, 2018

Assignee: International Business Machines Corporation

Inventors: Lior Horesh, Brian E. D. Kingsbury, Tara N. Sainath
Data augmentation method based on stochastic feature mapping for automatic speech recognition

Patent number: 9824683

Abstract: A method of augmenting training data includes converting a feature sequence of a source speaker determined from a plurality of utterances within a transcript to a feature sequence of a target speaker under the same transcript, training a speaker-dependent acoustic model for the target speaker for corresponding speaker-specific acoustic characteristics, estimating a mapping function between the feature sequence of the source speaker and the speaker-dependent acoustic model of the target speaker, and mapping each utterance from each speaker in a training set using the mapping function to multiple selected target speakers in the training set.

Type: Grant

Filed: December 22, 2015

Date of Patent: November 21, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Xiaodong Cui, Vaibhava Goel, Brian E. D. Kingsbury
Method and system for efficient spoken term detection using confusion networks

Patent number: 9734823

Abstract: Systems and methods for spoken term detection are provided. A method for spoken term detection, comprises receiving phone level out-of-vocabulary (OOV) keyword queries, converting the phone level OOV keyword queries to words, generating a confusion network (CN) based keyword searching (KWS) index, and using the CN based KWS index for both in-vocabulary (IV) keyword queries and the OOV keyword queries.

Type: Grant

Filed: August 27, 2015

Date of Patent: August 15, 2017

Assignee: International Business Machines Corporation

Inventors: Brian E. D. Kingsbury, Hong-Kwang Kuo, Lidia Mangu, Hagen Soltau
Data augmentation method based on stochastic feature mapping for automatic speech recognition

Patent number: 9721559

Abstract: A method of augmenting training data includes converting a feature sequence of a source speaker determined from a plurality of utterances within a transcript to a feature sequence of a target speaker under the same transcript, training a speaker-dependent acoustic model for the target speaker for corresponding speaker-specific acoustic characteristics, estimating a mapping function between the feature sequence of the source speaker and the speaker-dependent acoustic model of the target speaker, and mapping each utterance from each speaker in a training set using the mapping function to multiple selected target speakers in the training set.

Type: Grant

Filed: April 17, 2015

Date of Patent: August 1, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Xiaodong Cui, Vaibhava Goel, Brian E. D. Kingsbury
DATA AUGMENTATION METHOD BASED ON STOCHASTIC FEATURE MAPPING FOR AUTOMATIC SPEECH RECOGNITION

Publication number: 20170200446

Abstract: A method of augmenting training data includes converting a feature sequence of a source speaker determined from a plurality of utterances within a transcript to a feature sequence of a target speaker under the same transcript, training a speaker-dependent acoustic model for the target speaker for corresponding speaker-specific acoustic characteristics, estimating a mapping function between the feature sequence of the source speaker and the speaker-dependent acoustic model of the target speaker, and mapping each utterance from each speaker in a training set using the mapping function to multiple selected target speakers in the training set.

Type: Application

Filed: December 22, 2015

Publication date: July 13, 2017

Inventors: Xiaodong Cui, Vaibhava Goel, Brian E. D. Kingsbury
Method and system for order-free spoken term detection

Patent number: 9704482

Abstract: A method for spoken term detection, comprising generating a time-marked word list, wherein the time-marked word list is an output of an automatic speech recognition system, generating an index from the time-marked word list, wherein generating the index comprises creating a word loop weighted finite state transducer for each utterance, i, receiving a plurality of keyword queries, and searching the index for a plurality of keyword hits.

Type: Grant

Filed: March 11, 2015

Date of Patent: July 11, 2017

Assignee: International Business Machines Corporation

Inventors: Brian E. D. Kingsbury, Lidia Mangu, Michael A. Picheny, George A. Saon
Method and system for order-free spoken term detection

Patent number: 9697830

Abstract: A method for spoken term detection, comprising generating a time-marked word list, wherein the time-marked word list is an output of an automatic speech recognition system, generating an index from the time-marked word list, wherein generating the index comprises creating a word loop weighted finite state transducer for each utterance, i, receiving a plurality of keyword queries, and searching the index for a plurality of keyword hits.

Type: Grant

Filed: June 25, 2015

Date of Patent: July 4, 2017

Assignee: International Business Machines Corporation

Inventors: Brian E. D. Kingsbury, Lidia Mangu, Michael A. Picheny, George A. Saon
Deep scattering spectrum in acoustic modeling for speech recognition

Patent number: 9640186

Abstract: Deep scattering spectral features are extracted from an acoustic input signal to generate a deep scattering spectral feature representation of the acoustic input signal. The deep scattering spectral feature representation is input to a speech recognition engine. The acoustic input signal is decoded based on at least a portion of the deep scattering spectral feature representation input to a speech recognition engine.

Type: Grant

Filed: May 2, 2014

Date of Patent: May 2, 2017

Assignee: International Business Machines Corporation

Inventors: Petr Fousek, Vaibhava Goel, Brian E. D. Kingsbury, Etienne Marcheret, Shay Maymon, David Nahamoo, Vijayaditya Peddinti, Bhuvana Ramabhadran, Tara N. Sainath
SYSTEMS AND METHODS FOR ACCELERATING HESSIAN-FREE OPTIMIZATION FOR DEEP NEURAL NETWORKS BY IMPLICIT PRECONDITIONING AND SAMPLING

Publication number: 20170092263

Abstract: A method for training a deep neural network, comprises receiving and formatting speech data for the training, preconditioning a system of equations to be used for analyzing the speech data in connection with the training by using a non-fixed point quasi-Newton preconditioning scheme, and employing flexible Krylov subspace solvers in response to variations in the preconditioning scheme for different iterations of the training.

Type: Application

Filed: December 9, 2016

Publication date: March 30, 2017

Inventors: Lior Horesh, Brian E.D. Kingsbury, Tara N. Sainath
Systems and methods for accelerating hessian-free optimization for deep neural networks by implicit preconditioning and sampling

Patent number: 9601109

Abstract: A method for training a deep neural network, comprises receiving and formatting speech data for the training, preconditioning a system of equations to be used for analyzing the speech data in connection with the training by using a non-fixed point quasi-Newton preconditioning scheme, and employing flexible Krylov subspace solvers in response to variations in the preconditioning scheme for different iterations of the training.

Type: Grant

Filed: September 29, 2014

Date of Patent: March 21, 2017

Assignee: International Business Machines Corporation

Inventors: Lior Horesh, Brian E. D. Kingsbury, Tara N. Sainath
DATA AUGMENTATION METHOD BASED ON STOCHASTIC FEATURE MAPPING FOR AUTOMATIC SPEECH RECOGNITION

Publication number: 20170040016

Abstract: A method of augmenting training data includes converting a feature sequence of a source speaker determined from a plurality of utterances within a transcript to a feature sequence of a target speaker under the same transcript, training a speaker-dependent acoustic model for the target speaker for corresponding speaker-specific acoustic characteristics, estimating a mapping function between the feature sequence of the source speaker and the speaker-dependent acoustic model of the target speaker, and mapping each utterance from each speaker in a training set using the mapping function to multiple selected target speakers in the training set.

Type: Application

Filed: April 17, 2015

Publication date: February 9, 2017

Inventors: Xiaodong Cui, Vaibhava Goel, Brian E. D. Kingsbury
Classifier-based system combination for spoken term detection

Patent number: 9477753

Abstract: Systems and methods for processing a query include determining a plurality of sets of match candidates for a query using a processor, each of the plurality of sets of match candidates being independently determined from a plurality of diverse word lattice generation components of different type. The plurality of sets of match candidates is merged by generating a first score for each match candidate to provide a merged set of match candidates. A second score is computed for each match candidate of the merged set based upon features of that match candidate. The first score and the second score are combined to provide a final set of match candidates as matches to the query.

Type: Grant

Filed: March 12, 2013

Date of Patent: October 25, 2016

Assignee: International Business Machines Corporation

Inventors: Brian E. D. Kingsbury, Hong-Kwang Jeff Kuo, Lidia Luminita Mangu, Hagen Soltau
METHOD AND SYSTEM FOR ORDER-FREE SPOKEN TERM DETECTION

Publication number: 20160267906

Abstract: A method for spoken term detection, comprising generating a time-marked word list, wherein the time-marked word list is an output of an automatic speech recognition system, generating an index from the time-marked word list, wherein generating the index comprises creating a word loop weighted finite state transducer for each utterance, i, receiving a plurality of keyword queries, and searching the index for a plurality of keyword hits.

Type: Application

Filed: March 11, 2015

Publication date: September 15, 2016

Inventors: Brian E.D. Kingsbury, Lidia Mangu, Michael A. Picheny, George A. Saon
METHOD AND SYSTEM FOR ORDER-FREE SPOKEN TERM DETECTION

Publication number: 20160267907

Abstract: A method for spoken term detection, comprising generating a time-marked word list, wherein the time-marked word list is an output of an automatic speech recognition system, generating an index from the time-marked word list, wherein generating the index comprises creating a word loop weighted finite state transducer for each utterance, i, receiving a plurality of keyword queries, and searching the index for a plurality of keyword hits.

Type: Application

Filed: June 25, 2015

Publication date: September 15, 2016

Inventors: Brian E.D. Kingsbury, Lidia Mangu, Michael A. Picheny, George A. Saon
Training deep neural network acoustic models using distributed hessian-free optimization

Patent number: 9390370

Abstract: A method for training a neural network includes receiving labeled training data at a master node, generating, by the master node, partitioned training data from the labeled training data and a held-out set of the labeled training data, determining a plurality of gradients for the partitioned training data, wherein the determination of the gradients is distributed across a plurality of worker nodes, determining a plurality of curvature matrix-vector products over the plurality of samples of the partitioned training data, wherein the determination of the plurality of curvature matrix-vector products is distributed across the plurality of worker nodes, and determining, by the master node, a second-order optimization of the plurality of gradients and the plurality of curvature matrix-vector products, producing a trained neural network configured to perform a structured classification task using a sequence-discriminative criterion.

Type: Grant

Filed: March 4, 2013

Date of Patent: July 12, 2016

Assignee: International Business Machines Corporation

Inventor: Brian E. D. Kingsbury
Low-rank matrix factorization for deep belief network training with high-dimensional output targets

Patent number: 9262724

Abstract: Systems and methods for reducing a number of training parameters in a deep belief network (DBN) are provided. A method for reducing a number of training parameters in a deep belief network (DBN) comprises determining a network architecture including a plurality of layers, using matrix factorization to represent a weight matrix of a final layer of the plurality of layers as a plurality of matrices, and training the DBN having the plurality of matrices.

Type: Grant

Filed: June 26, 2013

Date of Patent: February 16, 2016

Assignee: International Business Machines Corporation

Inventors: Brian E. D. Kingsbury, Tara N. Sainath, Vikas Sindhwani
METHOD AND SYSTEM FOR EFFICIENT SPOKEN TERM DETECTION USING CONFUSION NETWORKS

Publication number: 20160005398

Abstract: Systems and methods for spoken term detection are provided. A method for spoken term detection, comprises receiving phone level out-of-vocabulary (OOV) keyword queries, converting the phone level OOV keyword queries to words, generating a confusion network (CN) based keyword searching (KWS) index, and using the CN based KWS index for both in-vocabulary (IV) keyword queries and the OOV keyword queries.

Type: Application

Filed: August 27, 2015

Publication date: January 7, 2016

Inventors: Brian E.D. Kingsbury, Hong-Kwang Kuo, Lidia Mangu, Hagen Soltau
Method and system for efficient spoken term detection using confusion networks

Patent number: 9196243

Abstract: Systems and methods for spoken term detection are provided. A method for spoken term detection, comprises receiving phone level out-of-vocabulary (OOV) keyword queries, converting the phone level OOV keyword queries to words, generating a confusion network (CN) based keyword searching (KWS) index, and using the CN based KWS index for both in-vocabulary (IV) keyword queries and the OOV keyword queries.

Type: Grant

Filed: March 31, 2014

Date of Patent: November 24, 2015

Assignee: International Business Machines Corporation

Inventors: Brian E. D. Kingsbury, Hong-Kwang Kuo, Lidia Mangu, Hagen Soltau
DEEP SCATTERING SPECTRUM IN ACOUSTIC MODELING FOR SPEECH RECOGNITION

Publication number: 20150317990

Abstract: Deep scattering spectral features are extracted from an acoustic input signal to generate a deep scattering spectral feature representation of the acoustic input signal. The deep scattering spectral feature representation is input to a speech recognition engine. The acoustic input signal is decoded based on at least a portion of the deep scattering spectral feature representation input to a speech recognition engine.

Type: Application

Filed: May 2, 2014

Publication date: November 5, 2015

Inventors: Petr Fousek, Vaibhava Goel, Brian E.D. Kingsbury, Etienne Marcheret, Shay Maymon, David Nahamoo, Tara N. Sainath, Bhuvana Ramabhadran

prev 1 2 3 next