Patents by Inventor Brian E. D. Kingsbury

Brian E. D. Kingsbury has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10360901
    Abstract: Techniques for learning front-end speech recognition parameters as part of training a neural network classifier include obtaining an input speech signal, and applying front-end speech recognition parameters to extract features from the input speech signal. The extracted features may be fed through a neural network to obtain an output classification for the input speech signal, and an error measure may be computed for the output classification through comparison of the output classification with a known target classification. Back propagation may be applied to adjust one or more of the front-end parameters as one or more layers of the neural network, based on the error measure.
    Type: Grant
    Filed: December 5, 2014
    Date of Patent: July 23, 2019
    Assignee: Nuance Communications, Inc.
    Inventors: Tara N. Sainath, Brian E. D. Kingsbury, Abdel-rahman Mohamed, Bhuvana Ramabhadran
  • Patent number: 10056075
    Abstract: A method for training a deep neural network, comprises receiving and formatting speech data for the training, preconditioning a system of equations to be used for analyzing the speech data in connection with the training by using a non-fixed point quasi-Newton preconditioning scheme, and employing flexible Krylov subspace solvers in response to variations in the preconditioning scheme for different iterations of the training.
    Type: Grant
    Filed: December 9, 2016
    Date of Patent: August 21, 2018
    Assignee: International Business Machines Corporation
    Inventors: Lior Horesh, Brian E. D. Kingsbury, Tara N. Sainath
  • Patent number: 9824683
    Abstract: A method of augmenting training data includes converting a feature sequence of a source speaker determined from a plurality of utterances within a transcript to a feature sequence of a target speaker under the same transcript, training a speaker-dependent acoustic model for the target speaker for corresponding speaker-specific acoustic characteristics, estimating a mapping function between the feature sequence of the source speaker and the speaker-dependent acoustic model of the target speaker, and mapping each utterance from each speaker in a training set using the mapping function to multiple selected target speakers in the training set.
    Type: Grant
    Filed: December 22, 2015
    Date of Patent: November 21, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Xiaodong Cui, Vaibhava Goel, Brian E. D. Kingsbury
  • Patent number: 9734823
    Abstract: Systems and methods for spoken term detection are provided. A method for spoken term detection, comprises receiving phone level out-of-vocabulary (OOV) keyword queries, converting the phone level OOV keyword queries to words, generating a confusion network (CN) based keyword searching (KWS) index, and using the CN based KWS index for both in-vocabulary (IV) keyword queries and the OOV keyword queries.
    Type: Grant
    Filed: August 27, 2015
    Date of Patent: August 15, 2017
    Assignee: International Business Machines Corporation
    Inventors: Brian E. D. Kingsbury, Hong-Kwang Kuo, Lidia Mangu, Hagen Soltau
  • Patent number: 9721559
    Abstract: A method of augmenting training data includes converting a feature sequence of a source speaker determined from a plurality of utterances within a transcript to a feature sequence of a target speaker under the same transcript, training a speaker-dependent acoustic model for the target speaker for corresponding speaker-specific acoustic characteristics, estimating a mapping function between the feature sequence of the source speaker and the speaker-dependent acoustic model of the target speaker, and mapping each utterance from each speaker in a training set using the mapping function to multiple selected target speakers in the training set.
    Type: Grant
    Filed: April 17, 2015
    Date of Patent: August 1, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Xiaodong Cui, Vaibhava Goel, Brian E. D. Kingsbury
  • Publication number: 20170200446
    Abstract: A method of augmenting training data includes converting a feature sequence of a source speaker determined from a plurality of utterances within a transcript to a feature sequence of a target speaker under the same transcript, training a speaker-dependent acoustic model for the target speaker for corresponding speaker-specific acoustic characteristics, estimating a mapping function between the feature sequence of the source speaker and the speaker-dependent acoustic model of the target speaker, and mapping each utterance from each speaker in a training set using the mapping function to multiple selected target speakers in the training set.
    Type: Application
    Filed: December 22, 2015
    Publication date: July 13, 2017
    Inventors: Xiaodong Cui, Vaibhava Goel, Brian E. D. Kingsbury
  • Patent number: 9704482
    Abstract: A method for spoken term detection, comprising generating a time-marked word list, wherein the time-marked word list is an output of an automatic speech recognition system, generating an index from the time-marked word list, wherein generating the index comprises creating a word loop weighted finite state transducer for each utterance, i, receiving a plurality of keyword queries, and searching the index for a plurality of keyword hits.
    Type: Grant
    Filed: March 11, 2015
    Date of Patent: July 11, 2017
    Assignee: International Business Machines Corporation
    Inventors: Brian E. D. Kingsbury, Lidia Mangu, Michael A. Picheny, George A. Saon
  • Patent number: 9697830
    Abstract: A method for spoken term detection, comprising generating a time-marked word list, wherein the time-marked word list is an output of an automatic speech recognition system, generating an index from the time-marked word list, wherein generating the index comprises creating a word loop weighted finite state transducer for each utterance, i, receiving a plurality of keyword queries, and searching the index for a plurality of keyword hits.
    Type: Grant
    Filed: June 25, 2015
    Date of Patent: July 4, 2017
    Assignee: International Business Machines Corporation
    Inventors: Brian E. D. Kingsbury, Lidia Mangu, Michael A. Picheny, George A. Saon
  • Patent number: 9640186
    Abstract: Deep scattering spectral features are extracted from an acoustic input signal to generate a deep scattering spectral feature representation of the acoustic input signal. The deep scattering spectral feature representation is input to a speech recognition engine. The acoustic input signal is decoded based on at least a portion of the deep scattering spectral feature representation input to a speech recognition engine.
    Type: Grant
    Filed: May 2, 2014
    Date of Patent: May 2, 2017
    Assignee: International Business Machines Corporation
    Inventors: Petr Fousek, Vaibhava Goel, Brian E. D. Kingsbury, Etienne Marcheret, Shay Maymon, David Nahamoo, Vijayaditya Peddinti, Bhuvana Ramabhadran, Tara N. Sainath
  • Publication number: 20170092263
    Abstract: A method for training a deep neural network, comprises receiving and formatting speech data for the training, preconditioning a system of equations to be used for analyzing the speech data in connection with the training by using a non-fixed point quasi-Newton preconditioning scheme, and employing flexible Krylov subspace solvers in response to variations in the preconditioning scheme for different iterations of the training.
    Type: Application
    Filed: December 9, 2016
    Publication date: March 30, 2017
    Inventors: Lior Horesh, Brian E.D. Kingsbury, Tara N. Sainath
  • Patent number: 9601109
    Abstract: A method for training a deep neural network, comprises receiving and formatting speech data for the training, preconditioning a system of equations to be used for analyzing the speech data in connection with the training by using a non-fixed point quasi-Newton preconditioning scheme, and employing flexible Krylov subspace solvers in response to variations in the preconditioning scheme for different iterations of the training.
    Type: Grant
    Filed: September 29, 2014
    Date of Patent: March 21, 2017
    Assignee: International Business Machines Corporation
    Inventors: Lior Horesh, Brian E. D. Kingsbury, Tara N. Sainath
  • Publication number: 20170040016
    Abstract: A method of augmenting training data includes converting a feature sequence of a source speaker determined from a plurality of utterances within a transcript to a feature sequence of a target speaker under the same transcript, training a speaker-dependent acoustic model for the target speaker for corresponding speaker-specific acoustic characteristics, estimating a mapping function between the feature sequence of the source speaker and the speaker-dependent acoustic model of the target speaker, and mapping each utterance from each speaker in a training set using the mapping function to multiple selected target speakers in the training set.
    Type: Application
    Filed: April 17, 2015
    Publication date: February 9, 2017
    Inventors: Xiaodong Cui, Vaibhava Goel, Brian E. D. Kingsbury
  • Patent number: 9477753
    Abstract: Systems and methods for processing a query include determining a plurality of sets of match candidates for a query using a processor, each of the plurality of sets of match candidates being independently determined from a plurality of diverse word lattice generation components of different type. The plurality of sets of match candidates is merged by generating a first score for each match candidate to provide a merged set of match candidates. A second score is computed for each match candidate of the merged set based upon features of that match candidate. The first score and the second score are combined to provide a final set of match candidates as matches to the query.
    Type: Grant
    Filed: March 12, 2013
    Date of Patent: October 25, 2016
    Assignee: International Business Machines Corporation
    Inventors: Brian E. D. Kingsbury, Hong-Kwang Jeff Kuo, Lidia Luminita Mangu, Hagen Soltau
  • Publication number: 20160267906
    Abstract: A method for spoken term detection, comprising generating a time-marked word list, wherein the time-marked word list is an output of an automatic speech recognition system, generating an index from the time-marked word list, wherein generating the index comprises creating a word loop weighted finite state transducer for each utterance, i, receiving a plurality of keyword queries, and searching the index for a plurality of keyword hits.
    Type: Application
    Filed: March 11, 2015
    Publication date: September 15, 2016
    Inventors: Brian E.D. Kingsbury, Lidia Mangu, Michael A. Picheny, George A. Saon
  • Publication number: 20160267907
    Abstract: A method for spoken term detection, comprising generating a time-marked word list, wherein the time-marked word list is an output of an automatic speech recognition system, generating an index from the time-marked word list, wherein generating the index comprises creating a word loop weighted finite state transducer for each utterance, i, receiving a plurality of keyword queries, and searching the index for a plurality of keyword hits.
    Type: Application
    Filed: June 25, 2015
    Publication date: September 15, 2016
    Inventors: Brian E.D. Kingsbury, Lidia Mangu, Michael A. Picheny, George A. Saon
  • Patent number: 9390370
    Abstract: A method for training a neural network includes receiving labeled training data at a master node, generating, by the master node, partitioned training data from the labeled training data and a held-out set of the labeled training data, determining a plurality of gradients for the partitioned training data, wherein the determination of the gradients is distributed across a plurality of worker nodes, determining a plurality of curvature matrix-vector products over the plurality of samples of the partitioned training data, wherein the determination of the plurality of curvature matrix-vector products is distributed across the plurality of worker nodes, and determining, by the master node, a second-order optimization of the plurality of gradients and the plurality of curvature matrix-vector products, producing a trained neural network configured to perform a structured classification task using a sequence-discriminative criterion.
    Type: Grant
    Filed: March 4, 2013
    Date of Patent: July 12, 2016
    Assignee: International Business Machines Corporation
    Inventor: Brian E. D. Kingsbury
  • Patent number: 9262724
    Abstract: Systems and methods for reducing a number of training parameters in a deep belief network (DBN) are provided. A method for reducing a number of training parameters in a deep belief network (DBN) comprises determining a network architecture including a plurality of layers, using matrix factorization to represent a weight matrix of a final layer of the plurality of layers as a plurality of matrices, and training the DBN having the plurality of matrices.
    Type: Grant
    Filed: June 26, 2013
    Date of Patent: February 16, 2016
    Assignee: International Business Machines Corporation
    Inventors: Brian E. D. Kingsbury, Tara N. Sainath, Vikas Sindhwani
  • Publication number: 20160005398
    Abstract: Systems and methods for spoken term detection are provided. A method for spoken term detection, comprises receiving phone level out-of-vocabulary (OOV) keyword queries, converting the phone level OOV keyword queries to words, generating a confusion network (CN) based keyword searching (KWS) index, and using the CN based KWS index for both in-vocabulary (IV) keyword queries and the OOV keyword queries.
    Type: Application
    Filed: August 27, 2015
    Publication date: January 7, 2016
    Inventors: Brian E.D. Kingsbury, Hong-Kwang Kuo, Lidia Mangu, Hagen Soltau
  • Patent number: 9196243
    Abstract: Systems and methods for spoken term detection are provided. A method for spoken term detection, comprises receiving phone level out-of-vocabulary (OOV) keyword queries, converting the phone level OOV keyword queries to words, generating a confusion network (CN) based keyword searching (KWS) index, and using the CN based KWS index for both in-vocabulary (IV) keyword queries and the OOV keyword queries.
    Type: Grant
    Filed: March 31, 2014
    Date of Patent: November 24, 2015
    Assignee: International Business Machines Corporation
    Inventors: Brian E. D. Kingsbury, Hong-Kwang Kuo, Lidia Mangu, Hagen Soltau
  • Publication number: 20150317990
    Abstract: Deep scattering spectral features are extracted from an acoustic input signal to generate a deep scattering spectral feature representation of the acoustic input signal. The deep scattering spectral feature representation is input to a speech recognition engine. The acoustic input signal is decoded based on at least a portion of the deep scattering spectral feature representation input to a speech recognition engine.
    Type: Application
    Filed: May 2, 2014
    Publication date: November 5, 2015
    Inventors: Petr Fousek, Vaibhava Goel, Brian E.D. Kingsbury, Etienne Marcheret, Shay Maymon, David Nahamoo, Tara N. Sainath, Bhuvana Ramabhadran