Patents by Inventor Vaibhava Goel

Vaibhava Goel has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ANNEALED DROPOUT TRAINING OF NEURAL NETWORKS

Publication number: 20160307098

Abstract: Systems and methods for training a neural network to optimize network performance, including sampling an applied dropout rate for one or more nodes of the network to evaluate a current generalization performance of one or more training models. An optimized annealing schedule may be generated based on the sampling, wherein the optimized annealing schedule includes an altered dropout rate configured to improve a generalization performance of the network. A number of nodes of the network may be adjusted in accordance with a dropout rate specified in the optimized annealing schedule. The steps may then be iterated until the generalization performance of the network is maximized.

Type: Application

Filed: October 22, 2015

Publication date: October 20, 2016

Inventors: Vaibhava Goel, Steven John Rennie, Samuel Thomas, Ewout van den Berg
ANNEALED DROPOUT TRAINING OF NEURAL NETWORKS

Publication number: 20160307096

Abstract: Systems and methods for training a neural network to optimize network performance, including sampling an applied dropout rate for one or more nodes of the network to evaluate a current generalization performance of one or more training models. An optimized annealing schedule may be generated based on the sampling, wherein the optimized annealing schedule includes an altered dropout rate configured to improve a generalization performance of the network. A number of nodes of the network may be adjusted in accordance with a dropout rate specified in the optimized annealing schedule. The steps may then be iterated until the generalization performance of the network is maximized.

Type: Application

Filed: September 1, 2015

Publication date: October 20, 2016

Inventors: Vaibhava Goel, Steven John Rennie, Samuel Thomas, Ewout van den Berg
Privacy-sensitive speech model creation via aggregation of multiple user models

Patent number: 9424836

Abstract: Techniques disclosed herein include systems and methods for privacy-sensitive training data collection for updating acoustic models of speech recognition systems. In one embodiment, the system locally creates adaptation data from raw audio data. Such adaptation can include derived statistics and/or acoustic model update parameters. The derived statistics and/or updated acoustic model data can then be sent to a speech recognition server or third-party entity. Since the audio data and transcriptions are already processed, the statistics or acoustic model data is devoid of any information that could be human-readable or machine readable such as to enable reconstruction of audio data. Thus, such converted data sent to a server does not include personal or confidential information. Third-party servers can then continually update speech models without storing personal and confidential utterances of users.

Type: Grant

Filed: June 22, 2015

Date of Patent: August 23, 2016

Assignee: Nuance Communications, Inc.

Inventors: Antonio R. Lee, Petr Novak, Peder Andreas Olsen, Vaibhava Goel
Regularized feature space discrimination adaptation

Patent number: 9251784

Abstract: A method and apparatus are provided for training a transformation matrix of a feature vector for an acoustic model. The method includes training the transformation matrix of the feature vector. The transformation matrix maximizes an objective function having a regularization term. The method further includes transforming the feature vector using the transformation matrix of the feature vector, and updating the acoustic model stored in a memory device using the transformed feature vector.

Type: Grant

Filed: October 23, 2013

Date of Patent: February 2, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Takashi Fukuda, Vaibhava Goel, Steven J. Rennie
DEEP SCATTERING SPECTRUM IN ACOUSTIC MODELING FOR SPEECH RECOGNITION

Publication number: 20150317990

Abstract: Deep scattering spectral features are extracted from an acoustic input signal to generate a deep scattering spectral feature representation of the acoustic input signal. The deep scattering spectral feature representation is input to a speech recognition engine. The acoustic input signal is decoded based on at least a portion of the deep scattering spectral feature representation input to a speech recognition engine.

Type: Application

Filed: May 2, 2014

Publication date: November 5, 2015

Inventors: Petr Fousek, Vaibhava Goel, Brian E.D. Kingsbury, Etienne Marcheret, Shay Maymon, David Nahamoo, Tara N. Sainath, Bhuvana Ramabhadran
SYSTEMS AND METHODS FOR COMBINING STOCHASTIC AVERAGE GRADIENT AND HESSIAN-FREE OPTIMIZATION FOR SEQUENCE TRAINING OF DEEP NEURAL NETWORKS

Publication number: 20150310329

Abstract: A method for training a deep neural network (DNN), comprises receiving and formatting speech data for the training, performing Hessian-free sequence training (HFST) on a first subset of a plurality of subsets of the speech data, and iteratively performing the HFST on successive subsets of the plurality of subsets of the speech data, wherein iteratively performing the HFST comprises reusing information from at least one previous iteration.

Type: Application

Filed: July 7, 2015

Publication date: October 29, 2015

Inventors: Pierre Dognin, Vaibhava Goel
PRIVACY-SENSITIVE SPEECH MODEL CREATION VIA AGGREGATION OF MULTIPLE USER MODELS

Publication number: 20150287401

Abstract: Techniques disclosed herein include systems and methods for privacy-sensitive training data collection for updating acoustic models of speech recognition systems. In one embodiment, the system locally creates adaptation data from raw audio data. Such adaptation can include derived statistics and/or acoustic model update parameters. The derived statistics and/or updated acoustic model data can then be sent to a speech recognition server or third-party entity. Since the audio data and transcriptions are already processed, the statistics or acoustic model data is devoid of any information that could be human-readable or machine readable such as to enable reconstruction of audio data. Thus, such converted data sent to a server does not include personal or confidential information. Third-party servers can then continually update speech models without storing personal and confidential utterances of users.

Type: Application

Filed: June 22, 2015

Publication date: October 8, 2015

Applicant: Nuance Communications, Inc.

Inventors: Antonio R. Lee, Petr Novak, Peder Andreas Olsen, Vaibhava Goel
Privacy-sensitive speech model creation via aggregation of multiple user models

Patent number: 9093069

Abstract: Techniques disclosed herein include systems and methods for privacy-sensitive training data collection for updating acoustic models of speech recognition systems. In one embodiment, the system locally creates adaptation data from raw audio data. Such adaptation can include derived statistics and/or acoustic model update parameters. The derived statistics and/or updated acoustic model data can then be sent to a speech recognition server or third-party entity. Since the audio data and transcriptions are already processed, the statistics or acoustic model data is devoid of any information that could be human-readable or machine readable such as to enable reconstruction of audio data. Thus, such converted data sent to a server does not include personal or confidential information. Third-party servers can then continually update speech models without storing personal and confidential utterances of users.

Type: Grant

Filed: November 5, 2012

Date of Patent: July 28, 2015

Assignee: Nuance Communications, Inc.

Inventors: Antonio R. Lee, Petr Novak, Peder Andreas Olsen, Vaibhava Goel
SYSTEMS AND METHODS FOR COMBINING STOCHASTIC AVERAGE GRADIENT AND HESSIAN-FREE OPTIMIZATION FOR SEQUENCE TRAINING OF DEEP NEURAL NETWORKS

Publication number: 20150161988

Abstract: A method for training a deep neural network (DNN), comprises receiving and formatting speech data for the training, performing Hessian-free sequence training (HFST) on a first subset of a plurality of subsets of the speech data, and iteratively performing the HFST on successive subsets of the plurality of subsets of the speech data, wherein iteratively performing the HFST comprises reusing information from at least one previous iteration.

Type: Application

Filed: October 30, 2014

Publication date: June 11, 2015

Inventors: Pierre Dognin, Vaibhava Goel
REGULARIZED FEATURE SPACE DISCRIMINATION ADAPTATION

Publication number: 20150112669

Abstract: A method and apparatus are provided for training a transformation matrix of a feature vector for an acoustic model. The method includes training the transformation matrix of the feature vector. The transformation matrix maximizes an objective function having a regularization term. The method further includes transforming the feature vector using the transformation matrix of the feature vector, and updating the acoustic model stored in a memory device using the transformed feature vector.

Type: Application

Filed: October 23, 2013

Publication date: April 23, 2015

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Takashi Fukuda, Vaibhava Goel, Steven J. Rennie
Natural language system and method based on unisolated performance metric

Patent number: 8977549

Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.

Type: Grant

Filed: September 26, 2013

Date of Patent: March 10, 2015

Assignee: Nuance Communications, Inc.

Inventors: Sabine V. Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
Sparse maximum a posteriori (map) adaption

Patent number: 8972258

Abstract: Techniques disclosed herein include using a Maximum A Posteriori (MAP) adaptation process that imposes sparseness constraints to generate acoustic parameter adaptation data for specific users based on a relatively small set of training data. The resulting acoustic parameter adaptation data identifies changes for a relatively small fraction of acoustic parameters from a baseline acoustic speech model instead of changes to all acoustic parameters. This results in user-specific acoustic parameter adaptation data that is several orders of magnitude smaller than storage amounts otherwise required for a complete acoustic model. This provides customized acoustic speech models that increase recognition accuracy at a fraction of expected data storage requirements.

Type: Grant

Filed: May 22, 2014

Date of Patent: March 3, 2015

Assignee: Nuance Communications, Inc.

Inventors: Vaibhava Goel, Peder A. Olsen, Steven J. Rennie, Jing Huang
Method and system for prompt construction for selection from a list of acoustically confusable items in spoken dialog systems

Patent number: 8909528

Abstract: A method (and system) of determining confusable list items and resolving this confusion in a spoken dialog system includes receiving user input, processing the user input and determining if a list of items needs to be played back to the user, retrieving the list to be played back to the user, identifying acoustic confusions between items on the list, changing the items on the list as necessary to remove the acoustic confusions, and playing unambiguous list items back to the user.

Type: Grant

Filed: May 9, 2007

Date of Patent: December 9, 2014

Assignee: Nuance Communications, Inc.

Inventors: Ellen Marie Eide, Vaibhava Goel, Ramesh Gopinath, Osamuyimen T. Stewart
Forced/predictable adaptation for speech recognition

Patent number: 8838448

Abstract: A method is described for use with automatic speech recognition using discriminative criteria for speaker adaptation. An adaptation evaluation is performed of speech recognition performance data for speech recognition system users. Adaptation candidate users are identified based on the adaptation evaluation for whom an adaptation process is likely to improve system performance.

Type: Grant

Filed: April 5, 2012

Date of Patent: September 16, 2014

Assignee: Nuance Communications, Inc.

Inventors: Dan Ning Jiang, Vaibhava Goel, Dimitri Kanevsky, Yong Qin
SPARSE MAXIMUM A POSTERIORI (MAP) ADAPTION

Publication number: 20140257809

Abstract: Techniques disclosed herein include using a Maximum A Posteriori (MAP) adaptation process that imposes sparseness constraints to generate acoustic parameter adaptation data for specific users based on a relatively small set of training data. The resulting acoustic parameter adaptation data identifies changes for a relatively small fraction of acoustic parameters from a baseline acoustic speech model instead of changes to all acoustic parameters. This results in user-specific acoustic parameter adaptation data that is several orders of magnitude smaller than storage amounts otherwise required for a complete acoustic model. This provides customized acoustic speech models that increase recognition accuracy at a fraction of expected data storage requirements.

Type: Application

Filed: May 22, 2014

Publication date: September 11, 2014

Inventors: Vaibhava Goel, Peder A. Olsen, Steven J. Rennie, Jing Huang
Sparse maximum a posteriori (MAP) adaptation

Patent number: 8738376

Abstract: Techniques disclosed herein include using a Maximum A Posteriori (MAP) adaptation process that imposes sparseness constraints to generate acoustic parameter adaptation data for specific users based on a relatively small set of training data. The resulting acoustic parameter adaptation data identifies changes for a relatively small fraction of acoustic parameters from a baseline acoustic speech model instead of changes to all acoustic parameters. This results in user-specific acoustic parameter adaptation data that is several orders of magnitude smaller than storage amounts otherwise required for a complete acoustic model. This provides customized acoustic speech models that increase recognition accuracy at a fraction of expected data storage requirements.

Type: Grant

Filed: October 28, 2011

Date of Patent: May 27, 2014

Assignee: Nuance Communications, Inc.

Inventors: Vaibhava Goel, Peder A. Olsen, Steven J. Rennie, Jing Huang
Method and system having hypothesis type variable thresholds

Patent number: 8725512

Abstract: A method (and system) for spoken dialog confirmation classifies a plurality of spoken dialog hypotheses, and assigns a threshold to each class of spoken dialog hypotheses.

Type: Grant

Filed: March 13, 2007

Date of Patent: May 13, 2014

Assignee: Nuance Communications, Inc.

Inventors: David Claiborn, Vaibhava Goel, Ramesh Gopinath, Pichappan Pethachi
PRIVACY-SENSITIVE SPEECH MODEL CREATION VIA AGGREGATION OF MULTIPLE USER MODELS

Publication number: 20140129226

Abstract: Techniques disclosed herein include systems and methods for privacy-sensitive training data collection for updating acoustic models of speech recognition systems. In one embodiment, the system locally creates adaptation data from raw audio data. Such adaptation can include derived statistics and/or acoustic model update parameters. The derived statistics and/or updated acoustic model data can then be sent to a speech recognition server or third-party entity. Since the audio data and transcriptions are already processed, the statistics or acoustic model data is devoid of any information that could be human-readable or machine readable such as to enable reconstruction of audio data. Thus, such converted data sent to a server does not include personal or confidential information. Third-party servers can then continually update speech models without storing personal and confidential utterances of users.

Type: Application

Filed: November 5, 2012

Publication date: May 8, 2014

Inventors: Antonio R. Lee, Petr Novak, Peder A. Olsen, Vaibhava Goel
NATURAL LANGUAGE SYSTEM AND METHOD BASED ON UNISOLATED PERFORMANCE METRIC

Publication number: 20140032217

Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module.

Type: Application

Filed: September 26, 2013

Publication date: January 30, 2014

Applicant: Nuance Communications, Inc.

Inventors: Sabine V. Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
Model restructuring for client and server based automatic speech recognition

Patent number: 8635067

Abstract: Access is obtained to a large reference acoustic model for automatic speech recognition. The large reference acoustic model has L states modeled by L mixture models, and the large reference acoustic model has N components. A desired number of components Nc, less than N, to be used in a restructured acoustic model derived from the reference acoustic model, is identified. The desired number of components Nc is selected based on a computing environment in which the restructured acoustic model is to be deployed. The restructured acoustic model also has L states. For each given one of the L mixture models in the reference acoustic model, a merge sequence is built which records, for a given cost function, sequential mergers of pairs of the components associated with the given one of the mixture models. A portion of the Nc components is assigned to each of the L states in the restructured acoustic model.

Type: Grant

Filed: December 9, 2010

Date of Patent: January 21, 2014

Assignee: International Business Machines Corporation

Inventors: Pierre Dognin, Vaibhava Goel, John R. Hershey, Peder A. Olsen

prev 1 2 3 4 5 next