Patents by Inventor Vaibhava Goel

Vaibhava Goel has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20160307098
    Abstract: Systems and methods for training a neural network to optimize network performance, including sampling an applied dropout rate for one or more nodes of the network to evaluate a current generalization performance of one or more training models. An optimized annealing schedule may be generated based on the sampling, wherein the optimized annealing schedule includes an altered dropout rate configured to improve a generalization performance of the network. A number of nodes of the network may be adjusted in accordance with a dropout rate specified in the optimized annealing schedule. The steps may then be iterated until the generalization performance of the network is maximized.
    Type: Application
    Filed: October 22, 2015
    Publication date: October 20, 2016
    Inventors: Vaibhava Goel, Steven John Rennie, Samuel Thomas, Ewout van den Berg
  • Publication number: 20160307096
    Abstract: Systems and methods for training a neural network to optimize network performance, including sampling an applied dropout rate for one or more nodes of the network to evaluate a current generalization performance of one or more training models. An optimized annealing schedule may be generated based on the sampling, wherein the optimized annealing schedule includes an altered dropout rate configured to improve a generalization performance of the network. A number of nodes of the network may be adjusted in accordance with a dropout rate specified in the optimized annealing schedule. The steps may then be iterated until the generalization performance of the network is maximized.
    Type: Application
    Filed: September 1, 2015
    Publication date: October 20, 2016
    Inventors: Vaibhava Goel, Steven John Rennie, Samuel Thomas, Ewout van den Berg
  • Patent number: 9424836
    Abstract: Techniques disclosed herein include systems and methods for privacy-sensitive training data collection for updating acoustic models of speech recognition systems. In one embodiment, the system locally creates adaptation data from raw audio data. Such adaptation can include derived statistics and/or acoustic model update parameters. The derived statistics and/or updated acoustic model data can then be sent to a speech recognition server or third-party entity. Since the audio data and transcriptions are already processed, the statistics or acoustic model data is devoid of any information that could be human-readable or machine readable such as to enable reconstruction of audio data. Thus, such converted data sent to a server does not include personal or confidential information. Third-party servers can then continually update speech models without storing personal and confidential utterances of users.
    Type: Grant
    Filed: June 22, 2015
    Date of Patent: August 23, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Antonio R. Lee, Petr Novak, Peder Andreas Olsen, Vaibhava Goel
  • Patent number: 9251784
    Abstract: A method and apparatus are provided for training a transformation matrix of a feature vector for an acoustic model. The method includes training the transformation matrix of the feature vector. The transformation matrix maximizes an objective function having a regularization term. The method further includes transforming the feature vector using the transformation matrix of the feature vector, and updating the acoustic model stored in a memory device using the transformed feature vector.
    Type: Grant
    Filed: October 23, 2013
    Date of Patent: February 2, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Takashi Fukuda, Vaibhava Goel, Steven J. Rennie
  • Publication number: 20150317990
    Abstract: Deep scattering spectral features are extracted from an acoustic input signal to generate a deep scattering spectral feature representation of the acoustic input signal. The deep scattering spectral feature representation is input to a speech recognition engine. The acoustic input signal is decoded based on at least a portion of the deep scattering spectral feature representation input to a speech recognition engine.
    Type: Application
    Filed: May 2, 2014
    Publication date: November 5, 2015
    Inventors: Petr Fousek, Vaibhava Goel, Brian E.D. Kingsbury, Etienne Marcheret, Shay Maymon, David Nahamoo, Tara N. Sainath, Bhuvana Ramabhadran
  • Publication number: 20150310329
    Abstract: A method for training a deep neural network (DNN), comprises receiving and formatting speech data for the training, performing Hessian-free sequence training (HFST) on a first subset of a plurality of subsets of the speech data, and iteratively performing the HFST on successive subsets of the plurality of subsets of the speech data, wherein iteratively performing the HFST comprises reusing information from at least one previous iteration.
    Type: Application
    Filed: July 7, 2015
    Publication date: October 29, 2015
    Inventors: Pierre Dognin, Vaibhava Goel
  • Publication number: 20150287401
    Abstract: Techniques disclosed herein include systems and methods for privacy-sensitive training data collection for updating acoustic models of speech recognition systems. In one embodiment, the system locally creates adaptation data from raw audio data. Such adaptation can include derived statistics and/or acoustic model update parameters. The derived statistics and/or updated acoustic model data can then be sent to a speech recognition server or third-party entity. Since the audio data and transcriptions are already processed, the statistics or acoustic model data is devoid of any information that could be human-readable or machine readable such as to enable reconstruction of audio data. Thus, such converted data sent to a server does not include personal or confidential information. Third-party servers can then continually update speech models without storing personal and confidential utterances of users.
    Type: Application
    Filed: June 22, 2015
    Publication date: October 8, 2015
    Applicant: Nuance Communications, Inc.
    Inventors: Antonio R. Lee, Petr Novak, Peder Andreas Olsen, Vaibhava Goel
  • Patent number: 9093069
    Abstract: Techniques disclosed herein include systems and methods for privacy-sensitive training data collection for updating acoustic models of speech recognition systems. In one embodiment, the system locally creates adaptation data from raw audio data. Such adaptation can include derived statistics and/or acoustic model update parameters. The derived statistics and/or updated acoustic model data can then be sent to a speech recognition server or third-party entity. Since the audio data and transcriptions are already processed, the statistics or acoustic model data is devoid of any information that could be human-readable or machine readable such as to enable reconstruction of audio data. Thus, such converted data sent to a server does not include personal or confidential information. Third-party servers can then continually update speech models without storing personal and confidential utterances of users.
    Type: Grant
    Filed: November 5, 2012
    Date of Patent: July 28, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Antonio R. Lee, Petr Novak, Peder Andreas Olsen, Vaibhava Goel
  • Publication number: 20150161988
    Abstract: A method for training a deep neural network (DNN), comprises receiving and formatting speech data for the training, performing Hessian-free sequence training (HFST) on a first subset of a plurality of subsets of the speech data, and iteratively performing the HFST on successive subsets of the plurality of subsets of the speech data, wherein iteratively performing the HFST comprises reusing information from at least one previous iteration.
    Type: Application
    Filed: October 30, 2014
    Publication date: June 11, 2015
    Inventors: Pierre Dognin, Vaibhava Goel
  • Publication number: 20150112669
    Abstract: A method and apparatus are provided for training a transformation matrix of a feature vector for an acoustic model. The method includes training the transformation matrix of the feature vector. The transformation matrix maximizes an objective function having a regularization term. The method further includes transforming the feature vector using the transformation matrix of the feature vector, and updating the acoustic model stored in a memory device using the transformed feature vector.
    Type: Application
    Filed: October 23, 2013
    Publication date: April 23, 2015
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Takashi Fukuda, Vaibhava Goel, Steven J. Rennie
  • Patent number: 8977549
    Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.
    Type: Grant
    Filed: September 26, 2013
    Date of Patent: March 10, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Sabine V. Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
  • Patent number: 8972258
    Abstract: Techniques disclosed herein include using a Maximum A Posteriori (MAP) adaptation process that imposes sparseness constraints to generate acoustic parameter adaptation data for specific users based on a relatively small set of training data. The resulting acoustic parameter adaptation data identifies changes for a relatively small fraction of acoustic parameters from a baseline acoustic speech model instead of changes to all acoustic parameters. This results in user-specific acoustic parameter adaptation data that is several orders of magnitude smaller than storage amounts otherwise required for a complete acoustic model. This provides customized acoustic speech models that increase recognition accuracy at a fraction of expected data storage requirements.
    Type: Grant
    Filed: May 22, 2014
    Date of Patent: March 3, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Vaibhava Goel, Peder A. Olsen, Steven J. Rennie, Jing Huang
  • Patent number: 8909528
    Abstract: A method (and system) of determining confusable list items and resolving this confusion in a spoken dialog system includes receiving user input, processing the user input and determining if a list of items needs to be played back to the user, retrieving the list to be played back to the user, identifying acoustic confusions between items on the list, changing the items on the list as necessary to remove the acoustic confusions, and playing unambiguous list items back to the user.
    Type: Grant
    Filed: May 9, 2007
    Date of Patent: December 9, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Ellen Marie Eide, Vaibhava Goel, Ramesh Gopinath, Osamuyimen T. Stewart
  • Patent number: 8838448
    Abstract: A method is described for use with automatic speech recognition using discriminative criteria for speaker adaptation. An adaptation evaluation is performed of speech recognition performance data for speech recognition system users. Adaptation candidate users are identified based on the adaptation evaluation for whom an adaptation process is likely to improve system performance.
    Type: Grant
    Filed: April 5, 2012
    Date of Patent: September 16, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Dan Ning Jiang, Vaibhava Goel, Dimitri Kanevsky, Yong Qin
  • Publication number: 20140257809
    Abstract: Techniques disclosed herein include using a Maximum A Posteriori (MAP) adaptation process that imposes sparseness constraints to generate acoustic parameter adaptation data for specific users based on a relatively small set of training data. The resulting acoustic parameter adaptation data identifies changes for a relatively small fraction of acoustic parameters from a baseline acoustic speech model instead of changes to all acoustic parameters. This results in user-specific acoustic parameter adaptation data that is several orders of magnitude smaller than storage amounts otherwise required for a complete acoustic model. This provides customized acoustic speech models that increase recognition accuracy at a fraction of expected data storage requirements.
    Type: Application
    Filed: May 22, 2014
    Publication date: September 11, 2014
    Inventors: Vaibhava Goel, Peder A. Olsen, Steven J. Rennie, Jing Huang
  • Patent number: 8738376
    Abstract: Techniques disclosed herein include using a Maximum A Posteriori (MAP) adaptation process that imposes sparseness constraints to generate acoustic parameter adaptation data for specific users based on a relatively small set of training data. The resulting acoustic parameter adaptation data identifies changes for a relatively small fraction of acoustic parameters from a baseline acoustic speech model instead of changes to all acoustic parameters. This results in user-specific acoustic parameter adaptation data that is several orders of magnitude smaller than storage amounts otherwise required for a complete acoustic model. This provides customized acoustic speech models that increase recognition accuracy at a fraction of expected data storage requirements.
    Type: Grant
    Filed: October 28, 2011
    Date of Patent: May 27, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Vaibhava Goel, Peder A. Olsen, Steven J. Rennie, Jing Huang
  • Patent number: 8725512
    Abstract: A method (and system) for spoken dialog confirmation classifies a plurality of spoken dialog hypotheses, and assigns a threshold to each class of spoken dialog hypotheses.
    Type: Grant
    Filed: March 13, 2007
    Date of Patent: May 13, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: David Claiborn, Vaibhava Goel, Ramesh Gopinath, Pichappan Pethachi
  • Publication number: 20140129226
    Abstract: Techniques disclosed herein include systems and methods for privacy-sensitive training data collection for updating acoustic models of speech recognition systems. In one embodiment, the system locally creates adaptation data from raw audio data. Such adaptation can include derived statistics and/or acoustic model update parameters. The derived statistics and/or updated acoustic model data can then be sent to a speech recognition server or third-party entity. Since the audio data and transcriptions are already processed, the statistics or acoustic model data is devoid of any information that could be human-readable or machine readable such as to enable reconstruction of audio data. Thus, such converted data sent to a server does not include personal or confidential information. Third-party servers can then continually update speech models without storing personal and confidential utterances of users.
    Type: Application
    Filed: November 5, 2012
    Publication date: May 8, 2014
    Inventors: Antonio R. Lee, Petr Novak, Peder A. Olsen, Vaibhava Goel
  • Publication number: 20140032217
    Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module.
    Type: Application
    Filed: September 26, 2013
    Publication date: January 30, 2014
    Applicant: Nuance Communications, Inc.
    Inventors: Sabine V. Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
  • Patent number: 8635067
    Abstract: Access is obtained to a large reference acoustic model for automatic speech recognition. The large reference acoustic model has L states modeled by L mixture models, and the large reference acoustic model has N components. A desired number of components Nc, less than N, to be used in a restructured acoustic model derived from the reference acoustic model, is identified. The desired number of components Nc is selected based on a computing environment in which the restructured acoustic model is to be deployed. The restructured acoustic model also has L states. For each given one of the L mixture models in the reference acoustic model, a merge sequence is built which records, for a given cost function, sequential mergers of pairs of the components associated with the given one of the mixture models. A portion of the Nc components is assigned to each of the L states in the restructured acoustic model.
    Type: Grant
    Filed: December 9, 2010
    Date of Patent: January 21, 2014
    Assignee: International Business Machines Corporation
    Inventors: Pierre Dognin, Vaibhava Goel, John R. Hershey, Peder A. Olsen