Patents by Inventor Bhuvana Ramabhadran

Bhuvana Ramabhadran has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11019306
    Abstract: A method of combining data streams from fixed audio-visual sensors with data streams from personal mobile devices including, forming a communication link with at least one of one or more personal mobile devices; receiving at least one of an audio data stream and/or a video data stream from the at least one of the one or more personal mobile devices; determining the quality of the at least one of the audio data stream and/or the video data stream, wherein the audio data stream and/or the video data stream having a quality above a threshold quality is retained; and combining the retained audio data stream and/or the video data stream with the data streams from the fixed audio-visual sensors.
    Type: Grant
    Filed: January 9, 2019
    Date of Patent: May 25, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Stanley Chen, Kenneth W. Church, Vaibhava Goel, Lidia L. Mangu, Etienne Marcheret, Bhuvana Ramabhadran, Laurence P. Sansone, Abhinav Sethy, Samuel Thomas
  • Patent number: 10991363
    Abstract: An apparatus, method, and computer program product for adapting an acoustic model to a specific environment are defined. An adapted model obtained by adapting an original model to the specific environment using adaptation data, the original model being trained using training data and being used to calculate probabilities of context-dependent phones given an acoustic feature. Adapted probabilities obtained by adapting original probabilities using the training data and the adaptation data, the original probabilities being trained using the training data and being prior probabilities of context-dependent phones. An adapted acoustic model obtained from the adapted model and the adapted probabilities.
    Type: Grant
    Filed: November 6, 2017
    Date of Patent: April 27, 2021
    Assignee: International Business Machines Corporation
    Inventors: Gakuto Kurata, Bhuvana Ramabhadran, Masayuki Suzuki
  • Patent number: 10923110
    Abstract: An apparatus, method, and computer program product for adapting an acoustic model to a specific environment are defined. An adapted model obtained by adapting an original model to the specific environment using adaptation data, the original model being trained using training data and being used to calculate probabilities of context-dependent phones given an acoustic feature. Adapted probabilities obtained by adapting original probabilities using the training data and the adaptation data, the original probabilities being trained using the training data and being prior probabilities of context-dependent phones. An adapted acoustic model obtained from the adapted model and the adapted probabilities.
    Type: Grant
    Filed: August 25, 2017
    Date of Patent: February 16, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gakuto Kurata, Bhuvana Ramabhadran, Masayuki Suzuki
  • Publication number: 20200380952
    Abstract: A method includes receiving an input text sequence to be synthesized into speech in a first language and obtaining a speaker embedding, the speaker embedding specifying specific voice characteristics of a target speaker for synthesizing the input text sequence into speech that clones a voice of the target speaker. The target speaker includes a native speaker of a second language different than the first language. The method also includes generating, using a text-to-speech (TTS) model, an output audio feature representation of the input text by processing the input text sequence and the speaker embedding. The output audio feature representation includes the voice characteristics of the target speaker specified by the speaker embedding.
    Type: Application
    Filed: April 22, 2020
    Publication date: December 3, 2020
    Applicant: Google LLC
    Inventors: Yu Zhang, Ron J. Weiss, Byungha Chun, Yonghui Wu, Zhifeng Chen, Russell John Wyatt Skerry-Ryan, Ye Jia, Andrew M. Rosenberg, Bhuvana Ramabhadran
  • Patent number: 10832661
    Abstract: A computer-implemented method is provided. The computer-implemented method is performed by a speech recognition system having at least a processor. The method further includes performing a speech recognition operation on the audio signal data to decode the audio signal data into a textual representation based on the estimated sound identification information from a neural network having periodic indications and components of a frequency spectrum of the audio signal data inputted thereto. The neural network includes a plurality of fully-connected network layers having a first layer that includes a plurality of first nodes and a plurality of second nodes. The method further comprises training the neural network by initially isolating the periodic indications from the components of the frequency spectrum in the first layer by setting weights between the first nodes and a plurality of input nodes corresponding to the periodic indications to 0.
    Type: Grant
    Filed: October 28, 2019
    Date of Patent: November 10, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Takashi Fukuda, Osamu Ichikawa, Bhuvana Ramabhadran
  • Patent number: 10692488
    Abstract: A computer selects a test set of sentences from among sentences applied to train a whole sentence recurrent neural network language model to estimate the probability of likelihood of each whole sentence processed by natural language processing being correct. The computer generates imposter sentences from among the test set of sentences by substituting one word in each sentence of the test set of sentences. The computer generates, through the whole sentence recurrent neural network language model, a first score for each sentence of the test set of sentences and at least one additional score for each of the imposter sentences. The computer evaluates an accuracy of the natural language processing system in performing sequential classification tasks based on an accuracy value of the first score in reflecting a correct sentence and the at least one additional score in reflecting an incorrect sentence.
    Type: Grant
    Filed: August 23, 2019
    Date of Patent: June 23, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Yinghui Huang, Abhinav Sethy, Kartik Audhkhasi, Bhuvana Ramabhadran
  • Publication number: 20200193977
    Abstract: Methods, systems, and apparatus, including computer programs stored on a computer-readable storage medium, for transliteration for speech recognition training and scoring. In some implementations, language examples are accessed, some of which include words in a first script and words in one or more other scripts. At least portions of some of the language examples are transliterated to the first script to generate a training data set. A language model is generated based on occurrences of the different sequences of words in the training data set in the first script. The language model is used to perform speech recognition for an utterance.
    Type: Application
    Filed: December 12, 2019
    Publication date: June 18, 2020
    Inventors: Bhuvana Ramabhadran, Min Ma, Pedro J. Moreno Mengibar, Jesse Emond, Brian E. Roark
  • Patent number: 10586529
    Abstract: A computer-implemented method for processing a speech signal, includes: identifying speech segments in an input speech signal; calculating an upper variance and a lower variance, the upper variance being a variance of upper spectra larger than a criteria among speech spectra corresponding to frames in the speech segments, the lower variance being a variance of lower spectra smaller than a criteria among the speech spectra corresponding to the frames in the speech segments; determining whether the input speech signal is a special input speech signal using a difference between the upper variance and the lower variance; and performing speech recognition of the input speech signal which has been determined to be the special input speech signal, using a special acoustic model for the special input speech signal.
    Type: Grant
    Filed: September 14, 2017
    Date of Patent: March 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Osamu Ichikawa, Takashi Fukuda, Gakuto Kurata, Bhuvana Ramabhadran
  • Publication number: 20200058297
    Abstract: A computer-implemented method is provided. The computer-implemented method is performed by a speech recognition system having at least a processor. The method further includes performing a speech recognition operation on the audio signal data to decode the audio signal data into a textual representation based on the estimated sound identification information from a neural network having periodic indications and components of a frequency spectrum of the audio signal data inputted thereto. The neural network includes a plurality of fully-connected network layers having a first layer that includes a plurality of first nodes and a plurality of second nodes. The method further comprises training the neural network by initially isolating the periodic indications from the components of the frequency spectrum in the first layer by setting weights between the first nodes and a plurality of input nodes corresponding to the periodic indications to 0.
    Type: Application
    Filed: October 28, 2019
    Publication date: February 20, 2020
    Inventors: Takashi Fukuda, Osamu Ichikawa, Bhuvana Ramabhadran
  • Publication number: 20200058296
    Abstract: Techniques for learning front-end speech recognition parameters as part of training a neural network classifier include obtaining an input speech signal, and applying front-end speech recognition parameters to extract features from the input speech signal. The extracted features may be fed through a neural network to obtain an output classification for the input speech signal, and an error measure may be computed for the output classification through comparison of the output classification with a known target classification. Back propagation may be applied to adjust one or more of the front-end parameters as one or more layers of the neural network, based on the error measure.
    Type: Application
    Filed: July 23, 2019
    Publication date: February 20, 2020
    Applicant: Nuance Communications, Inc.
    Inventors: Tara N. Sainath, Brian E. D. Kingsbury, Abdel-rahman Mohamed, Bhuvana Ramabhadran
  • Publication number: 20200034703
    Abstract: A student neural network may be trained by a computer-implemented method, including: inputting common input data to each teacher neural network among a plurality of teacher neural networks to obtain a soft label output among a plurality of soft label outputs from each teacher neural network among the plurality of teacher neural networks, and training a student neural network with the input data and the plurality of soft label outputs.
    Type: Application
    Filed: July 27, 2018
    Publication date: January 30, 2020
    Inventors: Takashi Fukuda, Masayuki Suzuki, Osamu Ichikawa, Gakuto Kurata, Samuel Thomas, Bhuvana Ramabhadran
  • Publication number: 20200034702
    Abstract: A student neural network may be trained by a computer-implemented method, including: selecting a teacher neural network among a plurality of teacher neural networks, inputting an input data to the selected teacher neural network to obtain a soft label output generated by the selected teacher neural network, and training a student neural network with at least the input data and the soft label output from the selected teacher neural network.
    Type: Application
    Filed: July 27, 2018
    Publication date: January 30, 2020
    Inventors: Takashi Fukuda, Masayuki Suzuki, Osamu Ichikawa, Gakuto Kurata, Samuel Thomas, Bhuvana Ramabhadran
  • Publication number: 20200013393
    Abstract: A computer selects a test set of sentences from among sentences applied to train a whole sentence recurrent neural network language model to estimate the probability of likelihood of each whole sentence processed by natural language processing being correct. The computer generates imposter sentences from among the test set of sentences by substituting one word in each sentence of the test set of sentences. The computer generates, through the whole sentence recurrent neural network language model, a first score for each sentence of the test set of sentences and at least one additional score for each of the imposter sentences. The computer evaluates an accuracy of the natural language processing system in performing sequential classification tasks based on an accuracy value of the first score in reflecting a correct sentence and the at least one additional score in reflecting an incorrect sentence.
    Type: Application
    Filed: August 23, 2019
    Publication date: January 9, 2020
    Inventors: YINGHUI HUANG, Abhinav Sethy, Kartik Audhkhasi, Bhuvana Ramabhadran
  • Publication number: 20200013408
    Abstract: Symbol sequences are estimated using a computer-implemented method including detecting one or more candidates of a target symbol sequence from a speech-to-text data, extracting a related portion of each candidate from the speech-to-text data, detecting repetition of at least a partial sequence of each candidate within the related portion of the corresponding candidate, labeling the detected repetition with a repetition indication, and estimating whether each candidate is the target symbol sequence, using the corresponding related portion including the repetition indication of each of the candidates.
    Type: Application
    Filed: September 20, 2019
    Publication date: January 9, 2020
    Inventors: Kenneth W. Church, Gakuto Kurata, Bhuvana Ramabhadran, Abhinav Sethy, Masayuki Suzuki, Ryuki Tachibana
  • Patent number: 10529337
    Abstract: Symbol sequences are estimated using a computer-implemented method including detecting one or more candidates of a target symbol sequence from a speech-to-text data, extracting a related portion of each candidate from the speech-to-text data, detecting repetition of at least a partial sequence of each candidate within the related portion of the corresponding candidate, labeling the detected repetition with a repetition indication, and estimating whether each candidate is the target symbol sequence, using the corresponding related portion including the repetition indication of each of the candidates.
    Type: Grant
    Filed: January 7, 2019
    Date of Patent: January 7, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Kenneth W. Church, Gakuto Kurata, Bhuvana Ramabhadran, Abhinav Sethy, Masayuki Suzuki, Ryuki Tachibana
  • Publication number: 20190378006
    Abstract: A technique for constructing a model supporting a plurality of domains is disclosed. In the technique, a plurality of teacher models, each of which is specialized for different one of the plurality of the domains, is prepared. A plurality of training data collections, each of which is collected for different one of the plurality of the domains, is obtained. A plurality of soft label sets is generated by inputting each training data in the plurality of the training data collections into corresponding one of the plurality of the teacher models. A student model is trained using the plurality of the soft label sets.
    Type: Application
    Filed: June 8, 2018
    Publication date: December 12, 2019
    Inventors: Takashi Fukuda, Osamu Ichikawa, Samuel Thomas, Bhuvana Ramabhadran
  • Patent number: 10460723
    Abstract: A computer-implemented method is provided. The computer-implemented method is performed by a speech recognition system having at least a processor. The method includes estimating sound identification information from a neural network having periodic indications and components of a frequency spectrum of an audio signal data inputted thereto. The method further includes performing a speech recognition operation on the audio signal data to decode the audio signal data into a textual representation based on the estimated sound identification information. The neural network includes a plurality of fully-connected network layers having a first layer that includes a plurality of first nodes and a plurality of second nodes. The method further comprises training the neural network by initially isolating the periodic indications from the components of the frequency spectrum in the first layer by setting weights between the first nodes and a plurality of input nodes corresponding to the periodic indications to 0.
    Type: Grant
    Filed: May 30, 2018
    Date of Patent: October 29, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Takashi Fukuda, Osamu Ichikawa, Bhuvana Ramabhadran
  • Publication number: 20190318732
    Abstract: A whole sentence recurrent neural network (RNN) language model (LM) is provided for for estimating a probability of likelihood of each whole sentence processed by natural language processing being correct. A noise contrastive estimation sampler is applied against at least one entire sentence from a corpus of multiple sentences to generate at least one incorrect sentence. The whole sentence RNN LN is trained, using the at least one entire sentence from the corpus and the at least one incorrect sentence, to distinguish the at least one entire sentence as correct. The whole sentence recurrent neural network language model is applied to estimate the probability of likelihood of each whole sentence processed by natural language processing being correct.
    Type: Application
    Filed: April 16, 2018
    Publication date: October 17, 2019
    Inventors: Yinghui Huang, Abhinav Sethy, Kartik Audhkhasi, Bhuvana Ramabhadran
  • Patent number: 10431210
    Abstract: A whole sentence recurrent neural network (RNN) language model (LM) is provided for for estimating a probability of likelihood of each whole sentence processed by natural language processing being correct. A noise contrastive estimation sampler is applied against at least one entire sentence from a corpus of multiple sentences to generate at least one incorrect sentence. The whole sentence RNN LN is trained, using the at least one entire sentence from the corpus and the at least one incorrect sentence, to distinguish the at least one entire sentence as correct. The whole sentence recurrent neural network language model is applied to estimate the probability of likelihood of each whole sentence processed by natural language processing being correct.
    Type: Grant
    Filed: April 16, 2018
    Date of Patent: October 1, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Yinghui Huang, Abhinav Sethy, Kartik Audhkhasi, Bhuvana Ramabhadran
  • Patent number: 10360901
    Abstract: Techniques for learning front-end speech recognition parameters as part of training a neural network classifier include obtaining an input speech signal, and applying front-end speech recognition parameters to extract features from the input speech signal. The extracted features may be fed through a neural network to obtain an output classification for the input speech signal, and an error measure may be computed for the output classification through comparison of the output classification with a known target classification. Back propagation may be applied to adjust one or more of the front-end parameters as one or more layers of the neural network, based on the error measure.
    Type: Grant
    Filed: December 5, 2014
    Date of Patent: July 23, 2019
    Assignee: Nuance Communications, Inc.
    Inventors: Tara N. Sainath, Brian E. D. Kingsbury, Abdel-rahman Mohamed, Bhuvana Ramabhadran