Patents by Inventor Gakuto Kurata

Gakuto Kurata has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9870765
    Abstract: Methods and a system are provided for estimating automatic speech recognition (ASR) accuracy. A method includes obtaining transcriptions of utterances in a conversation over two channels. The method further includes sorting the transcriptions along a time axis using a forced alignment. The method also includes training a language model with the sorted transcriptions. The method additionally includes performing ASR for utterances in a conversation between a first user and a second user. The second user is a target of ASR accuracy estimation. The method further includes determining whether an ASR result of the second user is consistent or inconsistent with an ASR result of the first user using the trained language model. The method also includes estimating the ASR result of the second user as poor responsive to the ASR result of the second user being as inconsistent with the ASR result of the first user.
    Type: Grant
    Filed: June 3, 2016
    Date of Patent: January 16, 2018
    Assignee: International Business Machines Corporation
    Inventors: Gakuto Kurata, Masayuki A. Suzuki
  • Patent number: 9870767
    Abstract: Embodiments include methods and systems for improving an acoustic model. Aspects include acquiring a first standard deviation value by calculating standard deviation of a feature from first training data and acquiring a second standard deviation value by calculating standard deviation of a feature from second training data acquired in a different environment from an environment of the first training data. Aspects also include creating a feature adapted to an environment where the first training data is recorded, by multiplying the feature acquired from the second training data by a ratio obtained by dividing the first standard deviation value by the second standard deviation value. Aspects further include reconstructing an acoustic model constructed using training data acquired in the same environment as the environment of the first training data using the feature adapted to the environment where the first training data is recorded.
    Type: Grant
    Filed: December 15, 2015
    Date of Patent: January 16, 2018
    Assignee: International Business Machines Corporation
    Inventors: Gakuto Kurata, Toru Nagano, Masayuki Suzuki
  • Patent number: 9870766
    Abstract: Embodiments include methods and systems for improving an acoustic model. Aspects include acquiring a first standard deviation value by calculating standard deviation of a feature from first training data and acquiring a second standard deviation value by calculating standard deviation of a feature from second training data acquired in a different environment from an environment of the first training data. Aspects also include creating a feature adapted to an environment where the first training data is recorded, by multiplying the feature acquired from the second training data by a ratio obtained by dividing the first standard deviation value by the second standard deviation value. Aspects further include reconstructing an acoustic model constructed using training data acquired in the same environment as the environment of the first training data using the feature adapted to the environment where the first training data is recorded.
    Type: Grant
    Filed: October 28, 2015
    Date of Patent: January 16, 2018
    Assignee: International Business Machines Incorporated
    Inventors: Gakuto Kurata, Toru Nagano, Masayuki Suzuki
  • Publication number: 20180012124
    Abstract: A computer implemented method for training a neural network to capture a structural feature specific to a set of chemical compounds is disclosed. In the method, the computer system reads an expression describing a structure of the chemical compound for each chemical compound in the set and enumerates one or more combinations of a position and a type of a structural element appearing in the expression for each chemical compound in the set. The computer system also generates training data based on the one or more enumerated combinations for each chemical compound in the set. The training data includes one or more values with a length, each of which indicates whether or not a corresponding type of the structural element appears at a corresponding position for each combination. Furthermore, the computer system trains the neural network based on the training data for the set of the chemical compounds.
    Type: Application
    Filed: July 5, 2016
    Publication date: January 11, 2018
    Inventors: Satoshi Hara, Gakuto Kurata, Shigeru Nakagawa, Seiji Takeda
  • Patent number: 9842610
    Abstract: A method is provided for training a Deep Neural Network (DNN) for acoustic modeling in speech recognition. The method includes reading central frames and side frames as input frames from a memory. The side frames are preceding side frames preceding the central frames and/or succeeding side frames succeeding the central frames. The method further includes executing pre-training for only the central frames or both the central frames and the side frames and fine-tuning for the central frames and the side frames so as to emphasize connections between acoustic features in the central frames and units of the bottom layer in hidden layer of the DNN.
    Type: Grant
    Filed: June 26, 2015
    Date of Patent: December 12, 2017
    Assignee: International Business Machines Corporation
    Inventor: Gakuto Kurata
  • Publication number: 20170352345
    Abstract: Methods and a system are provided for estimating automatic speech recognition (ASR) accuracy. A method includes obtaining transcriptions of utterances in a conversation over two channels. The method further includes sorting the transcriptions along a time axis using a forced alignment. The method also includes training a language model with the sorted transcriptions. The method additionally includes performing ASR for utterances in a conversation between a first user and a second user. The second user is a target of ASR accuracy estimation. The method further includes determining whether an ASR result of the second user is consistent or inconsistent with an ASR result of the first user using the trained language model. The method also includes estimating the ASR result of the second user as poor responsive to the ASR result of the second user being as inconsistent with the ASR result of the first user.
    Type: Application
    Filed: June 3, 2016
    Publication date: December 7, 2017
    Inventors: Gakuto Kurata, Masayuki A. Suzuki
  • Publication number: 20170345414
    Abstract: Embodiments include methods and systems for improving an acoustic model. Aspects include acquiring a first standard deviation value by calculating standard deviation of a feature from first training data and acquiring a second standard deviation value by calculating standard deviation of a feature from second training data acquired in a different environment from an environment of the first training data. Aspects also include creating a feature adapted to an environment where the first training data is recorded, by multiplying the feature acquired from the second training data by a ratio obtained by dividing the first standard deviation value by the second standard deviation value. Aspects further include reconstructing an acoustic model constructed using training data acquired in the same environment as the environment of the first training data using the feature adapted to the environment where the first training data is recorded.
    Type: Application
    Filed: August 16, 2017
    Publication date: November 30, 2017
    Inventors: Gakuto Kurata, Toru Nagano, Masayuki Suzuki
  • Publication number: 20170345415
    Abstract: Embodiments include methods and systems for improving an acoustic model. Aspects include acquiring a first standard deviation value by calculating standard deviation of a feature from first training data and acquiring a second standard deviation value by calculating standard deviation of a feature from second training data acquired in a different environment from an environment of the first training data. Aspects also include creating a feature adapted to an environment where the first training data is recorded, by multiplying the feature acquired from the second training data by a ratio obtained by dividing the first standard deviation value by the second standard deviation value. Aspects further include reconstructing an acoustic model constructed using training data acquired in the same environment as the environment of the first training data using the feature adapted to the environment where the first training data is recorded.
    Type: Application
    Filed: August 16, 2017
    Publication date: November 30, 2017
    Inventors: Gakuto Kurata, Toru Nagano, Masayuki Suzuki
  • Patent number: 9812122
    Abstract: A construction method for a speech recognition model, in which a computer system includes; a step of acquiring alignment between speech of each of a plurality of speakers and a transcript of the speaker; a step of joining transcripts of the respective ones of the plurality of speakers along a time axis, creating a transcript of speech of mixed speakers obtained from synthesized speech of the speakers, and replacing predetermined transcribed portions of the plurality of speakers overlapping on the time axis with a unit which represents a simultaneous speech segment; and a step of constructing at least one of an acoustic model and a language model which make up a speech recognition model, based on the transcript of the speech of the mixed speakers.
    Type: Grant
    Filed: September 23, 2015
    Date of Patent: November 7, 2017
    Assignee: International Business Machines Corporation
    Inventors: Gakuto Kurata, Toru Nagano, Masayuki Suzuki, Ryuki Tachibana
  • Publication number: 20170278508
    Abstract: Methods and systems are provided for finding a target document in spoken language processing. One of the methods includes calculating a score of each document in a document set, in response to a receipt of first n words of output of an automatic speech recognition (ASR) system, n being equal or greater than zero. The method further includes reading a prior distribution of each document in the document set from a memory device, and updating, for each document in the document set, the score, using the prior distribution, and a weight for interpolation, the weight for interpolation being set based on a confidence score of output of the ASR system. The method additionally includes finding a target document among the document set, based on the updated score of each document.
    Type: Application
    Filed: March 22, 2016
    Publication date: September 28, 2017
    Inventors: Gakuto Kurata, Masayuki A. Suzuki, Ryuki Tachibana
  • Patent number: 9747893
    Abstract: A computer-based, unsupervised training method for an N-gram language model includes reading, by a computer, recognition results obtained as a result of speech recognition of speech data; acquiring, by the computer, a reliability for each of the read recognition results; referring, by the computer, to the recognition result and the acquired reliability to select an N-gram entry; and training, by the computer, the N-gram language model about selected one of more of the N-gram entries using all recognition results.
    Type: Grant
    Filed: October 6, 2016
    Date of Patent: August 29, 2017
    Assignee: International Business Machines Corporation
    Inventors: Nobuyasu Itoh, Gakuto Kurata, Masafumi Nishimura
  • Publication number: 20170243114
    Abstract: A computer implemented method for adapting a model for recognition processing to a target-domain is disclosed. The method includes preparing a first distribution in relation to a part of the model, in which the first distribution is derived from data of a training-domain for the model. The method also includes obtaining a second distribution in relation to the part of the model by using data of the target-domain. The method further includes tuning one or more parameters of the part of the model so that difference between the first and the second distributions becomes small.
    Type: Application
    Filed: February 19, 2016
    Publication date: August 24, 2017
    Inventor: Gakuto Kurata
  • Publication number: 20170235547
    Abstract: A dialog server which provides dialogs made by at least one user through their respective avatars in a virtual space. A method and a computer readable article of manufacture tangibly embodying computer readable instructions for executing the steps of the method are also provided. The dialog server includes: a position storage unit which stores positional information on the avatars; an utterance receiver which receives at least one utterance of avatars and utterance strength representing an importance or attention level of the utterance; an interest level calculator which calculates interest levels between avatars based on their positional information; a message processor which generates a message based on the utterance in accordance with a value calculated from the interest levels and the utterance strength; and a message transmitter which transmits the message to the avatars.
    Type: Application
    Filed: January 12, 2017
    Publication date: August 17, 2017
    Inventors: Gakuto Kurata, Tohru Nagano, Michiaki Tatsubori
  • Publication number: 20170221486
    Abstract: A method for reducing response time in a speech interface including constructing a partially completed word sequence from a partially received utterance from a speaker received by an audio sensor, modeling a remainder portion using a processor based on a rich predictive model to predict the remainder portion, and responding to the partially completed word sequence and the predicted remainder portion using a natural language vocalization generator with a vocalization, wherein the vocalization is prepared before a complete utterance is received from the speaker and conveyed to the speaker by an audio transducer.
    Type: Application
    Filed: January 29, 2016
    Publication date: August 3, 2017
    Inventors: Gakuto Kurata, Tohru Nagano
  • Publication number: 20170193200
    Abstract: A method and system are provided for predicting chemical structures. The method includes receiving, at a user interface, intended structural feature values and intended chemical property values, as vectors. The method further includes constructing, by a hardware processor, a prediction model, wherein the prediction model predicts other structural feature values from the intended structural feature values and the intended chemical property values, and automatically configuring, by the hardware processor, at least one chemical structure candidate from the other structural feature vectors.
    Type: Application
    Filed: December 30, 2015
    Publication date: July 6, 2017
    Inventors: Hsiang H. Hsu, Gakuto Kurata, Koji Masuda, Shigeru Nakagawa, Hajime Nakamura, Seiji Takeda
  • Publication number: 20170169813
    Abstract: Methods and systems for language processing includes training one or more automatic speech recognition models using an automatic speech recognition dictionary. A set of N automatic speech recognition hypotheses for an input is determined, based on the one or more automatic speech recognition models, using a processor. A best hypothesis is selected using a discriminative language model and a list of relevant words. Natural language processing is performed on the best hypothesis.
    Type: Application
    Filed: December 14, 2015
    Publication date: June 15, 2017
    Inventors: Masayuki Suzuki, Nobuyasu Itoh, Gakuto Kurata, Tohru Nagano
  • Patent number: 9626958
    Abstract: A method for speech retrieval includes acquiring a keyword designated by a character string, and a phoneme string or a syllable string, detecting one or more coinciding segments by comparing a character string that is a recognition result of word speech recognition with words as recognition units performed for speech data to be retrieved and the character string of the keyword, calculating an evaluation value of each of the one or more segments by using the phoneme string or the syllable string of the keyword to evaluate a phoneme string or a syllable string that is recognized in each of the detected one or more segments and that is a recognition result of phoneme speech recognition with phonemes or syllables as recognition units performed for the speech data, and outputting a segment in which the calculated evaluation value exceeds a predetermined threshold.
    Type: Grant
    Filed: May 27, 2016
    Date of Patent: April 18, 2017
    Assignee: SINOEAST CONCEPT LIMITED
    Inventors: Gakuto Kurata, Tohru Nagano, Masafumi Nishimura
  • Patent number: 9626957
    Abstract: A method for speech retrieval includes acquiring a keyword designated by a character string, and a phoneme string or a syllable string, detecting one or more coinciding segments by comparing a character string that is a recognition result of word speech recognition with words as recognition units performed for speech data to be retrieved and the character string of the keyword, calculating an evaluation value of each of the one or more segments by using the phoneme string or the syllable string of the keyword to evaluate a phoneme string or a syllable string that is recognized in each of the detected one or more segments and that is a recognition result of phoneme speech recognition with phonemes or syllables as recognition units performed for the speech data, and outputting a segment in which the calculated evaluation value exceeds a predetermined threshold.
    Type: Grant
    Filed: May 27, 2016
    Date of Patent: April 18, 2017
    Assignee: SINOEAST CONCEPT LIMITED
    Inventors: Gakuto Kurata, Tohru Nagano, Masafumi Nishimura
  • Patent number: 9601110
    Abstract: A computer-based, unsupervised training method for an N-gram language model includes reading, by a computer, recognition results obtained as a result of speech recognition of speech data; acquiring, by the computer, a reliability for each of the read recognition results; referring, by the computer, to the recognition result and the acquired reliability to select an N-gram entry; and training, by the computer, the N-gram language model about selected one of more of the N-gram entries using all recognition results.
    Type: Grant
    Filed: June 24, 2015
    Date of Patent: March 21, 2017
    Assignee: International Business Machines Corporation
    Inventors: Nobuyasu Itoh, Gakuto Kurata, Masafumi Nishimura
  • Publication number: 20170061960
    Abstract: A method, a system, and a computer program product for building an n-gram language model for an automatic speech recognition. The method includes reading training text data and additional text data both for the n-gram language model from a storage, and building the n-gram language model by a smoothing algorithm having discount parameters for n-gram counts. The additional text data includes plural sentences having at least one target keyword. Each discount parameter for each target keyword is tuned using development data which are different from the additional text data so that a predetermined balance between precision and recall is achieved.
    Type: Application
    Filed: August 28, 2015
    Publication date: March 2, 2017
    Inventors: Gakuto Kurata, Toru Nagano, Masayuki Suzuki, Ryuki Tachibana