Patents by Inventor Gakuto Kurata

Gakuto Kurata has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method, system and computer program product for learning classification model

Patent number: 10621509

Abstract: Method, system and computer program product for learning classification model. The present invention provides a computer-implemented method for learning a classification model using one or more training data each having a training input and one or more correct labels assigned to the training input, the classification model having a plurality of hidden units and a plurality of output units is provided. The method includes: obtaining a combination of co-occurring labels expected to be appeared together for an input to the classification model; initializing the classification model with preparing a dedicated unit for the combination from among the plurality of the hidden units so as to activate together related output units connected to the dedicated unit among the plurality of the output units, each related output unit corresponding to each co-occurring label in the combination; and training the classification model using the one or more training data.

Type: Grant

Filed: August 30, 2016

Date of Patent: April 14, 2020

Assignee: International Business Machines Corporation

Inventor: Gakuto Kurata
Method for re-aligning corpus and improving the consistency

Patent number: 10607604

Abstract: Vocabulary consistency for a language model may be improved by splitting a target token in an initial vocabulary into a plurality of split tokens, calculating an entropy of the target token and an entropy of the plurality of split tokens in a bootstrap language model, and determining whether to delete the target token from the initial vocabulary based on at least the entropy of the target token and the entropy of the plurality of split tokens.

Type: Grant

Filed: October 27, 2017

Date of Patent: March 31, 2020

Assignee: International Business Machines Corporation

Inventors: Nobuyasu Itoh, Gakuto Kurata
Approach to reducing the response time of a speech interface

Patent number: 10600411

Abstract: A method for reducing response time in a speech interface including constructing a partially completed word sequence from a partially received utterance from a speaker received by an audio sensor, modeling a remainder portion using a processor based on a rich predictive model to predict the remainder portion, and responding to the partially completed word sequence and the predicted remainder portion using a natural language vocalization generator with a vocalization, wherein the vocalization is prepared before a complete utterance is received from the speaker and conveyed to the speaker by an audio transducer.

Type: Grant

Filed: October 6, 2017

Date of Patent: March 24, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Gakuto Kurata, Tohru Nagano
Processing of speech signal

Patent number: 10586529

Abstract: A computer-implemented method for processing a speech signal, includes: identifying speech segments in an input speech signal; calculating an upper variance and a lower variance, the upper variance being a variance of upper spectra larger than a criteria among speech spectra corresponding to frames in the speech segments, the lower variance being a variance of lower spectra smaller than a criteria among the speech spectra corresponding to the frames in the speech segments; determining whether the input speech signal is a special input speech signal using a difference between the upper variance and the lower variance; and performing speech recognition of the input speech signal which has been determined to be the special input speech signal, using a special acoustic model for the special input speech signal.

Type: Grant

Filed: September 14, 2017

Date of Patent: March 10, 2020

Assignee: International Business Machines Corporation

Inventors: Osamu Ichikawa, Takashi Fukuda, Gakuto Kurata, Bhuvana Ramabhadran
KNOWLEDGE TRANSFER BETWEEN RECURRENT NEURAL NETWORKS

Publication number: 20200074292

Abstract: Knowledge transfer between recurrent neural networks is performed by obtaining a first output sequence from a bidirectional Recurrent Neural Network (RNN) model for an input sequence, obtaining a second output sequence from a unidirectional RNN model for the input sequence, selecting at least one first output from the first output sequence based on a similarity between the at least one first output and a second output from the second output sequence; and training the unidirectional RNN model to increase the similarity between the at least one first output and the second output.

Type: Application

Filed: August 29, 2018

Publication date: March 5, 2020

Inventors: Gakuto Kurata, Kartik Audhkhasi
TECHNIQUE FOR AUTOMATICALLY SPLITTING WORDS

Publication number: 20200065378

Abstract: A computer-implemented method, computer program product, and system are provided for separating a word in a dictionary. The method includes reading a word from the dictionary as a source word. The method also includes searching the dictionary for another word having a substring with a same surface string and a same reading as the source word. The method additionally includes splitting the another word by the source word to obtain one or more remaining substrings of the another word. The method further includes registering each of the one or more remaining substrings as a new word in the dictionary.

Type: Application

Filed: October 31, 2019

Publication date: February 27, 2020

Inventors: Toru Nagano, Nobuyasu Itoh, Gakuto Kurata
Technique for automatically splitting words

Patent number: 10572586

Abstract: A computer-implemented method, computer program product, and system are provided for separating a word in a dictionary. The method includes reading a word from the dictionary as a source word. The method also includes searching the dictionary for another word having a substring with a same surface string and a same reading as the source word. The method additionally includes splitting the another word by the source word to obtain one or more remaining substrings of the another word. The method further includes registering each of the one or more remaining substrings as a new word in the dictionary.

Type: Grant

Filed: February 27, 2018

Date of Patent: February 25, 2020

Assignee: International Business Machines Corporation

Inventors: Toru Nagano, Nobuyasu Itoh, Gakuto Kurata
APPROACH TO REDUCING THE RESPONSE TIME OF A SPEECH INTERFACE

Publication number: 20200051569

Abstract: A method for reducing response time in a speech interface including constructing a partially completed word sequence from a partially received utterance from a speaker received by an audio sensor, modeling a remainder portion using a processor based on a rich predictive model to predict the remainder portion, and responding to the partially completed word sequence and the predicted remainder portion using a natural language vocalization generator with a vocalization, wherein the vocalization is prepared before a complete utterance is received from the speaker and conveyed to the speaker by an audio transducer.

Type: Application

Filed: October 17, 2019

Publication date: February 13, 2020

Inventors: Gakuto Kurata, Tohru Nagano
TRAINING OF STUDENT NEURAL NETWORK WITH SWITCHED TEACHER NEURAL NETWORKS

Publication number: 20200034702

Abstract: A student neural network may be trained by a computer-implemented method, including: selecting a teacher neural network among a plurality of teacher neural networks, inputting an input data to the selected teacher neural network to obtain a soft label output generated by the selected teacher neural network, and training a student neural network with at least the input data and the soft label output from the selected teacher neural network.

Type: Application

Filed: July 27, 2018

Publication date: January 30, 2020

Inventors: Takashi Fukuda, Masayuki Suzuki, Osamu Ichikawa, Gakuto Kurata, Samuel Thomas, Bhuvana Ramabhadran
TRAINING OF STUDENT NEURAL NETWORK WITH TEACHER NEURAL NETWORKS

Publication number: 20200034703

Abstract: A student neural network may be trained by a computer-implemented method, including: inputting common input data to each teacher neural network among a plurality of teacher neural networks to obtain a soft label output among a plurality of soft label outputs from each teacher neural network among the plurality of teacher neural networks, and training a student neural network with the input data and the plurality of soft label outputs.

Type: Application

Filed: July 27, 2018

Publication date: January 30, 2020

Inventors: Takashi Fukuda, Masayuki Suzuki, Osamu Ichikawa, Gakuto Kurata, Samuel Thomas, Bhuvana Ramabhadran
Generating labeled data by sequence-to-sequence modeling with added perturbations to encoded information

Patent number: 10546230

Abstract: Methods and a system are provided for generating labeled data. A method includes encoding, by a processor-based encoder, a first labeled data into an encoded representation of the first labeled data. The method further includes modifying the encoded representation into a modified representation by adding a perturbation to the encoded representation. The method additionally includes decoding, by a processor-based decoder, the modified representation into a second labeled data.

Type: Grant

Filed: August 12, 2016

Date of Patent: January 28, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Gakuto Kurata
Input generation for classifier

Patent number: 10540963

Abstract: A computer-implemented method for generating an input for a classifier. The method includes obtaining n-best hypotheses which is an output of an automatic speech recognition (ASR) for an utterance, combining the n-best hypotheses horizontally in a predetermined order with a separator between each pair of hypotheses, and outputting the combined n-best hypotheses as a single text input to a classifier.

Type: Grant

Filed: February 2, 2017

Date of Patent: January 21, 2020

Assignee: International Business Machines Corporation

Inventors: Nobuyasu Itoh, Gakuto Kurata, Ryuki Tachibana
INPUT GENERATION FOR CLASSIFIER

Publication number: 20200020324

Abstract: A computer-implemented method includes generating a single text data structure for a classifier of a speech recognition system, and sending the single text data structure to the classifier. Generating the single text data structure includes obtaining n-best hypotheses as an output of an automatic speech recognition (ASR) task for an utterance received by the speech recognition system, and combining the n-best hypotheses in a predetermined order with a separator between each pair of hypotheses to generate the single text data structure. The classifier is trained based on a single training text data structure by obtaining training source data, including selecting a first text sample and at least one similar text sample belong to a same class as the first text sample based on a maximum number of hypotheses, and arranging the plurality of text samples based on a degree of similarity.

Type: Application

Filed: September 23, 2019

Publication date: January 16, 2020

Inventors: Nobuyasu Itoh, Gakuto Kurata, Ryuki Tachibana
IMPLEMENTING A CLASSIFICATION MODEL FOR RECOGNITION PROCESSING

Publication number: 20200020323

Abstract: A method, system, and computer program product for learning a recognition model for recognition processing. The method includes preparing one or more examples for learning, each of which includes an input segment, an additional segment adjacent to the input segment and an assigned label. The input segment and the additional segment are extracted from an original training data. A classification model is trained, using the input segment and the additional segment in the examples, to initialize parameters of the classification model so that extended segments including the input segment and the additional segment are reconstructed from the input segment. Then, the classification model is tuned to predict a target label, using the input segment and the assigned label in the examples, based on the initialized parameters. At least a portion of the obtained classification model is included in the recognition model.

Type: Application

Filed: September 25, 2019

Publication date: January 16, 2020

Inventor: Gakuto Kurata
SYMBOL SEQUENCE ESTIMATION IN SPEECH

Publication number: 20200013408

Abstract: Symbol sequences are estimated using a computer-implemented method including detecting one or more candidates of a target symbol sequence from a speech-to-text data, extracting a related portion of each candidate from the speech-to-text data, detecting repetition of at least a partial sequence of each candidate within the related portion of the corresponding candidate, labeling the detected repetition with a repetition indication, and estimating whether each candidate is the target symbol sequence, using the corresponding related portion including the repetition indication of each of the candidates.

Type: Application

Filed: September 20, 2019

Publication date: January 9, 2020

Inventors: Kenneth W. Church, Gakuto Kurata, Bhuvana Ramabhadran, Abhinav Sethy, Masayuki Suzuki, Ryuki Tachibana
Symbol sequence estimation in speech

Patent number: 10529337

Abstract: Symbol sequences are estimated using a computer-implemented method including detecting one or more candidates of a target symbol sequence from a speech-to-text data, extracting a related portion of each candidate from the speech-to-text data, detecting repetition of at least a partial sequence of each candidate within the related portion of the corresponding candidate, labeling the detected repetition with a repetition indication, and estimating whether each candidate is the target symbol sequence, using the corresponding related portion including the repetition indication of each of the candidates.

Type: Grant

Filed: January 7, 2019

Date of Patent: January 7, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Kenneth W. Church, Gakuto Kurata, Bhuvana Ramabhadran, Abhinav Sethy, Masayuki Suzuki, Ryuki Tachibana
Implementing a classification model for recognition processing

Patent number: 10529318

Abstract: A method, system, and computer program product for learning a recognition model for recognition processing. The method includes preparing one or more examples for learning, each of which includes an input segment, an additional segment adjacent to the input segment and an assigned label. The input segment and the additional segment are extracted from an original training data. A classification model is trained, using the input segment and the additional segment in the examples, to initialize parameters of the classification model so that extended segments including the input segment and the additional segment are reconstructed from the input segment. Then, the classification model is tuned to predict a target label, using the input segment and the assigned label in the examples, based on the initialized parameters. At least a portion of the obtained classification model is included in the recognition model.

Type: Grant

Filed: July 31, 2015

Date of Patent: January 7, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Gakuto Kurata
Supervised training for word embedding

Patent number: 10503827

Abstract: A method and system are provided for training word embedding of domain-specific words. The method includes training, by a processor, a first word embedding, using a general domain corpus, on one or more terms inputted by a user. The method further includes retraining, by the processor, the first word embedding, using a specific domain corpus, for a Neuro-Linguistic Programming task, to create a tuned word embedding. The method also includes training, by the processor, a Neural Network for the Neuro-Linguistic Programming task, using the specific domain corpus. The method additionally includes incorporating, by the processor, the trained Neural Network and tuned word embedding into a Neural Network-based Neuro-Linguistic Programming task. The retraining of the first word embedding and the training of the Neural Network are performed together, and the tuned word embedding is accelerated due to a change in a hyper parameter for domain-specific words.

Type: Grant

Filed: September 23, 2016

Date of Patent: December 10, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Gakuto Kurata, Masayuki Suzuki, Ryuki Tachibana
Method of selecting training text for language model, and method of training language model using the training text, and computer and computer program for executing the methods

Patent number: 10418029

Abstract: Method of selecting training text for language model, and method of training language model using the training text, and computer and computer program for executing the methods. The present invention provides for selecting training text for a language model that includes: generating a template for selecting training text from a corpus in a first domain according to generation techniques of: (i) replacing one or more words in a word string selected from the corpus in the first domain with a special symbol representing any word or word string, and adopting the word string after replacement as a template for selecting the training text; and/or (ii) adopting the word string selected from the corpus in the first domain as the template for selecting the training text; and selecting text covered by the template as the training text from a corpus in a second domain different from the first domain.

Type: Grant

Filed: November 30, 2017

Date of Patent: September 17, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Nobuyasu Itoh, Gakuto Kurata, Masafumi Nishimura
USE OF SMALL UNIT LANGUAGE MODEL FOR TRAINING LARGE UNIT LANGUAGE MODELS

Publication number: 20190272318

Abstract: A computer-implemented method, computer program product, and apparatus are provided. The method includes generating a plurality of sequences of small unit tokens from a first language model that is trained with a small unit corpus including the small unit tokens, the small unit corpus having been derived by tokenization with a small unit. The method further includes tokenizing the plurality of sequences of small unit tokens by a large unit that is larger than the small unit, to create a derived large unit corpus including derived large unit tokens.

Type: Application

Filed: March 1, 2018

Publication date: September 5, 2019

Inventors: Masayuki Suzuki, Nobuyasu Itoh, Gakuto Kurata

prev 1 2 3 4 5 6 7 8 … next