Patents by Inventor Tohru Nagano

Tohru Nagano has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Data sorting for generating RNN-T models

Patent number: 12027153

Abstract: A computer-implemented method for preparing training data for a speech recognition model is provided including obtaining a plurality of sentences from a corpus, dividing each phoneme in each sentence of the plurality of sentences into three hidden states, calculating, for each sentence of the plurality of sentences, a score based on a variation in duration of the three hidden states of each phoneme in the sentence, and sorting the plurality of sentences by using the calculated scores.

Type: Grant

Filed: January 21, 2022

Date of Patent: July 2, 2024

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Takashi Fukuda, Tohru Nagano
DOMAIN ADAPTIVE SPEECH RECOGNITION USING ARTIFICIAL INTELLIGENCE

Publication number: 20240127801

Abstract: Methods, systems, and computer program products for domain adaptive speech recognition using artificial intelligence are provided herein. A computer-implemented method includes generating a set of language data candidates, each language data candidate comprising one or more graphemes, by processing a sequence of phonemes related to input speech data using an artificial intelligence-based data conversion model; determining, for a target pair of phonemes and graphemes, a subset of graphemes from the set of language data candidates; generating a first speech recognition output by processing the subset of graphemes using at least one biasing language model and an artificial intelligence-based speech recognition model; generating a second speech recognition output by replacing at least a portion of the subset of graphemes in the first speech recognition output with at least one of the graphemes from the target pair; and performing automated actions based on the second speech recognition output.

Type: Application

Filed: October 13, 2022

Publication date: April 18, 2024

Inventors: Tohru Nagano, Gakuto Kurata
DATA SORTING FOR GENERATING RNN-T MODELS

Publication number: 20230237987

Abstract: A computer-implemented method for preparing training data for a speech recognition model is provided including obtaining a plurality of sentences from a corpus, dividing each phoneme in each sentence of the plurality of sentences into three hidden states, calculating, for each sentence of the plurality of sentences, a score based on a variation in duration of the three hidden states of each phoneme in the sentence, and sorting the plurality of sentences by using the calculated scores.

Type: Application

Filed: January 21, 2022

Publication date: July 27, 2023

Inventors: Takashi Fukuda, Tohru Nagano
EXTERNAL LANGUAGE MODEL FUSING METHOD FOR SPEECH RECOGNITION

Publication number: 20230069628

Abstract: A computer-implemented method for fusing an end-to-end speech recognition model with an external language model (ExternalLM) is provided. The method includes obtaining an output of the end-to-end speech recognition model. The output is a probability distribution. The method further includes transforming, by a hardware processor, the probability distribution into a transformed probability distribution to relax a sharpness of the probability distribution. The method also includes fusing the transformed probability distribution and a probability distribution of the ExternalLM for decoding speech.

Type: Application

Filed: August 24, 2021

Publication date: March 2, 2023

Inventors: Tohru Nagano, Masayuki Suzuki, Gakuto Kurata
Approach to reducing the response time of a speech interface

Patent number: 11107473

Abstract: A method for reducing response time in a speech interface including constructing a partially completed word sequence from a partially received utterance from a speaker received by an audio sensor, modeling a remainder portion using a processor based on a rich predictive model to predict the remainder portion, and responding to the partially completed word sequence and the predicted remainder portion using a natural language vocalization generator with a vocalization, wherein the vocalization is prepared before a complete utterance is received from the speaker and conveyed to the speaker by an audio transducer.

Type: Grant

Filed: October 17, 2019

Date of Patent: August 31, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Gakuto Kurata, Tohru Nagano
Approach to reducing the response time of a speech interface

Patent number: 10600411

Abstract: A method for reducing response time in a speech interface including constructing a partially completed word sequence from a partially received utterance from a speaker received by an audio sensor, modeling a remainder portion using a processor based on a rich predictive model to predict the remainder portion, and responding to the partially completed word sequence and the predicted remainder portion using a natural language vocalization generator with a vocalization, wherein the vocalization is prepared before a complete utterance is received from the speaker and conveyed to the speaker by an audio transducer.

Type: Grant

Filed: October 6, 2017

Date of Patent: March 24, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Gakuto Kurata, Tohru Nagano
APPROACH TO REDUCING THE RESPONSE TIME OF A SPEECH INTERFACE

Publication number: 20200051569

Abstract: A method for reducing response time in a speech interface including constructing a partially completed word sequence from a partially received utterance from a speaker received by an audio sensor, modeling a remainder portion using a processor based on a rich predictive model to predict the remainder portion, and responding to the partially completed word sequence and the predicted remainder portion using a natural language vocalization generator with a vocalization, wherein the vocalization is prepared before a complete utterance is received from the speaker and conveyed to the speaker by an audio transducer.

Type: Application

Filed: October 17, 2019

Publication date: February 13, 2020

Inventors: Gakuto Kurata, Tohru Nagano
Discriminative training of automatic speech recognition models with natural language processing dictionary for spoken language processing

Patent number: 10140976

Abstract: Methods and systems for language processing includes training one or more automatic speech recognition models using an automatic speech recognition dictionary. A set of N automatic speech recognition hypotheses for an input is determined, based on the one or more automatic speech recognition models, using a processor. A best hypothesis is selected using a discriminative language model and a list of relevant words. Natural language processing is performed on the best hypothesis.

Type: Grant

Filed: December 14, 2015

Date of Patent: November 27, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Masayuki Suzuki, Nobuyasu Itoh, Gakuto Kurata, Tohru Nagano
Dialog server for handling conversation in virtual space method and computer program for having conversation in virtual space

Patent number: 10001970

Abstract: A dialog server which provides dialogs made by at least one user through their respective avatars in a virtual space. A method and a computer readable article of manufacture tangibly embodying computer readable instructions for executing the steps of the method are also provided. The dialog server includes: a position storage unit which stores positional information on the avatars; an utterance receiver which receives at least one utterance of avatars and utterance strength representing an importance or attention level of the utterance; an interest level calculator which calculates interest levels between avatars based on their positional information; a message processor which generates a message based on the utterance in accordance with a value calculated from the interest levels and the utterance strength; and a message transmitter which transmits the message to the avatars.

Type: Grant

Filed: January 12, 2017

Date of Patent: June 19, 2018

Assignee: Activision Publishing, Inc.

Inventors: Gakuto Kurata, Tohru Nagano, Michiaki Tatsubori
Splitting utterances for quick responses

Patent number: 9972308

Abstract: Methods, a system, and a classifier are provided. A method includes preparing, by a processor, pairs for an information retrieval task. Each pair includes (i) a training-stage speech recognition result for a respective sequence of training words and (ii) an answer label corresponding to the training-stage speech recognition result. The method further includes obtaining, by the processor, a respective rank for the answer label included in each pair to obtain a set of ranks. The method also includes determining, by the processor, for each pair, an end of question part in the training-stage speech recognition result based on the set of ranks. The method additionally includes building, by the processor, the classifier such that the classifier receives a recognition-stage speech recognition result and returns a corresponding end of question part for the recognition-stage speech recognition result, based on the end of question part determined for the pairs.

Type: Grant

Filed: November 8, 2016

Date of Patent: May 15, 2018

Assignee: International Business Machines Corporation

Inventors: Tohru Nagano, Ryuki Tachibana
SPLITTING UTTERANCES FOR QUICK RESPONSES

Publication number: 20180130460

Abstract: Methods, a system, and a classifier are provided. A method includes preparing, by a processor, pairs for an information retrieval task. Each pair includes (i) a training-stage speech recognition result for a respective sequence of training words and (ii) an answer label corresponding to the training-stage speech recognition result. The method further includes obtaining, by the processor, a respective rank for the answer label included in each pair to obtain a set of ranks. The method also includes determining, by the processor, for each pair, an end of question part in the training-stage speech recognition result based on the set of ranks. The method additionally includes building, by the processor, the classifier such that the classifier receives a recognition-stage speech recognition result and returns a corresponding end of question part for the recognition-stage speech recognition result, based on the end of question part determined for the pairs.

Type: Application

Filed: November 8, 2016

Publication date: May 10, 2018

Inventors: Tohru Nagano, Ryuki Tachibana
Approach to reducing the response time of a speech interface

Patent number: 9922647

Abstract: A method for reducing response time in a speech interface including constructing a partially completed word sequence from a partially received utterance from a speaker received by an audio sensor, modeling a remainder portion using a processor based on a rich predictive model to predict the remainder portion, and responding to the partially completed word sequence and the predicted remainder portion using a natural language vocalization generator with a vocalization, wherein the vocalization is prepared before a complete utterance is received from the speaker and conveyed to the speaker by an audio transducer.

Type: Grant

Filed: January 29, 2016

Date of Patent: March 20, 2018

Assignee: International Business Machines Corporation

Inventors: Gakuto Kurata, Tohru Nagano
APPROACH TO REDUCING THE RESPONSE TIME OF A SPEECH INTERFACE

Publication number: 20180033433

Abstract: A method for reducing response time in a speech interface including constructing a partially completed word sequence from a partially received utterance from a speaker received by an audio sensor, modeling a remainder portion using a processor based on a rich predictive model to predict the remainder portion, and responding to the partially completed word sequence and the predicted remainder portion using a natural language vocalization generator with a vocalization, wherein the vocalization is prepared before a complete utterance is received from the speaker and conveyed to the speaker by an audio transducer.

Type: Application

Filed: October 6, 2017

Publication date: February 1, 2018

Inventors: Gakuto Kurata, Tohru Nagano
DIALOG SERVER FOR HANDLING CONVERSATION IN VIRTUAL SPACE METHOD AND COMPUTER PROGRAM FOR HAVING CONVERSATION IN VIRTUAL SPACE

Publication number: 20170235547

Abstract: A dialog server which provides dialogs made by at least one user through their respective avatars in a virtual space. A method and a computer readable article of manufacture tangibly embodying computer readable instructions for executing the steps of the method are also provided. The dialog server includes: a position storage unit which stores positional information on the avatars; an utterance receiver which receives at least one utterance of avatars and utterance strength representing an importance or attention level of the utterance; an interest level calculator which calculates interest levels between avatars based on their positional information; a message processor which generates a message based on the utterance in accordance with a value calculated from the interest levels and the utterance strength; and a message transmitter which transmits the message to the avatars.

Type: Application

Filed: January 12, 2017

Publication date: August 17, 2017

Inventors: Gakuto Kurata, Tohru Nagano, Michiaki Tatsubori
APPROACH TO REDUCING THE RESPONSE TIME OF A SPEECH INTERFACE

Publication number: 20170221486

Abstract: A method for reducing response time in a speech interface including constructing a partially completed word sequence from a partially received utterance from a speaker received by an audio sensor, modeling a remainder portion using a processor based on a rich predictive model to predict the remainder portion, and responding to the partially completed word sequence and the predicted remainder portion using a natural language vocalization generator with a vocalization, wherein the vocalization is prepared before a complete utterance is received from the speaker and conveyed to the speaker by an audio transducer.

Type: Application

Filed: January 29, 2016

Publication date: August 3, 2017

Inventors: Gakuto Kurata, Tohru Nagano
DISCRIMINATIVE TRAINING OF AUTOMATIC SPEECH RECOGNITION MODELS WITH NATURAL LANGUAGE PROCESSING DICTIONARY FOR SPOKEN LANGUAGE PROCESSING

Publication number: 20170169813

Abstract: Methods and systems for language processing includes training one or more automatic speech recognition models using an automatic speech recognition dictionary. A set of N automatic speech recognition hypotheses for an input is determined, based on the one or more automatic speech recognition models, using a processor. A best hypothesis is selected using a discriminative language model and a list of relevant words. Natural language processing is performed on the best hypothesis.

Type: Application

Filed: December 14, 2015

Publication date: June 15, 2017

Inventors: Masayuki Suzuki, Nobuyasu Itoh, Gakuto Kurata, Tohru Nagano
Speech retrieval method, speech retrieval apparatus, and program for speech retrieval apparatus

Patent number: 9626958

Abstract: A method for speech retrieval includes acquiring a keyword designated by a character string, and a phoneme string or a syllable string, detecting one or more coinciding segments by comparing a character string that is a recognition result of word speech recognition with words as recognition units performed for speech data to be retrieved and the character string of the keyword, calculating an evaluation value of each of the one or more segments by using the phoneme string or the syllable string of the keyword to evaluate a phoneme string or a syllable string that is recognized in each of the detected one or more segments and that is a recognition result of phoneme speech recognition with phonemes or syllables as recognition units performed for the speech data, and outputting a segment in which the calculated evaluation value exceeds a predetermined threshold.

Type: Grant

Filed: May 27, 2016

Date of Patent: April 18, 2017

Assignee: SINOEAST CONCEPT LIMITED

Inventors: Gakuto Kurata, Tohru Nagano, Masafumi Nishimura
Speech retrieval method, speech retrieval apparatus, and program for speech retrieval apparatus

Patent number: 9626957

Abstract: A method for speech retrieval includes acquiring a keyword designated by a character string, and a phoneme string or a syllable string, detecting one or more coinciding segments by comparing a character string that is a recognition result of word speech recognition with words as recognition units performed for speech data to be retrieved and the character string of the keyword, calculating an evaluation value of each of the one or more segments by using the phoneme string or the syllable string of the keyword to evaluate a phoneme string or a syllable string that is recognized in each of the detected one or more segments and that is a recognition result of phoneme speech recognition with phonemes or syllables as recognition units performed for the speech data, and outputting a segment in which the calculated evaluation value exceeds a predetermined threshold.

Type: Grant

Filed: May 27, 2016

Date of Patent: April 18, 2017

Assignee: SINOEAST CONCEPT LIMITED

Inventors: Gakuto Kurata, Tohru Nagano, Masafumi Nishimura
Dialog server for handling conversation in virtual space method and computer program for having conversation in virtual space

Patent number: 9583109

Abstract: A dialog server which provides dialogs made by at least one user through their respective avatars in a virtual space. A method and a computer readable article of manufacture tangibly embodying computer readable instructions for executing the steps of the method are also provided. The dialog server includes: a position storage unit which stores positional information on the avatars; an utterance receiver which receives at least one utterance of avatars and utterance strength representing an importance or attention level of the utterance; an interest level calculator which calculates interest levels between avatars based on their positional information; a message processor which generates a message based on the utterance in accordance with a value calculated from the interest levels and the utterance strength; and a message transmitter which transmits the message to the avatars.

Type: Grant

Filed: October 7, 2013

Date of Patent: February 28, 2017

Assignee: Activision Publishing, Inc.

Inventors: Gakuto Kurata, Tohru Nagano, Michiaki Tatsubori
SPEECH RETRIEVAL METHOD, SPEECH RETRIEVAL APPARATUS, AND PROGRAM FOR SPEECH RETRIEVAL APPARATUS

Publication number: 20160275940

Abstract: A method for speech retrieval includes acquiring a keyword designated by a character string, and a phoneme string or a syllable string, detecting one or more coinciding segments by comparing a character string that is a recognition result of word speech recognition with words as recognition units performed for speech data to be retrieved and the character string of the keyword, calculating an evaluation value of each of the one or more segments by using the phoneme string or the syllable string of the keyword to evaluate a phoneme string or a syllable string that is recognized in each of the detected one or more segments and that is a recognition result of phoneme speech recognition with phonemes or syllables as recognition units performed for the speech data, and outputting a segment in which the calculated evaluation value exceeds a predetermined threshold.

Type: Application

Filed: May 27, 2016

Publication date: September 22, 2016

Inventors: Gakuto Kurata, Tohru Nagano, Masafumi Nishimura

1 2 3 next