Patents by Inventor Tohru Nagano

Tohru Nagano has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240127801
    Abstract: Methods, systems, and computer program products for domain adaptive speech recognition using artificial intelligence are provided herein. A computer-implemented method includes generating a set of language data candidates, each language data candidate comprising one or more graphemes, by processing a sequence of phonemes related to input speech data using an artificial intelligence-based data conversion model; determining, for a target pair of phonemes and graphemes, a subset of graphemes from the set of language data candidates; generating a first speech recognition output by processing the subset of graphemes using at least one biasing language model and an artificial intelligence-based speech recognition model; generating a second speech recognition output by replacing at least a portion of the subset of graphemes in the first speech recognition output with at least one of the graphemes from the target pair; and performing automated actions based on the second speech recognition output.
    Type: Application
    Filed: October 13, 2022
    Publication date: April 18, 2024
    Inventors: Tohru Nagano, Gakuto Kurata
  • Publication number: 20230237987
    Abstract: A computer-implemented method for preparing training data for a speech recognition model is provided including obtaining a plurality of sentences from a corpus, dividing each phoneme in each sentence of the plurality of sentences into three hidden states, calculating, for each sentence of the plurality of sentences, a score based on a variation in duration of the three hidden states of each phoneme in the sentence, and sorting the plurality of sentences by using the calculated scores.
    Type: Application
    Filed: January 21, 2022
    Publication date: July 27, 2023
    Inventors: Takashi Fukuda, Tohru Nagano
  • Publication number: 20230069628
    Abstract: A computer-implemented method for fusing an end-to-end speech recognition model with an external language model (ExternalLM) is provided. The method includes obtaining an output of the end-to-end speech recognition model. The output is a probability distribution. The method further includes transforming, by a hardware processor, the probability distribution into a transformed probability distribution to relax a sharpness of the probability distribution. The method also includes fusing the transformed probability distribution and a probability distribution of the ExternalLM for decoding speech.
    Type: Application
    Filed: August 24, 2021
    Publication date: March 2, 2023
    Inventors: Tohru Nagano, Masayuki Suzuki, Gakuto Kurata
  • Patent number: 11107473
    Abstract: A method for reducing response time in a speech interface including constructing a partially completed word sequence from a partially received utterance from a speaker received by an audio sensor, modeling a remainder portion using a processor based on a rich predictive model to predict the remainder portion, and responding to the partially completed word sequence and the predicted remainder portion using a natural language vocalization generator with a vocalization, wherein the vocalization is prepared before a complete utterance is received from the speaker and conveyed to the speaker by an audio transducer.
    Type: Grant
    Filed: October 17, 2019
    Date of Patent: August 31, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gakuto Kurata, Tohru Nagano
  • Patent number: 10600411
    Abstract: A method for reducing response time in a speech interface including constructing a partially completed word sequence from a partially received utterance from a speaker received by an audio sensor, modeling a remainder portion using a processor based on a rich predictive model to predict the remainder portion, and responding to the partially completed word sequence and the predicted remainder portion using a natural language vocalization generator with a vocalization, wherein the vocalization is prepared before a complete utterance is received from the speaker and conveyed to the speaker by an audio transducer.
    Type: Grant
    Filed: October 6, 2017
    Date of Patent: March 24, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gakuto Kurata, Tohru Nagano
  • Publication number: 20200051569
    Abstract: A method for reducing response time in a speech interface including constructing a partially completed word sequence from a partially received utterance from a speaker received by an audio sensor, modeling a remainder portion using a processor based on a rich predictive model to predict the remainder portion, and responding to the partially completed word sequence and the predicted remainder portion using a natural language vocalization generator with a vocalization, wherein the vocalization is prepared before a complete utterance is received from the speaker and conveyed to the speaker by an audio transducer.
    Type: Application
    Filed: October 17, 2019
    Publication date: February 13, 2020
    Inventors: Gakuto Kurata, Tohru Nagano
  • Patent number: 10140976
    Abstract: Methods and systems for language processing includes training one or more automatic speech recognition models using an automatic speech recognition dictionary. A set of N automatic speech recognition hypotheses for an input is determined, based on the one or more automatic speech recognition models, using a processor. A best hypothesis is selected using a discriminative language model and a list of relevant words. Natural language processing is performed on the best hypothesis.
    Type: Grant
    Filed: December 14, 2015
    Date of Patent: November 27, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Masayuki Suzuki, Nobuyasu Itoh, Gakuto Kurata, Tohru Nagano
  • Patent number: 10001970
    Abstract: A dialog server which provides dialogs made by at least one user through their respective avatars in a virtual space. A method and a computer readable article of manufacture tangibly embodying computer readable instructions for executing the steps of the method are also provided. The dialog server includes: a position storage unit which stores positional information on the avatars; an utterance receiver which receives at least one utterance of avatars and utterance strength representing an importance or attention level of the utterance; an interest level calculator which calculates interest levels between avatars based on their positional information; a message processor which generates a message based on the utterance in accordance with a value calculated from the interest levels and the utterance strength; and a message transmitter which transmits the message to the avatars.
    Type: Grant
    Filed: January 12, 2017
    Date of Patent: June 19, 2018
    Assignee: Activision Publishing, Inc.
    Inventors: Gakuto Kurata, Tohru Nagano, Michiaki Tatsubori
  • Patent number: 9972308
    Abstract: Methods, a system, and a classifier are provided. A method includes preparing, by a processor, pairs for an information retrieval task. Each pair includes (i) a training-stage speech recognition result for a respective sequence of training words and (ii) an answer label corresponding to the training-stage speech recognition result. The method further includes obtaining, by the processor, a respective rank for the answer label included in each pair to obtain a set of ranks. The method also includes determining, by the processor, for each pair, an end of question part in the training-stage speech recognition result based on the set of ranks. The method additionally includes building, by the processor, the classifier such that the classifier receives a recognition-stage speech recognition result and returns a corresponding end of question part for the recognition-stage speech recognition result, based on the end of question part determined for the pairs.
    Type: Grant
    Filed: November 8, 2016
    Date of Patent: May 15, 2018
    Assignee: International Business Machines Corporation
    Inventors: Tohru Nagano, Ryuki Tachibana
  • Publication number: 20180130460
    Abstract: Methods, a system, and a classifier are provided. A method includes preparing, by a processor, pairs for an information retrieval task. Each pair includes (i) a training-stage speech recognition result for a respective sequence of training words and (ii) an answer label corresponding to the training-stage speech recognition result. The method further includes obtaining, by the processor, a respective rank for the answer label included in each pair to obtain a set of ranks. The method also includes determining, by the processor, for each pair, an end of question part in the training-stage speech recognition result based on the set of ranks. The method additionally includes building, by the processor, the classifier such that the classifier receives a recognition-stage speech recognition result and returns a corresponding end of question part for the recognition-stage speech recognition result, based on the end of question part determined for the pairs.
    Type: Application
    Filed: November 8, 2016
    Publication date: May 10, 2018
    Inventors: Tohru Nagano, Ryuki Tachibana
  • Patent number: 9922647
    Abstract: A method for reducing response time in a speech interface including constructing a partially completed word sequence from a partially received utterance from a speaker received by an audio sensor, modeling a remainder portion using a processor based on a rich predictive model to predict the remainder portion, and responding to the partially completed word sequence and the predicted remainder portion using a natural language vocalization generator with a vocalization, wherein the vocalization is prepared before a complete utterance is received from the speaker and conveyed to the speaker by an audio transducer.
    Type: Grant
    Filed: January 29, 2016
    Date of Patent: March 20, 2018
    Assignee: International Business Machines Corporation
    Inventors: Gakuto Kurata, Tohru Nagano
  • Publication number: 20180033433
    Abstract: A method for reducing response time in a speech interface including constructing a partially completed word sequence from a partially received utterance from a speaker received by an audio sensor, modeling a remainder portion using a processor based on a rich predictive model to predict the remainder portion, and responding to the partially completed word sequence and the predicted remainder portion using a natural language vocalization generator with a vocalization, wherein the vocalization is prepared before a complete utterance is received from the speaker and conveyed to the speaker by an audio transducer.
    Type: Application
    Filed: October 6, 2017
    Publication date: February 1, 2018
    Inventors: Gakuto Kurata, Tohru Nagano
  • Publication number: 20170235547
    Abstract: A dialog server which provides dialogs made by at least one user through their respective avatars in a virtual space. A method and a computer readable article of manufacture tangibly embodying computer readable instructions for executing the steps of the method are also provided. The dialog server includes: a position storage unit which stores positional information on the avatars; an utterance receiver which receives at least one utterance of avatars and utterance strength representing an importance or attention level of the utterance; an interest level calculator which calculates interest levels between avatars based on their positional information; a message processor which generates a message based on the utterance in accordance with a value calculated from the interest levels and the utterance strength; and a message transmitter which transmits the message to the avatars.
    Type: Application
    Filed: January 12, 2017
    Publication date: August 17, 2017
    Inventors: Gakuto Kurata, Tohru Nagano, Michiaki Tatsubori
  • Publication number: 20170221486
    Abstract: A method for reducing response time in a speech interface including constructing a partially completed word sequence from a partially received utterance from a speaker received by an audio sensor, modeling a remainder portion using a processor based on a rich predictive model to predict the remainder portion, and responding to the partially completed word sequence and the predicted remainder portion using a natural language vocalization generator with a vocalization, wherein the vocalization is prepared before a complete utterance is received from the speaker and conveyed to the speaker by an audio transducer.
    Type: Application
    Filed: January 29, 2016
    Publication date: August 3, 2017
    Inventors: Gakuto Kurata, Tohru Nagano
  • Publication number: 20170169813
    Abstract: Methods and systems for language processing includes training one or more automatic speech recognition models using an automatic speech recognition dictionary. A set of N automatic speech recognition hypotheses for an input is determined, based on the one or more automatic speech recognition models, using a processor. A best hypothesis is selected using a discriminative language model and a list of relevant words. Natural language processing is performed on the best hypothesis.
    Type: Application
    Filed: December 14, 2015
    Publication date: June 15, 2017
    Inventors: Masayuki Suzuki, Nobuyasu Itoh, Gakuto Kurata, Tohru Nagano
  • Patent number: 9626957
    Abstract: A method for speech retrieval includes acquiring a keyword designated by a character string, and a phoneme string or a syllable string, detecting one or more coinciding segments by comparing a character string that is a recognition result of word speech recognition with words as recognition units performed for speech data to be retrieved and the character string of the keyword, calculating an evaluation value of each of the one or more segments by using the phoneme string or the syllable string of the keyword to evaluate a phoneme string or a syllable string that is recognized in each of the detected one or more segments and that is a recognition result of phoneme speech recognition with phonemes or syllables as recognition units performed for the speech data, and outputting a segment in which the calculated evaluation value exceeds a predetermined threshold.
    Type: Grant
    Filed: May 27, 2016
    Date of Patent: April 18, 2017
    Assignee: SINOEAST CONCEPT LIMITED
    Inventors: Gakuto Kurata, Tohru Nagano, Masafumi Nishimura
  • Patent number: 9626958
    Abstract: A method for speech retrieval includes acquiring a keyword designated by a character string, and a phoneme string or a syllable string, detecting one or more coinciding segments by comparing a character string that is a recognition result of word speech recognition with words as recognition units performed for speech data to be retrieved and the character string of the keyword, calculating an evaluation value of each of the one or more segments by using the phoneme string or the syllable string of the keyword to evaluate a phoneme string or a syllable string that is recognized in each of the detected one or more segments and that is a recognition result of phoneme speech recognition with phonemes or syllables as recognition units performed for the speech data, and outputting a segment in which the calculated evaluation value exceeds a predetermined threshold.
    Type: Grant
    Filed: May 27, 2016
    Date of Patent: April 18, 2017
    Assignee: SINOEAST CONCEPT LIMITED
    Inventors: Gakuto Kurata, Tohru Nagano, Masafumi Nishimura
  • Patent number: 9583109
    Abstract: A dialog server which provides dialogs made by at least one user through their respective avatars in a virtual space. A method and a computer readable article of manufacture tangibly embodying computer readable instructions for executing the steps of the method are also provided. The dialog server includes: a position storage unit which stores positional information on the avatars; an utterance receiver which receives at least one utterance of avatars and utterance strength representing an importance or attention level of the utterance; an interest level calculator which calculates interest levels between avatars based on their positional information; a message processor which generates a message based on the utterance in accordance with a value calculated from the interest levels and the utterance strength; and a message transmitter which transmits the message to the avatars.
    Type: Grant
    Filed: October 7, 2013
    Date of Patent: February 28, 2017
    Assignee: Activision Publishing, Inc.
    Inventors: Gakuto Kurata, Tohru Nagano, Michiaki Tatsubori
  • Publication number: 20160275940
    Abstract: A method for speech retrieval includes acquiring a keyword designated by a character string, and a phoneme string or a syllable string, detecting one or more coinciding segments by comparing a character string that is a recognition result of word speech recognition with words as recognition units performed for speech data to be retrieved and the character string of the keyword, calculating an evaluation value of each of the one or more segments by using the phoneme string or the syllable string of the keyword to evaluate a phoneme string or a syllable string that is recognized in each of the detected one or more segments and that is a recognition result of phoneme speech recognition with phonemes or syllables as recognition units performed for the speech data, and outputting a segment in which the calculated evaluation value exceeds a predetermined threshold.
    Type: Application
    Filed: May 27, 2016
    Publication date: September 22, 2016
    Inventors: Gakuto Kurata, Tohru Nagano, Masafumi Nishimura
  • Publication number: 20160275939
    Abstract: A method for speech retrieval includes acquiring a keyword designated by a character string, and a phoneme string or a syllable string, detecting one or more coinciding segments by comparing a character string that is a recognition result of word speech recognition with words as recognition units performed for speech data to be retrieved and the character string of the keyword, calculating an evaluation value of each of the one or more segments by using the phoneme string or the syllable string of the keyword to evaluate a phoneme string or a syllable string that is recognized in each of the detected one or more segments and that is a recognition result of phoneme speech recognition with phonemes or syllables as recognition units performed for the speech data, and outputting a segment in which the calculated evaluation value exceeds a predetermined threshold.
    Type: Application
    Filed: May 27, 2016
    Publication date: September 22, 2016
    Inventors: Gakuto Kurata, Tohru Nagano, Masafumi Nishimura