Patents by Inventor Ryo MASUMURA

Ryo MASUMURA has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220036912
    Abstract: A tag estimation device capable of estimating, for an utterance made among several persons, a tag representing a result of analyzing the utterance is provided. The tag estimation device includes an utterance sequence information vector generation unit that adds a t-th utterance word feature vector and a t-th speaker vector to a (t?1)-th utterance sequence information vector ut-1 that includes an utterance word feature vector that precedes the t-th utterance word feature vector and a speaker vector that precedes the t-th speaker vector to generate a t-th utterance sequence information vector ut, where t is a natural number, and a tagging unit that determines a tag lt that represents a result of analyzing a t-th utterance from a model parameter set in advance and the t-th utterance sequence information vector ut.
    Type: Application
    Filed: September 13, 2019
    Publication date: February 3, 2022
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo MASUMURA, Tomohiro TANAKA
  • Publication number: 20220013136
    Abstract: The present invention improves the accuracy of language prediction. A history speech meta-information understanding unit 11 obtains a history speech meta-information vector from a word string of a preceding speech using a meta-information understanding device. A history speech embedding unit 12 converts the word string of the preceding speech and a speaker label into a history speech embedding vector. A speech unit combination vector construction unit 13 obtains a speech unit combination vector by combining the history speech meta-information vector and the history speech embedding vector. A speech sequence embedding vector calculation unit 14 converts a plurality of speech unit combination vectors obtained for the past speech sequences to a speech sequence embedding vector. A language model score calculation unit 15 calculates a language model score of a current speech from a word string of the current speech, a speaker label, and a speech sequence embedding vector.
    Type: Application
    Filed: January 27, 2020
    Publication date: January 13, 2022
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo MASUMURA, Tomohiro TANAKA, Takanobu OBA
  • Publication number: 20210382467
    Abstract: An inspection system includes machine learning circuitry configured to determine whether each of objects belongs to a predetermined attribute based on feature data of each of the objects, feature data acquisition circuitry configured to acquire feature data of reevaluated objects which are determined to belong to the predetermined attribute without using the machine learning circuitry among excluded objects which are determined not to belong to the predetermined attribute by the machine learning circuitry, and parameter update circuitry configured to update a learning parameter of the machine learning circuitry based on teaching data including the acquired feature data acquired by the feature data acquisition circuitry.
    Type: Application
    Filed: August 23, 2021
    Publication date: December 9, 2021
    Applicant: KABUSHIKI KAISHA YASKAWA DENKI
    Inventors: Ryo MASUMURA, Masaru ADACHI
  • Publication number: 20210319783
    Abstract: A voice recognition device 10 includes: a phonological awareness feature amount extraction unit 11 that transforms an acoustic feature amount sequence of input voice into a phonological awareness feature amount sequence for the language 1 using a first model parameter group; a phonological awareness feature amount extraction unit 12 that transforms the acoustic feature amount sequence of the input voice into a phonological awareness feature amount sequence for the language 2 using a second model parameter group; a phonological recognition unit 13 that generates a posterior probability sequence from the acoustic feature amount sequence of the input voice, the phonological awareness feature amount sequence for the language 1, and the phonological awareness feature amount sequence for the language 2 using a third model parameter group; and a voice text transformation unit 14 that performs voice recognition based on the posterior probability sequence to output text of a voice recognition result.
    Type: Application
    Filed: June 21, 2019
    Publication date: October 14, 2021
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo MASUMURA, Tomohiro TANAKA
  • Patent number: 11081105
    Abstract: A model learning device comprises: an initial value setting part that uses a parameter of a learned first model including a neural network to set a parameter of a second model including a neural network having a same network structure as the first model; a first output probability distribution calculating part that calculates a first output probability distribution including a distribution of an output probability of each unit on an output layer, using learning features and the first model; a second output probability distribution calculating part that calculates a second output probability distribution including a distribution of an output probability of each unit on the output layer, using learning features and the second model; and a modified model update part that obtains a weighted sum of a second loss function calculated from correct information and from the second output probability distribution, and a cross entropy between the first output probability distribution and the second output probability dis
    Type: Grant
    Filed: September 5, 2017
    Date of Patent: August 3, 2021
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Hirokazu Masataki, Taichi Asami, Takashi Nakamura, Ryo Masumura
  • Publication number: 20210183368
    Abstract: Learning data is generated automatically without manually applying rules. An acoustic model learning data generation device 20 includes a stochastic attribute label generation model 21 that generates attribute labels from a first model parameter group according to a first probability distribution; a stochastic phoneme sequence generation model 22 that generates a phoneme sequence from a second model parameter group and the attribute labels according to a second probability distribution; and a stochastic acoustic feature quantity sequence generation model 23 that generates an acoustic feature quantity sequence from a third model parameter group, the attribute labels, and the phoneme sequence according to a third probability distribution.
    Type: Application
    Filed: June 21, 2019
    Publication date: June 17, 2021
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo MASUMURA, Tomohiro TANAKA
  • Publication number: 20210174788
    Abstract: A language model score calculation apparatus calculates a prediction probability of a word wi as a language model score of a language model based on a recurrent neural network. The language model score calculation apparatus includes a memory; and a processor configured to execute converting a word wi-1 that is observed immediately before the word wi into a word vector ?(wi-1); converting a speaker label ri-1 corresponding to the word wi-1 and a speaker label ri corresponding to the word wi into a speaker vector ?(ri-1) and a speaker vector ?(ri), respectively; calculating a word history vector si by using the word vector ?(wi-1), the speaker vector ?(ri-1), and a word history vector si-1 that is obtained when a prediction probability of the word wi-1 is calculated; and calculating a prediction probability of the word wi by using the word history vector si-1 and the speaker vector ?(ri).
    Type: Application
    Filed: June 21, 2019
    Publication date: June 10, 2021
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo MASUMURA, Tomohiro TANAKA
  • Publication number: 20210090552
    Abstract: A learning apparatus comprises a learning part that learns an error correction model by a set of a speech recognition result candidate and a correct text of speech recognition for given audio data, wherein the speech recognition result candidate includes a speech recognition result candidate which is different from the correct text, and the error correction model is a model that receives a word sequence of the speech recognition result candidate as input and outputs an error correction score indicating likelihood of the word sequence of the speech recognition result candidate in consideration of a speech recognition error.
    Type: Application
    Filed: February 18, 2019
    Publication date: March 25, 2021
    Applicant: NIPPPN TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Tomohiro TANAKA, Ryo MASUMURA
  • Publication number: 20210082415
    Abstract: A document identification device that improves class identification precision of multi-stream documents is provided. The document identification device includes: a primary stream expression generation unit that generates a primary stream expression, which is a fixed-length vector of a word sequence corresponding to each speaker's speech recorded in a setting including a plurality of speakers, for each speaker; a primary multi-stream expression generation unit that generates a primary multi-stream expression obtained by integrating the primary stream expression; a secondary stream expression generation unit that generates a secondary stream expression, which is a fixed-length vector generated based on the word sequence of each speaker and the primary multi-stream expression, for each speaker; and a secondary multi-stream expression generation unit that generates a secondary multi-stream expression obtained by integrating the secondary stream expression.
    Type: Application
    Filed: May 10, 2018
    Publication date: March 18, 2021
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo MASUMURA, Hirokazu MASATAKI
  • Publication number: 20210012158
    Abstract: By using training data containing tuples of texts for M types of tasks in N types of languages and correct labels of the texts as input, an optimized parameter group that defines N inter-task shared transformation functions ?(n) corresponding to the N types of languages n and M inter-language shared transformation functions ?(m) corresponding to the M types of tasks in is obtained. At least one of N and M is an integer greater than or equal to 2, each ?(n) outputs a latent vector, which corresponds to the contents of an input text in a certain language n but does not depend on the language n, to ?(1), . . . ?(M), and each ?(m) uses, as input, the latent vector output from any one of ?(1), . . . ?(N) and outputs an output label corresponding to the latent vector for a certain task in.
    Type: Application
    Filed: February 14, 2019
    Publication date: January 14, 2021
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo MASUMURA, Tomohiro TANAKA
  • Publication number: 20200218975
    Abstract: There is provided a technique for transforming a confusion network to a representation that can be used as an input for machine learning. A confusion network distributed representation sequence generating part that generates a confusion network distributed representation sequence, which is a vector sequence, from an arc word set sequence and an arc weight set sequence constituting the confusion network is included.
    Type: Application
    Filed: August 21, 2018
    Publication date: July 9, 2020
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo MASUMURA, Hirokazu MASATAKI
  • Publication number: 20200219413
    Abstract: The present invention provides a pronunciation error detection apparatus capable of following a text without the need for a correct sentence even when erroneous recognition such as a reading error occurs.
    Type: Application
    Filed: September 13, 2018
    Publication date: July 9, 2020
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Satoshi KOBASHIKAWA, Ryo MASUMURA, Hosana KAMIYAMA, Yusuke IJIMA, Yushi AONO
  • Publication number: 20190244604
    Abstract: A model learning device comprises: an initial value setting part that uses a parameter of a learned first model including a neural network to set a parameter of a second model including a neural network having a same network structure as the first model; a first output probability distribution calculating part that calculates a first output probability distribution including a distribution of an output probability of each unit on an output layer, using learning features and the first model; a second output probability distribution calculating part that calculates a second output probability distribution including a distribution of an output probability of each unit on the output layer, using learning features and the second model; and a modified model update part that obtains a weighted sum of a second loss function calculated from correct information and from the second output probability distribution, and a cross entropy between the first output probability distribution and the second output probability dis
    Type: Application
    Filed: September 5, 2017
    Publication date: August 8, 2019
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Hirokazu MASATAKI, Taichi ASAMI, Takashi NAKAMURA, Ryo MASUMURA