Patents by Inventor Ryo MASUMURA
Ryo MASUMURA has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20220108217Abstract: A model capable of estimating a label with high accuracy is learned even when training data involving a small number of raters per data item is used. Learning processing is performed in which a plurality of data items and label expectation values that are indicators representing degrees of correctness of individual labels on the data items are used in pairs as training data, and a model that estimates a label on an input data item is obtained.Type: ApplicationFiled: January 29, 2020Publication date: April 7, 2022Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Hosana KAMIYAMA, Satoshi KOBASHIKAWA, Atsushi ANDO, Ryo MASUMURA
-
Publication number: 20220093079Abstract: Without dividing speech into a unit such as a word or a character, text corresponding to the speech is labeled. A speech distributed representation sequence converting unit 11 converts an acoustic feature sequence into a speech distributed representation. A symbol distributed representation converting unit 12 converts each symbol included in the symbol sequence corresponding to the acoustic feature sequence into a symbol distributed representation. A label estimation unit 13 estimates a label corresponding to the symbol from the fixed-length vector of the symbol generated using the speech distributed representation, the symbol distributed representation, and fixed-length vectors of previous and next symbols.Type: ApplicationFiled: January 10, 2020Publication date: March 24, 2022Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Tomohiro TANAKA, Ryo MASUMURA, Takanobu OBA
-
Publication number: 20220036912Abstract: A tag estimation device capable of estimating, for an utterance made among several persons, a tag representing a result of analyzing the utterance is provided. The tag estimation device includes an utterance sequence information vector generation unit that adds a t-th utterance word feature vector and a t-th speaker vector to a (t?1)-th utterance sequence information vector ut-1 that includes an utterance word feature vector that precedes the t-th utterance word feature vector and a speaker vector that precedes the t-th speaker vector to generate a t-th utterance sequence information vector ut, where t is a natural number, and a tagging unit that determines a tag lt that represents a result of analyzing a t-th utterance from a model parameter set in advance and the t-th utterance sequence information vector ut.Type: ApplicationFiled: September 13, 2019Publication date: February 3, 2022Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo MASUMURA, Tomohiro TANAKA
-
Publication number: 20220013136Abstract: The present invention improves the accuracy of language prediction. A history speech meta-information understanding unit 11 obtains a history speech meta-information vector from a word string of a preceding speech using a meta-information understanding device. A history speech embedding unit 12 converts the word string of the preceding speech and a speaker label into a history speech embedding vector. A speech unit combination vector construction unit 13 obtains a speech unit combination vector by combining the history speech meta-information vector and the history speech embedding vector. A speech sequence embedding vector calculation unit 14 converts a plurality of speech unit combination vectors obtained for the past speech sequences to a speech sequence embedding vector. A language model score calculation unit 15 calculates a language model score of a current speech from a word string of the current speech, a speaker label, and a speech sequence embedding vector.Type: ApplicationFiled: January 27, 2020Publication date: January 13, 2022Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo MASUMURA, Tomohiro TANAKA, Takanobu OBA
-
Publication number: 20210382467Abstract: An inspection system includes machine learning circuitry configured to determine whether each of objects belongs to a predetermined attribute based on feature data of each of the objects, feature data acquisition circuitry configured to acquire feature data of reevaluated objects which are determined to belong to the predetermined attribute without using the machine learning circuitry among excluded objects which are determined not to belong to the predetermined attribute by the machine learning circuitry, and parameter update circuitry configured to update a learning parameter of the machine learning circuitry based on teaching data including the acquired feature data acquired by the feature data acquisition circuitry.Type: ApplicationFiled: August 23, 2021Publication date: December 9, 2021Applicant: KABUSHIKI KAISHA YASKAWA DENKIInventors: Ryo MASUMURA, Masaru ADACHI
-
Publication number: 20210319783Abstract: A voice recognition device 10 includes: a phonological awareness feature amount extraction unit 11 that transforms an acoustic feature amount sequence of input voice into a phonological awareness feature amount sequence for the language 1 using a first model parameter group; a phonological awareness feature amount extraction unit 12 that transforms the acoustic feature amount sequence of the input voice into a phonological awareness feature amount sequence for the language 2 using a second model parameter group; a phonological recognition unit 13 that generates a posterior probability sequence from the acoustic feature amount sequence of the input voice, the phonological awareness feature amount sequence for the language 1, and the phonological awareness feature amount sequence for the language 2 using a third model parameter group; and a voice text transformation unit 14 that performs voice recognition based on the posterior probability sequence to output text of a voice recognition result.Type: ApplicationFiled: June 21, 2019Publication date: October 14, 2021Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo MASUMURA, Tomohiro TANAKA
-
Patent number: 11081105Abstract: A model learning device comprises: an initial value setting part that uses a parameter of a learned first model including a neural network to set a parameter of a second model including a neural network having a same network structure as the first model; a first output probability distribution calculating part that calculates a first output probability distribution including a distribution of an output probability of each unit on an output layer, using learning features and the first model; a second output probability distribution calculating part that calculates a second output probability distribution including a distribution of an output probability of each unit on the output layer, using learning features and the second model; and a modified model update part that obtains a weighted sum of a second loss function calculated from correct information and from the second output probability distribution, and a cross entropy between the first output probability distribution and the second output probability disType: GrantFiled: September 5, 2017Date of Patent: August 3, 2021Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Hirokazu Masataki, Taichi Asami, Takashi Nakamura, Ryo Masumura
-
Publication number: 20210183368Abstract: Learning data is generated automatically without manually applying rules. An acoustic model learning data generation device 20 includes a stochastic attribute label generation model 21 that generates attribute labels from a first model parameter group according to a first probability distribution; a stochastic phoneme sequence generation model 22 that generates a phoneme sequence from a second model parameter group and the attribute labels according to a second probability distribution; and a stochastic acoustic feature quantity sequence generation model 23 that generates an acoustic feature quantity sequence from a third model parameter group, the attribute labels, and the phoneme sequence according to a third probability distribution.Type: ApplicationFiled: June 21, 2019Publication date: June 17, 2021Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo MASUMURA, Tomohiro TANAKA
-
Publication number: 20210174788Abstract: A language model score calculation apparatus calculates a prediction probability of a word wi as a language model score of a language model based on a recurrent neural network. The language model score calculation apparatus includes a memory; and a processor configured to execute converting a word wi-1 that is observed immediately before the word wi into a word vector ?(wi-1); converting a speaker label ri-1 corresponding to the word wi-1 and a speaker label ri corresponding to the word wi into a speaker vector ?(ri-1) and a speaker vector ?(ri), respectively; calculating a word history vector si by using the word vector ?(wi-1), the speaker vector ?(ri-1), and a word history vector si-1 that is obtained when a prediction probability of the word wi-1 is calculated; and calculating a prediction probability of the word wi by using the word history vector si-1 and the speaker vector ?(ri).Type: ApplicationFiled: June 21, 2019Publication date: June 10, 2021Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo MASUMURA, Tomohiro TANAKA
-
Publication number: 20210090552Abstract: A learning apparatus comprises a learning part that learns an error correction model by a set of a speech recognition result candidate and a correct text of speech recognition for given audio data, wherein the speech recognition result candidate includes a speech recognition result candidate which is different from the correct text, and the error correction model is a model that receives a word sequence of the speech recognition result candidate as input and outputs an error correction score indicating likelihood of the word sequence of the speech recognition result candidate in consideration of a speech recognition error.Type: ApplicationFiled: February 18, 2019Publication date: March 25, 2021Applicant: NIPPPN TELEGRAPH AND TELEPHONE CORPORATIONInventors: Tomohiro TANAKA, Ryo MASUMURA
-
Publication number: 20210082415Abstract: A document identification device that improves class identification precision of multi-stream documents is provided. The document identification device includes: a primary stream expression generation unit that generates a primary stream expression, which is a fixed-length vector of a word sequence corresponding to each speaker's speech recorded in a setting including a plurality of speakers, for each speaker; a primary multi-stream expression generation unit that generates a primary multi-stream expression obtained by integrating the primary stream expression; a secondary stream expression generation unit that generates a secondary stream expression, which is a fixed-length vector generated based on the word sequence of each speaker and the primary multi-stream expression, for each speaker; and a secondary multi-stream expression generation unit that generates a secondary multi-stream expression obtained by integrating the secondary stream expression.Type: ApplicationFiled: May 10, 2018Publication date: March 18, 2021Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo MASUMURA, Hirokazu MASATAKI
-
Publication number: 20210012158Abstract: By using training data containing tuples of texts for M types of tasks in N types of languages and correct labels of the texts as input, an optimized parameter group that defines N inter-task shared transformation functions ?(n) corresponding to the N types of languages n and M inter-language shared transformation functions ?(m) corresponding to the M types of tasks in is obtained. At least one of N and M is an integer greater than or equal to 2, each ?(n) outputs a latent vector, which corresponds to the contents of an input text in a certain language n but does not depend on the language n, to ?(1), . . . ?(M), and each ?(m) uses, as input, the latent vector output from any one of ?(1), . . . ?(N) and outputs an output label corresponding to the latent vector for a certain task in.Type: ApplicationFiled: February 14, 2019Publication date: January 14, 2021Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo MASUMURA, Tomohiro TANAKA
-
Publication number: 20200218975Abstract: There is provided a technique for transforming a confusion network to a representation that can be used as an input for machine learning. A confusion network distributed representation sequence generating part that generates a confusion network distributed representation sequence, which is a vector sequence, from an arc word set sequence and an arc weight set sequence constituting the confusion network is included.Type: ApplicationFiled: August 21, 2018Publication date: July 9, 2020Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo MASUMURA, Hirokazu MASATAKI
-
Publication number: 20200219413Abstract: The present invention provides a pronunciation error detection apparatus capable of following a text without the need for a correct sentence even when erroneous recognition such as a reading error occurs.Type: ApplicationFiled: September 13, 2018Publication date: July 9, 2020Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Satoshi KOBASHIKAWA, Ryo MASUMURA, Hosana KAMIYAMA, Yusuke IJIMA, Yushi AONO
-
Publication number: 20190244604Abstract: A model learning device comprises: an initial value setting part that uses a parameter of a learned first model including a neural network to set a parameter of a second model including a neural network having a same network structure as the first model; a first output probability distribution calculating part that calculates a first output probability distribution including a distribution of an output probability of each unit on an output layer, using learning features and the first model; a second output probability distribution calculating part that calculates a second output probability distribution including a distribution of an output probability of each unit on the output layer, using learning features and the second model; and a modified model update part that obtains a weighted sum of a second loss function calculated from correct information and from the second output probability distribution, and a cross entropy between the first output probability distribution and the second output probability disType: ApplicationFiled: September 5, 2017Publication date: August 8, 2019Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Hirokazu MASATAKI, Taichi ASAMI, Takashi NAKAMURA, Ryo MASUMURA