Patents by Inventor Hosana KAMIYAMA
Hosana KAMIYAMA has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240078999Abstract: A learning method includes the following processes. A shuffling process acquires learning data arranged in a time series and rearranges the learning data in an order different from the order of the time series. A learning process trains an acoustic model using the learning data rearranged through the shuffling process.Type: ApplicationFiled: January 15, 2021Publication date: March 7, 2024Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Hosana KAMIYAMA, Yoshikazu YAMAGUCHI
-
Publication number: 20230410834Abstract: A pre-adaptation model storage unit (14) stores a satisfaction estimation model obtained by connecting a speech satisfaction estimation model part that estimates a speech satisfaction for each speech using a feature amount of each speech as an input and a conversation satisfaction estimation model part that estimates a conversation satisfaction using at least the speech satisfaction for each speech as an input. An adaptation data storage unit (15) stores adaptation data including a conversation voice in which a conversation including a plurality of speeches is recorded and a correct value of a conversation satisfaction for the conversation. A model adaptation unit (18) fixes, by using a feature amount of each speech extracted from the conversation voice and a correct value of the conversation satisfaction, a parameter of the conversation satisfaction estimation model part to update a parameter of the speech satisfaction estimation model part.Type: ApplicationFiled: November 4, 2020Publication date: December 21, 2023Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Atsushi ANDO, Hosana KAMIYAMA, Takeshi MORI, Satoshi KOBASHIKAWA
-
Patent number: 11798578Abstract: To increase the accuracy of paralinguistic information estimation. A paralinguistic information estimation model storage unit 20 stores a paralinguistic information estimation model outputting, with a plurality of independent features as inputs, paralinguistic information estimation results. A feature extraction unit 11 extracts the features from an input utterance. A paralinguistic information estimation unit 20 estimates paralinguistic information of the input utterance from the features extracted from the input utterance, by using the paralinguistic information estimation model.Type: GrantFiled: October 8, 2019Date of Patent: October 24, 2023Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Atsushi Ando, Hosana Kamiyama, Satoshi Kobashikawa
-
Patent number: 11756554Abstract: An attribute identification technology that can reject an attribute identification result if the reliability thereof is low is provided. An attribute identification device includes: a posteriori probability calculation unit 110 that calculates, from input speech, a posteriori probability sequence {q(c, i)} which is a sequence of the posteriori probabilities q(c, i) that a frame i of the input speech is a class c; a reliability calculation unit 120 that calculates, from the posteriori probability sequence {q(c, i)}, reliability r(c) indicating the extent to which the class c is a correct attribute identification result; and an attribute identification result generating unit 130 that generates an attribute identification result L of the input speech from the posteriori probability sequence {q(c, i)} and the reliability r(c).Type: GrantFiled: August 23, 2021Date of Patent: September 12, 2023Assignee: Nippon Telegraph and Telephone CorporationInventors: Hosana Kamiyama, Satoshi Kobashikawa, Atsushi Ando
-
Publication number: 20230206118Abstract: Provided is a model learning technology to learn a model in consideration of a difference in label assignment accuracy between experts and non-experts.Type: ApplicationFiled: March 19, 2020Publication date: June 29, 2023Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Hosana KAMIYAMA, Yuki KITAGISHI, Atsushi ANDO, Ryo MASUMURA, Takeshi MORI, Satoshi KOBASHIKAWA
-
Publication number: 20230095088Abstract: The present invention provides emotion recognition technology that achieves high emotion recognition accuracy for all speakers.Type: ApplicationFiled: February 28, 2020Publication date: March 30, 2023Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Atsushi ANDO, Yuki KITAGISHI, Hosana KAMIYAMA, Takeshi MORI
-
Publication number: 20230069908Abstract: A recognition apparatus includes a classification unit that estimates a non-linguistic and para-linguistic information label to be imparted by an n-th listener from an acoustic feature amount of speech data to be recognized using an n-th classification model, and an integration unit that integrates estimation results of the non-linguistic and para-linguistic information labels for N listeners and obtains non-linguistic and para-linguistic information estimation results as a recognition apparatus for the speech data to be recognized, and the n-th classification model is a classification model trained using training speech data and a non-linguistic and para-linguistic information label imparted to the training speech data by the n-th listener as training data.Type: ApplicationFiled: February 21, 2020Publication date: March 9, 2023Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Atsushi ANDO, Yuki KITAGISHI, Hosana KAMIYAMA, Takeshi MORI
-
Patent number: 11568761Abstract: The present invention provides a pronunciation error detection apparatus capable of following a text without the need for a correct sentence even when erroneous recognition such as a reading error occurs.Type: GrantFiled: September 13, 2018Date of Patent: January 31, 2023Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Satoshi Kobashikawa, Ryo Masumura, Hosana Kamiyama, Yusuke Ijima, Yushi Aono
-
Publication number: 20230013385Abstract: A learning apparatus includes: a speaker vector learning unit configured to learn a speaker vector extraction parameter ? based on one or more items of learning speech voice data in a speaker vector voice database; a non-speaker-individuality sound model learning unit configured to create a probability distribution model using a frequency component of one or more items of non-speaker-individuality sound data in a non-speaker-individuality sound database and calculate an internal parameter of the probability distribution model; and an age level estimation model learning unit configured to extract a speaker vector from voice data in an age level estimation model-learning voice database using the speaker vector extraction parameter ?, calculate a non-speaker-individuality sound likelihood vector from voice data in the age level estimation model-learning voice database using the internal parameters ? and ?, and learn, with input of the speaker vector and the non-speaker-individuality sound likelihood vector, a paType: ApplicationFiled: December 9, 2019Publication date: January 19, 2023Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Yuki KITAGISHI, Takeshi MORI, Hosana KAMIYAMA, Atsushi ANDO, Satoshi KOBASHIKAWA
-
Patent number: 11557311Abstract: Estimation accuracies of a conversation satisfaction and a speech satisfaction are improved. A learning data storage unit (10) stores learning data including a conversation voice containing a conversation including a plurality of speeches, a correct answer value of a conversation satisfaction for the conversation, and a correct answer value of a speech satisfaction for each speech included in the conversation.Type: GrantFiled: July 20, 2018Date of Patent: January 17, 2023Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Atsushi Ando, Hosana Kamiyama, Satoshi Kobashikawa
-
Patent number: 11551708Abstract: With correct emotion classes selected as correct values of an emotion of an utterer of a first utterance from among a plurality of emotion classes C1, . . . , CK by listeners who have listened to the first utterance, as an input, the numbers of times ni that emotion classes Ci have been selected as the correct emotion classes are obtained, and rates of the numbers of times nk to a sum total of the numbers of times n1, . . . , nK or smoothed values of the rates are obtained as correct emotion soft labels tk(s) corresponding to the first utterance.Type: GrantFiled: November 12, 2018Date of Patent: January 10, 2023Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Atsushi Ando, Hosana Kamiyama, Satoshi Kobashikawa
-
Patent number: 11521641Abstract: State-of-satisfaction change pattern models each including a set of transition weights in state sequences of the states of satisfaction are obtained for predetermined change patterns of the states of satisfaction, and a state-of-satisfaction estimation model for obtaining the posteriori probability of the utterance feature amount given the state of satisfaction of an utterer is obtained by using the utterance-for-learning feature amount and a correct value of the state of satisfaction of an utterer who gave an utterance for learning corresponding to the utterance-for-learning feature amount. By using the input utterance feature amount and the state-of-satisfaction change pattern models and the state-of-satisfaction estimation model, an estimated value of the state of satisfaction of an utterer who gave an utterance corresponding to the input utterance feature amount is obtained.Type: GrantFiled: February 2, 2018Date of Patent: December 6, 2022Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Atsushi Ando, Hosana Kamiyama, Satoshi Kobashikawa
-
Patent number: 11495245Abstract: An urgency level estimation technique of estimating an urgency level of a speaker for free uttered speech, which does not require a specific word, is provided. An urgency level estimation apparatus includes a feature amount extracting part configured to extract a feature amount of an utterance from uttered speech, and an urgency level estimating part configured to estimate an urgency level of a speaker of the uttered speech from the feature amount based on a relationship between a feature amount extracted from uttered speech and an urgency level of a speaker of the uttered speech, the relationship being determined in advance, and the feature amount includes at least one of a feature indicating speaking speed of the uttered speech, a feature indicating voice pitch of the uttered speech and a feature indicating a power level of the uttered speech.Type: GrantFiled: November 15, 2018Date of Patent: November 8, 2022Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Hosana Kamiyama, Satoshi Kobashikawa, Atsushi Ando
-
Publication number: 20220335928Abstract: An estimation apparatus clusters a group of voice signals including a voice signal having a speaker attribute to be estimated into a plurality of clusters. Subsequently, the estimation apparatus identifies, from the plurality of clusters, a duster to which the voice signal to be estimated belongs. Next, the estimation apparatus uses a speaker attribute estimation model to estimate speaker attributes of respective voice signals in the identified cluster. After that, the estimation apparatus estimates an attribute of the entire cluster, by using an estimation result of the speaker attributes of the voice signals in the identified cluster, and outputs an estimation result of the speaker attribute of the entire cluster, as an estimation result of the speaker attribute of the voice signal to be estimated.Type: ApplicationFiled: August 19, 2019Publication date: October 20, 2022Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Naohiro TAWARA, Hosana KAMIYAMA, Satoshi KOBASHIKAWA, Atsunori OGAWA
-
Publication number: 20220277761Abstract: An impression estimation technique without the need of voice recognition is provided. An impression estimation device includes an estimation unit configured to estimate an impression of a voice signal s by defining p1<p2 and using a first feature amount obtained based on a first analysis time length p1 for the voice signal s and a second feature amount obtained based on a second analysis time length p2 for the voice signal s. A learning device includes a learning unit configured to learn an estimation model which estimates the impression of the voice signal by defining p1<p2 and using a first feature amount for learning obtained based on the first analysis time length p1 for a voice signal for learning sL, a second feature amount for learning obtained based on the second analysis time length p2 for the voice signal for learning sL, and an impression label imparted to the voice signal for learning sL.Type: ApplicationFiled: July 29, 2019Publication date: September 1, 2022Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Hosana KAMIYAMA, Atsushi ANDO, Satoshi KOBASHIKAWA
-
Publication number: 20220180188Abstract: A model is learned that is capable of accurate label estimation even if learning data is used for which the number of evaluators per piece of data is small.Type: ApplicationFiled: February 25, 2020Publication date: June 9, 2022Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Hosana KAMIYAMA, Atsushi ANDO, Satoshi KOBASHIKAWA
-
Publication number: 20220122584Abstract: Paralinguistic information is estimated with high accuracy even when an utterance for which it is difficult to identify paralinguistic information is used for model learning. An acoustic feature extraction unit 11 extracts an acoustic feature from an utterance. An anti-teacher decision unit 12 decides, based on a paralinguistic information label indicating a determination result of paralinguistic information given by a plurality of listeners for each utterance, an anti-teacher label indicating an anti-teacher serving as incorrect paralinguistic information for the utterance. An anti-teacher estimation model learning unit 13 learns, based on an acoustic feature extracted from the utterance and the anti-teacher label, an anti-teacher estimation model for outputting a posterior probability of anti-teacher for an input acoustic feature.Type: ApplicationFiled: January 27, 2020Publication date: April 21, 2022Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Atsushi ANDO, Hosana KAMIYAMA, Satoshi KOBASHIKAWA
-
Publication number: 20220108217Abstract: A model capable of estimating a label with high accuracy is learned even when training data involving a small number of raters per data item is used. Learning processing is performed in which a plurality of data items and label expectation values that are indicators representing degrees of correctness of individual labels on the data items are used in pairs as training data, and a model that estimates a label on an input data item is obtained.Type: ApplicationFiled: January 29, 2020Publication date: April 7, 2022Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Hosana KAMIYAMA, Satoshi KOBASHIKAWA, Atsushi ANDO, Ryo MASUMURA
-
Publication number: 20210398552Abstract: To increase the accuracy of paralinguistic information estimation. A paralinguistic information estimation model storage unit 20 stores a paralinguistic information estimation model outputting, with a plurality of independent features as inputs, paralinguistic information estimation results. A feature extraction unit 11 extracts the features from an input utterance. A paralinguistic information estimation unit 20 estimates paralinguistic information of the input utterance from the features extracted from the input utterance, by using the paralinguistic information estimation model.Type: ApplicationFiled: October 8, 2019Publication date: December 23, 2021Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Atsushi ANDO, Hosana KAMIYAMA, Satoshi KOBASHIKAWA
-
Publication number: 20210383812Abstract: An attribute identification technology that can reject an attribute identification result if the reliability thereof is low is provided. An attribute identification device includes: a posteriori probability calculation unit 110 that calculates, from input speech, a posteriori probability sequence {q(c, i)} which is a sequence of the posteriori probabilities q(c, i) that a frame i of the input speech is a class c; a reliability calculation unit 120 that calculates, from the posteriori probability sequence {q(c, i)}, reliability r(c) indicating the extent to which the class c is a correct attribute identification result; and an attribute identification result generating unit 130 that generates an attribute identification result L of the input speech from the posteriori probability sequence {q(c, i)} and the reliability r(c).Type: ApplicationFiled: August 23, 2021Publication date: December 9, 2021Applicant: Nippon Telegraph and Telephone CorporationInventors: Hosana KAMIYAMA, Satoshi Kobashikawa, Atsushi Ando