Patents by Inventor Yusuke IJIMA

Yusuke IJIMA has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SPEECH SYNTHESIS APPARATUS, SPEECH SYNTHESIS METHOD, AND SPEECH SYNTHESIS PROGRAM

Publication number: 20240347039

Abstract: A speech synthesis apparatus according to the present disclosure includes a memory and a processor coupled to the memory. The processor is configured to: obtain utterance information on subjects to be uttered, wherein the subjects to be uttered are texts contained in data on a book, obtain image information on images that are contained in the data on the book, obtain speech data corresponding to the subjects to be uttered; and generate, based on the obtained utterance information, the obtained image information, and the obtained speech data, a speech synthesis model for reading out a text associated with an image.

Type: Application

Filed: August 18, 2022

Publication date: October 17, 2024

Applicants: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, The University of Tokyo

Inventors: Yusuke IJIMA, Tomoki KORIYAMA, Shinnosuke TAKAMICHI
SPEAKER EMBEDDING DEVICE, SPEAKER EMBEDDING METHOD, AND SPEAKER EMBEDDING PROGRAM

Publication number: 20240312465

Abstract: A speaker embedding apparatus includes processing circuitry configured to accept input of voice data, generate utterance unit segmentation information indicating a duration length for each utterance of a speaker in the input voice data, and use a duration length for each utterance indicated in the generated utterance unit segmentation information as training data and train a speaker identification model for outputting an identification result of a speaker when a duration length for each utterance of the speaker is input.

Type: Application

Filed: February 2, 2021

Publication date: September 19, 2024

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Yusuke IJIMA, Kenichi FUJITA, Atsushi ANDO
Prediction device, prediction method, and program

Patent number: 11915688

Abstract: An estimation device (100), which is an estimation device that estimates a duration of a speech section, includes: a representation conversion unit (11) that performs representation conversion of a plurality of words included in learning utterance information to a plurality of pieces of numeric representation data; an estimation data generation unit (12) that generates estimation data by using a plurality of pieces of the learning utterance information and the plurality of pieces of numeric representation data; an estimation model learning unit (13) that learns an estimation model by using the estimation data and the durations of the plurality of words; and an estimation unit (20) that estimates the duration of a predetermined speech section based on utterance information of a user by using the estimation model.

Type: Grant

Filed: January 30, 2020

Date of Patent: February 27, 2024

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventor: Yusuke Ijima
CONSUMER BEHAVIOR PREDICTION METHOD, CONSUMER BEHAVIOR PREDICTION DEVICE, AND CONSUMER BEHAVIOR PREDICTION PROGRAM

Publication number: 20240013239

Abstract: An acquisition unit acquires a voice feature quantity vector representing a feature of input voice data, an emotion expression vector representing a customer's emotion corresponding to the voice data, and a purchase intention vector representing a purchase intention of the customer corresponding to the voice data. A learning unit generates, by learning, a purchase intention estimation model for estimating a purchase intention of a customer corresponding to the voice data by using the voice feature quantity vector, the emotion expression vector, and the purchase intention vector.

Type: Application

Filed: November 26, 2020

Publication date: January 11, 2024

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Mizuki NAGANO, Yusuke IJIMA
READING DISAMBIGUATION DEVICE, READING DISAMBIGUATION METHOD, AND READING DISAMBIGUATION PROGRAM

Publication number: 20230252983

Abstract: An input unit receives a morpheme array and parts-of-speech of morphemes of the morpheme array. An ambiguous word candidate acquisition unit (26) acquires, for each morpheme of the morpheme array, based on a notation and a part-of-speech of the morpheme, reading candidates of the morpheme from reading candidates of the morpheme defined in advance for each combination of a notation and a part-of-speech of the morpheme. A disambiguation unit (30) determines a reading of the morpheme from the acquired reading candidates of the morpheme by using a disambiguation rule in which a reading of the morpheme is defined in advance correspondingly to appearance positions of other morphemes and notations, parts-of-speech, or character types of the other morphemes.

Type: Application

Filed: May 8, 2019

Publication date: August 10, 2023

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Nozomi KOBAYASHI, Yusuke IJIMA, Junji TOMITA
Pronunciation error detection apparatus, pronunciation error detection method and program

Patent number: 11568761

Abstract: The present invention provides a pronunciation error detection apparatus capable of following a text without the need for a correct sentence even when erroneous recognition such as a reading error occurs.

Type: Grant

Filed: September 13, 2018

Date of Patent: January 31, 2023

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Satoshi Kobashikawa, Ryo Masumura, Hosana Kamiyama, Yusuke Ijima, Yushi Aono
POSE ESTIMATION MODEL LEARNING APPARATUS, POSE ESTIMATION APPARATUS, METHODS AND PROGRAMS FOR THE SAME

Publication number: 20230005468

Abstract: A pause estimation model learning apparatus includes: a morphological analysis unit configured to perform morphological analysis on training text data to provide M types of information, M being an integer that is equal to or larger than 2; a feature selection unit configured to combine N pieces of information, among the M pieces of information, to be an input feature when a predetermined certain condition is satisfied, and select predetermined one of the N pieces of information to be the input feature when the certain condition is not satisfied, N being an integer that is equal to or larger than 2 and equal to or smaller than M; and a learning unit configured to learn a pause estimation model by using the input feature selected by the feature selection unit and a pause correct label.

Type: Application

Filed: November 26, 2019

Publication date: January 5, 2023

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Mizuki NAGANO, Yusuke IJIMA, Nozomi KOBAYASHI
Acoustic model learning device, voice synthesis device, and program

Patent number: 11545135

Abstract: An acoustic model learning device is provided for obtaining an acoustic model used to synthesize voice signals with intonation. The device includes a first learning unit that learns the acoustic model to estimate synthetic acoustic feature values using voice and speaker determination models based on acoustic feature values of speakers, language feature values corresponding to the acoustic feature values and speaker data items, a second learning unit that learns the voice determination model to determine whether the synthetic acoustic feature value is a predetermined acoustic feature value or not based on the acoustic feature values and the synthetic acoustic feature values, and a third learning unit that learns the speaker determination model to determine whether the speaker of the synthetic acoustic feature value is a predetermined speaker or not based on the acoustic feature values and the synthetic acoustic feature values.

Type: Grant

Filed: September 25, 2019

Date of Patent: January 3, 2023

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Hiroki Kanagawa, Yusuke Ijima
DETECTION APPARATUS, METHOD AND PROGRAM FOR THE SAME

Publication number: 20220406289

Abstract: A detection device includes a labeling acoustic feature calculation unit configured to calculate a labeling acoustic feature from voice data, a time information acquisition unit configured to acquire a label with time information corresponding to the voice data from a label with no time information corresponding to the voice data and the labeling acoustic feature through a use of a labeling acoustic model configured to receive, as inputs, a label with no time information and a labeling acoustic feature and output a label with time information, an acoustic feature prediction unit configured to predict an acoustic feature corresponding to the label with time information and acquire a predicted value through a use of an acoustic model configured to receive, as an input, a label with time information and output an acoustic feature, an acoustic feature calculation unit configured to calculate an acoustic feature from the voice data, a difference calculation unit configured to determine an acoustic difference betwe

Type: Application

Filed: November 25, 2019

Publication date: December 22, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Hiroki KANAGAWA, Yusuke IJIMA
PREDICTION DEVICE, PREDICTION METHOD, AND PROGRAM

Publication number: 20220139381

Abstract: An estimation device (100), which is an estimation device that estimates a duration of a speech section, includes: a representation conversion unit (11) that performs representation conversion of a plurality of words included in learning utterance information to a plurality of pieces of numeric representation data; an estimation data generation unit (12) that generates estimation data by using a plurality of pieces of the learning utterance information and the plurality of pieces of numeric representation data; an estimation model learning unit (13) that learns an estimation model by using the estimation data and the durations of the plurality of words; and an estimation unit (20) that estimates the duration of a predetermined speech section based on utterance information of a user by using the estimation model.

Type: Application

Filed: January 30, 2020

Publication date: May 5, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventor: Yusuke IJIMA
ACOUSTIC MODEL LEARNING DEVICE, VOICE SYNTHESIS DEVICE, AND PROGRAM

Publication number: 20220051655

Abstract: An acoustic model learning device is provided for obtaining an acoustic model used to synthesize voice signals with intonation. The device includes a first learning unit that learns the acoustic model to estimate synthetic acoustic feature values using voice and speaker determination models based on acoustic feature values of speakers, language feature values corresponding to the acoustic feature values and speaker data items, a second learning unit that learns the voice determination model to determine whether the synthetic acoustic feature value is a predetermined acoustic feature value or not based on the acoustic feature values and the synthetic acoustic feature values, and a third learning unit that learns the speaker determination model to determine whether the speaker of the synthetic acoustic feature value is a predetermined speaker or not based on the acoustic feature values and the synthetic acoustic feature values.

Type: Application

Filed: September 25, 2019

Publication date: February 17, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Hiroki KANAGAWA, Yusuke IJIMA
PRONUNCIATION ERROR DETECTION APPARATUS, PRONUNCIATION ERROR DETECTION METHOD AND PROGRAM

Publication number: 20200219413

Abstract: The present invention provides a pronunciation error detection apparatus capable of following a text without the need for a correct sentence even when erroneous recognition such as a reading error occurs.

Type: Application

Filed: September 13, 2018

Publication date: July 9, 2020

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Satoshi KOBASHIKAWA, Ryo MASUMURA, Hosana KAMIYAMA, Yusuke IJIMA, Yushi AONO
WORD VECTORIZATION MODEL LEARNING DEVICE, WORD VECTORIZATION DEVICE, SPEECH SYNTHESIS DEVICE, METHOD THEREOF, AND PROGRAM

Publication number: 20190362703

Abstract: Provided is a word vectorization device that converts a word to a word vector considering the acoustic feature of the word. A word vectorization model learning device comprises a learning part for learning a word vectorization model by using a vector wL,s(t) indicating a word yL,s(t) included in learning text data, and an acoustic feature amount afL,s(t) that is an acoustic feature amount of speech data corresponding to the learning text data and that corresponds to the word yL,s(t). The word vectorization model includes a neural network that receives a vector indicating a word as an input and outputs the acoustic feature amount of speech data corresponding to the word, and the word vectorization model is a model that uses an output value from any intermediate layer as a word vector.

Type: Application

Filed: February 14, 2018

Publication date: November 28, 2019

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Yusuke IJIMA, Nobukatsu HOJO, Taichi ASAMI