Patents by Inventor Takaaki FUKUTOMI

Takaaki FUKUTOMI has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

LEARNING APPARATUS, ESTIMATION APPARATUS, METHODS AND PROGRAMS FOR THE SAME

Publication number: 20240127796

Abstract: The present invention estimates intention of an utterance more accurately than the related arts. A learning device learns an estimation model on the basis of learning data including an acoustic signal for learning and a label indicating whether or not the acoustic signal has been uttered to a predetermined target. The learning device includes: a feature synchronization unit configured to obtain a post-synchronization feature by synchronizing an acoustic feature obtained from the acoustic signal for learning with a text feature corresponding to the acoustic signal; an utterance intention estimation unit configured to estimate whether or not the acoustic signal has been uttered to the predetermined target by using the post-synchronization feature; and a parameter update unit configured to update a parameter of the estimation model on the basis of the label included in the learning data and an estimation result by the utterance intention estimation unit.

Type: Application

Filed: February 18, 2021

Publication date: April 18, 2024

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Hiroshi SATO, Takaaki FUKUTOMI, Yusuke SHINOHARA
Learning data acquisition apparatus, model learning apparatus, methods and programs for the same

Patent number: 11942074

Abstract: A learning data acquisition device or the like, capable of acquiring learning data by superimposing noise data on clean voice data at an appropriate SN ratio, is provided. The learning data acquisition device includes a voice recognition influence degree calculation unit and a learning data acquisition unit. The voice recognition influence degree calculation unit calculates an influence degree on voice recognition accuracy caused by a change of a signal-to-noise ratio, based on a result of voice recognition on the kth noise superimposed voice data and a result of voice recognition on the k?1th noise superimposed voice data, where K is an integer of 2 or larger, k=2, 3, . . .

Type: Grant

Filed: January 29, 2020

Date of Patent: March 26, 2024

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Takaaki Fukutomi, Takashi Nakamura, Kiyoaki Matsui
Non-verbal utterance detection apparatus, non-verbal utterance detection method, and program

Patent number: 11741989

Abstract: Detection precision of a non-verbal sound is improved. An acoustic model storage unit 10A stores an acoustic model that is configured by a deep neural network with a bottleneck structure, and estimates a phoneme state from a sound feature value. A non-verbal sound model storage unit 10B stores a non-verbal sound model that estimates a posterior probability of a non-verbal sound likeliness from the sound feature value and a bottleneck feature value. A sound feature value extraction unit 11 extracts a sound feature value from an input sound signal. A bottleneck feature value estimation unit 12 inputs the sound feature value to the acoustic model and obtains an output of a bottleneck layer of the acoustic model as a bottleneck feature value. A non-verbal sound detection unit 13 inputs the sound feature value and the bottleneck feature value to the non-verbal sound model and obtains the posterior probability of the non-verbal sound likeliness output by the non-verbal sound model.

Type: Grant

Filed: October 31, 2019

Date of Patent: August 29, 2023

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Takashi Nakamura, Takaaki Fukutomi, Kiyoaki Matsui
Learning speech data generating apparatus, learning speech data generating method, and program

Patent number: 11621015

Abstract: A training speech data generating apparatus includes: a voice conversion unit that converts, using fourth noise data, which is noise data based on third noise data, and speech data, the speech data so as to make the speech data clearly audible under a noise environment corresponding to the fourth noise data; and a noise superimposition unit that obtains training speech data by superimposing the third noise data and the converted speech data.

Type: Grant

Filed: March 11, 2019

Date of Patent: April 4, 2023

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Takaaki Fukutomi, Manabu Okamoto, Takashi Nakamura, Kiyoaki Matsui
Appropriate utterance estimate model learning apparatus, appropriate utterance judgement apparatus, appropriate utterance estimate model learning method, appropriate utterance judgement method, and program

Patent number: 11587553

Abstract: Provided is technology for assessing whether uttered speech detected from input speech is speech suited to a prescribed purpose. A method comprises detecting, from input speech including speech uttered by a speaker and noise, the uttered speech corresponding to the speech uttered by the speaker, extracting an acoustic feature of the uttered speech, generating, from the uttered speech, a speech recognition result set with a recognition score, generating, from the speech recognition result set with the recognition score, a speech recognition result word vector expression set and a speech recognition result part-of-speech vector expression set, generating a target utterance estimation model, providing, using the target utterance estimation model, a probability of the uttered speech being suited to the prescribed purpose, and outputting the uttered speech and the speech recognition result set with the recognition score, the the uttered speech suitable to the prescribed purpose.

Type: Grant

Filed: February 7, 2019

Date of Patent: February 21, 2023

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Takashi Nakamura, Takaaki Fukutomi
SPEECH RECOGNITION CONTROL APPARATUS, SPEECH RECOGNITION CONTROL METHOD, AND PROGRAM

Publication number: 20220328047

Abstract: Recognition results are acquired with high responsiveness without being affected by a network communication state. A speech recognition control device (1) acquires recognition results from a speech recognition device (2) with which it communicates through a network (3) and a speech recognition unit (13). A communication state measuring unit (11) measures a communication state of the network (3). A speech recognition requesting unit (12) transmits a request for a speech recognition process to each of the speech recognition device (2) and the speech recognition unit (13) with a timeout time set in accordance with an immediately prior communication state of the network (3). A recognition result output unit (14) outputs a recognition result based on a recognition result received from one or recognition results received from both of the speech recognition device (2) and the speech recognition unit (13).

Type: Application

Filed: June 4, 2019

Publication date: October 13, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Takaaki FUKUTOMI, Yoshikazu YAMAGUCHI, Yusuke SHINOHARA, Kiyoaki MATSUI, Takafumi MORIYA
LEARNING APPARATUS, SPEECH RECOGNITION APPARATUS, METHODS AND PROGRAMS FOR THE SAME

Publication number: 20220246138

Abstract: A learning device includes: a speech recognition portion configured to perform speech recognition processing on an acoustic feature value sequence O of an utterance unit using a recognition parameter ?ini, and obtain a recognition hypothesis Hm and an overall score xm; a hypothesis evaluation portion configured to evaluate the recognition hypothesis Hm and obtain an evaluation value Em using a correct answer text that is a correct speech recognition result for the acoustic feature value sequence O; a reranking portion configured to obtain an overall score xm,k for the recognition hypothesis Hm and give a rank rankm,k thereto using a recognition parameter ?k; an optimal parameter calculation portion configured to obtain, as a calculation result, an optimal value of a recognition parameter or a value expressing inappropriateness of the recognition parameter ?k based on the evaluation value Em and the rank rankm,k; and a model learning portion configured to learn a regression model for estimating an optimal reco

Type: Application

Filed: June 7, 2019

Publication date: August 4, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Hiroshi SATO, Takaaki FUKUTOMI
ACCOUSTIC MODEL LEARNING APPARATUS, ACCOUSTIC MODEL LEARNING METHOD, AND PROGRAM

Publication number: 20220122626

Abstract: Provided is a technology of learning an acoustic model with a certain degree of accuracy of sound recognition within a short calculation period.

Type: Application

Filed: January 23, 2020

Publication date: April 21, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Kiyoaki MATSUI, Takafumi MORIYA, Takaaki FUKUTOMI, Yusuke SHINOHARA, Yoshikazu YAMAGUCHI, Manabu OKAMOTO
LEARNING DATA ACQUISITION APPARATUS, MODEL LEARNING APPARATUS, METHODS AND PROGRAMS FOR THE SAME

Publication number: 20220101828

Abstract: A learning data acquisition device or the like, capable of acquiring learning data by superimposing noise data on clean voice data at an appropriate SN ratio, is provided. The learning data acquisition device includes a voice recognition influence degree calculation unit and a learning data acquisition unit. The voice recognition influence degree calculation unit calculates an influence degree on voice recognition accuracy caused by a change of a signal-to-noise ratio, based on a result of voice recognition on the kth noise superimposed voice data and a result of voice recognition on the k?1th noise superimposed voice data, where K is an integer of 2 or larger, k=2, 3, . . .

Type: Application

Filed: January 29, 2020

Publication date: March 31, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Takaaki FUKUTOMI, Takashi NAKAMURA, Kiyoaki MATSUI
Speech recognition accuracy deterioration factor estimation device, speech recognition accuracy deterioration factor estimation method, and program

Patent number: 11227580

Abstract: The present invention provides a device for estimating the deterioration factor of speech recognition accuracy by estimating an acoustic factor that leads to a speech recognition error. The device extracts an acoustic feature amount for each frame from an input speech, calculates a posterior probability for each acoustic event for the acoustic feature amount for each frame, corrects the posterior probability by filtering the posterior probability for each acoustic event using a time-series filter with weighting coefficients developed in the time axis, outputs a set of speech recognition results with a recognition score, outputs a feature amount for the speech recognition results for each frame, calculates and outputs a principal deterioration factor class for the speech recognition accuracy for each frame on the basis of the corrected posterior probability, the feature amount for speech recognition results for each frame, and the acoustic feature amount for each frame.

Type: Grant

Filed: February 6, 2019

Date of Patent: January 18, 2022

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Takashi Nakamura, Takaaki Fukutomi
NON-VERBAL UTTERANCE DETECTION APPARATUS, NON-VERBAL UTTERANCE DETECTION METHOD, AND PROGRAM

Publication number: 20210272587

Abstract: Detection precision of a non-verbal sound is improved. An acoustic model storage unit 10A stores an acoustic model that is configured by a deep neural network with a bottleneck structure, and estimates a phoneme state from a sound feature value. A non-verbal sound model storage unit 10B stores a non-verbal sound model that estimates a posterior probability of a non-verbal sound likeliness from the sound feature value and a bottleneck feature value. A sound feature value extraction unit 11 extracts a sound feature value from an input sound signal. A bottleneck feature value estimation unit 12 inputs the sound feature value to the acoustic model and obtains an output of a bottleneck layer of the acoustic model as a bottleneck feature value. A non-verbal sound detection unit 13 inputs the sound feature value and the bottleneck feature value to the non-verbal sound model and obtains the posterior probability of the non-verbal sound likeliness output by the non-verbal sound model.

Type: Application

Filed: October 31, 2019

Publication date: September 2, 2021

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Takashi NAKAMURA, Takaaki FUKUTOMI, Kiyoaki MATSUI
APPROPRIATE UTTERANCE ESTIMATE MODEL LEARNING APPARATUS, APPROPRIATE UTTERANCE JUDGEMENT APPARATUS, APPROPRIATE UTTERANCE ESTIMATE MODEL LEARNING METHOD, APPROPRIATE UTTERANCE JUDGEMENT METHOD, AND PROGRAM

Publication number: 20210035558

Abstract: Provided is technology for assessing whether uttered speech detected from input speech is speech suited to a prescribed purpose. A method comprises detecting, from input speech including speech uttered by a speaker and noise, the uttered speech corresponding to the speech uttered by the speaker, extracting an acoustic feature of the uttered speech, generating, from the uttered speech, a speech recognition result set with a recognition score, generating, from the speech recognition result set with the recognition score, a speech recognition result word vector expression set and a speech recognition result part-of-speech vector expression set, generating a target utterance estimation model, providing, using the target utterance estimation model, a probability of the uttered speech being suited to the prescribed purpose, and outputting the uttered speech and the speech recognition result set with the recognition score, the the uttered speech suitable to the prescribed purpose.

Type: Application

Filed: February 7, 2019

Publication date: February 4, 2021

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Takashi NAKAMURA, Takaaki FUKUTOMI
SPEECH RECOGNITION ACCURACY DETERIORATION FACTOR ESTIMATION DEVICE, SPEECH RECOGNITION ACCURACY DETERIORATION FACTOR ESTIMATION METHOD, AND PROGRAM

Publication number: 20210035553

Abstract: The present invention provides a device for estimating the deterioration factor of speech recognition accuracy by estimating an acoustic factor that leads to a speech recognition error. The device extracts an acoustic feature amount for each frame from an input speech, calculates a posterior probability for each acoustic event for the acoustic feature amount for each frame, corrects the posterior probability by filtering the posterior probability for each acoustic event using a time-series filter with weighting coefficients developed in the time axis, outputs a set of speech recognition results with a recognition score, outputs a feature amount for the speech recognition results for each frame, calculates and outputs a principal deterioration factor class for the speech recognition accuracy for each frame on the basis of the corrected posterior probability, the feature amount for speech recognition results for each frame, and the acoustic feature amount for each frame.

Type: Application

Filed: February 6, 2019

Publication date: February 4, 2021

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Takashi NAKAMURA, Takaaki FUKUTOMI
LEARNING SPEECH DATA GENERATING APPARATUS, LEARNING SPEECH DATA GENERATING METHOD, AND PROGRAM

Publication number: 20210005215

Abstract: A training speech data generating apparatus includes: a voice conversion unit that converts, using fourth noise data, which is noise data based on third noise data, and speech data, the speech data so as to make the speech data clearly audible under a noise environment corresponding to the fourth noise data; and a noise superimposition unit that obtains training speech data by superimposing the third noise data and the converted speech data.

Type: Application

Filed: March 11, 2019

Publication date: January 7, 2021

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Takaaki FUKUTOMI, Manabu OKAMOTO, Takashi NAKAMURA, Kiyoaki MATSUI