Patents by Inventor Takanobu OBA

Takanobu OBA has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Sequence labeling apparatus, sequence labeling method, and program

Patent number: 12142258

Abstract: Without dividing speech into a unit such as a word or a character, text corresponding to the speech is labeled. A speech distributed representation sequence converting unit 11 converts an acoustic feature sequence into a speech distributed representation. A symbol distributed representation converting unit 12 converts each symbol included in the symbol sequence corresponding to the acoustic feature sequence into a symbol distributed representation. A label estimation unit 13 estimates a label corresponding to the symbol from the fixed-length vector of the symbol generated using the speech distributed representation, the symbol distributed representation, and fixed-length vectors of previous and next symbols.

Type: Grant

Filed: January 10, 2020

Date of Patent: November 12, 2024

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Tomohiro Tanaka, Ryo Masumura, Takanobu Oba
Utterance section detection device, utterance section detection method, and program

Patent number: 12136435

Abstract: An utterance section detection device which is capable of detecting an utterance section with high accuracy on the basis of whether or not an end of a speech section is an end of utterance.

Type: Grant

Filed: July 24, 2019

Date of Patent: November 5, 2024

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo Masumura, Takanobu Oba, Kiyoaki Matsui
Speech recognition device, speech recognition method, and program

Patent number: 12057105

Abstract: Provided is a speech recognition device capable of implementing end-to-end speech recognition considering a context.

Type: Grant

Filed: January 27, 2020

Date of Patent: August 6, 2024

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo Masumura, Tomohiro Tanaka, Takanobu Oba
Voice/non-voice determination device, voice/non-voice determination model parameter learning device, voice/non-voice determination method, voice/non-voice determination model parameter learning method, and program

Patent number: 11894017

Abstract: A voice/non-voice determination device robust with respect to an acoustic signal in a high-noise environment is provided. The voice/non-voice determination device includes an acoustic scene classification unit including a first model which receives input of an acoustic signal and outputs acoustic scene information which is information regarding a scene where the acoustic signal is collected, a speech enhancement unit including a second model which receives input of the acoustic signal and outputs speech enhancement information which is information regarding the acoustic signal after enhancement, and a voice/non-voice determination unit including a third model which receives input of the acoustic signal, the acoustic scene information and the speech enhancement information and outputs a voice/non-voice label which is information regarding a label of either a speech section or a non-speech section.

Type: Grant

Filed: July 25, 2019

Date of Patent: February 6, 2024

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo Masumura, Takanobu Oba, Kiyoaki Matsui
Language model score calculation apparatus, language model generation apparatus, methods therefor, program, and recording medium

Patent number: 11887620

Abstract: The present invention improves the accuracy of language prediction. A history speech meta-information understanding unit 11 obtains a history speech meta-information vector from a word string of a preceding speech using a meta-information understanding device. A history speech embedding unit 12 converts the word string of the preceding speech and a speaker label into a history speech embedding vector. A speech unit combination vector construction unit 13 obtains a speech unit combination vector by combining the history speech meta-information vector and the history speech embedding vector. A speech sequence embedding vector calculation unit 14 converts a plurality of speech unit combination vectors obtained for the past speech sequences to a speech sequence embedding vector. A language model score calculation unit 15 calculates a language model score of a current speech from a word string of the current speech, a speaker label, and a speech sequence embedding vector.

Type: Grant

Filed: January 27, 2020

Date of Patent: January 30, 2024

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo Masumura, Tomohiro Tanaka, Takanobu Oba
VOICE/NON-VOICE DETERMINATION DEVICE, VOICE/NON-VOICE DETERMINATION MODEL PARAMETER LEARNING DEVICE, VOICE/NON-VOICE DETERMINATION METHOD, VOICE/NON-VOICE DETERMINATION MODEL PARAMETER LEARNING METHOD, AND PROGRAM

Publication number: 20220277767

Abstract: A voice/non-voice determination device robust with respect to an acoustic signal in a high-noise environment is provided. The voice/non-voice determination device includes an acoustic scene classification unit including a first model which receives input of an acoustic signal and outputs acoustic scene information which is information regarding a scene where the acoustic signal is collected, a speech enhancement unit including a second model which receives input of the acoustic signal and outputs speech enhancement information which is information regarding the acoustic signal after enhancement, and a voice/non-voice determination unit including a third model which receives input of the acoustic signal, the acoustic scene information and the speech enhancement information and outputs a voice/non-voice label which is information regarding a label of either a speech section or a non-speech section.

Type: Application

Filed: July 25, 2019

Publication date: September 1, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo MASUMURA, Takanobu OBA, Kiyoaki MATSUI
UTTERANCE SECTION DETECTION DEVICE, UTTERANCE SECTION DETECTION METHOD, AND PROGRAM

Publication number: 20220270637

Abstract: An utterance section detection device which is capable of detecting an utterance section with high accuracy on the basis of whether or not an end of a speech section is an end of utterance.

Type: Application

Filed: July 24, 2019

Publication date: August 25, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo MASUMURA, Takanobu OBA, Kiyoaki MATSUI
SPEECH RECOGNITION DEVICE, SPEECH RECOGNITION METHOD, AND PROGRAM

Publication number: 20220139374

Abstract: Provided a speech recognition device capable of implementing end-to-end speech. recognition considering a context.

Type: Application

Filed: January 27, 2020

Publication date: May 5, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo MASUMURA, Tomohiro TANAKA, Takanobu OBA
SEQUENCE LABELING APPARATUS, SEQUENCE LABELING METHOD, AND PROGRAM

Publication number: 20220093079

Abstract: Without dividing speech into a unit such as a word or a character, text corresponding to the speech is labeled. A speech distributed representation sequence converting unit 11 converts an acoustic feature sequence into a speech distributed representation. A symbol distributed representation converting unit 12 converts each symbol included in the symbol sequence corresponding to the acoustic feature sequence into a symbol distributed representation. A label estimation unit 13 estimates a label corresponding to the symbol from the fixed-length vector of the symbol generated using the speech distributed representation, the symbol distributed representation, and fixed-length vectors of previous and next symbols.

Type: Application

Filed: January 10, 2020

Publication date: March 24, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Tomohiro TANAKA, Ryo MASUMURA, Takanobu OBA
LANGUAGE MODEL SCORE CALCULATION APPARATUS, LANGUAGE MODEL GENERATION APPARATUS, METHODS THEREFOR, PROGRAM, AND RECORDING MEDIUM

Publication number: 20220013136

Abstract: The present invention improves the accuracy of language prediction. A history speech meta-information understanding unit 11 obtains a history speech meta-information vector from a word string of a preceding speech using a meta-information understanding device. A history speech embedding unit 12 converts the word string of the preceding speech and a speaker label into a history speech embedding vector. A speech unit combination vector construction unit 13 obtains a speech unit combination vector by combining the history speech meta-information vector and the history speech embedding vector. A speech sequence embedding vector calculation unit 14 converts a plurality of speech unit combination vectors obtained for the past speech sequences to a speech sequence embedding vector. A language model score calculation unit 15 calculates a language model score of a current speech from a word string of the current speech, a speaker label, and a speech sequence embedding vector.

Type: Application

Filed: January 27, 2020

Publication date: January 13, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo MASUMURA, Tomohiro TANAKA, Takanobu OBA