Patents by Inventor Takanobu OBA

Takanobu OBA has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12142258
    Abstract: Without dividing speech into a unit such as a word or a character, text corresponding to the speech is labeled. A speech distributed representation sequence converting unit 11 converts an acoustic feature sequence into a speech distributed representation. A symbol distributed representation converting unit 12 converts each symbol included in the symbol sequence corresponding to the acoustic feature sequence into a symbol distributed representation. A label estimation unit 13 estimates a label corresponding to the symbol from the fixed-length vector of the symbol generated using the speech distributed representation, the symbol distributed representation, and fixed-length vectors of previous and next symbols.
    Type: Grant
    Filed: January 10, 2020
    Date of Patent: November 12, 2024
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Tomohiro Tanaka, Ryo Masumura, Takanobu Oba
  • Patent number: 12136435
    Abstract: An utterance section detection device which is capable of detecting an utterance section with high accuracy on the basis of whether or not an end of a speech section is an end of utterance.
    Type: Grant
    Filed: July 24, 2019
    Date of Patent: November 5, 2024
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo Masumura, Takanobu Oba, Kiyoaki Matsui
  • Patent number: 12057105
    Abstract: Provided is a speech recognition device capable of implementing end-to-end speech recognition considering a context.
    Type: Grant
    Filed: January 27, 2020
    Date of Patent: August 6, 2024
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo Masumura, Tomohiro Tanaka, Takanobu Oba
  • Patent number: 11894017
    Abstract: A voice/non-voice determination device robust with respect to an acoustic signal in a high-noise environment is provided. The voice/non-voice determination device includes an acoustic scene classification unit including a first model which receives input of an acoustic signal and outputs acoustic scene information which is information regarding a scene where the acoustic signal is collected, a speech enhancement unit including a second model which receives input of the acoustic signal and outputs speech enhancement information which is information regarding the acoustic signal after enhancement, and a voice/non-voice determination unit including a third model which receives input of the acoustic signal, the acoustic scene information and the speech enhancement information and outputs a voice/non-voice label which is information regarding a label of either a speech section or a non-speech section.
    Type: Grant
    Filed: July 25, 2019
    Date of Patent: February 6, 2024
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo Masumura, Takanobu Oba, Kiyoaki Matsui
  • Patent number: 11887620
    Abstract: The present invention improves the accuracy of language prediction. A history speech meta-information understanding unit 11 obtains a history speech meta-information vector from a word string of a preceding speech using a meta-information understanding device. A history speech embedding unit 12 converts the word string of the preceding speech and a speaker label into a history speech embedding vector. A speech unit combination vector construction unit 13 obtains a speech unit combination vector by combining the history speech meta-information vector and the history speech embedding vector. A speech sequence embedding vector calculation unit 14 converts a plurality of speech unit combination vectors obtained for the past speech sequences to a speech sequence embedding vector. A language model score calculation unit 15 calculates a language model score of a current speech from a word string of the current speech, a speaker label, and a speech sequence embedding vector.
    Type: Grant
    Filed: January 27, 2020
    Date of Patent: January 30, 2024
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo Masumura, Tomohiro Tanaka, Takanobu Oba
  • Publication number: 20220277767
    Abstract: A voice/non-voice determination device robust with respect to an acoustic signal in a high-noise environment is provided. The voice/non-voice determination device includes an acoustic scene classification unit including a first model which receives input of an acoustic signal and outputs acoustic scene information which is information regarding a scene where the acoustic signal is collected, a speech enhancement unit including a second model which receives input of the acoustic signal and outputs speech enhancement information which is information regarding the acoustic signal after enhancement, and a voice/non-voice determination unit including a third model which receives input of the acoustic signal, the acoustic scene information and the speech enhancement information and outputs a voice/non-voice label which is information regarding a label of either a speech section or a non-speech section.
    Type: Application
    Filed: July 25, 2019
    Publication date: September 1, 2022
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo MASUMURA, Takanobu OBA, Kiyoaki MATSUI
  • Publication number: 20220270637
    Abstract: An utterance section detection device which is capable of detecting an utterance section with high accuracy on the basis of whether or not an end of a speech section is an end of utterance.
    Type: Application
    Filed: July 24, 2019
    Publication date: August 25, 2022
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo MASUMURA, Takanobu OBA, Kiyoaki MATSUI
  • Publication number: 20220139374
    Abstract: Provided a speech recognition device capable of implementing end-to-end speech. recognition considering a context.
    Type: Application
    Filed: January 27, 2020
    Publication date: May 5, 2022
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo MASUMURA, Tomohiro TANAKA, Takanobu OBA
  • Publication number: 20220093079
    Abstract: Without dividing speech into a unit such as a word or a character, text corresponding to the speech is labeled. A speech distributed representation sequence converting unit 11 converts an acoustic feature sequence into a speech distributed representation. A symbol distributed representation converting unit 12 converts each symbol included in the symbol sequence corresponding to the acoustic feature sequence into a symbol distributed representation. A label estimation unit 13 estimates a label corresponding to the symbol from the fixed-length vector of the symbol generated using the speech distributed representation, the symbol distributed representation, and fixed-length vectors of previous and next symbols.
    Type: Application
    Filed: January 10, 2020
    Publication date: March 24, 2022
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Tomohiro TANAKA, Ryo MASUMURA, Takanobu OBA
  • Publication number: 20220013136
    Abstract: The present invention improves the accuracy of language prediction. A history speech meta-information understanding unit 11 obtains a history speech meta-information vector from a word string of a preceding speech using a meta-information understanding device. A history speech embedding unit 12 converts the word string of the preceding speech and a speaker label into a history speech embedding vector. A speech unit combination vector construction unit 13 obtains a speech unit combination vector by combining the history speech meta-information vector and the history speech embedding vector. A speech sequence embedding vector calculation unit 14 converts a plurality of speech unit combination vectors obtained for the past speech sequences to a speech sequence embedding vector. A language model score calculation unit 15 calculates a language model score of a current speech from a word string of the current speech, a speaker label, and a speech sequence embedding vector.
    Type: Application
    Filed: January 27, 2020
    Publication date: January 13, 2022
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo MASUMURA, Tomohiro TANAKA, Takanobu OBA