Patents by Inventor Shota HORIGUCHI

Shota HORIGUCHI has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220254352
    Abstract: An audio analysis platform may receive a portion of an audio input, wherein the audio input corresponds to audio associated with a plurality of speakers. The audio analysis platform may process, using a neural network, the portion of the audio input to determine voice activity of the plurality of speakers during the portion of the audio input, wherein the neural network is trained using reference audio data and reference diarization data corresponding to the reference audio data. The audio analysis platform may determine, based on the neural network being used to process the portion of the audio input, a diarization output associated with the portion of the audio input, wherein the diarization output indicates individual voice activity of the plurality of speakers. The audio analysis platform may provide the diarization output to indicate the individual voice activity of the plurality of speakers during the portion of the audio input.
    Type: Application
    Filed: August 31, 2020
    Publication date: August 11, 2022
    Applicants: The Johns Hopkins University, Hitachi, Ltd.
    Inventors: Yusuke FUJITA, Shinji WATANABE, Naoyuki KANDA, Shota HORIGUCHI
  • Patent number: 11107476
    Abstract: A speaker estimation method that estimate the speaker from audio and image includes: inputting audio; extracting a feature quantity representing a voice characteristic from the input audio; inputting an image; detecting person regions of respective persons from the input image; estimating feature quantities representing voice characteristics from the respective detected person regions; Performing a change such that an image taken from another position and with another angle is input when any person is not detected; calculating a similarity between the feature quantity representing the voice characteristic extracted from the audio and the feature quantity representing the voice characteristic estimated from the person region in the image; and estimating a speaker from the calculated similarity.
    Type: Grant
    Filed: February 26, 2019
    Date of Patent: August 31, 2021
    Assignee: HITACHI, LTD.
    Inventors: Shota Horiguchi, Naoyuki Kanda
  • Publication number: 20190272828
    Abstract: A speaker estimation method that estimate the speaker from audio and image includes: inputting audio; extracting a feature quantity representing a voice characteristic from the input audio; inputting an image; detecting person regions of respective persons from the input image; estimating feature quantities representing voice characteristics from the respective detected person regions; Performing a change such that an image taken from another position and with another angle is input when any person is not detected; calculating a similarity between the feature quantity representing the voice characteristic extracted from the audio and the feature quantity representing the voice characteristic estimated from the person region in the image; and estimating a speaker from the calculated similarity.
    Type: Application
    Filed: February 26, 2019
    Publication date: September 5, 2019
    Applicant: HITACHI, LTD.
    Inventors: Shota HORIGUCHI, Naoyuki KANDA