Patents by Inventor Yasunori OHISHI

Yasunori OHISHI has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240087594
    Abstract: Provided is a technique for accurately estimating a scene even when the number of input signals increases. A scene estimation method includes: when S is the number of scenes and M is the number of input acoustic signals, an acoustic signal encoding step of generating, by a scene estimation device, an integrated acoustic feature amount from an m-th input acoustic signal (m=1, . . . , M) and a position where the m-th input acoustic signal is acquired (hereinafter referred to as an m-th input acoustic signal acquisition position) (m=1, . . . , M); and a scene selection step of selecting, by the scene estimation device, a scene from which M input acoustic signals are acquired from among S scenes, using the integrated acoustic feature amount.
    Type: Application
    Filed: February 10, 2021
    Publication date: March 14, 2024
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Masahiro YASUDA, Yasunori OHISHI, Shoichiro SAITO
  • Patent number: 11830478
    Abstract: A learning device calculates a feature of each data included in a pair of datasets in which two modalities among a plurality of modalities are combined, using a model that receives data on a corresponding modality among the modalities and outputs a feature obtained by mapping the received data into an embedding space. The learning device then selects similar data similar to each target data that is data on a first modality in a first dataset of the datasets, from data on a second modality included in a second dataset of the datasets. The learning device further updates a parameter of the model such that the features of the data in the pair included in the first and the second datasets are similar to one another, and the feature of data paired with the target data is similar to the feature of data paired with the similar data.
    Type: Grant
    Filed: April 1, 2021
    Date of Patent: November 28, 2023
    Assignees: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, MASSACHUSETTS INSTITUTE OF TECHNOLOGY
    Inventors: Yasunori Ohishi, Akisato Kimura, Takahito Kawanishi, Kunio Kashino, James R. Glass, David Harwath
  • Patent number: 11817081
    Abstract: A learning device calculates an image feature using a model (image encoder) that receives an image and outputs the image feature obtained by mapping the image into a latent space. The learning device calculates an audio feature using a model (audio encoder) that receives a speech in a predetermined language and outputs the audio feature obtained by mapping the speech into the latent space, and that includes a neural network provided with a self-attention mechanism. The learning device updates parameters of the models used by an image feature calculation unit and an audio feature calculation unit such that the image feature of a first image is similar to the audio feature of a speech corresponding to the first image.
    Type: Grant
    Filed: March 31, 2021
    Date of Patent: November 14, 2023
    Assignees: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, MASSACHUSETTS INSTITUTE OF TECHNOLOGY
    Inventors: Yasunori Ohishi, Akisato Kimura, Takahito Kawanishi, Kunio Kashino, James R. Glass, David Harwath
  • Publication number: 20220319493
    Abstract: A learning device calculates an image feature using a model (image encoder) that receives an image and outputs the image feature obtained by mapping the image into a latent space. The learning device calculates an audio feature using a model (audio encoder) that receives a speech in a predetermined language and outputs the audio feature obtained by mapping the speech into the latent space, and that includes a neural network provided with a self-attention mechanism. The learning device updates parameters of the models used by an image feature calculation unit and an audio feature calculation unit such that the image feature of a first image is similar to the audio feature of a speech corresponding to the first image.
    Type: Application
    Filed: March 31, 2021
    Publication date: October 6, 2022
    Applicants: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Massachusetts Institute of Technology
    Inventors: Yasunori OHISHI, Akisato KIMURA, Takahito KAWANISHI, Kunio KASHINO, James R. GLASS, David HARWATH
  • Publication number: 20220319495
    Abstract: A learning device calculates a feature of each data included in a pair of datasets in which two modalities among a plurality of modalities are combined, using a model that receives data on a corresponding modality among the modalities and outputs a feature obtained by mapping the received data into an embedding space. The learning device then selects similar data similar to each target data that is data on a first modality in a first dataset of the datasets, from data on a second modality included in a second dataset of the datasets. The learning device further updates a parameter of the model such that the features of the data in the pair included in the first and the second datasets are similar to one another, and the feature of data paired with the target data is similar to the feature of data paired with the similar data.
    Type: Application
    Filed: April 1, 2021
    Publication date: October 6, 2022
    Applicants: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Massachusetts Institute of Technology
    Inventors: Yasunori OHISHI, Akisato KIMURA, Takahito KAWANISHI, Kunio KASHINO, James R. GLASS, David HARWATH