Patents by Inventor Yasunori OHISHI

Yasunori OHISHI has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SCENE ESTIMATE METHOD, SCENE ESTIMATE APPARATUS, AND PROGRAM

Publication number: 20240087594

Abstract: Provided is a technique for accurately estimating a scene even when the number of input signals increases. A scene estimation method includes: when S is the number of scenes and M is the number of input acoustic signals, an acoustic signal encoding step of generating, by a scene estimation device, an integrated acoustic feature amount from an m-th input acoustic signal (m=1, . . . , M) and a position where the m-th input acoustic signal is acquired (hereinafter referred to as an m-th input acoustic signal acquisition position) (m=1, . . . , M); and a scene selection step of selecting, by the scene estimation device, a scene from which M input acoustic signals are acquired from among S scenes, using the integrated acoustic feature amount.

Type: Application

Filed: February 10, 2021

Publication date: March 14, 2024

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Masahiro YASUDA, Yasunori OHISHI, Shoichiro SAITO
Learning device, learning method, and learning program for images and sound which uses a similarity matrix

Patent number: 11830478

Abstract: A learning device calculates a feature of each data included in a pair of datasets in which two modalities among a plurality of modalities are combined, using a model that receives data on a corresponding modality among the modalities and outputs a feature obtained by mapping the received data into an embedding space. The learning device then selects similar data similar to each target data that is data on a first modality in a first dataset of the datasets, from data on a second modality included in a second dataset of the datasets. The learning device further updates a parameter of the model such that the features of the data in the pair included in the first and the second datasets are similar to one another, and the feature of data paired with the target data is similar to the feature of data paired with the similar data.

Type: Grant

Filed: April 1, 2021

Date of Patent: November 28, 2023

Assignees: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, MASSACHUSETTS INSTITUTE OF TECHNOLOGY

Inventors: Yasunori Ohishi, Akisato Kimura, Takahito Kawanishi, Kunio Kashino, James R. Glass, David Harwath
Learning device, learning method, learning program, retrieval device, retrieval method, and retrieval program

Patent number: 11817081

Abstract: A learning device calculates an image feature using a model (image encoder) that receives an image and outputs the image feature obtained by mapping the image into a latent space. The learning device calculates an audio feature using a model (audio encoder) that receives a speech in a predetermined language and outputs the audio feature obtained by mapping the speech into the latent space, and that includes a neural network provided with a self-attention mechanism. The learning device updates parameters of the models used by an image feature calculation unit and an audio feature calculation unit such that the image feature of a first image is similar to the audio feature of a speech corresponding to the first image.

Type: Grant

Filed: March 31, 2021

Date of Patent: November 14, 2023

Assignees: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, MASSACHUSETTS INSTITUTE OF TECHNOLOGY

Inventors: Yasunori Ohishi, Akisato Kimura, Takahito Kawanishi, Kunio Kashino, James R. Glass, David Harwath
LEARNING DEVICE, LEARNING METHOD, LEARNING PROGRAM, RETRIEVAL DEVICE, RETRIEVAL METHOD, AND RETRIEVAL PROGRAM

Publication number: 20220319493

Abstract: A learning device calculates an image feature using a model (image encoder) that receives an image and outputs the image feature obtained by mapping the image into a latent space. The learning device calculates an audio feature using a model (audio encoder) that receives a speech in a predetermined language and outputs the audio feature obtained by mapping the speech into the latent space, and that includes a neural network provided with a self-attention mechanism. The learning device updates parameters of the models used by an image feature calculation unit and an audio feature calculation unit such that the image feature of a first image is similar to the audio feature of a speech corresponding to the first image.

Type: Application

Filed: March 31, 2021

Publication date: October 6, 2022

Applicants: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Massachusetts Institute of Technology

Inventors: Yasunori OHISHI, Akisato KIMURA, Takahito KAWANISHI, Kunio KASHINO, James R. GLASS, David HARWATH
LEARNING DEVICE, LEARNING METHOD, AND LEARNING PROGRAM

Publication number: 20220319495

Abstract: A learning device calculates a feature of each data included in a pair of datasets in which two modalities among a plurality of modalities are combined, using a model that receives data on a corresponding modality among the modalities and outputs a feature obtained by mapping the received data into an embedding space. The learning device then selects similar data similar to each target data that is data on a first modality in a first dataset of the datasets, from data on a second modality included in a second dataset of the datasets. The learning device further updates a parameter of the model such that the features of the data in the pair included in the first and the second datasets are similar to one another, and the feature of data paired with the target data is similar to the feature of data paired with the similar data.

Type: Application

Filed: April 1, 2021

Publication date: October 6, 2022

Applicants: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Massachusetts Institute of Technology

Inventors: Yasunori OHISHI, Akisato KIMURA, Takahito KAWANISHI, Kunio KASHINO, James R. GLASS, David HARWATH

SCENE ESTIMATE METHOD, SCENE ESTIMATE APPARATUS, AND PROGRAM

Learning device, learning method, and learning program for images and sound which uses a similarity matrix

Learning device, learning method, learning program, retrieval device, retrieval method, and retrieval program

LEARNING DEVICE, LEARNING METHOD, LEARNING PROGRAM, RETRIEVAL DEVICE, RETRIEVAL METHOD, AND RETRIEVAL PROGRAM

LEARNING DEVICE, LEARNING METHOD, AND LEARNING PROGRAM