Patents by Inventor David HARWATH

David HARWATH has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method and system for generating a sequence of actions for controlling a robot

Patent number: 12613523

Abstract: A method, a system and a computer program product are provided for applying a neural network including an action sequence decoder for generating an action sequence for a robot to perform a task. The neural network is applied to generate the action sequence based on recordings demonstrating humans performing tasks. In an example, the method comprises collecting a recording and a sequence of captions describing scenes in the recording; extracting feature data from the recording; encoding the extracted feature data to produce a sequence of encoded features; and applying the action sequence decoder to produce a sequence of actions for the robot based on the sequence of encoded features having a semantic meaning corresponding to a semantic meaning of the sequence of captions. The feature data includes features of a video signal, an audio signal, and/or text transcription capturing a performance of the task.

Type: Grant

Filed: September 27, 2023

Date of Patent: April 28, 2026

Assignee: Mitsubishi Electric Research Laboratories, Inc.

Inventors: Chiori Hori, Jonathan Le Roux, Devesh Jha, Siddarth Jain, Radu Ioan Corcodel, Diego Romeres, Puyuang Peng, Xinyu Liu, David Harwath
Method and System for Generating a Sequence of Actions for Controlling a Robot

Publication number: 20240288870

Abstract: A method, a system and a computer program product are provided for applying a neural network including an action sequence decoder for generating an action sequence for a robot to perform a task. The neural network is applied to generate the action sequence based on recordings demonstrating humans performing tasks. In an example, the method comprises collecting a recording and a sequence of captions describing scenes in the recording; extracting feature data from the recording; encoding the extracted feature data to produce a sequence of encoded features; and applying the action sequence decoder to produce a sequence of actions for the robot based on the sequence of encoded features having a semantic meaning corresponding to a semantic meaning of the sequence of captions. The feature data includes features of a video signal, an audio signal, and/or text transcription capturing a performance of the task.

Type: Application

Filed: September 27, 2023

Publication date: August 29, 2024

Applicant: Mitsubishi Electric Research Laboratories, Inc.

Inventors: Chiori Hori, Jonathan Le Roux, Devesh Jha, Siddarth Jain, Radu Ioan Corcodel, Diego Romeres, Puyuang Peng, Xinyu Liu, David Harwath
Learning device, learning method, and learning program for images and sound which uses a similarity matrix

Patent number: 11830478

Abstract: A learning device calculates a feature of each data included in a pair of datasets in which two modalities among a plurality of modalities are combined, using a model that receives data on a corresponding modality among the modalities and outputs a feature obtained by mapping the received data into an embedding space. The learning device then selects similar data similar to each target data that is data on a first modality in a first dataset of the datasets, from data on a second modality included in a second dataset of the datasets. The learning device further updates a parameter of the model such that the features of the data in the pair included in the first and the second datasets are similar to one another, and the feature of data paired with the target data is similar to the feature of data paired with the similar data.

Type: Grant

Filed: April 1, 2021

Date of Patent: November 28, 2023

Assignees: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, MASSACHUSETTS INSTITUTE OF TECHNOLOGY

Inventors: Yasunori Ohishi, Akisato Kimura, Takahito Kawanishi, Kunio Kashino, James R. Glass, David Harwath
Learning device, learning method, learning program, retrieval device, retrieval method, and retrieval program

Patent number: 11817081

Abstract: A learning device calculates an image feature using a model (image encoder) that receives an image and outputs the image feature obtained by mapping the image into a latent space. The learning device calculates an audio feature using a model (audio encoder) that receives a speech in a predetermined language and outputs the audio feature obtained by mapping the speech into the latent space, and that includes a neural network provided with a self-attention mechanism. The learning device updates parameters of the models used by an image feature calculation unit and an audio feature calculation unit such that the image feature of a first image is similar to the audio feature of a speech corresponding to the first image.

Type: Grant

Filed: March 31, 2021

Date of Patent: November 14, 2023

Assignees: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, MASSACHUSETTS INSTITUTE OF TECHNOLOGY

Inventors: Yasunori Ohishi, Akisato Kimura, Takahito Kawanishi, Kunio Kashino, James R. Glass, David Harwath
LEARNING DEVICE, LEARNING METHOD, LEARNING PROGRAM, RETRIEVAL DEVICE, RETRIEVAL METHOD, AND RETRIEVAL PROGRAM

Publication number: 20220319493

Abstract: A learning device calculates an image feature using a model (image encoder) that receives an image and outputs the image feature obtained by mapping the image into a latent space. The learning device calculates an audio feature using a model (audio encoder) that receives a speech in a predetermined language and outputs the audio feature obtained by mapping the speech into the latent space, and that includes a neural network provided with a self-attention mechanism. The learning device updates parameters of the models used by an image feature calculation unit and an audio feature calculation unit such that the image feature of a first image is similar to the audio feature of a speech corresponding to the first image.

Type: Application

Filed: March 31, 2021

Publication date: October 6, 2022

Applicants: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Massachusetts Institute of Technology

Inventors: Yasunori OHISHI, Akisato KIMURA, Takahito KAWANISHI, Kunio KASHINO, James R. GLASS, David HARWATH
LEARNING DEVICE, LEARNING METHOD, AND LEARNING PROGRAM

Publication number: 20220319495

Abstract: A learning device calculates a feature of each data included in a pair of datasets in which two modalities among a plurality of modalities are combined, using a model that receives data on a corresponding modality among the modalities and outputs a feature obtained by mapping the received data into an embedding space. The learning device then selects similar data similar to each target data that is data on a first modality in a first dataset of the datasets, from data on a second modality included in a second dataset of the datasets. The learning device further updates a parameter of the model such that the features of the data in the pair included in the first and the second datasets are similar to one another, and the feature of data paired with the target data is similar to the feature of data paired with the similar data.

Type: Application

Filed: April 1, 2021

Publication date: October 6, 2022

Applicants: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Massachusetts Institute of Technology

Inventors: Yasunori OHISHI, Akisato KIMURA, Takahito KAWANISHI, Kunio KASHINO, James R. GLASS, David HARWATH

Method and system for generating a sequence of actions for controlling a robot

Method and System for Generating a Sequence of Actions for Controlling a Robot

Learning device, learning method, and learning program for images and sound which uses a similarity matrix

Learning device, learning method, learning program, retrieval device, retrieval method, and retrieval program

LEARNING DEVICE, LEARNING METHOD, LEARNING PROGRAM, RETRIEVAL DEVICE, RETRIEVAL METHOD, AND RETRIEVAL PROGRAM

LEARNING DEVICE, LEARNING METHOD, AND LEARNING PROGRAM