Patents by Inventor Joseph Edward Roth

Joseph Edward Roth has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and Methods Using Person Recognizability Across a Network of Devices

Publication number: 20220254190

Abstract: The present disclosure is directed to computer-implemented systems and methods for performing recognition over a network of devices. In general, the systems and methods implement a machine-learned recognizability model that can process information such as a person's voice, facial characteristics, or similar information to determine a recognizability score without necessarily generating or storing biometric information that could be used to identify the person. The recognizability score can act as a proxy for the quality of the information as a reference for biometric recognition that can be performed on other devices in the network of devices. Thus a single device can be used to enroll a person in the network (e.g., by capturing a number of photographs of the person). Thereafter, connection to the other devices can utilize a sensor (e.g., a camera) on the other devices to compare features of the reference information to the input received by the sensor.

Type: Application

Filed: August 14, 2019

Publication date: August 11, 2022

Inventors: Andrew Gallagher, Joseph Edward Roth, Michael Christian Nechyba
Speaking classification using audio-visual data

Patent number: 10846522

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating predictions for whether a target person is speaking during a portion of a video. In one aspect, a method includes obtaining one or more images which each depict a mouth of a given person at a respective time point. The images are processed using an image embedding neural network to generate a latent representation of the images. Audio data corresponding to the images is processed using an audio embedding neural network to generate a latent representation of the audio data. The latent representation of the images and the latent representation of the audio data is processed using a recurrent neural network to generate a prediction for whether the given person is speaking.

Type: Grant

Filed: October 16, 2018

Date of Patent: November 24, 2020

Assignee: Google LLC

Inventors: Sourish Chaudhuri, Ondrej Klejch, Joseph Edward Roth
SPEAKING CLASSIFICATION USING AUDIO-VISUAL DATA

Publication number: 20200117887

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating predictions for whether a target person is speaking during a portion of a video. In one aspect, a method includes obtaining one or more images which each depict a mouth of a given person at a respective time point. The images are processed using an image embedding neural network to generate a latent representation of the images. Audio data corresponding to the images is processed using an audio embedding neural network to generate a latent representation of the audio data. The latent representation of the images and the latent representation of the audio data is processed using a recurrent neural network to generate a prediction for whether the given person is speaking.

Type: Application

Filed: October 16, 2018

Publication date: April 16, 2020

Inventors: Sourish Chaudhuri, Ondrej Klejch, Joseph Edward Roth

Systems and Methods Using Person Recognizability Across a Network of Devices

Speaking classification using audio-visual data

SPEAKING CLASSIFICATION USING AUDIO-VISUAL DATA