Patents by Inventor Manoj Plakal

Manoj Plakal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and Methods for Upmixing Audiovisual Data

Publication number: 20250234150

Abstract: A computer-implemented method for upmixing audiovisual data can include obtaining audiovisual data including input audio data and video data accompanying the input audio data. Each frame of the video data can depict only a portion of a larger scene. The input audio data can have a first number of audio channels. The computer-implemented method can include providing the audiovisual data as input to a machine-learned audiovisual upmixing model. The audiovisual upmixing model can include a sequence-to-sequence model configured to model a respective location of one or more audio sources within the larger scene over multiple frames of the video data. The computer-implemented method can include receiving upmixed audio data from the audiovisual upmixing model.

Type: Application

Filed: April 7, 2025

Publication date: July 17, 2025

Inventors: Aren Jansen, Manoj Plakal, Dan Ellis, Shawn Hershey, Richard Channing Moore, III
Systems and methods for upmixing audiovisual data

Patent number: 12273697

Abstract: A computer-implemented method for upmixing audiovisual data can include obtaining audiovisual data including input audio data and video data accompanying the input audio data. Each frame of the video data can depict only a portion of a larger scene. The input audio data can have a first number of audio channels. The computer-implemented method can include providing the audiovisual data as input to a machine-learned audiovisual upmixing model. The audiovisual upmixing model can include a sequence-to-sequence model configured to model a respective location of one or more audio sources within the larger scene over multiple frames of the video data. The computer-implemented method can include receiving upmixed audio data from the audiovisual upmixing model. The upmixed audio data can have a second number of audio channels. The second number of audio channels can be greater than the first number of audio channels.

Type: Grant

Filed: August 26, 2020

Date of Patent: April 8, 2025

Assignee: GOOGLE LLC

Inventors: Aren Jansen, Manoj Plakal, Dan Ellis, Shawn Hershey, Richard Channing Moore, III
Systems and Methods for Upmixing Audiovisual Data

Publication number: 20230308823

Abstract: A computer-implemented method for upmixing audiovisual data can include obtaining audiovisual data including input audio data and video data accompanying the input audio data. Each frame of the video data can depict only a portion of a larger scene. The input audio data can have a first number of audio channels. The computer-implemented method can include providing the audiovisual data as input to a machine-learned audiovisual upmixing model. The audiovisual upmixing model can include a sequence-to-sequence model configured to model a respective location of one or more audio sources within the larger scene over multiple frames of the video data. The computer-implemented method can include receiving upmixed audio data from the audiovisual upmixing model. The upmixed audio data can have a second number of audio channels. The second number of audio channels can be greater than the first number of audio channels.

Type: Application

Filed: August 26, 2020

Publication date: September 28, 2023

Inventors: Aren Jansen, Manoj Plakal, Dan Ellis, Shawn Hershey, Richard Channing Moore, III
Unsupervised learning of semantic audio representations

Patent number: 11335328

Abstract: Methods are provided for generating training triplets that can be used to train multidimensional embeddings to represent the semantic content of non-speech sounds present in a corpus of audio recordings. These training triplets can be used with a triplet loss function to train the multidimensional embeddings such that the embeddings can be used to cluster the contents of a corpus of audio recordings, to facilitate a query-by-example lookup from the corpus, to allow a small number of manually-labeled audio recordings to be generalized, or to facilitate some other audio classification task. The triplet sampling methods may be used individually or collectively, and each represent a respective heuristic about the semantic structure of audio recordings.

Type: Grant

Filed: October 26, 2018

Date of Patent: May 17, 2022

Assignee: Google LLC

Inventors: Aren Jansen, Manoj Plakal, Richard Channing Moore, Shawn Hershey, Ratheet Pandya, Ryan Rifkin, Jiayang Liu, Daniel Ellis
Unsupervised Learning of Semantic Audio Representations

Publication number: 20200349921

Abstract: Methods are provided for generating training triplets that can be used to train multidimensional embeddings to represent the semantic content of non-speech sounds present in a corpus of audio recordings. These training triplets can be used with a triplet loss function to train the multidimensional embeddings such that the embeddings can be used to cluster the contents of a corpus of audio recordings, to facilitate a query-by-example lookup from the corpus, to allow a small number of manually-labeled audio recordings to be generalized, or to facilitate some other audio classification task. The triplet sampling methods may be used individually or collectively, and each represent a respective heuristic about the semantic structure of audio recordings.

Type: Application

Filed: October 26, 2018

Publication date: November 5, 2020

Inventors: Aren Jansen, Manoj Plakal, Richard Channing Moore, Shawn Hershey, Ratheet Pandya, Ryan Rifkin, Jiayang Liu, Daniel Ellis
Systems and methods that leverage deep learning to selectively store audiovisual content

Patent number: 10372991

Abstract: Systems, methods, and devices for curating audiovisual content are provided. A mobile image capture device can be operable to capture one or more images; receive an audio signal; analyze at least a portion of the audio signal with a first machine-learned model to determine a first audio classifier label descriptive of an audio event; identify a first image associated with the first audio classifier label; analyze the first image with a second machine-learned model to determine a desirability of a scene depicted by the first image; and determine, based at least in part on the desirability of the scene depicted by the first image, whether to store a copy of the first image associated with the first audio classifier label in the non-volatile memory of the mobile image capture device or to discard the first image without storing a copy of the first image.

Type: Grant

Filed: April 3, 2018

Date of Patent: August 6, 2019

Assignee: Google LLC

Inventors: James Niemasik, Manoj Plakal
Querying multidimensional data with independent fact and dimension pipelines combined at query time

Patent number: 8620857

Abstract: Separate subsystems are dedicated to handle fact and dimension data storage and retrieval in an optimized manner. Each subsystem acquires, processes, and stores its data separately in a manner appropriate to the characteristics of that data. A query engine combines the data from each subsystem at query time. When a user queries the system, the query engine interacts with each of the subsystems to fetch the data needed to generate a single result set.

Type: Grant

Filed: August 13, 2012

Date of Patent: December 31, 2013

Assignee: Google Inc.

Inventors: Benjamin Weinberger, Manoj Plakal, Will Robinson
Querying multidimensional data with independent fact and dimension pipelines combined at query time

Patent number: 8244667

Abstract: Separate subsystems are dedicated to handle fact and dimension data storage and retrieval in an optimized manner. Each subsystem acquires, processes, and stores its data separately in a manner appropriate to the characteristics of that data. A query engine combines the data from each subsystem at query time. When a user queries the system, the query engine interacts with each of the subsystems to fetch the data needed to generate a single result set.

Type: Grant

Filed: October 18, 2007

Date of Patent: August 14, 2012

Assignee: Google Inc.

Inventors: Benjamin Weinberger, Manoj Plakal, Will Robinson