Patents by Inventor Sandra SAJEEV

Sandra SAJEEV has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

PRIOR-DRIVEN SUPERVISION FOR WEAKLY-SUPERVISED TEMPORAL ACTION LOCALIZATION

Publication number: 20240404279

Abstract: A classifier model is trained for temporal action localization of video clips. A training video clip that includes actions of interest for identification is ingested into the classifier model. Action characteristics within frames of the video clip are identified. The actions correspond to known action classes. An actionness score is determined for each of the frames based upon the action characteristics identified within each of the frames. Class activation sequence (CAS) scores are determined for sequences of the frames based upon a presence or an absence of the action characteristics identified within each of the frames. Base confidence predictions of temporal locations of actions of interest within the video clip are produced by correlating each of the actionness scores with corresponding class activation scores for each of the frames in the sequences of frames.

Type: Application

Filed: May 30, 2023

Publication date: December 5, 2024

Inventors: Gaurav MITTAL, Ye YU, Matthew Brigham HALL, Sandra SAJEEV, Mei CHEN, Mamshad Nayeem RIZVE
TRAINING AND USING A MODEL FOR CONTENT MODERATION OF MULTIMODAL MEDIA

Publication number: 20240290081

Abstract: A computerized method trains and uses a multimodal fusion transformer (MFT) model for content moderation. Language modality data and vision modality data associated with a multimodal media source is received. Language embeddings are generated from the language modality data and vision embeddings are generated from the vision modality data. Both kinds of embeddings are generated using operations and/or processes that are specific to the associated modalities. The language embeddings and vision embeddings are combined into combined embeddings and the MFT model is used with those combined embeddings to generate a language semantic output token, a vision semantic output token, and a combined semantic output token. Contrastive loss data is generated using the three semantic output tokens and the MFT model is adjusted using that contrastive loss data. After the MFT model is trained sufficiently, it is configured to perform content moderation operations using semantic output tokens.

Type: Application

Filed: February 28, 2023

Publication date: August 29, 2024

Inventors: Ye YU, Gaurav MITTAL, Matthew Brigham HALL, Sandra SAJEEV, Mei CHEN, Jialin YUAN

PRIOR-DRIVEN SUPERVISION FOR WEAKLY-SUPERVISED TEMPORAL ACTION LOCALIZATION

TRAINING AND USING A MODEL FOR CONTENT MODERATION OF MULTIMODAL MEDIA