Patents Assigned to Vionlabs AB
  • Publication number: 20240071053
    Abstract: Systems and methods for video representation learning using triplet training are provided. The system receives a video file and extracts features associated with the video file such as video features, audio features, and valence-arousal-dominance (VAD) features. The system processes the video features, audio features, and VAD features using a hierarchical attention network to generate a video embedding, an audio embedding, and a VAD embedding, respectively. The system concatenates the video embedding, the audio embedding and VAD embedding to create a concatenated embedding. The system processes the concatenated embedding using a non-local attention network to generate a fingerprint associated with the video file. The system then processes the fingerprint generate one or more of a mood prediction, a genre prediction, and a keyword prediction.
    Type: Application
    Filed: August 23, 2023
    Publication date: February 29, 2024
    Applicant: Vionlabs AB
    Inventors: Alden Coots, Rithika Harish Kumar, Paula Diaz Benet, Marcus Bergström