Patents by Inventor Josef Sivic

Josef Sivic has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

CUSTOMIZING MOTION AND APPEARANCE IN VIDEO GENERATION

Publication number: 20250142182

Abstract: Systems and methods include generating synthetic videos based on a custom motion. A video generation system obtains a text prompt including an object and a custom motion token. The custom motion token represents a custom motion. The system encodes the text prompt to obtain a text embedding. Subsequently, a video generation model generates a synthetic video depicting the object performing the custom motion based on the text embedding using a video generation model.

Type: Application

Filed: February 22, 2024

Publication date: May 1, 2025

Inventors: Joanna Irena Materzynska, Richard Zhang, Elya Shechtman, Josef Sivic, Bryan Christopher Russell
Learning to Personalize Vision-Language Models through Meta-Personalization

Publication number: 20240419726

Abstract: Techniques for learning to personalize vision-language models through meta-personalization are described. In one embodiment, one or more processing devices lock a pre-trained vision-language model (VLM) during a training phase. The processing devices train the pre-trained VLM to augment a text encoder of the pre-trained VLM with a set of general named video instances to form a meta-personalized VLM, the meta-personalized VLM to include global category features. The processing devices test the meta-personalized VLM to adapt the text encoder with a set of personal named video instances to form a personal VLM, the personal VLM comprising the global category features personalized with a set of personal instance weights to form a personal instance token associated with the user. Other embodiments are described and claimed.

Type: Application

Filed: June 15, 2023

Publication date: December 19, 2024

Applicant: Adobe Inc.

Inventors: Simon Jenni, Fabian David Caba Heilbron, Chun-Hsiao Yeh, Bryan Russell, Josef Sivic
NATURAL LANGUAGE-GUIDED MUSIC AUDIO RECOMMENDATION FOR VIDEO USING MACHINE LEARNING

Publication number: 20240386048

Abstract: Embodiments are disclosed for an audio recommendation system trained to recommend music audio sequences for pairing with query video sequences using neural networks. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving an input including a query video sequence and natural language text. The disclosed systems and methods further comprise generating a fused visual-text embedding based on a visual embedding and a text embedding corresponding to the input. The disclosed systems and methods further comprise comparing audio embeddings for music audio sequences of a music audio sequences database with the fused visual-text embedding. The disclosed systems and methods further comprise determining a music audio sequence from the music audio sequences database as the recommended music audio sequence for pairing with the query video sequence based on a similarity metric calculated between an audio embedding for the music audio sequence and the fused visual-text embedding.

Type: Application

Filed: May 17, 2023

Publication date: November 21, 2024

Applicant: Adobe Inc.

Inventors: Bryan RUSSELL, Justin SALAMON, Daniel McKEE, Josef SIVIC
Object retrieval

Publication number: 20050225678

Abstract: A method of identifying a user-specified object contained in one or more images of a plurality of images that comprises the steps of defining regions of objects in the images, and computing a vector in respect of each of the regions based on the appearance of the respective region. The vector comprises a descriptor. The method further comprises vector quantizing the descriptors into clusters, storing the clusters as an index with the images in which they occur, defining regions of the user-specified object, computing a vector in respect of each of said regions based on the appearance of the regions, and vector quantizing the descriptors into one or more clusters. The index is searched and the clusters are compared with the contents of the index to identify which of the images contain the clusters so as to return the images containing the user-defined object.

Type: Application

Filed: April 8, 2004

Publication date: October 13, 2005

Inventors: Andrew Zisserman, Frederik Schaffalitzky, Josef Sivic

CUSTOMIZING MOTION AND APPEARANCE IN VIDEO GENERATION

Learning to Personalize Vision-Language Models through Meta-Personalization

NATURAL LANGUAGE-GUIDED MUSIC AUDIO RECOMMENDATION FOR VIDEO USING MACHINE LEARNING

Object retrieval