Patents by Inventor Deepali Aneja

Deepali Aneja has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Generating gesture reenactment video from video motion graphs using machine learning

Patent number: 12347135

Abstract: Embodiments are disclosed for generating a gesture reenactment video sequence corresponding to a target audio sequence using a trained network based on a video motion graph generated from a reference speech video. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving a first input including a reference speech video and generating a video motion graph representing the reference speech video, where each node is associated with a frame of the reference video sequence and reference audio features of the reference audio sequence. The disclosed systems and methods further comprise receiving a second input including a target audio sequence, generating target audio features, identifying a node path through the video motion graph based on the target audio features and the reference audio features, and generating an output media sequence based on the identified node path through the video motion graph paired with the target audio sequence.

Type: Grant

Filed: November 14, 2022

Date of Patent: July 1, 2025

Assignee: Adobe Inc.

Inventors: Yang Zhou, Jimei Yang, Jun Saito, Dingzeyu Li, Deepali Aneja
Curve generation for sketch vectorization

Patent number: 12333634

Abstract: Generating a vector representation of a hand-drawn sketch is described. To do so, the sketch is segmented into different superpixel regions. Superpixels are grown by distributing superpixel seeds throughout an image of the sketch and assigning unassigned pixels to a neighboring superpixel based on pixel value differences. The border between each pair of adjacent superpixels is then classified as either an active or an inactive boundary, with active boundaries indicating that the border corresponds to a salient sketch stroke. Vector paths are generated by traversing edges between pixel vertices along the active boundaries. To minimize vector paths included in the vector representation, vector paths are greedily generated first for longer curves along active boundaries until each edge is assigned to a vector path. Regions encompassed by vector paths corresponding to a foreground superpixel are filled to produce a high-fidelity vector representation of the sketch.

Type: Grant

Filed: November 4, 2021

Date of Patent: June 17, 2025

Assignee: Adobe Inc.

Inventors: Ashwani Chandil, Vineet Batra, Matthew David Fisher, Deepali Aneja, Ankit Phogat
VIDEO EDITING USING TRANSCRIPT TEXT STYLIZATION AND LAYOUT

Publication number: 20250168442

Abstract: Embodiments of the present disclosure provide, a method, a system, and a computer storage media that provide mechanisms for multimedia effect addition and editing support for text-based video editing tools. The method includes generating a user interface (UI) displaying a transcript of an audio track of a video and receiving, via the UI, input identifying selection of a text segment from the transcript. The method also includes in response to receiving, via the UI, input identifying selection of a particular type of text stylization or layout for application to the text segment. The method further includes identifying a video effect corresponding to the particular type of text stylization or layout, applying the video effect to a video segment corresponding to the text segment, and applying the particular type of text stylization or layout to the text segment to visually represent the video effect in the transcript.

Type: Application

Filed: January 21, 2025

Publication date: May 22, 2025

Inventors: Kim Pascal PIMMEL, Stephen Joseph DIVERDI, Jiaju MA, Rubaiat HABIB, LI-Yi WEI, Hijung SHIN, Deepali ANEJA, John G. NELSON, Wilmot LI, Dingzeyu LI, Lubomira Assenova DONTCHEVA, Joel Richard BRANDT
FACE-AWARE SCALE MAGNIFICATION VIDEO EFFECTS

Publication number: 20250140292

Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for cutting down a user's larger input video into an edited video comprising the most important video segments and applying corresponding video effects. Some embodiments of the present invention are directed to adding face-aware scale magnification to the trimmed video (e.g., applying scale magnification to simulate a camera zoom effect that hides shot cuts with respect to the subject's face). For example, as the trimmed video transitions from one video segment to the next video segment, a scale magnification may be applied that zooms in on a detected face at a boundary between the video segments to smooth the transition between video segments.

Type: Application

Filed: February 2, 2024

Publication date: May 1, 2025

Inventors: Anh Lan TRUONG, Deepali ANEJA, Hijung SHIN, Rubaiat HABIB, Jakub FISER, Kishore RADHAKRISHNA, Joel Richard BRANDT, Matthew David FISHER, Zeyu JIN, Kim Pascal PIMMEL, Wilmot LI, Lubomira Assenova DONTCHEVA
CAPTIONING USING GENERATIVE ARTIFICIAL INTELLIGENCE

Publication number: 20250139161

Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for cutting down a user's larger input video into an edited video comprising the most important video segments and applying corresponding video effects. Some embodiments of the present invention are directed to adding captioning video effects to the trimmed video (e.g., applying face-aware and non-face-aware captioning to emphasize extracted video segment headings, important sentences, quotes, words of interest, extracted lists, etc.). For example, a prompt is provided to a generative language model to identify portions of a transcript (e.g., extracted scene summaries, important sentences, lists of items discussed in the video, etc.) to apply to corresponding video segments as captions depending on the type of caption (e.g., an extracted heading may be captioned at the start of a corresponding video segment, important sentences and/or extracted list items may be captioned when they are spoken).

Type: Application

Filed: February 2, 2024

Publication date: May 1, 2025

Inventors: Deepali ANEJA, Zeyu JIN, Hijung SHIN, Anh Lan TRUONG, Dingzeyu LI, Hanieh DEILAMSALEHY, Rubaiat HABIB, Matthew David FISHER, Kim Pascal PIMMEL, Wilmot LI, Lubomira Assenova DONTCHEVA
Template-Based Behaviors in Machine Learning

Publication number: 20250111695

Abstract: In implementation of techniques for template-based behaviors in machine learning, a computing device implements a template system to receive a digital video and data executable to generate animated content. The template system determines a location within a frame of the digital video to place the animated content using a machine learning model. The template system then renders the animated content within the frame of the digital video at the location determined by the machine learning model. The template system then displays the rendered animated content within the frame of the digital video in a user interface.

Type: Application

Filed: December 18, 2023

Publication date: April 3, 2025

Applicant: Adobe Inc.

Inventors: Wilmot Wei-Mau Li, Li-Yi Wei, Cuong D. Nguyen, Jakub Fiser, Hijung Shin, Stephen Joseph DiVerdi, Seth John Walker, Kazi Rubaiat Habib, Deepali Aneja, David Gilliaert Werner, Erica K. Schisler
Script-Based Animations for Live Video

Publication number: 20250078380

Abstract: In various examples, a video effect is displayed in a live video stream in response to determining a portion of an audio stream of the live video stream corresponds to a text segment of a script associated with the video effect and detecting performance of a gesture. For example, during presentation of the script, the audio stream is obtained to determine if a portion of the audio stream corresponds to the text segment. In response to the portion of the audio stream corresponding to the text segment, detecting performance of a gestures and causing the video effect to be displayed.

Type: Application

Filed: August 31, 2023

Publication date: March 6, 2025

Inventors: Yining CAO, Stefano PETRANGELI, Li-Yi WEI, Rubaiat HABIB, Deepali ANEJA, Balaji Vasan SRINIVASAN, Haijun XIA
Video editing using transcript text stylization and layout

Patent number: 12206930

Abstract: Embodiments of the present disclosure provide, a method, a system, and a computer storage media that provide mechanisms for multimedia effect addition and editing support for text-based video editing tools. The method includes generating a user interface (UI) displaying a transcript of an audio track of a video and receiving, via the UI, input identifying selection of a text segment from the transcript. The method also includes in response to receiving, via the UI, input identifying selection of a particular type of text stylization or layout for application to the text segment. The method further includes identifying a video effect corresponding to the particular type of text stylization or layout, applying the video effect to a video segment corresponding to the text segment, and applying the particular type of text stylization or layout to the text segment to visually represent the video effect in the transcript.

Type: Grant

Filed: January 13, 2023

Date of Patent: January 21, 2025

Assignee: Adobe Inc.

Inventors: Kim Pascal Pimmel, Stephen Joseph Diverdi, Jiaju MA, Rubaiat Habib, Li-Yi Wei, Hijung Shin, Deepali Aneja, John G. Nelson, Wilmot Li, Dingzeyu Li, Lubomira Assenova Dontcheva, Joel Richard Brandt
SCRIPT BASED VIDEO EFFECTS FOR LIVE VIDEO

Publication number: 20250006226

Abstract: In various examples, a video effect is displayed in a live video stream in response to determining a portion of an audio stream of the live video stream that corresponds to a text segment of a script associated with the video effect. For example, during presentation of the script, the audio stream is obtained to determine if a portion of the audio stream corresponds to the text segment.

Type: Application

Filed: June 30, 2023

Publication date: January 2, 2025

Inventors: Deepali ANEJA, Rubaiat HABIB, Li-Yi WEI, Wilmot Wei-Mau LI, Stephen Joseph DIVERDI
VIDEO EDITING USING TRANSCRIPT TEXT STYLIZATION AND LAYOUT

Publication number: 20240244287

Abstract: Embodiments of the present disclosure provide, a method, a system, and a computer storage media that provide mechanisms for multimedia effect addition and editing support for text-based video editing tools. The method includes generating a user interface (UI) displaying a transcript of an audio track of a video and receiving, via the UI, input identifying selection of a text segment from the transcript. The method also includes in response to receiving, via the UI, input identifying selection of a particular type of text stylization or layout for application to the text segment. The method further includes identifying a video effect corresponding to the particular type of text stylization or layout, applying the video effect to a video segment corresponding to the text segment, and applying the particular type of text stylization or layout to the text segment to visually represent the video effect in the transcript.

Type: Application

Filed: January 13, 2023

Publication date: July 18, 2024

Inventors: Kim Pascal PIMMEL, Stephen Joseph DIVERDI, Jiaju MA, Rubaiat HABIB, Li-Yi WEI, Hijung SHIN, Deepali ANEJA, John G. NELSON, Wilmot LI, Dingzeyu LI, Lubomira Assenova DONTCHEVA, Joel Richard BRANDT
GENERATING GESTURE REENACTMENT VIDEO FROM VIDEO MOTION GRAPHS USING MACHINE LEARNING

Publication number: 20240161335

Abstract: Embodiments are disclosed for generating a gesture reenactment video sequence corresponding to a target audio sequence using a trained network based on a video motion graph generated from a reference speech video. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving a first input including a reference speech video and generating a video motion graph representing the reference speech video, where each node is associated with a frame of the reference video sequence and reference audio features of the reference audio sequence. The disclosed systems and methods further comprise receiving a second input including a target audio sequence, generating target audio features, identifying a node path through the video motion graph based on the target audio features and the reference audio features, and generating an output media sequence based on the identified node path through the video motion graph paired with the target audio sequence.

Type: Application

Filed: November 14, 2022

Publication date: May 16, 2024

Applicant: Adobe Inc.

Inventors: Yang ZHOU, Jimei YANG, Jun SAITO, Dingzeyu LI, Deepali ANEJA
Articulated part extraction from sprite sheets using machine learning

Patent number: 11875442

Abstract: Embodiments are disclosed for articulated part extraction using images of animated characters from sprite sheets by a digital design system. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving an input including a plurality of images depicting an animated character in different poses. The disclosed systems and methods further comprise, for each pair of images in the plurality of images, determining, by a first machine learning model, pixel correspondences between pixels of the pair of images, and determining, by a second machine learning model, pixel clusters representing the animated character, each pixel cluster corresponding to a different structural segment of the animated character. The disclosed systems and methods further comprise selecting a subset of clusters that reconstructs the different poses of the animated character. The disclosed systems and methods further comprise creating a rigged animated character based on the selected subset of clusters.

Type: Grant

Filed: May 31, 2022

Date of Patent: January 16, 2024

Assignee: Adobe Inc.

Inventors: Matthew David Fisher, Zhan Xu, Yang Zhou, Deepali Aneja, Evangelos Kalogerakis
ARTICULATED PART EXTRACTION FROM SPRITE SHEETS USING MACHINE LEARNING

Publication number: 20240005585

Abstract: Embodiments are disclosed for articulated part extraction using images of animated characters from sprite sheets by a digital design system. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving an input including a plurality of images depicting an animated character in different poses. The disclosed systems and methods further comprise, for each pair of images in the plurality of images, determining, by a first machine learning model, pixel correspondences between pixels of the pair of images, and determining, by a second machine learning model, pixel clusters representing the animated character, each pixel cluster corresponding to a different structural segment of the animated character. The disclosed systems and methods further comprise selecting a subset of clusters that reconstructs the different poses of the animated character. The disclosed systems and methods further comprise creating a rigged animated character based on the selected subset of clusters.

Type: Application

Filed: May 31, 2022

Publication date: January 4, 2024

Inventors: Matthew David FISHER, Zhan XU, Yang ZHOU, Deepali ANEJA, Evangelos KALOGERAKIS
Re-timing a video sequence to an audio sequence based on motion and audio beat detection

Patent number: 11682238

Abstract: Embodiments are disclosed for re-timing a video sequence to an audio sequence based on the detection of motion beats in the video sequence and audio beats in the audio sequence. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving a first input, the first input including a video sequence, detecting motion beats in the video sequence, receiving a second input, the second input including an audio sequence, detecting audio beats in the audio sequence, modifying the video sequence by matching the detected motions beats in the video sequence to the detected audio beats in the audio sequence, and outputting the modified video sequence.

Type: Grant

Filed: February 12, 2021

Date of Patent: June 20, 2023

Assignee: Adobe Inc.

Inventors: Jimei Yang, Deepali Aneja, Dingzeyu Li, Jun Saito, Yang Zhou
Image-to-image translation using an auto-regressive generative model

Patent number: 11663763

Abstract: A computer-implemented method including receiving an input image at a first image stage and receiving a request to generate a plurality of variations of the input image at a second image stage. The method including generating, using an auto-regressive generative deep learning model, the plurality of variations of the input image at the second image stage and outputting the plurality of variations of the input image at the second image stage.

Type: Grant

Filed: October 25, 2021

Date of Patent: May 30, 2023

Assignee: Adobe Inc.

Inventors: Matthew David Fisher, Vineet Batra, Sumit Dhingra, Praveen Kumar Dhanuka, Deepali Aneja, Ankit Phogat
Curve Generation for Sketch Vectorization

Publication number: 20230137233

Abstract: Generating a vector representation of a hand-drawn sketch is described. To do so, the sketch is segmented into different superpixel regions. Superpixels are grown by distributing superpixel seeds throughout an image of the sketch and assigning unassigned pixels to a neighboring superpixel based on pixel value differences. The border between each pair of adjacent superpixels is then classified as either an active or an inactive boundary, with active boundaries indicating that the border corresponds to a salient sketch stroke. Vector paths are generated by traversing edges between pixel vertices along the active boundaries. To minimize vector paths included in the vector representation, vector paths are greedily generated first for longer curves along active boundaries until each edge is assigned to a vector path. Regions encompassed by vector paths corresponding to a foreground superpixel are filled to produce a high-fidelity vector representation of the sketch.

Type: Application

Filed: November 4, 2021

Publication date: May 4, 2023

Applicant: Adobe Inc.

Inventors: Ashwani Chandil, Vineet Batra, Matthew David Fisher, Deepali Aneja, Ankit Phogat
IMAGE-TO-IMAGE TRANSLATION USING AN AUTO-REGRESSIVE GENERATIVE MODEL

Publication number: 20230131321

Abstract: A computer-implemented method including receiving an input image at a first image stage and receiving a request to generate a plurality of variations of the input image at a second image stage. The method including generating, using an auto-regressive generative deep learning model, the plurality of variations of the input image at the second image stage and outputting the plurality of variations of the input image at the second image stage.

Type: Application

Filed: October 25, 2021

Publication date: April 27, 2023

Applicant: Adobe Inc.

Inventors: Matthew David FISHER, Vineet BATRA, Sumit DHINGRA, Praveen Kumar DHANUKA, Deepali ANEJA, Ankit PHOGAT
RE-TIMING A VIDEO SEQUENCE TO AN AUDIO SEQUENCE BASED ON MOTION AND AUDIO BEAT DETECTION

Publication number: 20220261573

Abstract: Embodiments are disclosed for re-timing a video sequence to an audio sequence based on the detection of motion beats in the video sequence and audio beats in the audio sequence. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving a first input, the first input including a video sequence, detecting motion beats in the video sequence, receiving a second input, the second input including an audio sequence, detecting audio beats in the audio sequence, modifying the video sequence by matching the detected motions beats in the video sequence to the detected audio beats in the audio sequence, and outputting the modified video sequence.

Type: Application

Filed: February 12, 2021

Publication date: August 18, 2022

Inventors: Jimei YANG, Deepali ANEJA, Dingzeyu LI, Jun SAITO, Yang ZHOU
Using machine-learning models to determine movements of a mouth corresponding to live speech

Patent number: 11211060

Abstract: Disclosed systems and methods predict visemes from an audio sequence. In an example, a viseme-generation application accesses a first audio sequence that is mapped to a sequence of visemes. The first audio sequence has a first length and represents phonemes. The application adjusts a second length of a second audio sequence such that the second length equals the first length and represents the phonemes. The application adjusts the sequence of visemes to the second audio sequence such that phonemes in the second audio sequence correspond to the phonemes in the first audio sequence. The application trains a machine-learning model with the second audio sequence and the sequence of visemes. The machine-learning model predicts an additional sequence of visemes based on an additional sequence of audio.

Type: Grant

Filed: May 29, 2020

Date of Patent: December 28, 2021

Assignee: Adobe Inc.

Inventors: Wilmot Li, Jovan Popovic, Deepali Aneja, David Simons
USING MACHINE-LEARNING MODELS TO DETERMINE MOVEMENTS OF A MOUTH CORRESPONDING TO LIVE SPEECH

Publication number: 20200294495

Abstract: Disclosed systems and methods predict visemes from an audio sequence. In an example, a viseme-generation application accesses a first audio sequence that is mapped to a sequence of visemes. The first audio sequence has a first length and represents phonemes. The application adjusts a second length of a second audio sequence such that the second length equals the first length and represents the phonemes. The application adjusts the sequence of visemes to the second audio sequence such that phonemes in the second audio sequence correspond to the phonemes in the first audio sequence. The application trains a machine-learning model with the second audio sequence and the sequence of visemes. The machine-learning model predicts an additional sequence of visemes based on an additional sequence of audio.

Type: Application

Filed: May 29, 2020

Publication date: September 17, 2020

Inventors: Wilmot Li, Jovan Popovic, Deepali Aneja, David Simons

1 2 next