Patents by Inventor Kshitiz Garg

Kshitiz Garg has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20260105670
    Abstract: Systems, methods, and non-transitory computer-readable media generate custom animations comprises a structure of a coarse animation prompt. For example, the disclosed systems receive a style prompt and receive a coarse animation prompt. The disclosed systems generate, utilizing a media generation model, a custom animation having a structure and timing of the coarse animation prompt and a style informed by the style prompt. The disclosed systems also provide the custom animation for display via a graphical user interface.
    Type: Application
    Filed: September 30, 2025
    Publication date: April 16, 2026
    Inventors: Yangtuanfeng Wang, Li-Yi Wei, Wilmot Wei-Mau Li, Valerie Head, Seth Walker, Lakshya Lnu, Kshitiz Garg, Kazi Rubaiat Habib, Jun Saito, James Ratliff, Duygu Ceylan Aksit, Dafei Qin, Cameron Smith
  • Patent number: 12579608
    Abstract: Systems and methods for generating tile-able patterns from text include obtaining a text prompt and generating, by a generation prior model, a latent vector based on the text prompt, where the generation prior model is trained to output vectors within a distribution of tile-able patterns. An image generation model then generates an output image based on the latent vector. The output image comprises a tile-able pattern including an element from the text prompt.
    Type: Grant
    Filed: December 1, 2023
    Date of Patent: March 17, 2026
    Assignee: ADOBE INC.
    Inventors: Vineet Batra, Sumit Chaturvedi, Abhishek Rai, Pranav Vineet Aggarwal, Ajinkya Gorakhnath Kale, Aman Jeph, Ankit Phogat, Sumit Dhingra, Fengbin Chen, Kshitiz Garg, Milos Hasan, Midhun Harikumar, Gaurav Suresh Pathak, Souymodip Chakraborty
  • Publication number: 20260044563
    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for extracting moments of interest (e.g., video frames, video segments) from a video. In an example embodiment, independent and/or orthogonal machine learning models are used to extract different types of features considering different modalities, and each frame in the video is assigned an importance score for each model. The importance scores for each model are combined into an aggregated importance score for each frame in the video. Depending on the embodiment, the aggregated importance scores are used to visualize the score per frame, identify moments of interest, automatically crop down the video into a highlight reel, browse or visualize the moments of interest within the video, and/or search across multiple videos.
    Type: Application
    Filed: October 16, 2025
    Publication date: February 12, 2026
    Inventors: Ali AMINIAN, William Lawrence MARINO, Kshitiz GARG, Aseem Omprakash AGARWALA
  • Patent number: 12513369
    Abstract: Embodiments are disclosed for generating a temporally coherent video extension. The method includes displaying, on a graphical user interface, a user interface element representing a video to be extended, where the video includes a number of frames. The method further includes receiving an input via the graphical user interface associated with the user interface element. The input causes a visual change to the user interface element which represents a duration of an extension to be made to the video. The method further includes generating frames based on the duration of the extension. The generated frames use motion information determined from frames of the video. The motion information represents a per-pixel motion between at least a pair of frames of the video. The method further includes providing, for display on the graphical user interface, an extended video including the frames of the video and the generated frames.
    Type: Grant
    Filed: September 10, 2024
    Date of Patent: December 30, 2025
    Assignee: Adobe Inc.
    Inventors: Gabriela Duncombe, Xue Bai, Lakshya Lnu, Kshitiz Garg, Gunjan Aggarwal, Feng Liu, Aseem Agarwala, Ali Aminian, Jui-Hsien Wang, Zhe Wang
  • Patent number: 12468760
    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for extracting moments of interest (e.g., video frames, video segments) from a video. In an example embodiment, independent and/or orthogonal machine learning models are used to extract different types of features considering different modalities, and each frame in the video is assigned an importance score for each model. The importance scores for each model are combined into an aggregated importance score for each frame in the video. Depending on the embodiment, the aggregated importance scores are used to visualize the score per frame, identify moments of interest, automatically crop down the video into a highlight reel, browse or visualize the moments of interest within the video, and/or search across multiple videos.
    Type: Grant
    Filed: October 28, 2021
    Date of Patent: November 11, 2025
    Assignee: Adobe Inc.
    Inventors: Ali Aminian, William Lawrence Marino, Kshitiz Garg, Aseem Agarwala
  • Patent number: 12450504
    Abstract: This disclosure describes one or more implementations of a video inference system that utilizes machine-learning models to efficiently and flexibly process digital videos utilizing various improved video inference architectures. For example, the video inference system provides a framework for improving digital video processing by increasing the efficiency of both central processing units (CPUs) and graphics processing units (GPUs). In one example, the video inference system utilizes a first video inference architecture to reduce the number of computing resources needed to inference digital videos by analyzing multiple digital videos utilizing sets of CPU/GPU containers along with parallel pipeline processing. In a further example, the video inference system utilizes a second video inference architecture that facilitates multiple CPUs to preprocess multiple digital videos in parallel as well as a GPU to continuously, sequentially, and efficiently inference each of the digital videos.
    Type: Grant
    Filed: July 12, 2024
    Date of Patent: October 21, 2025
    Assignee: Adobe Inc.
    Inventors: Akhilesh Kumar, Xiaozhen Xue, Daniel Miranda, Nicolas Huynh Thien, Kshitiz Garg
  • Patent number: 12412419
    Abstract: Some aspects of the technology described herein perform identity identification on faces in a video. Object tracking is performed on detected faces in frames of a video to generate tracklets. Each tracklet comprises a sequence of consecutive frames in which each frame includes a detected face for a person. The tracklets are clustered using face feature vectors for detected faces of each tracklet to generate a plurality of clusters. Information is stored in an identity datastore, including a first identifier for a first identity in association with an indication of frames from tracklets in a first cluster from the plurality of clusters.
    Type: Grant
    Filed: November 7, 2022
    Date of Patent: September 9, 2025
    Assignee: ADOBE INC.
    Inventors: Ali Aminian, Aashish Kumar Misraa, Kshitiz Garg, Aseem Agarwala
  • Publication number: 20250252741
    Abstract: Embodiments are disclosed for receiving a user input and an input video comprising multiple frames. The method may include extracting a text feature from the user input. The method may further include extracting a plurality of image features from the frames. The method may further include identifying one or more keyframes from the frames that include the object. The method may further include clustering one or more groups of the one or more keyframes. The method may further include generating a plurality of segmentation masks for each group. The method may further include determining a set of reference masks corresponding to the user input and the object. The method may further include generating a set of fusion masks by combining the plurality of segmentation masks and the set of reference masks. The method may further include propagating the set of fusion masks and outputting a final set of masks.
    Type: Application
    Filed: March 31, 2025
    Publication date: August 7, 2025
    Applicant: Adobe Inc.
    Inventors: Shivam Nalin PATEL, Kshitiz GARG, Han GUO, Ali AMINIAN, Aashish MISRAA
  • Patent number: 12266181
    Abstract: Embodiments are disclosed for receiving a user input and an input video comprising multiple frames. The method may include extracting a text feature from the user input. The method may further include extracting a plurality of image features from the frames. The method may further include identifying one or more keyframes from the frames that include the object. The method may further include clustering one or more groups of the one or more keyframes. The method may further include generating a plurality of segmentation masks for each group. The method may further include determining a set of reference masks corresponding to the user input and the object. The method may further include generating a set of fusion masks by combining the plurality of segmentation masks and the set of reference masks. The method may further include propagating the set of fusion masks and outputting a final set of masks.
    Type: Grant
    Filed: November 19, 2021
    Date of Patent: April 1, 2025
    Assignee: Adobe Inc.
    Inventors: Shivam Nalin Patel, Kshitiz Garg, Han Guo, Ali Aminian, Aashish Misraa
  • Publication number: 20240420389
    Abstract: Systems and methods for generating tile-able patterns from text include obtaining a text prompt and generating, by a generation prior model, a latent vector based on the text prompt, where the generation prior model is trained to output vectors within a distribution of tile-able patterns. An image generation model then generates an output image based on the latent vector. The output image comprises a tile-able pattern including an element from the text prompt.
    Type: Application
    Filed: December 1, 2023
    Publication date: December 19, 2024
    Inventors: Vineet Batra, Sumit Chaturvedi, Abhishek Rai, Pranav Vineet Aggarwal, Ajinkya Gorakhnath Kale, Aman Jeph, Ankit Phogat, Sumit Dhingra, Fengbin Chen, Kshitiz Garg, Milos Hasan, Midhun Harikumar, Gaurav Suresh Pathak, Souymodip Chakraborty
  • Publication number: 20240362506
    Abstract: This disclosure describes one or more implementations of a video inference system that utilizes machine-learning models to efficiently and flexibly process digital videos utilizing various improved video inference architectures. For example, the video inference system provides a framework for improving digital video processing by increasing the efficiency of both central processing units (CPUs) and graphics processing units (GPUs). In one example, the video inference system utilizes a first video inference architecture to reduce the number of computing resources needed to inference digital videos by analyzing multiple digital videos utilizing sets of CPU/GPU containers along with parallel pipeline processing. In a further example, the video inference system utilizes a second video inference architecture that facilitates multiple CPUs to preprocess multiple digital videos in parallel as well as a GPU to continuously, sequentially, and efficiently inference each of the digital videos.
    Type: Application
    Filed: July 12, 2024
    Publication date: October 31, 2024
    Inventors: Akhilesh Kumar, Xiaozhen Xue, Daniel Miranda, Nicolas Huynh Thien, Kshitiz Garg
  • Patent number: 12067499
    Abstract: This disclosure describes one or more implementations of a video inference system that utilizes machine-learning models to efficiently and flexibly process digital videos utilizing various improved video inference architectures. For example, the video inference system provides a framework for improving digital video processing by increasing the efficiency of both central processing units (CPUs) and graphics processing units (GPUs). In one example, the video inference system utilizes a first video inference architecture to reduce the number of computing resources needed to inference digital videos by analyzing multiple digital videos utilizing sets of CPU/GPU containers along with parallel pipeline processing. In a further example, the video inference system utilizes a second video inference architecture that facilitates multiple CPUs to preprocess multiple digital videos in parallel as well as a GPU to continuously, sequentially, and efficiently inference each of the digital videos.
    Type: Grant
    Filed: November 2, 2020
    Date of Patent: August 20, 2024
    Assignee: Adobe Inc.
    Inventors: Akhilesh Kumar, Xiaozhen Xue, Daniel Miranda, Nicolas Huynh Thien, Kshitiz Garg
  • Patent number: 12050647
    Abstract: Techniques for recommending hashtags, including trending hashtags, are disclosed. An example method includes accessing a graph. The graph includes video nodes representing videos, historical hashtag nodes representing historical hashtags, and edges indicating associations among the video nodes and the historical hashtag nodes. A trending hashtag is identified. An edge is added to the graph between a historical hashtag node representing a historical hashtag and a trending hashtag node representing the trending hashtag, based on a semantic similarity between the historical hashtag and the trending hashtag. A new video node representing a new video is added to the video nodes of the graph. A graph neural network (GNN) is applied to the graph, and the GNN predicts a new edge between the trending hashtag node and the new video node. The trending hashtag is recommended for the new video based on prediction of the new edge.
    Type: Grant
    Filed: July 29, 2022
    Date of Patent: July 30, 2024
    Assignee: Adobe Inc.
    Inventors: Somdeb Sarkhel, Xiang Chen, Viswanathan Swaminathan, Swapneel Mehta, Saayan Mitra, Ryan Rossi, Han Guo, Ali Aminian, Kshitiz Garg
  • Publication number: 20240153303
    Abstract: Some aspects of the technology described herein perform identity identification on faces in a video. Object tracking is performed on detected faces in frames of a video to generate tracklets. Each tracklet comprises a sequence of consecutive frames in which each frame includes a detected face for a person. The tracklets are clustered using face feature vectors for detected faces of each tracklet to generate a plurality of clusters. Information is stored in an identity datastore, including a first identifier for a first identity in association with an indication of frames from tracklets in a first cluster from the plurality of clusters.
    Type: Application
    Filed: November 7, 2022
    Publication date: May 9, 2024
    Inventors: Ali AMINIAN, Aashish Kumar MISRAA, Kshitiz GARG, Aseem AGARWALA
  • Publication number: 20240037149
    Abstract: Techniques for recommending hashtags, including trending hashtags, are disclosed. An example method includes accessing a graph. The graph includes video nodes representing videos, historical hashtag nodes representing historical hashtags, and edges indicating associations among the video nodes and the historical hashtag nodes. A trending hashtag is identified. An edge is added to the graph between a historical hashtag node representing a historical hashtag and a trending hashtag node representing the trending hashtag, based on a semantic similarity between the historical hashtag and the trending hashtag. A new video node representing a new video is added to the video nodes of the graph. A graph neural network (GNN) is applied to the graph, and the GNN predicts a new edge between the trending hashtag node and the new video node. The trending hashtag is recommended for the new video based on prediction of the new edge.
    Type: Application
    Filed: July 29, 2022
    Publication date: February 1, 2024
    Inventors: Somdeb Sarkhel, Xiang Chen, Viswanathan Swaminathan, Swapneel Mehta, Saayan Mitra, Ryan Rossi, Han Guo, Ali Aminian, Kshitiz Garg
  • Publication number: 20230377339
    Abstract: Embodiments are disclosed for generating temporally consistent manipulated videos. A method of generating temporally consistent manipulated videos comprises receiving a target appearance and an input digital video including a plurality of frames, generating a plurality of target appearance frames from the plurality of frames, training a video prediction network to generate a digital video wherein a subject of the digital video has its appearance modified to match the target appearance, providing the input digital video to the video prediction network, and generating, by the video prediction network, an output digital video wherein the subject of the output digital video has its appearance modified to match the target appearance.
    Type: Application
    Filed: May 23, 2022
    Publication date: November 23, 2023
    Applicant: Adobe Inc.
    Inventors: Han GUO, Kshitiz GARG, Ali AMINIAN, Aashish MISRAA, William MARINO, Nicolas HUYNH THIEN
  • Publication number: 20230140369
    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for extracting moments of interest (e.g., video frames, video segments) from a video. In an example embodiment, independent and/or orthogonal machine learning models are used to extract different types of features considering different modalities, and each frame in the video is assigned an importance score for each model. The importance scores for each model are combined into an aggregated importance score for each frame in the video. Depending on the embodiment, the aggregated importance scores are used to visualize the score per frame, identify moments of interest, automatically crop down the video into a highlight reel, browse or visualize the moments of interest within the video, and/or search across multiple videos.
    Type: Application
    Filed: October 28, 2021
    Publication date: May 4, 2023
    Inventors: Ali Aminian, William Lawrence Marino, Kshitiz Garg, Aseem Agarwala
  • Publication number: 20220138596
    Abstract: This disclosure describes one or more implementations of a video inference system that utilizes machine-learning models to efficiently and flexibly process digital videos utilizing various improved video inference architectures. For example, the video inference system provides a framework for improving digital video processing by increasing the efficiency of both central processing units (CPUs) and graphics processing units (GPUs). In one example, the video inference system utilizes a first video inference architecture to reduce the number of computing resources needed to inference digital videos by analyzing multiple digital videos utilizing sets of CPU/GPU containers along with parallel pipeline processing. In a further example, the video inference system utilizes a second video inference architecture that facilitates multiple CPUs to preprocess multiple digital videos in parallel as well as a GPU to continuously, sequentially, and efficiently inference each of the digital videos.
    Type: Application
    Filed: November 2, 2020
    Publication date: May 5, 2022
    Inventors: Akhilesh Kumar, Xiaozhen Xue, Daniel Miranda, Nicolas Huynh Thien, Kshitiz Garg
  • Patent number: 10762440
    Abstract: Some embodiments provide a sensor data-processing system which detects and classifies objects detected in an environment via fusion of sensor data representations generated by multiple separate sensors. The sensor data-processing system can fuse sensor data representations generated by multiple sensor devices into a fused sensor data representation and can further detect and classify features in the fused sensor data representation. Feature detection can be implemented based at least in part upon utilizing a feature-detection model generated via one or more of deep learning and traditional machine learning. The sensor data-processing system can adjust sensor data processing of representations generated by sensor devices based on external factors including indications of sensor health and environmental conditions.
    Type: Grant
    Filed: September 23, 2016
    Date of Patent: September 1, 2020
    Assignee: Apple Inc.
    Inventors: Kshitiz Garg, Ahmad Al-Dahle
  • Patent number: 10671068
    Abstract: Sensor data captured at by different sensors may be shared across different sensor processing pipelines. Sensor processing pipelines may process captured sensor data from respective sensors. Some of the sensor data that is received or processed at one sensor data processing pipeline may be provided to another sensor data processing pipeline so that subsequent processing stages at the recipient sensor processing pipeline may process the combined sensor data in order to determine a perception decision. Different types of sensor data may be shared, including raw sensor data, processed sensor data, or data derived from sensor data. A control system may perform control actions based on the perception decisions determined by the sensor processing pipelines that share sensor data.
    Type: Grant
    Filed: September 19, 2017
    Date of Patent: June 2, 2020
    Assignee: Apple Inc.
    Inventors: Xinyu Xu, Ahmad Al-Dahle, Kshitiz Garg