Patents by Inventor Jason Wen

Jason Wen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12373920
    Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media that utilizes artificial intelligence to learn to recommend foreground object images for use in generating composite images based on geometry and/or lighting features. For instance, in one or more embodiments, the disclosed systems transform a foreground object image corresponding to a background image using at least one of a geometry transformation or a lighting transformation. The disclosed systems further generating predicted embeddings for the background image, the foreground object image, and the transformed foreground object image within a geometry-lighting-sensitive embedding space utilizing a geometry-lighting-aware neural network. Using a loss determined from the predicted embeddings, the disclosed systems update parameters of the geometry-lighting-aware neural network. The disclosed systems further provide a variety of efficient user interfaces for generating composite digital images.
    Type: Grant
    Filed: April 11, 2022
    Date of Patent: July 29, 2025
    Assignee: Adobe Inc.
    Inventors: Zhe Lin, Sijie Zhu, Jason Wen Yong Kuen, Scott Cohen, Zhifei Zhang
  • Publication number: 20250232575
    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that generates object masks for digital objects portrayed in digital images utilizing a detection-masking neural network pipeline. In particular, in one or more embodiments, the disclosed systems utilize detection heads of a neural network to detect digital objects portrayed within a digital image. In some cases, each detection head is associated with one or more digital object classes that are not associated with the other detection heads. Further, in some cases, the detection heads implement multi-scale synchronized batch normalization to normalize feature maps across various feature levels. The disclosed systems further utilize a masking head of the neural network to generate one or more object masks for the detected digital objects. In some cases, the disclosed systems utilize post-processing techniques to filter out low-quality masks.
    Type: Application
    Filed: April 7, 2025
    Publication date: July 17, 2025
    Inventors: Jason Wen Yong Kuen, Su Chen, Scott Cohen, Zhe Lin, Zijun Wei, Jianming Zhang
  • Patent number: 12333845
    Abstract: The technology described includes methods for pretraining a document encoder model based on multimodal self cross-attention. One method includes receiving image data that encodes a set of pretraining documents. A set of sentences is extracted from the image data. A bounding box for each sentence is generated. For each sentence, a set of predicted features is generated by using an encoder machine-learning model. The encoder model performs cross-attention between a set of masked-textual features for the sentence and a set of masked-visual features for the sentence. The set of masked-textual features is based on a masking function and the sentence. The set of masked-visual features is based on the masking function and the corresponding bounding box. A document-encoder model is pretrained based on the set of predicted features for each sentence and pretraining tasks. The pretraining tasks includes masked sentence modeling, visual contrastive learning, or visual-language alignment.
    Type: Grant
    Filed: November 16, 2021
    Date of Patent: June 17, 2025
    Assignee: Adobe Inc.
    Inventors: Jiuxiang Gu, Ani Nenkova Nenkova, Nikolaos Barmpalios, Vlad Ion Morariu, Tong Sun, Rajiv Bhawanji Jain, Jason wen yong Kuen, Handong Zhao
  • Patent number: 12272127
    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that generates object masks for digital objects portrayed in digital images utilizing a detection-masking neural network pipeline. In particular, in one or more embodiments, the disclosed systems utilize detection heads of a neural network to detect digital objects portrayed within a digital image. In some cases, each detection head is associated with one or more digital object classes that are not associated with the other detection heads. Further, in some cases, the detection heads implement multi-scale synchronized batch normalization to normalize feature maps across various feature levels. The disclosed systems further utilize a masking head of the neural network to generate one or more object masks for the detected digital objects. In some cases, the disclosed systems utilize post-processing techniques to filter out low-quality masks.
    Type: Grant
    Filed: January 31, 2022
    Date of Patent: April 8, 2025
    Assignee: Adobe Inc.
    Inventors: Jason Wen Yong Kuen, Su Chen, Scott Cohen, Zhe Lin, Zijun Wei, Jianming Zhang
  • Publication number: 20250086849
    Abstract: Embodiments of the present disclosure include obtaining a text prompt describing an element, layout information indicating a target region for the element, and a precision level corresponding to the element. Some embodiments generate a text feature pyramid based on the text prompt, the layout information, and the precision level, wherein the text feature pyramid comprises a plurality of text feature maps at a plurality of scales, respectively. Then, an image is generated based on the text feature pyramid. In some cases, the image includes an object corresponding to the element of the text prompt at the target region. Additionally, a shape of the object corresponds to a shape of the target region based on the precision level.
    Type: Application
    Filed: September 8, 2023
    Publication date: March 13, 2025
    Inventors: Yu Zeng, Zhe Lin, Jianming Zhang, Qing Liu, Jason Wen Yong Kuen, John Philip Collomosse
  • Patent number: 12223439
    Abstract: Systems and methods for multi-modal representation learning are described. One or more embodiments provide a visual representation learning system trained using machine learning techniques. For example, some embodiments of the visual representation learning system are trained using cross-modal training tasks including a combination of intra-modal and inter-modal similarity preservation objectives. In some examples, the training tasks are based on contrastive learning techniques.
    Type: Grant
    Filed: March 3, 2021
    Date of Patent: February 11, 2025
    Assignee: ADOBE INC.
    Inventors: Xin Yuan, Zhe Lin, Jason Wen Yong Kuen, Jianming Zhang, Yilin Wang, Ajinkya Kale, Baldo Faieta
  • Patent number: 12198224
    Abstract: Systems and methods for image generation are described. Embodiments of the present disclosure receive a text phrase that describes a target image to be generated; generate text features based on the text phrase; retrieve a search image based on the text phrase; and generate the target image using an image generation network based on the text features and the search image.
    Type: Grant
    Filed: February 15, 2022
    Date of Patent: January 14, 2025
    Assignee: ADOBE INC.
    Inventors: Xin Yuan, Zhe Lin, Jason Wen Yong Kuen, Jianming Zhang, John Philip Collomosse
  • Publication number: 20240371007
    Abstract: Various disclosed embodiments are directed to refining or correcting individual semantic segmentation/instance segmentation masks that have already been produced by baseline models in order to generate a final coherent panoptic segmentation map. Specifically, a refinement model, such as an encoder-decoder-based neural network, generates or predicts various data objects, such as foreground masks, bounding box offset maps, center maps, center offset maps, and coordinate convolution. This, among other functionality described herein, improves the inaccuracies and computing resource consumption of existing technologies.
    Type: Application
    Filed: July 11, 2024
    Publication date: November 7, 2024
    Inventors: Zhe LIN, Simon Su Chen, Jason wen-young Kuen, Bo Sun
  • Patent number: 12067730
    Abstract: Various disclosed embodiments are directed to refining or correcting individual semantic segmentation/instance segmentation masks that have already been produced by baseline models in order to generate a final coherent panoptic segmentation map. Specifically, a refinement model, such as an encoder-decoder-based neural network, generates or predicts various data objects, such as foreground masks, bounding box offset maps, center maps, center offset maps, and coordinate convolution. This, among other functionality described herein, improves the inaccuracies and computing resource consumption of existing technologies.
    Type: Grant
    Filed: October 6, 2021
    Date of Patent: August 20, 2024
    Assignee: Adobe Inc.
    Inventors: Zhe Lin, Simon Su Chen, Jason Wen-youg Kuen, Bo Sun
  • Publication number: 20240249413
    Abstract: In implementations of systems for performing multiple segmentation tasks, a computing device implements a segment system to receive input data describing a digital image depicting an object. The segment system computes per-pixel embeddings for the digital image using a pixel decoder of a machine learning model. Output embeddings are generated using a transformer decoder of the machine learning model based on the per-pixel embeddings for the digital image, input embeddings for a first segmentation task and input embeddings for a second segmentation task. The segment system outputs a first digital image and a second digital image. The first digital image depicts the object segmented based on the first segmentation task and the second digital image depicts the object segmented based on the second segmentation task.
    Type: Application
    Filed: January 23, 2023
    Publication date: July 25, 2024
    Applicant: Adobe Inc.
    Inventors: Jason Wen Yong Kuen, Zhe Lin, Sukjun Hwang, Jianming Zhang, Brian Lynn Price
  • Publication number: 20240169623
    Abstract: Systems and methods for multi-modal image generation are provided. One or more aspects of the systems and methods includes obtaining a text prompt and layout information indicating a target location for an element of the text prompt within an image to be generated and computing a text feature map including a plurality of values corresponding to the element of the text prompt at pixel locations corresponding to the target location. Then the image is generated based on the text feature map using a diffusion model. The generated image includes the element of the text prompt at the target location.
    Type: Application
    Filed: November 22, 2022
    Publication date: May 23, 2024
    Inventors: Yu Zeng, Zhe Lin, Jianming Zhang, Qing Liu, Jason Wen Yong Kuen, John Philip Collomosse
  • Publication number: 20240104951
    Abstract: In various examples, a table recognition model receives an image of a table and generates, using a first encoder of the table recognition machine learning model, an image feature vector including features extracted from the image of the table; generates, using a first decoder of the table recognition machine learning model and the image feature vector, a set of coordinates within the image representing rows and columns associated with the table, and generates, using a second decoder of the table recognition machine learning model and the image feature vector, a set of bounding boxes and semantic features associated with cells the table, then determines, using a third decoder of the table recognition machine learning model, a table structure associated with the table using the image feature vector, the set of coordinates, the set of bounding boxes, and the semantic features.
    Type: Application
    Filed: September 19, 2022
    Publication date: March 28, 2024
    Inventors: Jiuxiang Gu, Vlad Morariu, Tong Sun, Jason wen yong Kuen, Ani Nenkova
  • Patent number: 11941884
    Abstract: Systems and methods for image processing are described. Embodiments of the present disclosure receive an image having a plurality of object instances; encode the image to obtain image features; decode the image features to obtain object features; generate object detection information based on the object features using an object detection branch, wherein the object detection branch is trained based on a first training set using a detection loss; generate semantic segmentation information based on the object features using a semantic segmentation branch, wherein the semantic segmentation branch is trained based on a second training set different from the first training set using a semantic segmentation loss; and combine the object detection information and the semantic segmentation information to obtain panoptic segmentation information that indicates which pixels of the image correspond to each of the plurality of object instances.
    Type: Grant
    Filed: November 12, 2021
    Date of Patent: March 26, 2024
    Assignee: ADOBE INC.
    Inventors: Jason Wen Yong Kuen, Bo Sun, Zhe Lin, Simon Su Chen
  • Patent number: 11915359
    Abstract: Systems, apparatuses, and methods for implementing kernel software driven color remapping of rendered primary surfaces are disclosed. A system includes at least a general processor, a graphics processor, and a memory. The general processor executes a user-mode application, a user-mode driver, and a kernel-mode driver. A primary surface is rendered on the graphics processor on behalf of the user-mode application. The primary surface is stored in memory locations allocated for the primary surface by the user-mode driver and the kernel-mode driver is notified when the primary surface is ready to be displayed. Rather than displaying the primary surface, the kernel-mode driver causes the pixels of the primary surface to be remapped on the graphics processor using a selected lookup table (LUT) so as to generate a remapped surface which stored in memory locations allocated for the remapped surface by the user-mode driver. Then, the remapped surface is displayed.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: February 27, 2024
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Jason Wen-Tse Wu, Parimalkumar Patel, Jia Hui Li, Chao Zhan
  • Patent number: 11868889
    Abstract: In implementations of object detection in images, object detectors are trained using heterogeneous training datasets. A first training dataset is used to train an image tagging network to determine an attention map of an input image for a target concept. A second training dataset is used to train a conditional detection network that accepts as conditional inputs the attention map and a word embedding of the target concept. Despite the conditional detection network being trained with a training dataset having a small number of seen classes (e.g., classes in a training dataset), it generalizes to novel, unseen classes by concept conditioning, since the target concept propagates through the conditional detection network via the conditional inputs, thus influencing classification and region proposal. Hence, classes of objects that can be detected are expanded, without the need to scale training databases to include additional classes.
    Type: Grant
    Filed: January 31, 2022
    Date of Patent: January 9, 2024
    Assignee: Adobe Inc.
    Inventors: Zhe Lin, Xiaohui Shen, Mingyang Ling, Jianming Zhang, Jason Wen Yong Kuen
  • Publication number: 20230401827
    Abstract: Systems and methods for image segmentation are described. Embodiments of the present disclosure receive a training image and a caption for the training image, wherein the caption includes text describing an object in the training image; generate a pseudo mask for the object using a teacher network based on the text describing the object; generate a mask for the object using a student network; compute noise information for the training image using a noise estimation network; and update parameters of the student network based on the mask, the pseudo mask, and the noise information.
    Type: Application
    Filed: June 9, 2022
    Publication date: December 14, 2023
    Inventors: Jason Wen Yong Kuen, Dat Ba Huynh, Zhe Lin, Jiuxiang Gu
  • Publication number: 20230368003
    Abstract: The technology described herein is directed to an adaptive sparse attention pattern that is learned during fine-tuning and deployed in a machine-learning model. In aspects, a row or a column in an attention matrix with an importance score for a task that is above a threshold importance score is identified. The important row or the column is included in an adaptive attention pattern used with a machine-learning model having a self-attention operation. In response to an input, a task-specific inference is generated for the input using the machine-learning model with the adaptive attention pattern.
    Type: Application
    Filed: May 10, 2022
    Publication date: November 16, 2023
    Inventors: Jiuxiang Gu, Zihan Wang, Jason Wen Yong Kuen, Handong Zhao, Vlad Ion Morariu, Ruiyi Zhang, Ani Nenkova Nenkova, Tong Sun
  • Publication number: 20230325991
    Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media that utilizes artificial intelligence to learn to recommend foreground object images for use in generating composite images based on geometry and/or lighting features. For instance, in one or more embodiments, the disclosed systems transform a foreground object image corresponding to a background image using at least one of a geometry transformation or a lighting transformation. The disclosed systems further generating predicted embeddings for the background image, the foreground object image, and the transformed foreground object image within a geometry-lighting-sensitive embedding space utilizing a geometry-lighting-aware neural network. Using a loss determined from the predicted embeddings, the disclosed systems update parameters of the geometry-lighting-aware neural network. The disclosed systems further provide a variety of efficient user interfaces for generating composite digital images.
    Type: Application
    Filed: April 11, 2022
    Publication date: October 12, 2023
    Inventors: Zhe Lin, Sijie Zhu, Jason Wen Yong Kuen, Scott Cohen, Zhifei Zhang
  • Publication number: 20230325992
    Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media that utilizes artificial intelligence to learn to recommend foreground object images for use in generating composite images based on geometry and/or lighting features. For instance, in one or more embodiments, the disclosed systems transform a foreground object image corresponding to a background image using at least one of a geometry transformation or a lighting transformation. The disclosed systems further generating predicted embeddings for the background image, the foreground object image, and the transformed foreground object image within a geometry-lighting-sensitive embedding space utilizing a geometry-lighting-aware neural network. Using a loss determined from the predicted embeddings, the disclosed systems update parameters of the geometry-lighting-aware neural network. The disclosed systems further provide a variety of efficient user interfaces for generating composite digital images.
    Type: Application
    Filed: April 11, 2022
    Publication date: October 12, 2023
    Inventors: Zhe Lin, Sijie Zhu, Jason Wen Yong Kuen, Scott Cohen, Zhifei Zhang
  • Publication number: 20230260164
    Abstract: Systems and methods for image generation are described. Embodiments of the present disclosure receive a text phrase that describes a target image to be generated; generate text features based on the text phrase; retrieve a search image based on the text phrase; and generate the target image using an image generation network based on the text features and the search image.
    Type: Application
    Filed: February 15, 2022
    Publication date: August 17, 2023
    Inventors: Xin Yuan, Zhe Lin, Jason Wen Yong Kuen, Jianming Zhang, John Philip Collomosse