Patents by Inventor Mausoom Sarkar

Mausoom Sarkar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEMS AND METHODS FOR DATA AUGMENTATION

Publication number: 20240119122

Abstract: Systems and methods for data augmentation are provided. One aspect of the systems and methods include receiving an image that is misclassified by a classification network; computing an augmentation image based on the image using an augmentation network; and generating an augmented image by combining the image and the augmentation image, wherein the augmented image is correctly classified by the classification network.

Type: Application

Filed: October 11, 2022

Publication date: April 11, 2024

Inventors: Shripad Vilasrao Deshmukh, Surgan Jandial, Abhinav Java, Milan Aggarwal, Mausoom Sarkar, Arneh Jain, Balaji Krishnamurthy
Self-supervised hierarchical event representation learning

Patent number: 11948358

Abstract: Systems and methods for video processing are described. Embodiments of the present disclosure generate a plurality of image feature vectors corresponding to a plurality of frames of a video; generate a plurality of low-level event representation vectors based on the plurality of image feature vectors, wherein a number of the low-level event representation vectors is less than a number of the image feature vectors; generate a plurality of high-level event representation vectors based on the plurality of low-level event representation vectors, wherein a number of the high-level event representation vectors is less than the number of the low-level event representation vectors; and identify a plurality of high-level events occurring in the video based on the plurality of high-level event representation vectors.

Type: Grant

Filed: November 16, 2021

Date of Patent: April 2, 2024

Assignee: ADOBE INC.

Inventors: Sumegh Roychowdhury, Sumedh A. Sontakke, Mausoom Sarkar, Nikaash Puri, Pinkesh Badjatiya, Milan Aggarwal
Text conditioned image search based on dual-disentangled feature composition

Patent number: 11874902

Abstract: Techniques are disclosed for text conditioned image searching. A methodology implementing the techniques according to an embodiment includes receiving a source image and a text query defining a target image attribute. The method also includes decomposing the source image into image content and style feature vectors and decomposing the text query into text content and style feature vectors, wherein image style is descriptive of image content and text style is descriptive of text content. The method further includes composing a global content feature vector based on the text content feature vector and the image content feature vector and composing a global style feature vector based on the text style feature vector and the image style feature vector. The method further includes identifying a target image that relates to the global content feature vector and the global style feature vector so that the target image relates to the target image attribute.

Type: Grant

Filed: January 28, 2021

Date of Patent: January 16, 2024

Assignee: Adobe Inc.

Inventors: Pinkesh Badjatiya, Surgan Jandial, Pranit Chawla, Mausoom Sarkar, Ayush Chopra
Model training with retrospective loss

Patent number: 11797823

Abstract: Generating a machine learning model that is trained using retrospective loss is described. A retrospective loss system receives an untrained machine learning model and a task for training the model. The retrospective loss system initially trains the model over warm-up iterations using task-specific loss that is determined based on a difference between predictions output by the model during training on input data and a ground truth dataset for the input data. Following the warm-up training iterations, the retrospective loss system continues to train the model using retrospective loss, which is model-agnostic and constrains the model such that a subsequently output prediction is more similar to the ground truth dataset than the previously output prediction. After determining that the model's outputs are within a threshold similarity to the ground truth dataset, the model is output with its current parameters as a trained model.

Type: Grant

Filed: February 18, 2020

Date of Patent: October 24, 2023

Assignee: Adobe Inc.

Inventors: Ayush Chopra, Balaji Krishnamurthy, Mausoom Sarkar, Surgan Jandial
FORM STRUCTURE EXTRACTION BY PREDICTING ASSOCIATIONS

Publication number: 20230267345

Abstract: Techniques described herein extract form structures from a static form to facilitate making that static form reflowable. A method described herein includes accessing low-level form elements extracted from a static form. The method includes determining, using a first set of prediction models, second-level form elements based on the low-level form elements. Each second-level form element includes a respective one or more low-level form elements. The method further includes determining, using a second set of prediction models, high-level form elements based on the second-level form elements and the low-level form elements. Each high-level form element includes a respective one or more second-level form elements or low-level form elements. The method further includes generating a reflowable form based on the static form by, for each high-level form element, linking together the respective one or more second-level form elements or low-level form elements.

Type: Application

Filed: April 18, 2023

Publication date: August 24, 2023

Inventors: Milan Aggarwal, Mausoom Sarkar, Balaji Krishnamurthy
Identifying digital attributes from multiple attribute groups utilizing a deep cognitive attribution neural network

Patent number: 11734337

Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media for generating tags for an object portrayed in a digital image based on predicted attributes of the object. For example, the disclosed systems can utilize interleaved neural network layers of alternating inception layers and dilated convolution layers to generate a localization feature vector. Based on the localization feature vector, the disclosed systems can generate attribute localization feature embeddings, for example, using some pooling layer such as a global average pooling layer. The disclosed systems can then apply the attribute localization feature embeddings to corresponding attribute group classifiers to generate tags based on predicted attributes. In particular, attribute group classifiers can predict attributes as associated with a query image (e.g., based on a scoring comparison with other potential attributes of an attribute group).

Type: Grant

Filed: June 14, 2022

Date of Patent: August 22, 2023

Assignee: Adobe Inc.

Inventors: Ayush Chopra, Mausoom Sarkar, Jonas Dahl, Hiresh Gupta, Balaji Krishnamurthy, Abhishek Sinha
Text-conditioned image search based on transformation, aggregation, and composition of visio-linguistic features

Patent number: 11720651

Abstract: Techniques are disclosed for text-conditioned image searching. A methodology implementing the techniques includes decomposing a source image into visual feature vectors associated with different levels of granularity. The method also includes decomposing a text query (defining a target image attribute) into feature vectors associated with different levels of granularity including a global text feature vector. The method further includes generating image-text embeddings based on the visual feature vectors and the text feature vectors to encode information from visual and textual features. The method further includes composing a visio-linguistic representation based on a hierarchical aggregation of the image-text embeddings to encode visual and textual information at multiple levels of granularity.

Type: Grant

Filed: January 28, 2021

Date of Patent: August 8, 2023

Assignee: Adobe Inc.

Inventors: Pinkesh Badjatiya, Surgan Jandial, Pranit Chawla, Mausoom Sarkar, Ayush Chopra
Form structure extraction by predicting associations

Patent number: 11657306

Abstract: Techniques described herein extract form structures from a static form to facilitate making that static form reflowable. A method described herein includes accessing low-level form elements extracted from a static form. The method includes determining, using a first set of prediction models, second-level form elements based on the low-level form elements. Each second-level form element includes a respective one or more low-level form elements. The method further includes determining, using a second set of prediction models, high-level form elements based on the second-level form elements and the low-level form elements. Each high-level form element includes a respective one or more second-level form elements or low-level form elements. The method further includes generating a reflowable form based on the static form by, for each high-level form element, linking together the respective one or more second-level form elements or low-level form elements.

Type: Grant

Filed: June 17, 2020

Date of Patent: May 23, 2023

Assignee: Adobe Inc.

Inventors: Milan Aggarwal, Mausoom Sarkar, Balaji Krishnamurthy
SELF-SUPERVISED HIERARCHICAL EVENT REPRESENTATION LEARNING

Publication number: 20230154186

Abstract: Systems and methods for video processing are described. Embodiments of the present disclosure generate a plurality of image feature vectors corresponding to a plurality of frames of a video; generate a plurality of low-level event representation vectors based on the plurality of image feature vectors, wherein a number of the low-level event representation vectors is less than a number of the image feature vectors; generate a plurality of high-level event representation vectors based on the plurality of low-level event representation vectors, wherein a number of the high-level event representation vectors is less than the number of the low-level event representation vectors; and identify a plurality of high-level events occurring in the video based on the plurality of high-level event representation vectors.

Type: Application

Filed: November 16, 2021

Publication date: May 18, 2023

Inventors: Sumegh Roychowdhury, Sumedh A. Sontakke, Mausoom Sarkar, Nikaash Puri, Pinkesh Badjatiya, Milan Aggarwal
Refining Element Associations for Form Structure Extraction

Publication number: 20230134460

Abstract: In implementations of refining element associations for form structure extraction, a computing device implements a structure system to receive estimate data describing estimated associations of elements included in a form and a digital image depicting the form. An image patch is extracted from the digital image, and the image patch depicts a pair of elements of the elements included in the form. The structure system encodes an indication of whether the pair of elements have an association of the estimated associations. An indication is generated that the pair of elements have a particular association based at least partially on the encoded indication, bounding boxes of the pair of elements, and text depicted in the image patch.

Type: Application

Filed: November 2, 2021

Publication date: May 4, 2023

Applicant: Adobe Inc.

Inventors: Shripad Deshmukh, Milan Aggarwal, Mausoom Sarkar, Hiresh Gupta
Performing electronic document segmentation using deep neural networks

Patent number: 11600091

Abstract: Techniques for document segmentation. In an example, a document processing application segments an electronic document image into strips. A first strip overlaps a second strip. The application generates a first mask indicating one or more elements and element types in the first strip by applying a predictive model network to image content in the first strip and a prior mask generated from image content of the first strip. The application generates a second mask indicating one or more elements and element types in the second strip by applying the predictive model network to image content in the second strip and the first mask. The application computes, from a combined mask derived from the first mask and the second mask, an output electronic document that identifies elements in the electronic document and the respective element types.

Type: Grant

Filed: May 21, 2021

Date of Patent: March 7, 2023

Assignee: Adobe Inc.

Inventors: Mausoom Sarkar, Arneh Jain
Performing semantic segmentation of form images using deep learning

Patent number: 11593552

Abstract: The present disclosure relates to generating fillable digital forms corresponding to paper forms using a form conversion neural network to determine low-level and high-level semantic characteristics of the paper forms. For example, one or more embodiments applies a digitized paper form to an encoder that outputs feature maps to a reconstruction decoder, a low-level semantic decoder, and one or more high-level semantic decoders. The reconstruction decoder generates a reconstructed layout of the digitized paper form. The low-level and high-level semantic decoders determine low-level and high-level semantic characteristics of each pixel of the digitized paper form, which provide a probability of the element type to which the pixel belongs. The semantic decoders then classify each pixel and generate corresponding semantic segmentation maps based on those probabilities. The system then generates a fillable digital form using the reconstructed layout and the semantic segmentation maps.

Type: Grant

Filed: March 21, 2018

Date of Patent: February 28, 2023

Assignee: Adobe Inc.

Inventor: Mausoom Sarkar
IDENTIFYING DIGITAL ATTRIBUTES FROM MULTIPLE ATTRIBUTE GROUPS UTILIZING A DEEP COGNITIVE ATTRIBUTION NEURAL NETWORK

Publication number: 20220309093

Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media for generating tags for an object portrayed in a digital image based on predicted attributes of the object. For example, the disclosed systems can utilize interleaved neural network layers of alternating inception layers and dilated convolution layers to generate a localization feature vector. Based on the localization feature vector, the disclosed systems can generate attribute localization feature embeddings, for example, using some pooling layer such as a global average pooling layer. The disclosed systems can then apply the attribute localization feature embeddings to corresponding attribute group classifiers to generate tags based on predicted attributes. In particular, attribute group classifiers can predict attributes as associated with a query image (e.g., based on a scoring comparison with other potential attributes of an attribute group).

Type: Application

Filed: June 14, 2022

Publication date: September 29, 2022

Inventors: Ayush Chopra, Mausoom Sarkar, Jonas Dahl, Hiresh Gupta, Balaji Krishnamurthy, Abhishek Sinha
TEXT-CONDITIONED IMAGE SEARCH BASED ON TRANSFORMATION, AGGREGATION, AND COMPOSITION OF VISIO-LINGUISTIC FEATURES

Publication number: 20220245391

Abstract: Techniques are disclosed for text-conditioned image searching. A methodology implementing the techniques includes decomposing a source image into visual feature vectors associated with different levels of granularity. The method also includes decomposing a text query (defining a target image attribute) into feature vectors associated with different levels of granularity including a global text feature vector. The method further includes generating image-text embeddings based on the visual feature vectors and the text feature vectors to encode information from visual and textual features. The method further includes composing a visio-linguistic representation based on a hierarchical aggregation of the image-text embeddings to encode visual and textual information at multiple levels of granularity.

Type: Application

Filed: January 28, 2021

Publication date: August 4, 2022

Applicant: Adobe Inc.

Inventors: Pinkesh Badjatiya, Surgan Jandial, Pranit Chawla, Mausoom Sarkar, Ayush Chopra
TEXT CONDITIONED IMAGE SEARCH BASED ON DUAL-DISENTANGLED FEATURE COMPOSITION

Publication number: 20220237406

Abstract: Techniques are disclosed for text conditioned image searching. A methodology implementing the techniques according to an embodiment includes receiving a source image and a text query defining a target image attribute. The method also includes decomposing the source image into image content and style feature vectors and decomposing the text query into text content and style feature vectors, wherein image style is descriptive of image content and text style is descriptive of text content. The method further includes composing a global content feature vector based on the text content feature vector and the image content feature vector and composing a global style feature vector based on the text style feature vector and the image style feature vector. The method further includes identifying a target image that relates to the global content feature vector and the global style feature vector so that the target image relates to the target image attribute.

Type: Application

Filed: January 28, 2021

Publication date: July 28, 2022

Applicant: Adobe Inc.

Inventors: Pinkesh Badjatiya, Surgan Jandial, Pranit Chawla, Mausoom Sarkar, Ayush Chopra
Identifying digital attributes from multiple attribute groups within target digital images utilizing a deep cognitive attribution neural network

Patent number: 11386144

Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media for generating tags for an object portrayed in a digital image based on predicted attributes of the object. For example, the disclosed systems can utilize interleaved neural network layers of alternating inception layers and dilated convolution layers to generate a localization feature vector. Based on the localization feature vector, the disclosed systems can generate attribute localization feature embeddings, for example, using some pooling layer such as a global average pooling layer. The disclosed systems can then apply the attribute localization feature embeddings to corresponding attribute group classifiers to generate tags based on predicted attributes. In particular, attribute group classifiers can predict attributes as associated with a query image (e.g., based on a scoring comparison with other potential attributes of an attribute group).

Type: Grant

Filed: September 9, 2019

Date of Patent: July 12, 2022

Assignee: Adobe Inc.

Inventors: Ayush Chopra, Mausoom Sarkar, Jonas Dahl, Hiresh Gupta, Balaji Krishnamurthy, Abhishek Sinha
FORM STRUCTURE EXTRACTION BY PREDICTING ASSOCIATIONS

Publication number: 20210397986

Abstract: Techniques described herein extract form structures from a static form to facilitate making that static form reflowable. A method described herein includes accessing low-level form elements extracted from a static form. The method includes determining, using a first set of prediction models, second-level form elements based on the low-level form elements. Each second-level form element includes a respective one or more low-level form elements. The method further includes determining, using a second set of prediction models, high-level form elements based on the second-level form elements and the low-level form elements. Each high-level form element includes a respective one or more second-level form elements or low-level form elements. The method further includes generating a reflowable form based on the static form by, for each high-level form element, linking together the respective one or more second-level form elements or low-level form elements.

Type: Application

Filed: June 17, 2020

Publication date: December 23, 2021

Inventors: Milan Aggarwal, Mausoom Sarkar, Balaji Krishnamurthy
PERFORMING ELECTRONIC DOCUMENT SEGMENTATION USING DEEP NEURAL NETWORKS

Publication number: 20210279461

Abstract: Techniques for document segmentation. In an example, a document processing application segments an electronic document image into strips. A first strip overlaps a second strip. The application generates a first mask indicating one or more elements and element types in the first strip by applying a predictive model network to image content in the first strip and a prior mask generated from image content of the first strip. The application generates a second mask indicating one or more elements and element types in the second strip by applying the predictive model network to image content in the second strip and the first mask. The application computes, from a combined mask derived from the first mask and the second mask, an output electronic document that identifies elements in the electronic document and the respective element types.

Type: Application

Filed: May 21, 2021

Publication date: September 9, 2021

Inventors: Mausoom Sarkar, Arneh Jain
Model Training with Retrospective Loss

Publication number: 20210256387

Abstract: Generating a machine learning model that is trained using retrospective loss is described. A retrospective loss system receives an untrained machine learning model and a task for training the model. The retrospective loss system initially trains the model over warm-up iterations using task-specific loss that is determined based on a difference between predictions output by the model during training on input data and a ground truth dataset for the input data. Following the warm-up training iterations, the retrospective loss system continues to train the model using retrospective loss, which is model-agnostic and constrains the model such that a subsequently output prediction is more similar to the ground truth dataset than the previously output prediction. After determining that the model's outputs are within a threshold similarity to the ground truth dataset, the model is output with its current parameters as a trained model.

Type: Application

Filed: February 18, 2020

Publication date: August 19, 2021

Applicant: Adobe Inc.

Inventors: Ayush Chopra, Balaji Krishnamurthy, Mausoom Sarkar, Surgan Jandial
Electronic document segmentation using deep learning

Patent number: 11042734

Abstract: Techniques for document segmentation. In an example, a document processing application segments an electronic document image into strips. A first strip overlaps a second strip. The application generates a first mask indicating one or more elements and element types in the first strip by applying a predictive model network to image content in the first strip and a prior mask generated from image content of the first strip. The application generates a second mask indicating one or more elements and element types in the second strip by applying the predictive model network to image content in the second strip and the first mask. The application computes, from a combined mask derived from the first mask and the second mask, an output electronic document that identifies elements in the electronic document and the respective element types.

Type: Grant

Filed: August 13, 2019

Date of Patent: June 22, 2021

Assignee: ADOBE INC.

Inventors: Mausoom Sarkar, Arneh Jain

1 2 3 next