Patents by Inventor Mausoom Sarkar

Mausoom Sarkar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11042734
    Abstract: Techniques for document segmentation. In an example, a document processing application segments an electronic document image into strips. A first strip overlaps a second strip. The application generates a first mask indicating one or more elements and element types in the first strip by applying a predictive model network to image content in the first strip and a prior mask generated from image content of the first strip. The application generates a second mask indicating one or more elements and element types in the second strip by applying the predictive model network to image content in the second strip and the first mask. The application computes, from a combined mask derived from the first mask and the second mask, an output electronic document that identifies elements in the electronic document and the respective element types.
    Type: Grant
    Filed: August 13, 2019
    Date of Patent: June 22, 2021
    Assignee: ADOBE INC.
    Inventors: Mausoom Sarkar, Arneh Jain
  • Patent number: 11017016
    Abstract: A method for clustering product media files is provided. The method includes dividing each media file corresponding to one or more products into a plurality of tiles. The media file include one of an image or a video. Feature vectors are computed for each tile of each media file. One or more patch clusters are generated using the plurality of tiles. Each patch cluster includes tiles having feature vectors similar to each other. The feature vectors of each media file are compared with feature vectors of each patch cluster. Based on comparison, product groups are then generated. All media files having comparison output similar to each other are grouped into one product group. Each product group includes one or more media files for one product. Apparatus for substantially performing the method as described herein is also provided.
    Type: Grant
    Filed: March 29, 2018
    Date of Patent: May 25, 2021
    Assignee: ADOBE INC.
    Inventors: Vikas Yadav, Balaji Krishnamurthy, Mausoom Sarkar, Rajiv Mangla, Gitesh Malik
  • Publication number: 20210073267
    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media for generating tags for an object portrayed in a digital image based on predicted attributes of the object. For example, the disclosed systems can utilize interleaved neural network layers of alternating inception layers and dilated convolution layers to generate a localization feature vector. Based on the localization feature vector, the disclosed systems can generate attribute localization feature embeddings, for example, using some pooling layer such as a global average pooling layer. The disclosed systems can then apply the attribute localization feature embeddings to corresponding attribute group classifiers to generate tags based on predicted attributes. In particular, attribute group classifiers can predict attributes as associated with a query image (e.g., based on a scoring comparison with other potential attributes of an attribute group).
    Type: Application
    Filed: September 9, 2019
    Publication date: March 11, 2021
    Applicant: Adobe, Inc.
    Inventors: Ayush Chopra, Mausoom Sarkar, Jonas Dahl, Hiresh Gupta, Balaji Krishnamurthy, Abhishek Sinha
  • Publication number: 20210049357
    Abstract: Techniques for document segmentation. In an example, a document processing application segments an electronic document image into strips. A first strip overlaps a second strip. The application generates a first mask indicating one or more elements and element types in the first strip by applying a predictive model network to image content in the first strip and a prior mask generated from image content of the first strip. The application generates a second mask indicating one or more elements and element types in the second strip by applying the predictive model network to image content in the second strip and the first mask. The application computes, from a combined mask derived from the first mask and the second mask, an output electronic document that identifies elements in the electronic document and the respective element types.
    Type: Application
    Filed: August 13, 2019
    Publication date: February 18, 2021
    Inventors: Mausoom Sarkar, Ameh Jain
  • Publication number: 20210042625
    Abstract: Methods and systems are provided for facilitating the creation and utilization of a transformation function system capable of providing network agnostic performance improvement. The transformation function system receives a representation from a task neural network. The representation can be input into a composite function neural network of the transformation function system. A learned composite function can be generated using the composite function neural network. The composite function can be specifically constructed for the task neural network based on the input representation. The learned composite function can be applied to a feature embedding of the task neural network to transform the feature embedding. Transforming the feature embedding can optimize the output of the task neural network.
    Type: Application
    Filed: August 7, 2019
    Publication date: February 11, 2021
    Inventors: Ayush CHOPRA, Abhishek SINHA, Hiresh GUPTA, Mausoom SARKAR, Kumar AYUSH, Balaji KRISHNAMURTHY
  • Publication number: 20200372560
    Abstract: A search system provides search results with images of products based on associations of primary products and secondary products from product image sets. The search system analyzes a product image set containing multiple images to determine a primary product and secondary products. Information associating the primary and secondary products are stored in a search index. When the search system receives a query image containing a search product, the search index is queried using the search product to identify search result images based on associations of products in the search index, and the result images are provided as a response to the query image.
    Type: Application
    Filed: May 20, 2019
    Publication date: November 26, 2020
    Inventors: Jonas Dahl, Mausoom Sarkar, Hiresh Gupta, Balaji Krishnamurthy, Ayush Chopra, Abhishek Sinha
  • Patent number: 10831818
    Abstract: Digital image search training techniques and machine-learning architectures are described. In one example, a query digital image is received by service provider system, which is then used to select at least one positive sample digital image, e.g., having a same product ID. A plurality of negative sample digital images is also selected by the service provider system based on the query digital image, e.g., having different product IDs. The at least one positive sample digital image and the plurality of negative samples are then aggregated by the service provider system into a single aggregated digital image. At least one neural network is then trained by the service provider system using a loss function based on a feature comparison between the query digital image and samples from the aggregated digital image in a single pass.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: November 10, 2020
    Assignee: Adobe Inc.
    Inventors: Mausoom Sarkar, Hiresh Gupta, Abhishek Sinha
  • Patent number: 10810633
    Abstract: Embodiments of the present invention provide systems and methods for automatically generating a shoppable video. A video is parsed into one or more scenes. Products and their corresponding product information are automatically associated with the one or more scenes. The shoppable video is then generated using the associated products and corresponding product information such that the products are visible in the shoppable video based on a scene in which the products are found.
    Type: Grant
    Filed: June 3, 2019
    Date of Patent: October 20, 2020
    Assignee: Adobe, Inc.
    Inventors: Vikas Yadav, Balaji Krishnamurthy, Mausoom Sarkar, Rajiv Mangla, Gitesh Malik
  • Patent number: 10755199
    Abstract: An introspection network is a machine-learned neural network that accelerates training of other neural networks. The introspection network receives a weight history for each of a plurality of weights from a current training step for a target neural network. A weight history includes at least four values for the weight that are obtained during training of the target neural network up to the current step. The introspection network then provides, for each of the plurality of weights, a respective predicted value, based on the weight history. The predicted value for a weight represents a value for the weight in a future training step for the target neural network. Thus, the predicted value represents a jump in the training steps of the target neural network, which reduces the training time of the target neural network. The introspection network then sets each of the plurality of weights to its respective predicted value.
    Type: Grant
    Filed: May 30, 2017
    Date of Patent: August 25, 2020
    Assignee: ADOBE INC.
    Inventors: Mausoom Sarkar, Balaji Krishnamurthy, Abhishek Sinha, Aahitagni Mukherjee
  • Publication number: 20200134056
    Abstract: Digital image search training techniques and machine-learning architectures are described. In one example, a query digital image is received by service provider system, which is then used to select at least one positive sample digital image, e.g., having a same product ID. A plurality of negative sample digital images is also selected by the service provider system based on the query digital image, e.g., having different product IDs. The at least one positive sample digital image and the plurality of negative samples are then aggregated by the service provider system into a single aggregated digital image. At least one neural network is then trained by the service provider system using a loss function based on a feature comparison between the query digital image and samples from the aggregated digital image in a single pass.
    Type: Application
    Filed: October 31, 2018
    Publication date: April 30, 2020
    Applicant: Adobe Inc.
    Inventors: Mausoom Sarkar, Hiresh Gupta, Abhishek Sinha
  • Publication number: 20190294661
    Abstract: The present disclosure relates to generating fillable digital forms corresponding to paper forms using a form conversion neural network to determine low-level and high-level semantic characteristics of the paper forms. For example, one or more embodiments applies a digitized paper form to an encoder that outputs feature maps to a reconstruction decoder, a low-level semantic decoder, and one or more high-level semantic decoders. The reconstruction decoder generates a reconstructed layout of the digitized paper form. The low-level and high-level semantic decoders determine low-level and high-level semantic characteristics of each pixel of the digitized paper form, which provide a probability of the element type to which the pixel belongs. The semantic decoders then classify each pixel and generate corresponding semantic segmentation maps based on those probabilities. The system then generates a fillable digital form using the reconstructed layout and the semantic segmentation maps.
    Type: Application
    Filed: March 21, 2018
    Publication date: September 26, 2019
    Inventor: Mausoom Sarkar
  • Publication number: 20190287139
    Abstract: Embodiments of the present invention provide systems and methods for automatically generating a shoppable video. A video is parsed into one or more scenes. Products and their corresponding product information are automatically associated with the one or more scenes. The shoppable video is then generated using the associated products and corresponding product information such that the products are visible in the shoppable video based on a scene in which the products are found.
    Type: Application
    Filed: June 3, 2019
    Publication date: September 19, 2019
    Inventors: VIKAS YADAV, BALAJI KRISHNAMURTHY, MAUSOOM SARKAR, RAJIV MANGLA, GITESH MALIK
  • Patent number: 10354290
    Abstract: Embodiments of the present invention provide systems and methods for automatically generating a shoppable video. A video is parsed into one or more scenes. Products and their corresponding product information are automatically associated with the one or more scenes. The shoppable video is then generated using the associated products and corresponding product information such that the products are visible in the shoppable video based on a scene in which the products are found.
    Type: Grant
    Filed: June 16, 2015
    Date of Patent: July 16, 2019
    Assignee: Adobe, Inc.
    Inventors: Vikas Yadav, Balaji Krishnamurthy, Mausoom Sarkar, Rajiv Mangla, Gitesh Malik
  • Patent number: 10268883
    Abstract: A method and system for detecting and extracting accurate and precise structure in documents. A high-resolution image of documents is segmented into a set of tiles. Each tile is processed by a convolutional network and subsequently by a set of recurrent networks for each row and column. A global-lookup process is disclosed that allows “future” information required for accurate assessment by the recurrent neural networks to be considered. Utilization of high-resolution image allows for precise and accurate feature extraction while segmentation into tiles facilitates the tractable processing of the high-resolution image within reasonable computational resource bounds.
    Type: Grant
    Filed: August 10, 2017
    Date of Patent: April 23, 2019
    Assignee: Adobe Inc.
    Inventors: Mausoom Sarkar, Balaji Krishnamurthy
  • Publication number: 20190050640
    Abstract: A method and system for detecting and extracting accurate and precise structure in documents. A high-resolution image of documents is segmented into a set of tiles. Each tile is processed by a convolutional network and subsequently by a set of recurrent networks for each row and column. A global-lookup process is disclosed that allows “future” information required for accurate assessment by the recurrent neural networks to be considered. Utilization of high-resolution image allows for precise and accurate feature extraction while segmentation into tiles facilitates the tractable processing of the high-resolution image within reasonable computational resource bounds.
    Type: Application
    Filed: August 10, 2017
    Publication date: February 14, 2019
    Applicant: Adobe Systems Incorporated
    Inventors: MAUSOOM SARKAR, BALAJI KRISHNAMURTHY
  • Patent number: 10152655
    Abstract: Systems and methods are disclosed herein for automatically identifying a query object within a visual medium. The technique generally involves receiving as input to a neural network a query object and a visual medium including the query object. The technique also involves generating, by the neural network, representations of the query object and the visual medium defining features of the query object and the visual medium. The technique also involves generating, by the neural network, a heat map using the representations. The heat map identifies a location of pixels corresponding to the query object within the visual medium and is usable to generate an updated visual medium highlighting the query object.
    Type: Grant
    Filed: May 15, 2018
    Date of Patent: December 11, 2018
    Assignee: Adobe Systems Incorporated
    Inventors: Balaji Krishnamurthy, Mausoom Sarkar
  • Publication number: 20180349788
    Abstract: An introspection network is a machine-learned neural network that accelerates training of other neural networks. The introspection network receives a weight history for each of a plurality of weights from a current training step for a target neural network. A weight history includes at least four values for the weight that are obtained during training of the target neural network up to the current step. The introspection network then provides, for each of the plurality of weights, a respective predicted value, based on the weight history. The predicted value for a weight represents a value for the weight in a future training step for the target neural network. Thus, the predicted value represents a jump in the training steps of the target neural network, which reduces the training time of the target neural network. The introspection network then sets each of the plurality of weights to its respective predicted value.
    Type: Application
    Filed: May 30, 2017
    Publication date: December 6, 2018
    Inventors: Mausoom Sarkar, Balaji Krishnamurthy, Abhishek Sinha, Aahitagni Mukherjee
  • Publication number: 20180260664
    Abstract: Systems and methods are disclosed herein for automatically identifying a query object within a visual medium. The technique generally involves receiving as input to a neural network a query object and a visual medium including the query object. The technique also involves generating, by the neural network, representations of the query object and the visual medium defining features of the query object and the visual medium. The technique also involves generating, by the neural network, a heat map using the representations. The heat map identifies a location of pixels corresponding to the query object within the visual medium and is usable to generate an updated visual medium highlighting the query object.
    Type: Application
    Filed: May 15, 2018
    Publication date: September 13, 2018
    Inventors: Balaji Krishnamurthy, Mausoom Sarkar
  • Publication number: 20180218009
    Abstract: A method for clustering product media files is provided. The method includes dividing each media file corresponding to one or more products into a plurality of tiles. The media file include one of an image or a video. Feature vectors are computed for each tile of each media file. One or more patch clusters are generated using the plurality of tiles. Each patch cluster includes tiles having feature vectors similar to each other. The feature vectors of each media file are compared with feature vectors of each patch cluster. Based on comparison, product groups are then generated. All media files having comparison output similar to each other are grouped into one product group. Each product group includes one or more media files for one product. Apparatus for substantially performing the method as described herein is also provided.
    Type: Application
    Filed: March 29, 2018
    Publication date: August 2, 2018
    Inventors: Vikas Yadav, Balaji Krishnamurthy, Mausoom Sarkar, Rajiv Mangla, Gitesh Malik
  • Patent number: 10019655
    Abstract: Systems and methods are disclosed herein for automatically identifying a query object within a visual medium. The technique generally involves receiving as input to a neural network a query object and a visual medium including the query object. The technique also involves generating, by the neural network, representations of the query object and the visual medium defining features of the query object and the visual medium. The technique also involves generating, by the neural network, a heat map using the representations. The heat map identifies a location of pixels corresponding to the query object within the visual medium and is usable to generate an updated visual medium highlighting the query object.
    Type: Grant
    Filed: August 31, 2016
    Date of Patent: July 10, 2018
    Assignee: Adobe Systems Incorporated
    Inventors: Balaji Krishnamurthy, Mausoom Sarkar