Patents by Inventor Mausoom Sarkar
Mausoom Sarkar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11042734Abstract: Techniques for document segmentation. In an example, a document processing application segments an electronic document image into strips. A first strip overlaps a second strip. The application generates a first mask indicating one or more elements and element types in the first strip by applying a predictive model network to image content in the first strip and a prior mask generated from image content of the first strip. The application generates a second mask indicating one or more elements and element types in the second strip by applying the predictive model network to image content in the second strip and the first mask. The application computes, from a combined mask derived from the first mask and the second mask, an output electronic document that identifies elements in the electronic document and the respective element types.Type: GrantFiled: August 13, 2019Date of Patent: June 22, 2021Assignee: ADOBE INC.Inventors: Mausoom Sarkar, Arneh Jain
-
Patent number: 11017016Abstract: A method for clustering product media files is provided. The method includes dividing each media file corresponding to one or more products into a plurality of tiles. The media file include one of an image or a video. Feature vectors are computed for each tile of each media file. One or more patch clusters are generated using the plurality of tiles. Each patch cluster includes tiles having feature vectors similar to each other. The feature vectors of each media file are compared with feature vectors of each patch cluster. Based on comparison, product groups are then generated. All media files having comparison output similar to each other are grouped into one product group. Each product group includes one or more media files for one product. Apparatus for substantially performing the method as described herein is also provided.Type: GrantFiled: March 29, 2018Date of Patent: May 25, 2021Assignee: ADOBE INC.Inventors: Vikas Yadav, Balaji Krishnamurthy, Mausoom Sarkar, Rajiv Mangla, Gitesh Malik
-
Publication number: 20210073267Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media for generating tags for an object portrayed in a digital image based on predicted attributes of the object. For example, the disclosed systems can utilize interleaved neural network layers of alternating inception layers and dilated convolution layers to generate a localization feature vector. Based on the localization feature vector, the disclosed systems can generate attribute localization feature embeddings, for example, using some pooling layer such as a global average pooling layer. The disclosed systems can then apply the attribute localization feature embeddings to corresponding attribute group classifiers to generate tags based on predicted attributes. In particular, attribute group classifiers can predict attributes as associated with a query image (e.g., based on a scoring comparison with other potential attributes of an attribute group).Type: ApplicationFiled: September 9, 2019Publication date: March 11, 2021Applicant: Adobe, Inc.Inventors: Ayush Chopra, Mausoom Sarkar, Jonas Dahl, Hiresh Gupta, Balaji Krishnamurthy, Abhishek Sinha
-
Publication number: 20210049357Abstract: Techniques for document segmentation. In an example, a document processing application segments an electronic document image into strips. A first strip overlaps a second strip. The application generates a first mask indicating one or more elements and element types in the first strip by applying a predictive model network to image content in the first strip and a prior mask generated from image content of the first strip. The application generates a second mask indicating one or more elements and element types in the second strip by applying the predictive model network to image content in the second strip and the first mask. The application computes, from a combined mask derived from the first mask and the second mask, an output electronic document that identifies elements in the electronic document and the respective element types.Type: ApplicationFiled: August 13, 2019Publication date: February 18, 2021Inventors: Mausoom Sarkar, Ameh Jain
-
Publication number: 20210042625Abstract: Methods and systems are provided for facilitating the creation and utilization of a transformation function system capable of providing network agnostic performance improvement. The transformation function system receives a representation from a task neural network. The representation can be input into a composite function neural network of the transformation function system. A learned composite function can be generated using the composite function neural network. The composite function can be specifically constructed for the task neural network based on the input representation. The learned composite function can be applied to a feature embedding of the task neural network to transform the feature embedding. Transforming the feature embedding can optimize the output of the task neural network.Type: ApplicationFiled: August 7, 2019Publication date: February 11, 2021Inventors: Ayush CHOPRA, Abhishek SINHA, Hiresh GUPTA, Mausoom SARKAR, Kumar AYUSH, Balaji KRISHNAMURTHY
-
Publication number: 20200372560Abstract: A search system provides search results with images of products based on associations of primary products and secondary products from product image sets. The search system analyzes a product image set containing multiple images to determine a primary product and secondary products. Information associating the primary and secondary products are stored in a search index. When the search system receives a query image containing a search product, the search index is queried using the search product to identify search result images based on associations of products in the search index, and the result images are provided as a response to the query image.Type: ApplicationFiled: May 20, 2019Publication date: November 26, 2020Inventors: Jonas Dahl, Mausoom Sarkar, Hiresh Gupta, Balaji Krishnamurthy, Ayush Chopra, Abhishek Sinha
-
Patent number: 10831818Abstract: Digital image search training techniques and machine-learning architectures are described. In one example, a query digital image is received by service provider system, which is then used to select at least one positive sample digital image, e.g., having a same product ID. A plurality of negative sample digital images is also selected by the service provider system based on the query digital image, e.g., having different product IDs. The at least one positive sample digital image and the plurality of negative samples are then aggregated by the service provider system into a single aggregated digital image. At least one neural network is then trained by the service provider system using a loss function based on a feature comparison between the query digital image and samples from the aggregated digital image in a single pass.Type: GrantFiled: October 31, 2018Date of Patent: November 10, 2020Assignee: Adobe Inc.Inventors: Mausoom Sarkar, Hiresh Gupta, Abhishek Sinha
-
Patent number: 10810633Abstract: Embodiments of the present invention provide systems and methods for automatically generating a shoppable video. A video is parsed into one or more scenes. Products and their corresponding product information are automatically associated with the one or more scenes. The shoppable video is then generated using the associated products and corresponding product information such that the products are visible in the shoppable video based on a scene in which the products are found.Type: GrantFiled: June 3, 2019Date of Patent: October 20, 2020Assignee: Adobe, Inc.Inventors: Vikas Yadav, Balaji Krishnamurthy, Mausoom Sarkar, Rajiv Mangla, Gitesh Malik
-
Patent number: 10755199Abstract: An introspection network is a machine-learned neural network that accelerates training of other neural networks. The introspection network receives a weight history for each of a plurality of weights from a current training step for a target neural network. A weight history includes at least four values for the weight that are obtained during training of the target neural network up to the current step. The introspection network then provides, for each of the plurality of weights, a respective predicted value, based on the weight history. The predicted value for a weight represents a value for the weight in a future training step for the target neural network. Thus, the predicted value represents a jump in the training steps of the target neural network, which reduces the training time of the target neural network. The introspection network then sets each of the plurality of weights to its respective predicted value.Type: GrantFiled: May 30, 2017Date of Patent: August 25, 2020Assignee: ADOBE INC.Inventors: Mausoom Sarkar, Balaji Krishnamurthy, Abhishek Sinha, Aahitagni Mukherjee
-
Publication number: 20200134056Abstract: Digital image search training techniques and machine-learning architectures are described. In one example, a query digital image is received by service provider system, which is then used to select at least one positive sample digital image, e.g., having a same product ID. A plurality of negative sample digital images is also selected by the service provider system based on the query digital image, e.g., having different product IDs. The at least one positive sample digital image and the plurality of negative samples are then aggregated by the service provider system into a single aggregated digital image. At least one neural network is then trained by the service provider system using a loss function based on a feature comparison between the query digital image and samples from the aggregated digital image in a single pass.Type: ApplicationFiled: October 31, 2018Publication date: April 30, 2020Applicant: Adobe Inc.Inventors: Mausoom Sarkar, Hiresh Gupta, Abhishek Sinha
-
Publication number: 20190294661Abstract: The present disclosure relates to generating fillable digital forms corresponding to paper forms using a form conversion neural network to determine low-level and high-level semantic characteristics of the paper forms. For example, one or more embodiments applies a digitized paper form to an encoder that outputs feature maps to a reconstruction decoder, a low-level semantic decoder, and one or more high-level semantic decoders. The reconstruction decoder generates a reconstructed layout of the digitized paper form. The low-level and high-level semantic decoders determine low-level and high-level semantic characteristics of each pixel of the digitized paper form, which provide a probability of the element type to which the pixel belongs. The semantic decoders then classify each pixel and generate corresponding semantic segmentation maps based on those probabilities. The system then generates a fillable digital form using the reconstructed layout and the semantic segmentation maps.Type: ApplicationFiled: March 21, 2018Publication date: September 26, 2019Inventor: Mausoom Sarkar
-
Publication number: 20190287139Abstract: Embodiments of the present invention provide systems and methods for automatically generating a shoppable video. A video is parsed into one or more scenes. Products and their corresponding product information are automatically associated with the one or more scenes. The shoppable video is then generated using the associated products and corresponding product information such that the products are visible in the shoppable video based on a scene in which the products are found.Type: ApplicationFiled: June 3, 2019Publication date: September 19, 2019Inventors: VIKAS YADAV, BALAJI KRISHNAMURTHY, MAUSOOM SARKAR, RAJIV MANGLA, GITESH MALIK
-
Patent number: 10354290Abstract: Embodiments of the present invention provide systems and methods for automatically generating a shoppable video. A video is parsed into one or more scenes. Products and their corresponding product information are automatically associated with the one or more scenes. The shoppable video is then generated using the associated products and corresponding product information such that the products are visible in the shoppable video based on a scene in which the products are found.Type: GrantFiled: June 16, 2015Date of Patent: July 16, 2019Assignee: Adobe, Inc.Inventors: Vikas Yadav, Balaji Krishnamurthy, Mausoom Sarkar, Rajiv Mangla, Gitesh Malik
-
Patent number: 10268883Abstract: A method and system for detecting and extracting accurate and precise structure in documents. A high-resolution image of documents is segmented into a set of tiles. Each tile is processed by a convolutional network and subsequently by a set of recurrent networks for each row and column. A global-lookup process is disclosed that allows “future” information required for accurate assessment by the recurrent neural networks to be considered. Utilization of high-resolution image allows for precise and accurate feature extraction while segmentation into tiles facilitates the tractable processing of the high-resolution image within reasonable computational resource bounds.Type: GrantFiled: August 10, 2017Date of Patent: April 23, 2019Assignee: Adobe Inc.Inventors: Mausoom Sarkar, Balaji Krishnamurthy
-
Publication number: 20190050640Abstract: A method and system for detecting and extracting accurate and precise structure in documents. A high-resolution image of documents is segmented into a set of tiles. Each tile is processed by a convolutional network and subsequently by a set of recurrent networks for each row and column. A global-lookup process is disclosed that allows “future” information required for accurate assessment by the recurrent neural networks to be considered. Utilization of high-resolution image allows for precise and accurate feature extraction while segmentation into tiles facilitates the tractable processing of the high-resolution image within reasonable computational resource bounds.Type: ApplicationFiled: August 10, 2017Publication date: February 14, 2019Applicant: Adobe Systems IncorporatedInventors: MAUSOOM SARKAR, BALAJI KRISHNAMURTHY
-
Patent number: 10152655Abstract: Systems and methods are disclosed herein for automatically identifying a query object within a visual medium. The technique generally involves receiving as input to a neural network a query object and a visual medium including the query object. The technique also involves generating, by the neural network, representations of the query object and the visual medium defining features of the query object and the visual medium. The technique also involves generating, by the neural network, a heat map using the representations. The heat map identifies a location of pixels corresponding to the query object within the visual medium and is usable to generate an updated visual medium highlighting the query object.Type: GrantFiled: May 15, 2018Date of Patent: December 11, 2018Assignee: Adobe Systems IncorporatedInventors: Balaji Krishnamurthy, Mausoom Sarkar
-
Publication number: 20180349788Abstract: An introspection network is a machine-learned neural network that accelerates training of other neural networks. The introspection network receives a weight history for each of a plurality of weights from a current training step for a target neural network. A weight history includes at least four values for the weight that are obtained during training of the target neural network up to the current step. The introspection network then provides, for each of the plurality of weights, a respective predicted value, based on the weight history. The predicted value for a weight represents a value for the weight in a future training step for the target neural network. Thus, the predicted value represents a jump in the training steps of the target neural network, which reduces the training time of the target neural network. The introspection network then sets each of the plurality of weights to its respective predicted value.Type: ApplicationFiled: May 30, 2017Publication date: December 6, 2018Inventors: Mausoom Sarkar, Balaji Krishnamurthy, Abhishek Sinha, Aahitagni Mukherjee
-
Publication number: 20180260664Abstract: Systems and methods are disclosed herein for automatically identifying a query object within a visual medium. The technique generally involves receiving as input to a neural network a query object and a visual medium including the query object. The technique also involves generating, by the neural network, representations of the query object and the visual medium defining features of the query object and the visual medium. The technique also involves generating, by the neural network, a heat map using the representations. The heat map identifies a location of pixels corresponding to the query object within the visual medium and is usable to generate an updated visual medium highlighting the query object.Type: ApplicationFiled: May 15, 2018Publication date: September 13, 2018Inventors: Balaji Krishnamurthy, Mausoom Sarkar
-
Publication number: 20180218009Abstract: A method for clustering product media files is provided. The method includes dividing each media file corresponding to one or more products into a plurality of tiles. The media file include one of an image or a video. Feature vectors are computed for each tile of each media file. One or more patch clusters are generated using the plurality of tiles. Each patch cluster includes tiles having feature vectors similar to each other. The feature vectors of each media file are compared with feature vectors of each patch cluster. Based on comparison, product groups are then generated. All media files having comparison output similar to each other are grouped into one product group. Each product group includes one or more media files for one product. Apparatus for substantially performing the method as described herein is also provided.Type: ApplicationFiled: March 29, 2018Publication date: August 2, 2018Inventors: Vikas Yadav, Balaji Krishnamurthy, Mausoom Sarkar, Rajiv Mangla, Gitesh Malik
-
Patent number: 10019655Abstract: Systems and methods are disclosed herein for automatically identifying a query object within a visual medium. The technique generally involves receiving as input to a neural network a query object and a visual medium including the query object. The technique also involves generating, by the neural network, representations of the query object and the visual medium defining features of the query object and the visual medium. The technique also involves generating, by the neural network, a heat map using the representations. The heat map identifies a location of pixels corresponding to the query object within the visual medium and is usable to generate an updated visual medium highlighting the query object.Type: GrantFiled: August 31, 2016Date of Patent: July 10, 2018Assignee: Adobe Systems IncorporatedInventors: Balaji Krishnamurthy, Mausoom Sarkar