Patents by Inventor Pranav AGGARWAL

Pranav AGGARWAL has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods for detecting text of interest

Patent number: 11948374

Abstract: In some embodiments, apparatuses and methods are provided herein useful to train a machine learning algorithm to detect text of interest. In some embodiments, there is provided a system to detect vertically oriented text of interest including a first data set comprising a plurality of captured digital images each depicting an object of interest and a second data set comprising a plurality of augmented digital images each depicting a captured digital image augmented with a synthetic text image; a first control circuit configured to cause the machine learning algorithm to output a machine learning model trained to automatically detect occurrences of vertically oriented text of interest based on the first data set and the second data set; at least one camera; and a second control circuit configured to execute the machine learning model to automatically detect vertically oriented text of interest on the object of interest.

Type: Grant

Filed: July 20, 2021

Date of Patent: April 2, 2024

Assignee: WALMART APOLLO, LLC

Inventors: Ramanujam Ramaswamy Srinivasa, Manish Kumar, Pranav Aggarwal
Text to color palette generator

Patent number: 11914641

Abstract: The present disclosure describes systems and methods for information retrieval. Embodiments of the disclosure provide a color embedding network trained using machine learning techniques to generate embedded color representations for color terms included in a text search query. For example, techniques described herein are used to represent color text in a same space as color embeddings (e.g., an embedding space created by determining a histogram of LAB based colors in a three-dimensional (3D) space). Further, techniques are described for indexing color palettes for all the searchable images in the search space. Accordingly, color terms in a text query are directly converted into a color palette and an image search system can return one or more search images with corresponding color palettes that are relevant to (e.g., within a threshold distance from) the color palette of the text query.

Type: Grant

Filed: February 26, 2021

Date of Patent: February 27, 2024

Assignee: ADOBE INC.

Inventors: Pranav Aggarwal, Ajinkya Kale, Baldo Faieta, Saeid Motiian, Venkata naveen kumar yadav Marri
GENERATING NOVEL IMAGES USING SKETCH IMAGE REPRESENTATIONS

Publication number: 20230419551

Abstract: Techniques for generating a novel image using tokenized image representations are disclosed. In some embodiments, a method of generating the novel image includes generating, via a first machine learning model, a first sequence of coded representations of a first image having one or more features; generating, via a second machine learning model, a second sequence of coded representations of a sketch image having one or more edge features associated with the one or more features; predicting, via a third machine learning model, one or more subsequent coded representations based on the first sequence of coded representations and the second sequence of coded representations; and based on the subsequent coded representations, generating, via the third machine learning model, a first portion of a reconstructed image having one or more image attributes of the first image, and a second portion of the reconstructed image associated with the one or more edge features.

Type: Application

Filed: June 22, 2022

Publication date: December 28, 2023

Inventors: Midhun Harikumar, Pranav Aggarwal, Ajinkya Gorakhnath Kale
METHODS AND APPARATUS FOR TEXT IDENTIFICATION AND EXTRACTION WITHIN IMAGES USING MACHINE LEARNING PROCESSES

Publication number: 20230401877

Abstract: This application related to automatic processes for identifying and extracting information from images of documents of varying layouts. For example, a computing device may receive an image of a document, where the image includes a plurality of color channels. The computing device applies a character recognition process to the image to generate optical character recognition data. Further, the computing device determines an area of the image that includes one or more characters based on the optical character recognition data. The computing device adjusts a value of each of a plurality of pixels corresponding to the area of the image determined for each character based on a value of each corresponding character to generate a modified image. The computing device then applies a trained machine learning process to the modified image to generate output data. The output data characterizes characters, such as words and number values, within the original image.

Type: Application

Filed: May 27, 2022

Publication date: December 14, 2023

Inventors: Ramanujam Ramaswamy Srinivasa, Pranav Aggarwal
SYSTEMS AND METHODS FOR GENERATING TEXT DESCRIPTIVE OF DIGITAL IMAGES

Publication number: 20230315988

Abstract: Disclosed are computer-implemented methods and systems for generating text descriptive of digital images, comprising using a machine learning model to pre-process an image to generate initial text descriptive of the image; adjusting one or more inferences of the machine learning model, the inferences biasing the machine learning model away from associating negative words with the image; using the machine learning model comprising the adjusted inferences to post-process the image to generate updated text descriptive of the image; and processing the generated updated text descriptive of the image outputted by the machine learning model to fine-tune the updated text descriptive of the image.

Type: Application

Filed: May 10, 2023

Publication date: October 5, 2023

Applicant: Adobe Inc.

Inventors: Pranav Aggarwal, Di Pu, Daniel ReMine, Ajinkya Kale
Multi-modal image color segmenter and editor

Patent number: 11756239

Abstract: Systems and methods for color replacement are described. Embodiments of the disclosure include a color replacement system that adjusts an image based on a user-input source color and target color. For example, the source color may be replaced with the target color throughout the entire image. In some embodiments, a user provides a speech or text input that identifies a source color to be replaced. The user may then provide a speech or text input identifying the target color, replacing the source color. A color replacement system creates and embedding of the source color, segments the image based on the source color embedding, and then replaces the color of segmented portion of the image with the target color.

Type: Grant

Filed: April 26, 2021

Date of Patent: September 12, 2023

Assignee: ADOBE, INC.

Inventors: Pranav Aggarwal, Ajinkya Kale
Generating embeddings in a multimodal embedding space for cross-lingual digital image retrieval

Patent number: 11734339

Abstract: The present disclosure relates to methods, systems, and non-transitory computer-readable media for retrieving digital images in response to queries. For example, in one or more embodiments, the disclosed systems receive a query comprising text and generates a cross-lingual-multimodal embedding for the text within a multimodal embedding space. The disclosed systems further identifies an image embedding for a digital image that corresponds to (e.g., is relevant to) the text from the query based on an embedding distance between the image embedding and the cross-lingual-multimodal embedding for the text within the multimodal embedding space. Accordingly, the disclosed systems retrieve the digital image associated with the image embedding for display on a client device, such as the client device that submitted the query.

Type: Grant

Filed: October 20, 2020

Date of Patent: August 22, 2023

Assignee: Adobe Inc.

Inventors: Ajinkya Kale, Zhe Lin, Pranav Aggarwal
IMAGE SEGMENTATION USING TEXT EMBEDDING

Publication number: 20230206525

Abstract: A non-transitory computer-readable medium includes program code that is stored thereon. The program code is executable by one or more processing devices for performing operations including generating, using a model, a learned image representation of a target image. The operations further include generating, using a text embedding model, a text embedding of a text query. The text embedding and the learned image representation of the target image are in a same embedding space. Additionally, the operations include convolving the learned image representation of the target image with the text embedding of the text query. Moreover, the operations include generating an object-segmented image based on the convolving of the learned image representation of the target image with the text embedding.

Type: Application

Filed: March 3, 2023

Publication date: June 29, 2023

Inventors: Midhun Harikumar, Pranav Aggarwal, Baldo Faieta, Ajinkya Kale, Zhe Lin
Systems and methods for generating text descriptive of digital images

Patent number: 11687714

Abstract: Disclosed are computer-implemented methods and systems for generating text descriptive of digital images, comprising using a machine learning model to pre-process an image to generate initial text descriptive of the image; adjusting one or more inferences of the machine learning model, the inferences biasing the machine learning model away from associating negative words with the image; using the machine learning model comprising the adjusted inferences to post-process the image to generate updated text descriptive of the image; and processing the generated updated text descriptive of the image outputted by the machine learning model to fine-tune the updated text descriptive of the image.

Type: Grant

Filed: August 20, 2020

Date of Patent: June 27, 2023

Assignee: Adobe Inc.

Inventors: Pranav Aggarwal, Di Pu, Daniel ReMine, Ajinkya Kale
Multi-lingual tagging for digital images

Patent number: 11645478

Abstract: Introduced here is an approach to translating tags assigned to digital images. As an example, embeddings may be extracted from a tag to be translated and the digital image with which the tag is associated by a multimodal model. These embeddings can be compared to embeddings extracted from a set of target tags associated with a target language by the multimodal model. Such an approach allows similarity to be established along two dimensions, which ensures the obstacles associated with direct translation can be avoided.

Type: Grant

Filed: November 4, 2020

Date of Patent: May 9, 2023

Assignee: Adobe Inc.

Inventors: Ritiz Tambi, Pranav Aggarwal, Ajinkya Kale
Image segmentation using text embedding

Patent number: 11615567

Abstract: A non-transitory computer-readable medium includes program code that is stored thereon. The program code is executable by one or more processing devices for performing operations including generating, by a model that includes trainable components, a learned image representation of a target image. The operations further include generating, by a text embedding model, a text embedding of a text query. The text embedding and the learned image representation of the target image are in a same embedding space. Additionally, the operations include generating a class activation map of the target image by, at least, convolving the learned image representation of the target image with the text embedding of the text query. Moreover, the operations include generating an object-segmented image using the class activation map of the target image.

Type: Grant

Filed: November 18, 2020

Date of Patent: March 28, 2023

Assignee: Adobe Inc.

Inventors: Midhun Harikumar, Pranav Aggarwal, Baldo Faieta, Ajinkya Kale, Zhe Lin
SYSTEMS AND METHODS FOR DETECTING TEXT OF INTEREST

Publication number: 20230021506

Abstract: In some embodiments, apparatuses and methods are provided herein useful to train a machine learning algorithm to detect text of interest. In some embodiments, there is provided a system to detect vertically oriented text of interest including a first data set comprising a plurality of captured digital images each depicting an object of interest and a second data set comprising a plurality of augmented digital images each depicting a captured digital image augmented with a synthetic text image; a first control circuit configured to cause the machine learning algorithm to output a machine learning model trained to automatically detect occurrences of vertically oriented text of interest based on the first data set and the second data set; at least one camera; and a second control circuit configured to execute the machine learning model to automatically detect vertically oriented text of interest on the object of interest.

Type: Application

Filed: July 20, 2021

Publication date: January 26, 2023

Inventors: Ramanujam Ramaswamy Srinivasa, Manish Kumar, Pranav Aggarwal
SYSTEMS AND METHODS FOR RECOGNIZING TEXT OF INTEREST

Publication number: 20230025548

Abstract: In some embodiments, apparatuses and methods are provided herein useful to determine text on an object. In some embodiments, there is provided a system to determine text of interest on an object of interest including at least one camera and a control circuit configured to execute a machine learning model trained to identify the text of interest, group into a cluster each node point that is located substantially in the same location in the text of interest, determine a score value of each particular character in the cluster, identify the particular character that has a determined score value corresponding to at least a threshold score value relative to all characters in the cluster, assign the particular character having the determined score value corresponding to at least the threshold score value as a recognized character in the cluster, and transmit to a display monitor overlay data.

Type: Application

Filed: July 20, 2021

Publication date: January 26, 2023

Inventors: Ramanujam Ramaswamy Srinivasa, Manish Kumar, Pranav Aggarwal
MULTI-MODAL IMAGE COLOR SEGMENTER AND EDITOR

Publication number: 20220343561

Abstract: Systems and methods for color replacement are described. Embodiments of the disclosure include a color replacement system that adjusts an image based on a user-input source color and target color. For example, the source color may be replaced with the target color throughout the entire image. In some embodiments, a user provides a speech or text input that identifies a source color to be replaced. The user may then provide a speech or text input identifying the target color, replacing the source color. A color replacement system creates and embedding of the source color, segments the image based on the source color embedding, and then replaces the color of segmented portion of the image with the target color.

Type: Application

Filed: April 26, 2021

Publication date: October 27, 2022

Inventors: Pranav Aggarwal, Ajinkya Kale
TEXT TO COLOR PALETTE GENERATOR

Publication number: 20220277039

Abstract: The present disclosure describes systems and methods for information retrieval. Embodiments of the disclosure provide a color embedding network trained using machine learning techniques to generate embedded color representations for color terms included in a text search query. For example, techniques described herein are used to represent color text in a same space as color embeddings (e.g., an embedding space created by determining a histogram of LAB based colors in a three-dimensional (3D) space). Further, techniques are described for indexing color palettes for all the searchable images in the search space. Accordingly, color terms in a text query are directly converted into a color palette and an image search system can return one or more search images with corresponding color palettes that are relevant to (e.g., within a threshold distance from) the color palette of the text query.

Type: Application

Filed: February 26, 2021

Publication date: September 1, 2022

Inventors: PRANAV AGGARWAL, Ajinkya Kale, Baldo Faieta, Saeid Motiian, Venkata naveen kumar yadav Marri
IMAGE SEGMENTATION USING TEXT EMBEDDING

Publication number: 20220156992

Abstract: A non-transitory computer-readable medium includes program code that is stored thereon. The program code is executable by one or more processing devices for performing operations including generating, by a model that includes trainable components, a learned image representation of a target image. The operations further include generating, by a text embedding model, a text embedding of a text query. The text embedding and the learned image representation of the target image are in a same embedding space. Additionally, the operations include generating a class activation map of the target image by, at least, convolving the learned image representation of the target image with the text embedding of the text query. Moreover, the operations include generating an object-segmented image using the class activation map of the target image.

Type: Application

Filed: November 18, 2020

Publication date: May 19, 2022

Inventors: Midhun Harikumar, Pranav Aggarwal, Baldo Faieta, Ajinkya Kale, Zhe Lin
MULTI-LINGUAL TAGGING FOR DIGITAL IMAGES

Publication number: 20220138439

Abstract: Introduced here is an approach to translating tags assigned to digital images. As an example, embeddings may be extracted from a tag to be translated and the digital image with which the tag is associated by a multimodal model. These embeddings can be compared to embeddings extracted from a set of target tags associated with a target language by the multimodal model. Such an approach allows similarity to be established along two dimensions, which ensures the obstacles associated with direct translation can be avoided.

Type: Application

Filed: November 4, 2020

Publication date: May 5, 2022

Inventors: Ritiz Tambi, Pranav Aggarwal, Ajinkya Kale
GENERATING EMBEDDINGS IN A MULTIMODAL EMBEDDING SPACE FOR CROSS-LINGUAL DIGITAL IMAGE RETRIEVAL

Publication number: 20220121702

Abstract: The present disclosure relates to methods, systems, and non-transitory computer-readable media for retrieving digital images in response to queries. For example, in one or more embodiments, the disclosed systems receive a query comprising text and generates a cross-lingual-multimodal embedding for the text within a multimodal embedding space. The disclosed systems further identifies an image embedding for a digital image that corresponds to (e.g., is relevant to) the text from the query based on an embedding distance between the image embedding and the cross-lingual-multimodal embedding for the text within the multimodal embedding space. Accordingly, the disclosed systems retrieve the digital image associated with the image embedding for display on a client device, such as the client device that submitted the query.

Type: Application

Filed: October 20, 2020

Publication date: April 21, 2022

Inventors: Ajinkya Kale, Zhe Lin, Pranav Aggarwal
MULTI-WORD CONCEPT TAGGING FOR IMAGES USING SHORT TEXT DECODER

Publication number: 20220114361

Abstract: Embodiments are disclosed for training an image caption generator model to generate phrase tags for input images. The phrase tags can include short phrases that describe the contents of the images (e.g., objects depicted therein). Once trained, the image caption generator model can be used as an image phrase tagger to tag input images from an image library with phrase tags. The image library can be indexed based on their phrase tags. Subsequently, when the image library is queried, the query can be divided into phrases and the index can be used to identify matching images.

Type: Application

Filed: October 14, 2020

Publication date: April 14, 2022

Inventors: Ajinkya KALE, Pranav AGGARWAL
SYSTEMS AND METHODS FOR GENERATING TEXT DESCRIPTIVE OF DIGITAL IMAGES

Publication number: 20220058340

Abstract: Disclosed are computer-implemented methods and systems for generating text descriptive of digital images, comprising using a machine learning model to pre-process an image to generate initial text descriptive of the image; adjusting one or more inferences of the machine learning model, the inferences biasing the machine learning model away from associating negative words with the image; using the machine learning model comprising the adjusted inferences to post-process the image to generate updated text descriptive of the image; and processing the generated updated text descriptive of the image outputted by the machine learning model to fine-tune the updated text descriptive of the image.

Type: Application

Filed: August 20, 2020

Publication date: February 24, 2022

Inventors: Pranav AGGARWAL, Di PU, Daniel ReMINE, Ajinkya KALE

1 2 next