Patents by Inventor Pranav AGGARWAL
Pranav AGGARWAL has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11948374Abstract: In some embodiments, apparatuses and methods are provided herein useful to train a machine learning algorithm to detect text of interest. In some embodiments, there is provided a system to detect vertically oriented text of interest including a first data set comprising a plurality of captured digital images each depicting an object of interest and a second data set comprising a plurality of augmented digital images each depicting a captured digital image augmented with a synthetic text image; a first control circuit configured to cause the machine learning algorithm to output a machine learning model trained to automatically detect occurrences of vertically oriented text of interest based on the first data set and the second data set; at least one camera; and a second control circuit configured to execute the machine learning model to automatically detect vertically oriented text of interest on the object of interest.Type: GrantFiled: July 20, 2021Date of Patent: April 2, 2024Assignee: WALMART APOLLO, LLCInventors: Ramanujam Ramaswamy Srinivasa, Manish Kumar, Pranav Aggarwal
-
Patent number: 11914641Abstract: The present disclosure describes systems and methods for information retrieval. Embodiments of the disclosure provide a color embedding network trained using machine learning techniques to generate embedded color representations for color terms included in a text search query. For example, techniques described herein are used to represent color text in a same space as color embeddings (e.g., an embedding space created by determining a histogram of LAB based colors in a three-dimensional (3D) space). Further, techniques are described for indexing color palettes for all the searchable images in the search space. Accordingly, color terms in a text query are directly converted into a color palette and an image search system can return one or more search images with corresponding color palettes that are relevant to (e.g., within a threshold distance from) the color palette of the text query.Type: GrantFiled: February 26, 2021Date of Patent: February 27, 2024Assignee: ADOBE INC.Inventors: Pranav Aggarwal, Ajinkya Kale, Baldo Faieta, Saeid Motiian, Venkata naveen kumar yadav Marri
-
Publication number: 20230419551Abstract: Techniques for generating a novel image using tokenized image representations are disclosed. In some embodiments, a method of generating the novel image includes generating, via a first machine learning model, a first sequence of coded representations of a first image having one or more features; generating, via a second machine learning model, a second sequence of coded representations of a sketch image having one or more edge features associated with the one or more features; predicting, via a third machine learning model, one or more subsequent coded representations based on the first sequence of coded representations and the second sequence of coded representations; and based on the subsequent coded representations, generating, via the third machine learning model, a first portion of a reconstructed image having one or more image attributes of the first image, and a second portion of the reconstructed image associated with the one or more edge features.Type: ApplicationFiled: June 22, 2022Publication date: December 28, 2023Inventors: Midhun Harikumar, Pranav Aggarwal, Ajinkya Gorakhnath Kale
-
Publication number: 20230401877Abstract: This application related to automatic processes for identifying and extracting information from images of documents of varying layouts. For example, a computing device may receive an image of a document, where the image includes a plurality of color channels. The computing device applies a character recognition process to the image to generate optical character recognition data. Further, the computing device determines an area of the image that includes one or more characters based on the optical character recognition data. The computing device adjusts a value of each of a plurality of pixels corresponding to the area of the image determined for each character based on a value of each corresponding character to generate a modified image. The computing device then applies a trained machine learning process to the modified image to generate output data. The output data characterizes characters, such as words and number values, within the original image.Type: ApplicationFiled: May 27, 2022Publication date: December 14, 2023Inventors: Ramanujam Ramaswamy Srinivasa, Pranav Aggarwal
-
Publication number: 20230315988Abstract: Disclosed are computer-implemented methods and systems for generating text descriptive of digital images, comprising using a machine learning model to pre-process an image to generate initial text descriptive of the image; adjusting one or more inferences of the machine learning model, the inferences biasing the machine learning model away from associating negative words with the image; using the machine learning model comprising the adjusted inferences to post-process the image to generate updated text descriptive of the image; and processing the generated updated text descriptive of the image outputted by the machine learning model to fine-tune the updated text descriptive of the image.Type: ApplicationFiled: May 10, 2023Publication date: October 5, 2023Applicant: Adobe Inc.Inventors: Pranav Aggarwal, Di Pu, Daniel ReMine, Ajinkya Kale
-
Patent number: 11756239Abstract: Systems and methods for color replacement are described. Embodiments of the disclosure include a color replacement system that adjusts an image based on a user-input source color and target color. For example, the source color may be replaced with the target color throughout the entire image. In some embodiments, a user provides a speech or text input that identifies a source color to be replaced. The user may then provide a speech or text input identifying the target color, replacing the source color. A color replacement system creates and embedding of the source color, segments the image based on the source color embedding, and then replaces the color of segmented portion of the image with the target color.Type: GrantFiled: April 26, 2021Date of Patent: September 12, 2023Assignee: ADOBE, INC.Inventors: Pranav Aggarwal, Ajinkya Kale
-
Patent number: 11734339Abstract: The present disclosure relates to methods, systems, and non-transitory computer-readable media for retrieving digital images in response to queries. For example, in one or more embodiments, the disclosed systems receive a query comprising text and generates a cross-lingual-multimodal embedding for the text within a multimodal embedding space. The disclosed systems further identifies an image embedding for a digital image that corresponds to (e.g., is relevant to) the text from the query based on an embedding distance between the image embedding and the cross-lingual-multimodal embedding for the text within the multimodal embedding space. Accordingly, the disclosed systems retrieve the digital image associated with the image embedding for display on a client device, such as the client device that submitted the query.Type: GrantFiled: October 20, 2020Date of Patent: August 22, 2023Assignee: Adobe Inc.Inventors: Ajinkya Kale, Zhe Lin, Pranav Aggarwal
-
Publication number: 20230206525Abstract: A non-transitory computer-readable medium includes program code that is stored thereon. The program code is executable by one or more processing devices for performing operations including generating, using a model, a learned image representation of a target image. The operations further include generating, using a text embedding model, a text embedding of a text query. The text embedding and the learned image representation of the target image are in a same embedding space. Additionally, the operations include convolving the learned image representation of the target image with the text embedding of the text query. Moreover, the operations include generating an object-segmented image based on the convolving of the learned image representation of the target image with the text embedding.Type: ApplicationFiled: March 3, 2023Publication date: June 29, 2023Inventors: Midhun Harikumar, Pranav Aggarwal, Baldo Faieta, Ajinkya Kale, Zhe Lin
-
Patent number: 11687714Abstract: Disclosed are computer-implemented methods and systems for generating text descriptive of digital images, comprising using a machine learning model to pre-process an image to generate initial text descriptive of the image; adjusting one or more inferences of the machine learning model, the inferences biasing the machine learning model away from associating negative words with the image; using the machine learning model comprising the adjusted inferences to post-process the image to generate updated text descriptive of the image; and processing the generated updated text descriptive of the image outputted by the machine learning model to fine-tune the updated text descriptive of the image.Type: GrantFiled: August 20, 2020Date of Patent: June 27, 2023Assignee: Adobe Inc.Inventors: Pranav Aggarwal, Di Pu, Daniel ReMine, Ajinkya Kale
-
Patent number: 11645478Abstract: Introduced here is an approach to translating tags assigned to digital images. As an example, embeddings may be extracted from a tag to be translated and the digital image with which the tag is associated by a multimodal model. These embeddings can be compared to embeddings extracted from a set of target tags associated with a target language by the multimodal model. Such an approach allows similarity to be established along two dimensions, which ensures the obstacles associated with direct translation can be avoided.Type: GrantFiled: November 4, 2020Date of Patent: May 9, 2023Assignee: Adobe Inc.Inventors: Ritiz Tambi, Pranav Aggarwal, Ajinkya Kale
-
Patent number: 11615567Abstract: A non-transitory computer-readable medium includes program code that is stored thereon. The program code is executable by one or more processing devices for performing operations including generating, by a model that includes trainable components, a learned image representation of a target image. The operations further include generating, by a text embedding model, a text embedding of a text query. The text embedding and the learned image representation of the target image are in a same embedding space. Additionally, the operations include generating a class activation map of the target image by, at least, convolving the learned image representation of the target image with the text embedding of the text query. Moreover, the operations include generating an object-segmented image using the class activation map of the target image.Type: GrantFiled: November 18, 2020Date of Patent: March 28, 2023Assignee: Adobe Inc.Inventors: Midhun Harikumar, Pranav Aggarwal, Baldo Faieta, Ajinkya Kale, Zhe Lin
-
Publication number: 20230021506Abstract: In some embodiments, apparatuses and methods are provided herein useful to train a machine learning algorithm to detect text of interest. In some embodiments, there is provided a system to detect vertically oriented text of interest including a first data set comprising a plurality of captured digital images each depicting an object of interest and a second data set comprising a plurality of augmented digital images each depicting a captured digital image augmented with a synthetic text image; a first control circuit configured to cause the machine learning algorithm to output a machine learning model trained to automatically detect occurrences of vertically oriented text of interest based on the first data set and the second data set; at least one camera; and a second control circuit configured to execute the machine learning model to automatically detect vertically oriented text of interest on the object of interest.Type: ApplicationFiled: July 20, 2021Publication date: January 26, 2023Inventors: Ramanujam Ramaswamy Srinivasa, Manish Kumar, Pranav Aggarwal
-
Publication number: 20230025548Abstract: In some embodiments, apparatuses and methods are provided herein useful to determine text on an object. In some embodiments, there is provided a system to determine text of interest on an object of interest including at least one camera and a control circuit configured to execute a machine learning model trained to identify the text of interest, group into a cluster each node point that is located substantially in the same location in the text of interest, determine a score value of each particular character in the cluster, identify the particular character that has a determined score value corresponding to at least a threshold score value relative to all characters in the cluster, assign the particular character having the determined score value corresponding to at least the threshold score value as a recognized character in the cluster, and transmit to a display monitor overlay data.Type: ApplicationFiled: July 20, 2021Publication date: January 26, 2023Inventors: Ramanujam Ramaswamy Srinivasa, Manish Kumar, Pranav Aggarwal
-
Publication number: 20220343561Abstract: Systems and methods for color replacement are described. Embodiments of the disclosure include a color replacement system that adjusts an image based on a user-input source color and target color. For example, the source color may be replaced with the target color throughout the entire image. In some embodiments, a user provides a speech or text input that identifies a source color to be replaced. The user may then provide a speech or text input identifying the target color, replacing the source color. A color replacement system creates and embedding of the source color, segments the image based on the source color embedding, and then replaces the color of segmented portion of the image with the target color.Type: ApplicationFiled: April 26, 2021Publication date: October 27, 2022Inventors: Pranav Aggarwal, Ajinkya Kale
-
Publication number: 20220277039Abstract: The present disclosure describes systems and methods for information retrieval. Embodiments of the disclosure provide a color embedding network trained using machine learning techniques to generate embedded color representations for color terms included in a text search query. For example, techniques described herein are used to represent color text in a same space as color embeddings (e.g., an embedding space created by determining a histogram of LAB based colors in a three-dimensional (3D) space). Further, techniques are described for indexing color palettes for all the searchable images in the search space. Accordingly, color terms in a text query are directly converted into a color palette and an image search system can return one or more search images with corresponding color palettes that are relevant to (e.g., within a threshold distance from) the color palette of the text query.Type: ApplicationFiled: February 26, 2021Publication date: September 1, 2022Inventors: PRANAV AGGARWAL, Ajinkya Kale, Baldo Faieta, Saeid Motiian, Venkata naveen kumar yadav Marri
-
Publication number: 20220156992Abstract: A non-transitory computer-readable medium includes program code that is stored thereon. The program code is executable by one or more processing devices for performing operations including generating, by a model that includes trainable components, a learned image representation of a target image. The operations further include generating, by a text embedding model, a text embedding of a text query. The text embedding and the learned image representation of the target image are in a same embedding space. Additionally, the operations include generating a class activation map of the target image by, at least, convolving the learned image representation of the target image with the text embedding of the text query. Moreover, the operations include generating an object-segmented image using the class activation map of the target image.Type: ApplicationFiled: November 18, 2020Publication date: May 19, 2022Inventors: Midhun Harikumar, Pranav Aggarwal, Baldo Faieta, Ajinkya Kale, Zhe Lin
-
Publication number: 20220138439Abstract: Introduced here is an approach to translating tags assigned to digital images. As an example, embeddings may be extracted from a tag to be translated and the digital image with which the tag is associated by a multimodal model. These embeddings can be compared to embeddings extracted from a set of target tags associated with a target language by the multimodal model. Such an approach allows similarity to be established along two dimensions, which ensures the obstacles associated with direct translation can be avoided.Type: ApplicationFiled: November 4, 2020Publication date: May 5, 2022Inventors: Ritiz Tambi, Pranav Aggarwal, Ajinkya Kale
-
Publication number: 20220121702Abstract: The present disclosure relates to methods, systems, and non-transitory computer-readable media for retrieving digital images in response to queries. For example, in one or more embodiments, the disclosed systems receive a query comprising text and generates a cross-lingual-multimodal embedding for the text within a multimodal embedding space. The disclosed systems further identifies an image embedding for a digital image that corresponds to (e.g., is relevant to) the text from the query based on an embedding distance between the image embedding and the cross-lingual-multimodal embedding for the text within the multimodal embedding space. Accordingly, the disclosed systems retrieve the digital image associated with the image embedding for display on a client device, such as the client device that submitted the query.Type: ApplicationFiled: October 20, 2020Publication date: April 21, 2022Inventors: Ajinkya Kale, Zhe Lin, Pranav Aggarwal
-
Publication number: 20220114361Abstract: Embodiments are disclosed for training an image caption generator model to generate phrase tags for input images. The phrase tags can include short phrases that describe the contents of the images (e.g., objects depicted therein). Once trained, the image caption generator model can be used as an image phrase tagger to tag input images from an image library with phrase tags. The image library can be indexed based on their phrase tags. Subsequently, when the image library is queried, the query can be divided into phrases and the index can be used to identify matching images.Type: ApplicationFiled: October 14, 2020Publication date: April 14, 2022Inventors: Ajinkya KALE, Pranav AGGARWAL
-
Publication number: 20220058340Abstract: Disclosed are computer-implemented methods and systems for generating text descriptive of digital images, comprising using a machine learning model to pre-process an image to generate initial text descriptive of the image; adjusting one or more inferences of the machine learning model, the inferences biasing the machine learning model away from associating negative words with the image; using the machine learning model comprising the adjusted inferences to post-process the image to generate updated text descriptive of the image; and processing the generated updated text descriptive of the image outputted by the machine learning model to fine-tune the updated text descriptive of the image.Type: ApplicationFiled: August 20, 2020Publication date: February 24, 2022Inventors: Pranav AGGARWAL, Di PU, Daniel ReMINE, Ajinkya KALE