Patents by Inventor Pranav AGGARWAL

Pranav AGGARWAL has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11948374
    Abstract: In some embodiments, apparatuses and methods are provided herein useful to train a machine learning algorithm to detect text of interest. In some embodiments, there is provided a system to detect vertically oriented text of interest including a first data set comprising a plurality of captured digital images each depicting an object of interest and a second data set comprising a plurality of augmented digital images each depicting a captured digital image augmented with a synthetic text image; a first control circuit configured to cause the machine learning algorithm to output a machine learning model trained to automatically detect occurrences of vertically oriented text of interest based on the first data set and the second data set; at least one camera; and a second control circuit configured to execute the machine learning model to automatically detect vertically oriented text of interest on the object of interest.
    Type: Grant
    Filed: July 20, 2021
    Date of Patent: April 2, 2024
    Assignee: WALMART APOLLO, LLC
    Inventors: Ramanujam Ramaswamy Srinivasa, Manish Kumar, Pranav Aggarwal
  • Patent number: 11914641
    Abstract: The present disclosure describes systems and methods for information retrieval. Embodiments of the disclosure provide a color embedding network trained using machine learning techniques to generate embedded color representations for color terms included in a text search query. For example, techniques described herein are used to represent color text in a same space as color embeddings (e.g., an embedding space created by determining a histogram of LAB based colors in a three-dimensional (3D) space). Further, techniques are described for indexing color palettes for all the searchable images in the search space. Accordingly, color terms in a text query are directly converted into a color palette and an image search system can return one or more search images with corresponding color palettes that are relevant to (e.g., within a threshold distance from) the color palette of the text query.
    Type: Grant
    Filed: February 26, 2021
    Date of Patent: February 27, 2024
    Assignee: ADOBE INC.
    Inventors: Pranav Aggarwal, Ajinkya Kale, Baldo Faieta, Saeid Motiian, Venkata naveen kumar yadav Marri
  • Publication number: 20230419551
    Abstract: Techniques for generating a novel image using tokenized image representations are disclosed. In some embodiments, a method of generating the novel image includes generating, via a first machine learning model, a first sequence of coded representations of a first image having one or more features; generating, via a second machine learning model, a second sequence of coded representations of a sketch image having one or more edge features associated with the one or more features; predicting, via a third machine learning model, one or more subsequent coded representations based on the first sequence of coded representations and the second sequence of coded representations; and based on the subsequent coded representations, generating, via the third machine learning model, a first portion of a reconstructed image having one or more image attributes of the first image, and a second portion of the reconstructed image associated with the one or more edge features.
    Type: Application
    Filed: June 22, 2022
    Publication date: December 28, 2023
    Inventors: Midhun Harikumar, Pranav Aggarwal, Ajinkya Gorakhnath Kale
  • Publication number: 20230401877
    Abstract: This application related to automatic processes for identifying and extracting information from images of documents of varying layouts. For example, a computing device may receive an image of a document, where the image includes a plurality of color channels. The computing device applies a character recognition process to the image to generate optical character recognition data. Further, the computing device determines an area of the image that includes one or more characters based on the optical character recognition data. The computing device adjusts a value of each of a plurality of pixels corresponding to the area of the image determined for each character based on a value of each corresponding character to generate a modified image. The computing device then applies a trained machine learning process to the modified image to generate output data. The output data characterizes characters, such as words and number values, within the original image.
    Type: Application
    Filed: May 27, 2022
    Publication date: December 14, 2023
    Inventors: Ramanujam Ramaswamy Srinivasa, Pranav Aggarwal
  • Publication number: 20230315988
    Abstract: Disclosed are computer-implemented methods and systems for generating text descriptive of digital images, comprising using a machine learning model to pre-process an image to generate initial text descriptive of the image; adjusting one or more inferences of the machine learning model, the inferences biasing the machine learning model away from associating negative words with the image; using the machine learning model comprising the adjusted inferences to post-process the image to generate updated text descriptive of the image; and processing the generated updated text descriptive of the image outputted by the machine learning model to fine-tune the updated text descriptive of the image.
    Type: Application
    Filed: May 10, 2023
    Publication date: October 5, 2023
    Applicant: Adobe Inc.
    Inventors: Pranav Aggarwal, Di Pu, Daniel ReMine, Ajinkya Kale
  • Patent number: 11756239
    Abstract: Systems and methods for color replacement are described. Embodiments of the disclosure include a color replacement system that adjusts an image based on a user-input source color and target color. For example, the source color may be replaced with the target color throughout the entire image. In some embodiments, a user provides a speech or text input that identifies a source color to be replaced. The user may then provide a speech or text input identifying the target color, replacing the source color. A color replacement system creates and embedding of the source color, segments the image based on the source color embedding, and then replaces the color of segmented portion of the image with the target color.
    Type: Grant
    Filed: April 26, 2021
    Date of Patent: September 12, 2023
    Assignee: ADOBE, INC.
    Inventors: Pranav Aggarwal, Ajinkya Kale
  • Patent number: 11734339
    Abstract: The present disclosure relates to methods, systems, and non-transitory computer-readable media for retrieving digital images in response to queries. For example, in one or more embodiments, the disclosed systems receive a query comprising text and generates a cross-lingual-multimodal embedding for the text within a multimodal embedding space. The disclosed systems further identifies an image embedding for a digital image that corresponds to (e.g., is relevant to) the text from the query based on an embedding distance between the image embedding and the cross-lingual-multimodal embedding for the text within the multimodal embedding space. Accordingly, the disclosed systems retrieve the digital image associated with the image embedding for display on a client device, such as the client device that submitted the query.
    Type: Grant
    Filed: October 20, 2020
    Date of Patent: August 22, 2023
    Assignee: Adobe Inc.
    Inventors: Ajinkya Kale, Zhe Lin, Pranav Aggarwal
  • Publication number: 20230206525
    Abstract: A non-transitory computer-readable medium includes program code that is stored thereon. The program code is executable by one or more processing devices for performing operations including generating, using a model, a learned image representation of a target image. The operations further include generating, using a text embedding model, a text embedding of a text query. The text embedding and the learned image representation of the target image are in a same embedding space. Additionally, the operations include convolving the learned image representation of the target image with the text embedding of the text query. Moreover, the operations include generating an object-segmented image based on the convolving of the learned image representation of the target image with the text embedding.
    Type: Application
    Filed: March 3, 2023
    Publication date: June 29, 2023
    Inventors: Midhun Harikumar, Pranav Aggarwal, Baldo Faieta, Ajinkya Kale, Zhe Lin
  • Patent number: 11687714
    Abstract: Disclosed are computer-implemented methods and systems for generating text descriptive of digital images, comprising using a machine learning model to pre-process an image to generate initial text descriptive of the image; adjusting one or more inferences of the machine learning model, the inferences biasing the machine learning model away from associating negative words with the image; using the machine learning model comprising the adjusted inferences to post-process the image to generate updated text descriptive of the image; and processing the generated updated text descriptive of the image outputted by the machine learning model to fine-tune the updated text descriptive of the image.
    Type: Grant
    Filed: August 20, 2020
    Date of Patent: June 27, 2023
    Assignee: Adobe Inc.
    Inventors: Pranav Aggarwal, Di Pu, Daniel ReMine, Ajinkya Kale
  • Patent number: 11645478
    Abstract: Introduced here is an approach to translating tags assigned to digital images. As an example, embeddings may be extracted from a tag to be translated and the digital image with which the tag is associated by a multimodal model. These embeddings can be compared to embeddings extracted from a set of target tags associated with a target language by the multimodal model. Such an approach allows similarity to be established along two dimensions, which ensures the obstacles associated with direct translation can be avoided.
    Type: Grant
    Filed: November 4, 2020
    Date of Patent: May 9, 2023
    Assignee: Adobe Inc.
    Inventors: Ritiz Tambi, Pranav Aggarwal, Ajinkya Kale
  • Patent number: 11615567
    Abstract: A non-transitory computer-readable medium includes program code that is stored thereon. The program code is executable by one or more processing devices for performing operations including generating, by a model that includes trainable components, a learned image representation of a target image. The operations further include generating, by a text embedding model, a text embedding of a text query. The text embedding and the learned image representation of the target image are in a same embedding space. Additionally, the operations include generating a class activation map of the target image by, at least, convolving the learned image representation of the target image with the text embedding of the text query. Moreover, the operations include generating an object-segmented image using the class activation map of the target image.
    Type: Grant
    Filed: November 18, 2020
    Date of Patent: March 28, 2023
    Assignee: Adobe Inc.
    Inventors: Midhun Harikumar, Pranav Aggarwal, Baldo Faieta, Ajinkya Kale, Zhe Lin
  • Publication number: 20230021506
    Abstract: In some embodiments, apparatuses and methods are provided herein useful to train a machine learning algorithm to detect text of interest. In some embodiments, there is provided a system to detect vertically oriented text of interest including a first data set comprising a plurality of captured digital images each depicting an object of interest and a second data set comprising a plurality of augmented digital images each depicting a captured digital image augmented with a synthetic text image; a first control circuit configured to cause the machine learning algorithm to output a machine learning model trained to automatically detect occurrences of vertically oriented text of interest based on the first data set and the second data set; at least one camera; and a second control circuit configured to execute the machine learning model to automatically detect vertically oriented text of interest on the object of interest.
    Type: Application
    Filed: July 20, 2021
    Publication date: January 26, 2023
    Inventors: Ramanujam Ramaswamy Srinivasa, Manish Kumar, Pranav Aggarwal
  • Publication number: 20230025548
    Abstract: In some embodiments, apparatuses and methods are provided herein useful to determine text on an object. In some embodiments, there is provided a system to determine text of interest on an object of interest including at least one camera and a control circuit configured to execute a machine learning model trained to identify the text of interest, group into a cluster each node point that is located substantially in the same location in the text of interest, determine a score value of each particular character in the cluster, identify the particular character that has a determined score value corresponding to at least a threshold score value relative to all characters in the cluster, assign the particular character having the determined score value corresponding to at least the threshold score value as a recognized character in the cluster, and transmit to a display monitor overlay data.
    Type: Application
    Filed: July 20, 2021
    Publication date: January 26, 2023
    Inventors: Ramanujam Ramaswamy Srinivasa, Manish Kumar, Pranav Aggarwal
  • Publication number: 20220343561
    Abstract: Systems and methods for color replacement are described. Embodiments of the disclosure include a color replacement system that adjusts an image based on a user-input source color and target color. For example, the source color may be replaced with the target color throughout the entire image. In some embodiments, a user provides a speech or text input that identifies a source color to be replaced. The user may then provide a speech or text input identifying the target color, replacing the source color. A color replacement system creates and embedding of the source color, segments the image based on the source color embedding, and then replaces the color of segmented portion of the image with the target color.
    Type: Application
    Filed: April 26, 2021
    Publication date: October 27, 2022
    Inventors: Pranav Aggarwal, Ajinkya Kale
  • Publication number: 20220277039
    Abstract: The present disclosure describes systems and methods for information retrieval. Embodiments of the disclosure provide a color embedding network trained using machine learning techniques to generate embedded color representations for color terms included in a text search query. For example, techniques described herein are used to represent color text in a same space as color embeddings (e.g., an embedding space created by determining a histogram of LAB based colors in a three-dimensional (3D) space). Further, techniques are described for indexing color palettes for all the searchable images in the search space. Accordingly, color terms in a text query are directly converted into a color palette and an image search system can return one or more search images with corresponding color palettes that are relevant to (e.g., within a threshold distance from) the color palette of the text query.
    Type: Application
    Filed: February 26, 2021
    Publication date: September 1, 2022
    Inventors: PRANAV AGGARWAL, Ajinkya Kale, Baldo Faieta, Saeid Motiian, Venkata naveen kumar yadav Marri
  • Publication number: 20220156992
    Abstract: A non-transitory computer-readable medium includes program code that is stored thereon. The program code is executable by one or more processing devices for performing operations including generating, by a model that includes trainable components, a learned image representation of a target image. The operations further include generating, by a text embedding model, a text embedding of a text query. The text embedding and the learned image representation of the target image are in a same embedding space. Additionally, the operations include generating a class activation map of the target image by, at least, convolving the learned image representation of the target image with the text embedding of the text query. Moreover, the operations include generating an object-segmented image using the class activation map of the target image.
    Type: Application
    Filed: November 18, 2020
    Publication date: May 19, 2022
    Inventors: Midhun Harikumar, Pranav Aggarwal, Baldo Faieta, Ajinkya Kale, Zhe Lin
  • Publication number: 20220138439
    Abstract: Introduced here is an approach to translating tags assigned to digital images. As an example, embeddings may be extracted from a tag to be translated and the digital image with which the tag is associated by a multimodal model. These embeddings can be compared to embeddings extracted from a set of target tags associated with a target language by the multimodal model. Such an approach allows similarity to be established along two dimensions, which ensures the obstacles associated with direct translation can be avoided.
    Type: Application
    Filed: November 4, 2020
    Publication date: May 5, 2022
    Inventors: Ritiz Tambi, Pranav Aggarwal, Ajinkya Kale
  • Publication number: 20220121702
    Abstract: The present disclosure relates to methods, systems, and non-transitory computer-readable media for retrieving digital images in response to queries. For example, in one or more embodiments, the disclosed systems receive a query comprising text and generates a cross-lingual-multimodal embedding for the text within a multimodal embedding space. The disclosed systems further identifies an image embedding for a digital image that corresponds to (e.g., is relevant to) the text from the query based on an embedding distance between the image embedding and the cross-lingual-multimodal embedding for the text within the multimodal embedding space. Accordingly, the disclosed systems retrieve the digital image associated with the image embedding for display on a client device, such as the client device that submitted the query.
    Type: Application
    Filed: October 20, 2020
    Publication date: April 21, 2022
    Inventors: Ajinkya Kale, Zhe Lin, Pranav Aggarwal
  • Publication number: 20220114361
    Abstract: Embodiments are disclosed for training an image caption generator model to generate phrase tags for input images. The phrase tags can include short phrases that describe the contents of the images (e.g., objects depicted therein). Once trained, the image caption generator model can be used as an image phrase tagger to tag input images from an image library with phrase tags. The image library can be indexed based on their phrase tags. Subsequently, when the image library is queried, the query can be divided into phrases and the index can be used to identify matching images.
    Type: Application
    Filed: October 14, 2020
    Publication date: April 14, 2022
    Inventors: Ajinkya KALE, Pranav AGGARWAL
  • Publication number: 20220058340
    Abstract: Disclosed are computer-implemented methods and systems for generating text descriptive of digital images, comprising using a machine learning model to pre-process an image to generate initial text descriptive of the image; adjusting one or more inferences of the machine learning model, the inferences biasing the machine learning model away from associating negative words with the image; using the machine learning model comprising the adjusted inferences to post-process the image to generate updated text descriptive of the image; and processing the generated updated text descriptive of the image outputted by the machine learning model to fine-tune the updated text descriptive of the image.
    Type: Application
    Filed: August 20, 2020
    Publication date: February 24, 2022
    Inventors: Pranav AGGARWAL, Di PU, Daniel ReMINE, Ajinkya KALE