Patents by Inventor Christopher James Kelley

Christopher James Kelley has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Visual Indicators of Generative Model Response Details

Publication number: 20250218139

Abstract: Systems and methods for providing visual indications of generative model responses can include obtaining a user input and processing the user input with a generative model to generate a model-generated-response. The systems and methods can process the model-generated response and an image of an environment to generate an augmented image. The augmented image can include visual indicators of the model-generated response, which can include annotating the image based on detected features within the image. Generation of the augmented image can include object detection and annotation based on the content of the model-generated response.

Type: Application

Filed: February 25, 2025

Publication date: July 3, 2025

Inventors: Harshit Kharbanda, Louis Wang, Christopher James Kelley, Jessica Lee, Igor Bonaci, Daniel Valcarce Silva
Visual and audio multimodal searching system

Patent number: 12346386

Abstract: A multimodal search system is described. The system can receive image data captured by a camera of a user device. Additionally, the system can receive audio data associated with the image data. The audio data can be captured by a microphone of the user device. Moreover, the system can process the image data to generate visual features. Furthermore, the system can process the audio data to generate a plurality of words. The system can generate a plurality of search terms based on the plurality of words and the visual features. Subsequently, the system can determine one or more search results associated with the plurality of search terms and provide the one or more search results as an output.

Type: Grant

Filed: April 25, 2023

Date of Patent: July 1, 2025

Assignee: GOOGLE LLC

Inventors: Harshit Kharbanda, Belinda Luna Zeng, Viviana Caso Corella, Christopher James Kelley, Jessica Lee, Pendar Yousefi, Dounia Berrada, Sundeep Vaddadi, Kai Yu, Balint Miklos, Severin Heiniger, Louis Wang
Object Filtering and Information Display in an Augmented-Reality Experience

Publication number: 20250148782

Abstract: Systems and methods for providing scene understanding can include obtaining a plurality of images, stitching images associated with the scene, detecting objects in the scene, and providing information associated with the objects in the scene. The systems and methods can include determining filter tags or query tags that can be selected to filter the plurality of objects, which can then be provided as information to the user to provide further insight on the scene. The information may be provided in an augmented-reality experience via text or other user-interface elements anchored to objects in the images.

Type: Application

Filed: January 9, 2025

Publication date: May 8, 2025

Inventors: Jessica Lee, Christopher James Kelley, Alok Aggarwal, Harshit Kharbanda
Visual Search Determination for Text-To-Image Replacement

Publication number: 20250124075

Abstract: Systems and methods for textual replacement can include the determination of a visual intent, which can trigger an interface for selecting an image to replace visual descriptors. The visually descriptive terms can be identified, and an indicator can be provided to indicate the text replacement option may be initiated. An image can then be selected by a user to replace the visually descriptive terms.

Type: Application

Filed: December 23, 2024

Publication date: April 17, 2025

Inventors: Harshit Kharbanda, Christopher James Kelley, Pendar Yousefi
User verification of a generative response to a multimodal query

Patent number: 12277635

Abstract: A multimodal search system is described. The system can receive image data from a user device. Additionally, the system can receive a prompt associated with the image data. Moreover, the system can determine, using a computer vision model, a first object in the image data that is associated with the prompt. Furthermore, the system can receive, from the user device, a user indication on whether the image data includes the first object. Subsequently, in response to receiving the user indication, the system can generate a response using a large language model.

Type: Grant

Filed: December 7, 2023

Date of Patent: April 15, 2025

Assignee: GOOGLE LLC

Inventors: Harshit Kharbanda, Louis Wang, Christopher James Kelley, Jessica Lee
Multi-image search

Patent number: 12271417

Abstract: Systems and methods for multi-image search can include obtaining two or more images and determining one or more search results that are based on the two or more images. The one or more search results can be determined based on determined shared attributes of the two or more images. The one or more search results may be based on feature embeddings associated with the two or more images. The two or more images may be obtained based on one or more user interactions with one or more databases.

Type: Grant

Filed: April 24, 2023

Date of Patent: April 8, 2025

Assignee: GOOGLE LLC

Inventors: Belinda Luna Zeng, Harshit Kharbanda, Christopher James Kelley, Erica Bjornsson, David William Hendon
Visual indicators of generative model response details

Patent number: 12266065

Abstract: Systems and methods for providing visual indications of generative model responses can include obtaining a user input and processing the user input with a generative model to generate a model-generated-response. The systems and methods can process the model-generated response and an image of an environment to generate an augmented image. The augmented image can include visual indicators of the model-generated response, which can include annotating the image based on detected features within the image. Generation of the augmented image can include object detection and annotation based on the content of the model-generated response.

Type: Grant

Filed: January 10, 2024

Date of Patent: April 1, 2025

Assignee: GOOGLE LLC

Inventors: Harshit Kharbanda, Louis Wang, Christopher James Kelley, Jessica Lee, Igor Bonaci, Daniel Valcarce Silva
Systems and Methods for Analyzing Text Extracted from Images and Performing Appropriate Transformations on the Extracted Text

Publication number: 20250087207

Abstract: The present disclosure provides computer-implemented methods, systems, and devices for responding to requests associated with an image. A computing system obtains, wherein the image depicts a first set of textual content. The computing system determines one or more characteristics of the first set of textual content. The computing system determines a response type from a plurality of response types based on the one or more characteristics. The computing system generates a model input, wherein the model input comprises data descriptive of the first set of textual content and a prompt associated with the response type. The computing system provides providing the model input as an input to a machine-learned language model. The computing system receives a second set of text as an output of the machine-learned language model as a result of the machine-learned language model processing the model input.

Type: Application

Filed: June 6, 2024

Publication date: March 13, 2025

Inventors: Harshit Kharbanda, Jessica Lee, Christopher James Kelley, Fabian Roth, Dounia Berrada, Samer Hassan Hassan, Afroz Mohiuddin, Misha Khalman, Ali Essam Ali Elqursh, Belinda Luna Zeng
Object filtering and information display in an augmented-reality experience

Patent number: 12230030

Abstract: Systems and methods for providing scene understanding can include obtaining a plurality of images, stitching images associated with the scene, detecting objects in the scene, and providing information associated with the objects in the scene. The systems and methods can include determining filter tags or query tags that can be selected to filter the plurality of objects, which can then be provided as information to the user to provide further insight on the scene. The information may be provided in an augmented-reality experience via text or other user-interface elements anchored to objects in the images.

Type: Grant

Filed: December 20, 2022

Date of Patent: February 18, 2025

Assignee: GOOGLE LLC

Inventors: Jessica Lee, Christopher James Kelley, Alok Aggarwal, Harshit Kharbanda
Visual search determination for text-to-image replacement

Patent number: 12216703

Abstract: Systems and methods for textual replacement can include the determination of a visual intent, which can trigger an interface for selecting an image to replace visual descriptors. The visually descriptive terms can be identified, and an indicator can be provided to indicate the text replacement option may be initiated. An image can then be selected by a user to replace the visually descriptive terms.

Type: Grant

Filed: October 18, 2022

Date of Patent: February 4, 2025

Assignee: GOOGLE LLC

Inventors: Harshit Kharbanda, Christopher James Kelley, Pendar Yousefi
Video and Audio Multimodal Searching System

Publication number: 20240403362

Abstract: A multimodal search system using a video query is described. The system can receive video data captured by a camera of a user device. The video data can have a sequence of image frames. Additionally, the system can receive audio data associated with the video data captured by the user device. Moreover, the system can process, using one or more machine-learned models, the sequence of image frames to generate video embeddings related to the sequence of the image frames. The video embeddings can have a plurality of image embeddings associated with the sequence of image frames. Furthermore, the system can determine one or more video results based on the video embeddings and the audio data. Subsequently, the system can transmit, to the user device, the one or more video results.

Type: Application

Filed: May 31, 2023

Publication date: December 5, 2024

Inventors: Harshit Kharbanda, Belinda Luna Zeng, Viviana Caso Corella, Aashi Jain, David William Hendon, Christopher James Kelley, Jessica Lee, Dounia Berrada, Kai Yu, Louis Wang, Thomas J. Duerig, Radu Soricut, Robin Dua
Search with Machine-Learned Model-generated Queries

Publication number: 20240394768

Abstract: Systems and methods for searching using machine-learned model-generated outputs can provide a user with a medium for generating a theoretical dataset that can then be matched to a real world example. The systems and methods can include selecting a plurality of terms, which can be utilized to generate a prompt input that can be processed by a dataset generation model to generate a plurality of model-generated datasets. A selection can then be received that selects a particular model-generated database to utilize to query a database.

Type: Application

Filed: August 8, 2024

Publication date: November 28, 2024

Inventors: Harshit Kharbanda, Arash Sadr, Alice Au Quan, Belinda Luna Zeng, Christopher James Kelley, Jieming Yu, Minsang Choi
Visual Citations for Information Provided in Response to Multimodal Queries

Publication number: 20240378237

Abstract: Result images are retrieved based on a similarity to a query image. A set of textual inputs is processed with a machine-learned language model to obtain a language output comprising textual content, wherein the set of textual inputs comprises textual content from source documents that include the result images, and a prompt associated with the query image. The language output and the result images are provided to a user computing device. Information is received descriptive of an indication by a user that a first result image is visually dissimilar to the query image. Textual content associated with the source document that includes the first result image from the set of textual inputs is removed. The set of textual inputs is processed with the machine-learned language model to obtain a refined language output. The refined language output is provided to the user computing device.

Type: Application

Filed: May 9, 2023

Publication date: November 14, 2024

Inventors: Harshit Kharbanda, Jessica Lee, Christopher James Kelley, Belinda Luna Zeng, Louis Wang
Visual Citations for Information Provided in Response to Multimodal Queries

Publication number: 20240378236

Abstract: A result image is retrieved based on a similarity between a query image and the result image. A first unit of text is obtained, wherein the first unit of text comprises at least a portion of textual content of a source document that includes the result image. A second unit of text is determined responsive to a prompt associated with the query image, wherein the second unit of text comprises one or more of (a) at least some of the first unit of text, or (b) text derived from the first unit of text. The second unit of text and the result image are provided for display within an interface.

Type: Application

Filed: May 9, 2023

Publication date: November 14, 2024

Inventors: Harshit Kharbanda, Jessica Lee, Christopher James Kelley, Belinda Luna Zeng, Louis Wang
Visual and Audio Multimodal Searching System

Publication number: 20240362279

Abstract: A multimodal search system is described. The system can receive image data captured by a camera of a user device. Additionally, the system can receive audio data associated with the image data. The audio data can be captured by a microphone of the user device. Moreover, the system can process the image data to generate visual features. Furthermore, the system can process the audio data to generate a plurality of words. The system can generate a plurality of search terms based on the plurality of words and the visual features. Subsequently, the system can determine one or more search results associated with the plurality of search terms and provide the one or more search results as an output.

Type: Application

Filed: April 25, 2023

Publication date: October 31, 2024

Inventors: Harshit Kharbanda, Belinda Luna Zeng, Viviana Caso Corella, Christopher James Kelley, Jessica Lee, Pendar Yousefi, Dounia Berrada, Sundeep Vaddadi, Kai Yu, Balint Miklos, Severin Heiniger, Louis Wang
Multi-Image Search

Publication number: 20240354332

Abstract: Systems and methods for multi-image search can include obtaining two or more images and determining one or more search results that are based on the two or more images. The one or more search results can be determined based on determined shared attributes of the two or more images. The one or more search results may be based on feature embeddings associated with the two or more images. The two or more images may be obtained based on one or more user interactions with one or more databases.

Type: Application

Filed: April 24, 2023

Publication date: October 24, 2024

Inventors: Belinda Luna Zeng, Harshit Kharbanda, Christopher James Kelley, Erica Bjornsson, David William Hendon
Search with machine-learned model-generated queries

Patent number: 12086857

Abstract: Systems and methods for searching using machine-learned model-generated outputs can provide a user with a medium for generating a theoretical dataset that can then be matched to a real world example. The systems and methods can include selecting a plurality of terms, which can be utilized to generate a prompt input that can be processed by a dataset generation model to generate a plurality of model-generated datasets. A selection can then be received that selects a particular model-generated database to utilize to query a database.

Type: Grant

Filed: March 31, 2023

Date of Patent: September 10, 2024

Assignee: GOOGLE LLC

Inventors: Harshit Kharbanda, Arash Sadr, Alice Au Quan, Belinda Luna Zeng, Christopher James Kelley, Jieming Yu, Minsang Choi
Systems and methods for analyzing text extracted from images and performing appropriate transformations on the extracted text

Patent number: 12033620

Abstract: The present disclosure provides computer-implemented methods, systems, and devices for responding to requests associated with an image. A computing system obtains, wherein the image depicts a first set of textual content. The computing system determines one or more characteristics of the first set of textual content. The computing system determines a response type from a plurality of response types based on the one or more characteristics. The computing system generates a model input, wherein the model input comprises data descriptive of the first set of textual content and a prompt associated with the response type. The computing system provides providing the model input as an input to a machine-learned language model. The computing system receives a second set of text as an output of the machine-learned language model as a result of the machine-learned language model processing the model input.

Type: Grant

Filed: September 8, 2023

Date of Patent: July 9, 2024

Assignee: GOOGLE LLC

Inventors: Harshit Kharbanda, Jessica Lee, Christopher James Kelley, Fabian Roth, Dounia Berrada, Samer Hassan Hassan, Afroz Mohiuddin, Mikhail Khalman, Ali Essam Ali Elqursh, Belinda Luna Zeng
Search with Machine-Learned Model-generated Queries

Publication number: 20240202795

Abstract: Systems and methods for searching using machine-learned model-generated outputs can provide a user with a medium for generating a theoretical dataset that can then be matched to a real world example. The systems and methods can include selecting a plurality of terms, which can be utilized to generate a prompt input that can be processed by a dataset generation model to generate a plurality of model-generated datasets. A selection can then be received that selects a particular model-generated database to utilize to query a database.

Type: Application

Filed: March 31, 2023

Publication date: June 20, 2024

Inventors: Harshit Kharbanda, Arash Sadr, Alice Au Quan, Belinda Luna Zeng, Christopher James Kelley, Jieming Yu, Minsang Choi
Visual Search Determination for Text-To-Image Replacement

Publication number: 20240126807

Abstract: Systems and methods for textual replacement can include the determination of a visual intent, which can trigger an interface for selecting an image to replace visual descriptors. The visually descriptive terms can be identified, and an indicator can be provided to indicate the text replacement option may be initiated. An image can then be selected by a user to replace the visually descriptive terms.

Type: Application

Filed: October 18, 2022

Publication date: April 18, 2024

Inventors: Harshit Kharbanda, Christopher James Kelley, Pendar Yousefi

1 2 next