Patents by Inventor Karan Sikka

Karan Sikka has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

System and method for content comprehension and response

Patent number: 11934793

Abstract: A method, apparatus and system for training an embedding space for content comprehension and response includes, for each layer of a hierarchical taxonomy having at least two layers including respective words resulting in layers of varying complexity, determining a set of words associated with a layer of the hierarchical taxonomy, determining a question answer pair based on a question generated using at least one word of the set of words and at least one content domain, determining a vector representation for the generated question and for content related to the at least one content domain of the question answer pair, and embedding the question vector representation and the content vector representations into a common embedding space where vector representations that are related, are closer in the embedding space than unrelated embedded vector representations. Requests for content can then be fulfilled using the trained, common embedding space.

Type: Grant

Filed: November 1, 2021

Date of Patent: March 19, 2024

Assignee: SRI International

Inventors: Ajay Divakaran, Karan Sikka, Yi Yao, Yunye Gong, Stephanie Nunn, Pritish Sahu, Michael A. Cogswell, Jesse Hostetler, Sara Rutherford-Quach
MULTILINGUAL CONTENT MODERATION USING MULTIPLE CRITERIA

Publication number: 20240054294

Abstract: A method, apparatus and system for moderating multilingual content data, for example, presented during a communication session include receiving or pulling content data that can include multilingual content, classifying, using a first machine learning system, the content data by projecting the content data into a trained embedding space to determine at least one English-language classification for the content data, and determining, using a second machine learning system, if the content data violates at least one predetermined moderation rule, wherein the second machine learning system is trained to determine from English-language classifications determined by the first machine learning system if the content data violates moderation rules. In some embodiments, the method apparatus and system can further include prohibiting a presentation of the content data related to the at least one English-language classification determined to violate the at least one predetermined moderation rule.

Type: Application

Filed: August 14, 2023

Publication date: February 15, 2024

Inventors: Karan SIKKA, Meng YE, Ajay DIVAKARAN
Zero-shot object detection

Patent number: 11610384

Abstract: A method, apparatus and system for zero shot object detection includes, in a semantic embedding space having embedded object class labels, training the space by embedding extracted features of bounding boxes and object class labels of labeled bounding boxes of known object classes into the space, determining regions in an image having unknown object classes on which to perform object detection as proposed bounding boxes, extracting features of the proposed bounding boxes, projecting the extracted features of the proposed bounding boxes into the space, computing a similarity measure between the projected features of the proposed bounding boxes and the embedded, extracted features of the bounding boxes of the known object classes in the space, and predicting an object class label for proposed bounding boxes by determining a nearest embedded object class label to the projected features of the proposed bounding boxes in the space based on the similarity measures.

Type: Grant

Filed: June 2, 2021

Date of Patent: March 21, 2023

Assignee: SRI International

Inventors: Karan Sikka, Ajay Divakaran, Ankan Bansal
Semantically-aware image-based visual localization

Patent number: 11361470

Abstract: A method, apparatus and system for visual localization includes extracting appearance features of an image, extracting semantic features of the image, fusing the extracted appearance features and semantic features, pooling and projecting the fused features into a semantic embedding space having been trained using fused appearance and semantic features of images having known locations, computing a similarity measure between the projected fused features and embedded, fused appearance and semantic features of images, and predicting a location of the image associated with the projected, fused features. An image can include at least one image from a plurality of modalities such as a Light Detection and Ranging image, a Radio Detection and Ranging image, or a 3D Computer Aided Design modeling image, and an image from a different sensor, such as an RGB image sensor, captured from a same geo-location, which is used to determine the semantic features of the multi-modal image.

Type: Grant

Filed: October 29, 2019

Date of Patent: June 14, 2022

Assignee: SRI International

Inventors: Han-Pang Chiu, Zachary Seymour, Karan Sikka, Supun Samarasekera, Rakesh Kumar, Niluthpol Mithun
SYSTEM AND METHOD FOR CONTENT COMPREHENSION AND RESPONSE

Publication number: 20220138433

Abstract: A method, apparatus and system for training an embedding space for content comprehension and response includes, for each layer of a hierarchical taxonomy having at least two layers including respective words resulting in layers of varying complexity, determining a set of words associated with a layer of the hierarchical taxonomy, determining a question answer pair based on a question generated using at least one word of the set of words and at least one content domain, determining a vector representation for the generated question and for content related to the at least one content domain of the question answer pair, and embedding the question vector representation and the content vector representations into a common embedding space where vector representations that are related, are closer in the embedding space than unrelated embedded vector representations. Requests for content can then be fulfilled using the trained, common embedding space.

Type: Application

Filed: November 1, 2021

Publication date: May 5, 2022

Inventors: Ajay DIVAKARAN, Karan SIKKA, Yi YAO, Yunye GONG, Stephanie NUNN, Pritish SAHU, Michael A. COGSWELL, Jesse HOSTETLER, Sara RUTHERFORD-QUACH
Align-to-ground, weakly supervised phrase grounding guided by image-caption alignment

Patent number: 11238631

Abstract: A method, apparatus and system for visual grounding of a caption in an image include projecting at least two parsed phrases of the caption into a trained semantic embedding space, projecting extracted region proposals of the image into the trained semantic embedding space, aligning the extracted region proposals and the at least two parsed phrases, aggregating the aligned region proposals and the at least two parsed phrases to determine a caption-conditioned image representation and projecting the caption-conditioned image representation and the caption into a semantic embedding space to align the caption-conditioned image representation and the caption. The method, apparatus and system can further include a parser for parsing the caption into the at least two parsed phrases and a region proposal module for extracting the region proposals from the image.

Type: Grant

Filed: April 22, 2020

Date of Patent: February 1, 2022

Assignee: SRI International

Inventors: Karan Sikka, Ajay Divakaran, Samyak Datta
Aligning symbols and objects using co-attention for understanding visual content

Patent number: 11210572

Abstract: A method, apparatus and system for understanding visual content includes determining at least one region proposal for an image, attending at least one symbol of the proposed image region, attending a portion of the proposed image region using information regarding the attended symbol, extracting appearance features of the attended portion of the proposed image region, fusing the appearance features of the attended image region and features of the attended symbol, projecting the fused features into a semantic embedding space having been trained using fused attended appearance features and attended symbol features of images having known descriptive messages, computing a similarity measure between the projected, fused features and fused attended appearance features and attended symbol features embedded in the semantic embedding space having at least one associated descriptive message and predicting a descriptive message for an image associated with the projected, fused features.

Type: Grant

Filed: December 17, 2019

Date of Patent: December 28, 2021

Assignee: SRI International

Inventors: Ajay Divakaran, Karan Sikka, Karuna Ahuja, Anirban Roy
ZERO-SHOT OBJECT DETECTION

Publication number: 20210295082

Abstract: A method, apparatus and system for zero shot object detection includes, in a semantic embedding space having embedded object class labels, training the space by embedding extracted features of bounding boxes and object class labels of labeled bounding boxes of known object classes into the space, determining regions in an image having unknown object classes on which to perform object detection as proposed bounding boxes, extracting features of the proposed bounding boxes, projecting the extracted features of the proposed bounding boxes into the space, computing a similarity measure between the projected features of the proposed bounding boxes and the embedded, extracted features of the bounding boxes of the known object classes in the space, and predicting an object class label for proposed bounding boxes by determining a nearest embedded object class label to the projected features of the proposed bounding boxes in the space based on the similarity measures.

Type: Application

Filed: June 2, 2021

Publication date: September 23, 2021

Inventors: Karan Sikka, Ajay Divakaran, Ankan Bansal
USER TARGETED CONTENT GENERATION USING MULTIMODAL EMBEDDINGS

Publication number: 20210297498

Abstract: A method, apparatus and system for determining user-content associations for determining and providing user-preferred content using multimodal embeddings include creating an embedding space for multimodal content by creating a first modality vector representation of the multimodal content having a first modality, creating a second modality vector representation of the multimodal content having a second modality, creating a user vector representation, as a third modality, for each user associated with at least a portion of the multimodal content, and embedding the first and the second modality vector representations and the user vector representations in the common embedding space using at least a mixture of loss functions for each modality pair of the first, the at least second and the third modalities that pushes closer co-occurring pairs of multimodal content.

Type: Application

Filed: March 4, 2021

Publication date: September 23, 2021

Inventors: Ajay Divakaran, Karan Sikka, Arijit Ray, Xiao Lin, Yi Yao
Zero-shot object detection

Patent number: 11055555

Abstract: A method, apparatus and system for zero shot object detection includes, in a semantic embedding space having embedded object class labels, training the space by embedding extracted features of bounding boxes and object class labels of labeled bounding boxes of known object classes into the space, determining regions in an image having unknown object classes on which to perform object detection as proposed bounding boxes, extracting features of the proposed bounding boxes, projecting the extracted features of the proposed bounding boxes into the space, computing a similarity measure between the projected features of the proposed bounding boxes and the embedded, extracted features of the bounding boxes of the known object classes in the space, and predicting an object class label for proposed bounding boxes by determining a nearest embedded object class label to the projected features of the proposed bounding boxes in the space based on the similarity measures.

Type: Grant

Filed: April 12, 2019

Date of Patent: July 6, 2021

Assignee: SRI International

Inventors: Karan Sikka, Ajay Divakaran, Ankan Bansal
ALIGN-TO-GROUND, WEAKLY SUPERVISED PHRASE GROUNDING GUIDED BY IMAGE-CAPTION ALIGNMENT

Publication number: 20210056742

Abstract: A method, apparatus and system for visual grounding of a caption in an image include projecting at least two parsed phrases of the caption into a trained semantic embedding space, projecting extracted region proposals of the image into the trained semantic embedding space, aligning the extracted region proposals and the at least two parsed phrases, aggregating the aligned region proposals and the at least two parsed phrases to determine a caption-conditioned image representation and projecting the caption-conditioned image representation and the caption into a semantic embedding space to align the caption-conditioned image representation and the caption. The method, apparatus and system can further include a parser for parsing the caption into the at least two parsed phrases and a region proposal module for extracting the region proposals from the image.

Type: Application

Filed: April 22, 2020

Publication date: February 25, 2021

Inventors: Karan Sikka, Ajay Divakaran, Samyak Datta
SEMANTICALLY-AWARE IMAGE-BASED VISUAL LOCALIZATION

Publication number: 20200357143

Abstract: A method, apparatus and system for visual localization includes extracting appearance features of an image, extracting semantic features of the image, fusing the extracted appearance features and semantic features, pooling and projecting the fused features into a semantic embedding space having been trained using fused appearance and semantic features of images having known locations, computing a similarity measure between the projected fused features and embedded, fused appearance and semantic features of images, and predicting a location of the image associated with the projected, fused features. An image can include at least one image from a plurality of modalities such as a Light Detection and Ranging image, a Radio Detection and Ranging image, or a 3D Computer Aided Design modeling image, and an image from a different sensor, such as an RGB image sensor, captured from a same geo-location, which is used to determine the semantic features of the multi-modal image.

Type: Application

Filed: October 29, 2019

Publication date: November 12, 2020

Inventors: Han-Pang Chiu, Zachary Seymour, Karan Sikka, Supun Samarasekera, Rakesh Kumar, Niluthpol Mithun
Weakly supervised learning for classifying images

Patent number: 10824916

Abstract: Systems and methods for improving the accuracy of a computer system for object identification/classification through the use of weakly supervised learning are provided herein. In some embodiments, the method includes (a) receiving at least one set of curated data, wherein the curated data includes labeled images, (b) using the curated data to train a deep network model for identifying objects within images, wherein the trained deep network model has a first accuracy level for identifying objects, receiving a first target accuracy level for object identification of the deep network model, determining, automatically via the computer system, an amount of weakly labeled data needed to train the deep network model to achieve the first target accuracy level, and augmenting the deep network model using weakly supervised learning and the weakly labeled data to achieve the first target accuracy level for object identification by the deep network model.

Type: Grant

Filed: September 10, 2018

Date of Patent: November 3, 2020

Assignee: SRI International

Inventors: Karan Sikka, Ajay Divakaran, Parneet Kaur
ALIGNING SYMBOLS AND OBJECTS USING CO-ATTENTION FOR UNDERSTANDING VISUAL CONTENT

Publication number: 20200193245

Abstract: A method, apparatus and system for understanding visual content includes determining at least one region proposal for an image, attending at least one symbol of the proposed image region, attending a portion of the proposed image region using information regarding the attended symbol, extracting appearance features of the attended portion of the proposed image region, fusing the appearance features of the attended image region and features of the attended symbol, projecting the fused features into a semantic embedding space having been trained using fused attended appearance features and attended symbol features of images having known descriptive messages, computing a similarity measure between the projected, fused features and fused attended appearance features and attended symbol features embedded in the semantic embedding space having at least one associated descriptive message and predicting a descriptive message for an image associated with the projected, fused features.

Type: Application

Filed: December 17, 2019

Publication date: June 18, 2020

Inventors: Ajay Divakaran, Karan Sikka, Karuna Ahuja, Anirban Roy
DETERMINING INTENT FROM MULTIMODAL CONTENT EMBEDDED IN A COMMON GEOMETRIC SPACE

Publication number: 20200134398

Abstract: Inferring multimodal content intent in a common geometric space in order to improve recognition of influential impacts of content includes mapping the multimodal content in a common geometric space by embedding a multimodal feature vector representing a first modality of the multimodal content and a second modality of the multimodal content and inferring intent of the multimodal content mapped into the common geometric space such that connections between multimodal content result in an improvement in recognition of the influential impact of the multimodal content.

Type: Application

Filed: April 12, 2019

Publication date: April 30, 2020

Inventors: Julia Kruk, Jonah M. Lubin, Karan Sikka, Xiao Lin, Ajay Divakaran
WEAKLY SUPERVISED LEARNING FOR CLASSIFYING IMAGES

Publication number: 20200082224

Abstract: Systems and methods for improving the accuracy of a computer system for object identification/classification through the use of weakly supervised learning are provided herein. In some embodiments, the method includes (a) receiving at least one set of curated data, wherein the curated data includes labeled images, (b) using the curated data to train a deep network model for identifying objects within images, wherein the trained deep network model has a first accuracy level for identifying objects, receiving a first target accuracy level for object identification of the deep network model, determining, automatically via the computer system, an amount of weakly labeled data needed to train the deep network model to achieve the first target accuracy level, and augmenting the deep network model using weakly supervised learning and the weakly labeled data to achieve the first target accuracy level for object identification by the deep network model.

Type: Application

Filed: September 10, 2018

Publication date: March 12, 2020

Inventors: Karan Sikka, Ajay Divakaran, Parneet Kaur
ZERO-SHOT OBJECT DETECTION

Publication number: 20190325243

Abstract: A method, apparatus and system for zero shot object detection includes, in a semantic embedding space having embedded object class labels, training the space by embedding extracted features of bounding boxes and object class labels of labeled bounding boxes of known object classes into the space, determining regions in an image having unknown object classes on which to perform object detection as proposed bounding boxes, extracting features of the proposed bounding boxes, projecting the extracted features of the proposed bounding boxes into the space, computing a similarity measure between the projected features of the proposed bounding boxes and the embedded, extracted features of the bounding boxes of the known object classes in the space, and predicting an object class label for proposed bounding boxes by determining a nearest embedded object class label to the projected features of the proposed bounding boxes in the space based on the similarity measures.

Type: Application

Filed: April 12, 2019

Publication date: October 24, 2019

Inventors: Karan Sikka, Ajay Divakaran, Ankan Bansal
EMBEDDING MULTIMODAL CONTENT IN A COMMON NON-EUCLIDEAN GEOMETRIC SPACE

Publication number: 20190325342

Abstract: Embedding multimodal content in a common geometric space includes for each of a plurality of content of the multimodal content, creating a respective, first modality feature vector representative of content of the multimodal content having a first modality using a first machine learning model; for each of a plurality of content of the multimodal content, creating a respective, second modality feature vector representative of content of the multimodal content having a second modality using a second machine learning model; and semantically embedding the respective, first modality feature vectors and the respective, second modality feature vectors in a common geometric space that provides logarithm-like warping of distance space in the geometric space to capture hierarchical relationships between seemingly disparate, embedded modality feature vectors of content in the geometric space; wherein embedded modality feature vectors that are related, across modalities, are closer together in the geometric space than un

Type: Application

Filed: April 12, 2019

Publication date: October 24, 2019

Inventors: Karan Sikka, Ajay Divakaran, Julia Kruk