Patents by Inventor Ajay Divakaran

Ajay Divakaran has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

User targeted content generation using multimodal embeddings

Patent number: 12367420

Abstract: A method, apparatus and system for determining user-content associations for determining and providing user-preferred content using multimodal embeddings include creating an embedding space for multimodal content by creating a first modality vector representation of the multimodal content having a first modality, creating a second modality vector representation of the multimodal content having a second modality, creating a user vector representation, as a third modality, for each user associated with at least a portion of the multimodal content, and embedding the first and the second modality vector representations and the user vector representations in the common embedding space using at least a mixture of loss functions for each modality pair of the first, the at least second and the third modalities that pushes closer co-occurring pairs of multimodal content.

Type: Grant

Filed: March 4, 2021

Date of Patent: July 22, 2025

Assignee: SRI International

Inventors: Ajay Divakaran, Karan Sikka, Arijit Ray, Xiao Lin, Yi Yao
METHOD AND SYSTEM USING DIVERSE CAPTIONS FOR IMPROVING LONG VIDEO RETRIEVAL

Publication number: 20250231986

Abstract: Embodiments of the present principles generally relate to methods, apparatuses, and systems for improved long video retrieval by training video language models (VLM) using diverse captions. In some embodiments, a method for improved long video retrieval may include generating a plurality of captions of varying dimensions using one or more Large Language Models (LLM); associating the plurality of captions of varying dimensions to one or more videos in one or more video data sets to generate one or more enhanced video data sets; generating an enhanced VLM by finetuning a pretrained video language model using the generated one or more enhanced video data sets; and retrieving one or more videos with a query using the enhanced VLM having a R@K rank.

Type: Application

Filed: December 13, 2024

Publication date: July 17, 2025

Inventors: Ajay DIVAKARAN, Matthew Aaron GWILLIAM, Michael COGSWELL, Karan SIKKA, Meng YE
Automated collaboration skills assessment

Patent number: 12340335

Abstract: In some examples, a computer-implemented collaboration assessment model identifies actions of each of two or more individuals depicted in video data, identify, based at least on the identified actions of each of the two or more individuals depicted in the video data, first behaviors at a first collaboration assessment level, identify, based at least on the identified actions of each of the two or more individuals depicted in the video data, second behaviors at a second collaboration assessment level different from the first collaboration assessment level, and generate and output, based at least on the first behaviors at the first collaboration assessment level and the second behaviors at the second collaboration assessment level, an indication of at least one of an assessment of a collaboration effort of the two or more individuals or respective assessments of individual contributions of the two or more individuals to the collaboration effort.

Type: Grant

Filed: June 15, 2021

Date of Patent: June 24, 2025

Assignee: SRI International

Inventors: Svati Dhamija, Amir Tamrakar, Nonye M. Alozie, Elizabeth McBride, Ajay Divakaran, Anirudh Som, Sujeong Kim, Bladimir Lopez-Prado
INSTRUCTION-GUIDED VISUAL EMBEDDINGS AND FEEDBACK-BASED LEARNING IN LARGE VISION-LANGUAGE MODELS

Publication number: 20250131027

Abstract: In an example, a method for fine-tuning a Large Visual Language Model (LVLM) includes providing visual queries, each of the visual queries comprises at least an image and a textual query related to the image; processing, by the LVLM, the visual queries to extract visual embeddings from the visual queries, wherein the LVLM comprises a Visual Language Model (VLM), a first Large Language Model (LLM), and a linear projection layer interconnecting the VLM and the LLM; for visual queries: i) generating, by the LVLM, a response to the corresponding visual query based on the corresponding visual embedding; ii) evaluating, by a second LLM, the generated response to verify that the generated response satisfies predefined criteria; and iii) providing, by the second LLM, a feedback to the LVLM, in response to the evaluating the generated response; and fine-tuning the LVLM using aggregated feedback provided by the second LLM for the visual queries.

Type: Application

Filed: October 23, 2024

Publication date: April 24, 2025

Inventors: Yangyi Chen, Karan Sikka, Michael A. Cogswell, Ajay Divakaran
LARGE LANGUAGE MODEL AUGMENTATION WITH KNOWLEDGE LANGUAGE MODELS

Publication number: 20250131212

Abstract: In an example, a method for generating responses by a Machine Learning (ML) system includes processing, by a first language model, a natural language instruction to generate an instruction representation based on a meaning of the natural language instruction; translating, by a translation module comprising an interface between the first language model and a second language model, the instruction representation into data indicating an intent of the natural language instruction, wherein the second language model is trained with domain specific knowledge; providing, by the translation module, the natural language instruction and the data indicating the intent of the natural language instruction to the second language model; and generating, by the second language model, a response based on the natural language instruction and the data indicating the intent of the natural language instruction.

Type: Application

Filed: October 18, 2024

Publication date: April 24, 2025

Inventors: Pengfei Yu, Yi Yao, Karan Sikka, Michael A. Cogswell, Ajay Divakaran
VPA with integrated object recognition and facial expression recognition

Patent number: 12282606

Abstract: Methods, computing devices, and computer-program products are provided for implementing a virtual personal assistant. In various implementations, a virtual personal assistant can be configured to receive sensory input, including at least two different types of information. The virtual personal assistant can further be configured to determine semantic information from the sensory input, and to identify a context-specific framework. The virtual personal assistant can further be configured to determine a current intent. Determining the current intent can include using the semantic information and the context-specific framework. The virtual personal assistant can further be configured to determine a current input state. Determining the current input state can include using the semantic information and one or more behavioral models. The behavioral models can include one or more interpretations of previously-provided semantic information.

Type: Grant

Filed: December 1, 2020

Date of Patent: April 22, 2025

Assignee: SRI International

Inventors: Ajay Divakaran, Amir Tamrakar, Girish Acharya, William Mark, Greg Ho, Jihua Huang, David Salter, Edgar Kalns, Michael Wessel, Min Yin, James Carpenter, Brent Mombourquette, Kenneth Nitz, Elizabeth Shriberg, Eric Law, Michael Frandsen, Hyong-Gyun Kim, Cory Albright, Andreas Tsiartas
MACHINE LEARNING MODEL PROMPT DEMONSTRATION SELECTION

Publication number: 20250124352

Abstract: Techniques are described for a machine learning system configured to generate respective sample embeddings for a plurality of sample statements. The machine learning system may further be configured to generate a statement embedding for a statement. The machine learning system may further be configured to determine, based on the sample embedding and the statement embedding, respective similarity scores for the sample embeddings. The machine learning system may further be configured to select, based on the respective similarity scores for the sample embeddings, one or more sample statements from the plurality of sample statements. The machine learning system may further be configured to generate a prompt including the one or more sample statements, the statement, and at least one of respective ground-truth information or respective paraphrases for the selected one or more sample statements. The machine learning system may further be configured to provide the prompt to a machine learning model.

Type: Application

Filed: October 15, 2024

Publication date: April 17, 2025

Inventors: Anirudh Som, Karan Sikka, Ajay Divakaran, Helen Gent, Andreas Kathol, Dimitra Vergyri
CAUSAL ANALYSIS WITH TIME SERIES DATA

Publication number: 20250110989

Abstract: In general, various aspects of the techniques are directed to causal analysis using large scale time series data. A computing system may convert large scale time series data to first time period records and second time period records according to a multi-scale time resolution. The computing system may implement a hierarchical machine learning model to generate embeddings that capture temporal characteristics of features of the large scale time series data. The computing system may generate a graph data structure indicating cause and effect correlations between features of the large scale time series data based on temporal dynamics captured in the cause and second time period records and/or the embeddings.

Type: Application

Filed: September 24, 2024

Publication date: April 3, 2025

Inventors: Ajay Divakaran, Yi Yao, Julia Kruk, Jesse Hostetler, Jihua Huang
Analysis and design of dynamical system controllers using neural differential equations

Patent number: 12236330

Abstract: In general, the disclosure describes techniques for characterizing a dynamical system and a neural ordinary differential equation (NODE)-based controller for the dynamical system. An example analysis system is configured to: obtain a set of parameters of a NODE model used to implement the NODE-based controller, the NODE model trained to control the dynamical system; determine, based on the set of parameters, a system property of a combined system comprising the dynamical system and the NODE-based controller, the system property comprising one or more of an accuracy, safety, reliability, reachability, or controllability of the combined system; and output the system property to modify one or more of the dynamical system or the NODE-based controller to meet a required specification for the combined system.

Type: Grant

Filed: May 26, 2021

Date of Patent: February 25, 2025

Assignee: SRI International

Inventors: Ajay Divakaran, Anirban Roy, Susmit Jha
METHOD, APPARATUS AND SYSTEM FOR CONSISTENCY ENHANCED LARGE LANGUAGE MODELS

Publication number: 20250013873

Abstract: A method, apparatus, and system for training a language model for enhanced consistency include selecting at least a portion of the content data of the language model, generating reasoning statements in the form of natural language relevant to the selected portion of the content data, and training the language model using the generated reasoning statements such that a logical inference of the trained language model in response to a prompt directed to the selected portion of the content data is increased as compared with the logical inference of the language model in response to the same or similar prompt before the training of the language model to enhance the consistency of the language model with respect to the selected portion of the content data. The trained language model can be used to generate a logical inference having enhanced consistency for at least a portion of content data.

Type: Application

Filed: July 8, 2024

Publication date: January 9, 2025

Inventors: Ajay DIVAKARAN, Karan SIKKA, Michael COGSWELL, Yunye GONG, Yangyi CHEN
SYSTEM AND METHOD TO REVIEW ONLINE VIOLENCE AND EDUCATION

Publication number: 20240414394

Abstract: A computing system is configured to obtain a video that includes text elements and visual elements. The computing system is further configured to generate a plurality of text tokens representative of audio spoken in the video and a plurality of frame tokens representative of one or more frames of the video. The computing system is further configured to generate a set of features that includes a text feature, a frame feature, and a multi-modal feature, wherein the multi-modal feature is representative of multi-modal elements of the video, and wherein generating the set of features is based on the plurality of text tokens and the plurality of frame tokens. The computing system is further configured to associate the set of features with one or more labels to generate a multi-label classification of the video. The computing system is further configured to output an indication of the multi-label classification of the video.

Type: Application

Filed: June 7, 2024

Publication date: December 12, 2024

Inventors: Claire Christensen, Anirban Roy, Ajay Divakaran, Todd Grindal
CONFIDENCE CALIBRATION FOR SYSTEMS WITH CASCADED PREDICTIVE MODELS

Publication number: 20240403728

Abstract: In general, techniques are described that address the limitations of existing conformal prediction methods for cascaded models. In an example, a method includes receiving a first validation data set for validating performance of an upstream model of the two or more cascaded models and receiving a second validation data set for validating performance of a downstream model of the two or more cascaded models wherein the second validation data set is different than the first validation set; estimating system-level errors caused by predictions of the upstream model based on the first validation data set; estimating system-level errors caused by predictions of the downstream model based on the second validation data set; and generating a prediction confidence interval that indicates a confidence for the system based on the system-level errors caused by predictions of the upstream model and based on the system-level errors caused by predictions of the downstream model.

Type: Application

Filed: March 22, 2024

Publication date: December 5, 2024

Inventors: Yunye Gong, Yi Yao, Xiao Lin, Ajay Divakaran
ADAPTING A LANGUAGE MODEL FOR MULTIMODAL MULTI-TASK LEARNING

Publication number: 20240338599

Abstract: A method, apparatus and system for adapting a language model for understanding domain-specific multimodal content include acquiring domain-specific multimodal content for at least one content domain and applying question/answer pairs to the acquired, domain-specific multimodal content for the at least one content domain to train the language model to learn tasks associated with the domain-specific multimodal content for the at least one domain. As such, the trained language model can be implemented to answer questions directed to the domain-specific multimodal content for the at least one domain.

Type: Application

Filed: March 28, 2024

Publication date: October 10, 2024

Inventors: Karan SIKKA, Michael COGSWELL, Pritish SAHU, Meng YE, Abrar RAHMAN, Rohit SRIDHAR, Ajay DIVAKARAN
METHOD AND SYSTEM FOR DETERMINING A MEASURE OF CONCEPTUAL CONSISTENCY IN LARGE LANGUAGE MODELS

Publication number: 20240242040

Abstract: Embodiments of the present principles generally relate to methods, apparatuses and systems for determining a measure of conceptual consistency in large language models for understanding of relevant concepts. In some embodiments, a method for measuring conceptual consistency may include prompting an LLM in order to extract answers to background queries and anchor tasks. The method also includes comparing background knowledge facts for a given anchor task associated with known answers with facts extracted from the LLM to determine an LLM performance. The method also includes determining a background knowledge score and an anchor task score based on the LLM's performance. The method also includes determining a conceptual may include score for the LLM by predicting the anchor task score from the background knowledge score. The method also includes outputting an indication of the conceptual may include score.

Type: Application

Filed: December 15, 2023

Publication date: July 18, 2024

Inventors: Michael COGSWELL, Ajay DIVAKARAN, Yunye GONG, Pritish SAHU
SPATIAL-TEMPORAL ANOMALY AND EVENT DETECTION USING NIGHT VISION SENSORS

Publication number: 20240212350

Abstract: In general, the disclosure describes techniques for joint spatiotemporal Artificial Intelligence (AI) models that can encompass multiple space and time resolutions through self-supervised learning. In an example, a method includes for each of a plurality of multimodal data, generating, by a computing system, using a first machine learning model, a respective modality feature vector representative of content of the multimodal data, wherein each of the generated modality feature vectors has a different modality; processing, by the computing system, each of generated modality feature vectors with a second machine learning model comprising an encoder model to generate event data comprising a plurality of events and/or activities of interest; and analyzing, by the computing system, the event data to generate anomaly data indicative of detected anomalies in the multimodal data.

Type: Application

Filed: June 7, 2023

Publication date: June 27, 2024

Inventors: Subhodev Das, Ajay Divakaran, Ali Chaudhry, Julia Kruk, Bo Dong
SYSTEM DESIGN FOR AN INTEGRATED LIFELONG MACHINE LEARNING AGENT

Publication number: 20240202538

Abstract: A method, apparatus and system for lifelong reinforcement learning include receiving features of a task, communicating the task features to a learning system, where the learning system learns or performs a task related to the features based on learning or performing similar previous tasks, determining from the features if the task has changed and if so, communicating the features of the changed task to the learning system, where the learning system learns or performs the changed task based on learning or performing similar previous tasks, automatically annotating feature characteristics of received features including differences between the features of the original task and the features of the changed task to enable the learning system to more efficiently learn or perform at least the changed task, and if the task has not changed, processing the task features of a current task by the learning system to learn or perform the current task.

Type: Application

Filed: December 11, 2023

Publication date: June 20, 2024

Inventors: Aswin NADAMUNI RAGHAVAN, Indranil SUR, Zachary DANIELS, Jesse HOSTETLER, Abrar RAHMAN, Ajay DIVAKARAN, Michael R. PIACENTINO
System and method for content comprehension and response

Patent number: 11934793

Abstract: A method, apparatus and system for training an embedding space for content comprehension and response includes, for each layer of a hierarchical taxonomy having at least two layers including respective words resulting in layers of varying complexity, determining a set of words associated with a layer of the hierarchical taxonomy, determining a question answer pair based on a question generated using at least one word of the set of words and at least one content domain, determining a vector representation for the generated question and for content related to the at least one content domain of the question answer pair, and embedding the question vector representation and the content vector representations into a common embedding space where vector representations that are related, are closer in the embedding space than unrelated embedded vector representations. Requests for content can then be fulfilled using the trained, common embedding space.

Type: Grant

Filed: November 1, 2021

Date of Patent: March 19, 2024

Assignee: SRI International

Inventors: Ajay Divakaran, Karan Sikka, Yi Yao, Yunye Gong, Stephanie Nunn, Pritish Sahu, Michael A. Cogswell, Jesse Hostetler, Sara Rutherford-Quach
HARDENING A DEEP NEURAL NETWORK AGAINST ADVERSARIAL ATTACKS USING A STOCHASTIC ENSEMBLE

Publication number: 20240062042

Abstract: In general, the disclosure describes techniques for implementing an MI-based attack detector. In an example, a method includes training a neural network using training data, applying stochastic quantization to one or more layers of the neural network, generating, using the trained neural network, an ensemble of neural networks having a plurality of quantized members, wherein at least one of weights or activations of each of the plurality of quantized members have different bit precision, and combining predictions of the plurality of quantized members of the ensemble to detect one or more adversarial attacks and/or determine performance of the ensemble of neural networks.

Type: Application

Filed: August 17, 2023

Publication date: February 22, 2024

Inventors: Aswin Nadamuni Raghavan, Saurabh Farkya, Jesse Albert Hostetler, Avraham Joshua Ziskind, Michael Piacentino, Ajay Divakaran, Zhengyu Chen
MULTILINGUAL CONTENT MODERATION USING MULTIPLE CRITERIA

Publication number: 20240054294

Abstract: A method, apparatus and system for moderating multilingual content data, for example, presented during a communication session include receiving or pulling content data that can include multilingual content, classifying, using a first machine learning system, the content data by projecting the content data into a trained embedding space to determine at least one English-language classification for the content data, and determining, using a second machine learning system, if the content data violates at least one predetermined moderation rule, wherein the second machine learning system is trained to determine from English-language classifications determined by the first machine learning system if the content data violates moderation rules. In some embodiments, the method apparatus and system can further include prohibiting a presentation of the content data related to the at least one English-language classification determined to violate the at least one predetermined moderation rule.

Type: Application

Filed: August 14, 2023

Publication date: February 15, 2024

Inventors: Karan SIKKA, Meng YE, Ajay DIVAKARAN
ERROR-BASED EXPLANATIONS FOR ARTIFICIAL INTELLIGENCE BEHAVIOR

Publication number: 20240005654

Abstract: A computing system comprising a memory configured to store an artificial intelligence (AI) model and an image, and a computation engine executing one or more processors may be configured to perform the techniques for error-based explanations for AI behavior. The computation engine may execute the AI model to analyze the image to output a result. The AI model may, when analyzing the image to output the result, process, based on data indicative of the result, the image to assign an error score to each image feature extracted from the image, and obtain, based on the error scores, an error map. The AI model may next update, based on the error map and to obtain a first updated image, the image to visually indicate the error score assigned to each of the image features, and output one or more of the error scores, the error map, and the first updated image.

Type: Application

Filed: March 24, 2022

Publication date: January 4, 2024

Inventors: Arijit Ray, Michael A. Cogswell, Ajay Divakaran, Yi Yao, Giedrius T. Burachas, Kamran Alipour

1 2 3 4 5 … next