Patents by Inventor Devansh Arpit

Devansh Arpit has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEMS AND METHODS FOR ARTIFICIAL INTELLIGENCE AGENTS

Publication number: 20250139411

Abstract: Embodiments described herein provide a large language model (LLM) based AI agent that adopts Monte-Carlo Tree Search (MCTS) to execute a task. The LLM is prompted with a task description and it responds with its first attempted list of actions. Based on the success or failure of the first attempt, the LLM is prompted with an updated prompt which includes feedback from the first attempt based on a determined reward. The prompt may include a relative “score” for each action taken at each step. A numeric score may be mapped to a set of pre-defined text labels, such as “high” or “low” value putting the score in a form more suited for an LLM prompt. In this way, the LLM is iteratively given prompts which are updated with the scores from each action taken at each previous iterations so that it traverses different paths on the tree in each iteration.

Type: Application

Filed: October 31, 2023

Publication date: May 1, 2025

Inventors: Rithesh Murthy, Shelby Heinecke, Juan Carlos Niebles Duque, Zhiwei Liu, Le Xue, Weiran Yao, Yihao Feng, Zeyuan Chen, Akash Gokul, Devansh Arpit, Ran Xu, Lik Mui, Huan Wang, Caiming Xiong, Silvio Savarese
SYSTEMS AND METHODS FOR AN ATTENTION-BASED NEURAL NETWORK ARCHITECTURE

Publication number: 20250131246

Abstract: Embodiments provide an attention mechanism that computes attention weights for an input sequence by employing a set of multi-head learnable vectors (referred to as “binder vectors”) to attend to the input sequence.

Type: Application

Filed: October 23, 2023

Publication date: April 24, 2025

Inventors: Devansh Arpit, Huan Wang, Caiming Xiong
SYSTEMS AND METHODS FOR EDITING A LARGE LANGUAGE MODEL

Publication number: 20250124233

Abstract: Systems and methods for editing a large language model are provided. The large language model generates a sequence of tokens, a first probability of a pre-edit output based on the sequence of tokens, and a second probability of a target output based on the sequence of tokens. A loss function is provided based on the first probability and the second probability. A plurality of gradients of the large language model with respect to the loss function is computed. An edit location of the large language model is determined based on the plurality of gradients. The large language model is edited by editing weights at the edit location of the large language model, such that the updated large language model generates the target output for an input including the sequence of words.

Type: Application

Filed: January 31, 2024

Publication date: April 17, 2025

Inventors: Itai Izhak Feigenbaum, Devansh Arpit, Shelby Heinecke, Juan Carlos Niebles Duque, Weiran Yao, Huan Wang, Caiming Xiong, Silvio Savarese
SYSTEMS AND METHODS FOR ORCHESTRATING LLM-AUGMENTED AUTONOMOUS AGENTS

Publication number: 20250053793

Abstract: Embodiments described herein provide a method of predicting an action by a plurality of language model augmented agents (LAAs). In at least one embodiment, a controller receives a task instruction to be performed using an environment. The controller receives an observation of a first state from the environment. The controller selects a LAA from the plurality of LAAs based on the task instruction and the observation. The controller obtains an output from the selected LAA generated using an input combining the task instruction, the observation, and an LAA-specific prompt template. The controller determines the action based on the output. The controller causes the action to be performed on the environment thereby causing the first state of the environment to change to a second state.

Type: Application

Filed: October 25, 2023

Publication date: February 13, 2025

Inventors: Zhiwei Liu, Weiran Yao, Jianguo Zhang, Le Xue, Shelby Heinecke, Rithesh Murthy, Yihao Feng, Zeyuan Chen, Juan Carlos Niebles Duque, Devansh Arpit, Ran Xu, Lik Mui, Huan Wang, Caiming Xiong, Silvio Savarese
SYSTEMS AND METHODS FOR LANGUAGE AGENT OPTIMIZATION

Publication number: 20250045567

Abstract: Embodiments described herein provide for optimizing a language model (LM) agent. In at least one embodiment, and LM agent comprises an “actor” LM and a “retrospective LM which provides reflections on attempts by the actor LM. The reflections are used to update subsequent prompts to the actor LM. Optimizing the LM agent comprises fine-tuning parameters of the retrospective LM while keeping parameters of the actor LM frozen. A gradient may be determined by a change in reward from the environment based on actions taken by the actor LM with and without a reflection of the retrospective LM. Using this gradient, parameters of the retrospective LM may be updated via backpropagation.

Type: Application

Filed: October 31, 2023

Publication date: February 6, 2025

Inventors: Weiran Yao, Shelby Heinecke, Juan Carlos Niebles Duque, Zhiwei Liu, Yihao Feng, Le Xue, Rithesh Murthy, Zeyuan Chen, Jianguo Zhang, Devansh Arpit, Ran Xu, Lik Mui, Huan Wang, Caiming Xiong, Silvio Savarese
TECHNIQUES FOR MACHINE LEARNING MODEL SELECTION FOR DOMAIN GENERALIZATION

Publication number: 20230368078

Abstract: A computing device may perform training of a set of machine learning models on a first data set associated with a first domain. In some examples, the training may include, for each machine learning model of the set of machine learning models, inputting, as values for a set of parameters of the respective sets of parameters and for an iteration of a set of iterations, a moving average of the set of parameters calculated over a threshold number of previous iterations. The computing device may select a set of model states that are generated during the training of the plurality of machine learning models based on a validation performance of the set of model states performed during the training. The computing device may then generate an ensembled machine learning model by aggregating the set of machine learning models corresponding to the set of selected model states.

Type: Application

Filed: May 16, 2022

Publication date: November 16, 2023

Inventors: Devansh Arpit, Huan Wang, Yingbo Zhou, Caiming Xiong
SYSTEMS AND METHODS FOR LEARNING RICH NEAREST NEIGHBOR REPRESENTATIONS FROM SELF-SUPERVISED ENSEMBLES

Publication number: 20230105322

Abstract: Embodiments described herein provide a system and method for extracting information. The system receives, via a communication interface, a dataset of a plurality of data samples. The system determines, in response to an input data sample from the dataset, a set of feature vectors via a plurality of pre-trained feature extractors, respectively. The system retrieves a set of memory bank vectors that correspond to the input data sample. The system, generates, via a plurality of Multi-Layer-Perceptrons (MLPs), a mapped set of representations in response to an input of the set of memory bank vectors, respectively. The system determines a loss objective between the set of feature vectors and the combination of the mapped set of representations and a network of layers in the MLP. The system updates, the parameters of the plurality of MLPs and the parameters of the memory bank vectors by minimizing the computed loss objective.

Type: Application

Filed: January 28, 2022

Publication date: April 6, 2023

Inventors: Bram Wallace, Devansh Arpit, Huan Wang, Caiming Xiong
NEURAL NETWORK BASED ANOMALY DETECTION FOR TIME-SERIES DATA

Publication number: 20220335257

Abstract: A system uses a neural network to detect anomalies in time series data. The system trains the neural network for a fixed number of iterations using data from a time window of the time series. The system uses the loss value at the end of the fixed number of iterations for identifying anomalies in the time series data. For a time window, the system initializes the neural network to random values and trains the neural network for a fixed number of iterations using the data of the time window. After the fixed number of iterations, the system compares the loss values for various data points to a threshold value. Data points having loss value exceeding a threshold are identified as anomalous data points.

Type: Application

Filed: April 15, 2021

Publication date: October 20, 2022

Inventors: Devansh Arpit, Huan Wang, Caiming Xiong
MOMENTUM CONTRASTIVE AUTOENCODER

Publication number: 20220108183

Abstract: The embodiments are directed to training a momentum contrastive autoencoder using a contrastive learning framework. The contrastive learning framework learns a latent space distribution by matching latent representations of the momentum contrastive autoencoder to a pre-specified distribution, such as a distribution over a unit hyper-sphere. Once the latent space distribution is learned, samples for a new data set may be obtained from the latent space distribution. This results in a simple and scalable algorithm that avoids many of the optimization challenges of existing generative models, while retaining the advantage of efficient sampling.

Type: Application

Filed: January 21, 2021

Publication date: April 7, 2022

Inventor: Devansh Arpit
METHOD AND SYSTEM FOR EVALUATING THE QUALITY OF A SURGICAL PROCEDURE FROM IN-VIVO VIDEO

Publication number: 20170132785

Abstract: The quality of surgeries in captured videos is modeled in a learning network. For this task, a dataset of surgical video is given with a corresponding set of scores that are labeled by reviewers, to learn a model for quality assessment of surgical procedures. A learned model is then used to automatically assess quality of a surgical procedure, which omits the need for professional experts to manually inspect such videos. The quality assessment of surgical procedures can be performed off-line or in real-time as the surgical procedure is being performed. Surgical actions in surgical procedures are also localized in space and time to provide a feedback to the surgeon as to which action can be improved.

Type: Application

Filed: April 26, 2016

Publication date: May 11, 2017

Inventors: Safwan R. Wshah, Ahmed E. Ghazi, Raja Bala, Devansh Arpit
Class discriminative feature transformation

Patent number: 9471886

Abstract: A method for feature transformation of a data set includes: receiving a data set including original feature samples with corresponding class labels; splitting the data set into a direction optimization set and a training set; using the direction optimization set to calculate an optimum transformation vector that maximizes inter-class separability and minimizes intra-class variance of the feature samples with respect to corresponding class labels; using the optimum transformation vector to transform the rest of the original feature samples of the data set to new feature samples with enhanced discriminative characteristics; and training a classifier using the new feature samples, wherein the method is performed by one or more processors.

Type: Grant

Filed: August 13, 2014

Date of Patent: October 18, 2016

Assignee: RAYTHEON BBN TECHNOLOGIES CORP.

Inventors: Manasvi Tickoo, Devansh Arpit, Xiaodan Zhuang, Walter Andrews, Pradeep Natarajan
CLASS DISCRIMINATIVE FEATURE TRANSFORMATION

Publication number: 20150117766

Abstract: A method for feature transformation of a data set includes: receiving a data set including original feature samples with corresponding class labels; splitting the data set into a direction optimization set and a training set; using the direction optimization set to calculate an optimum transformation vector that maximizes inter-class separability and minimizes intra-class variance of the feature samples with respect to corresponding class labels; using the optimum transformation vector to transform the rest of the original feature samples of the data set to new feature samples with enhanced discriminative characteristics; and training a classifier using the new feature samples, wherein the method is performed by one or more processors.

Type: Application

Filed: August 13, 2014

Publication date: April 30, 2015

Inventors: Manasvi Tickoo, Devansh Arpit, Xiaodan Zhuang, Walter Andrews, Pradeep Natarajan