Patents by Inventor Nakul Agarwal

Nakul Agarwal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12386890
    Abstract: Systems and methods for augmenting large video language models (LVLMs) by incorporation of a plausible action anticipation framework. The framework augments the LVLM by taking into account the aspect of plausibility in an action sequence. Two objective functions directed to counterfactual-based plausible action sequence learning loss and a long-horizon action repetition loss are derived and used to train the LVLM to generate plausible anticipated action sequences. The augmented LVLM is then able to produce sequences whereby each action in the sequence is temporally and spatially factual with respect to the others.
    Type: Grant
    Filed: April 8, 2024
    Date of Patent: August 12, 2025
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Himangi Mittal, Nakul Agarwal, Shao-Yuan Lo, Kwonjoon Lee
  • Publication number: 20250199123
    Abstract: A sensor system includes a ranged sensor that generates time-series data indicating positions of objects in an environment, and at least one processor that receives the time-series data generated by the ranged sensor, encodes the time-series data into edge embeddings with an encoder, and computes edge features and edge logits of the objects in the environment, represented in a latent space, based on the edge embeddings. The at least one processor also disentangles the edge features in the latent space, and generates a representation of time-invariant latent characteristics of interactions between the edge features in the latent space.
    Type: Application
    Filed: March 25, 2024
    Publication date: June 19, 2025
    Inventors: Victoria Magdalena DAX, Jiachen LI, Enna SACHDEVA, Nakul AGARWAL, Mykel J. KOCHENDERFER
  • Publication number: 20250166377
    Abstract: A method for forming versatile action models for video understanding may gather data from a video. The data may comprise textual video representations and other task specific language inputs. The method may use a pre-trained large language model (LLM) next token prediction for action anticipation based on the data from the video.
    Type: Application
    Filed: March 22, 2024
    Publication date: May 22, 2025
    Applicants: Honda Motor Co., Ltd., Brown University
    Inventors: Shijie Wang, Qi Zhao, Minh Quan Do, Nakul Agarwal, Kwonjoon Lee, Chen Sun
  • Publication number: 20250014338
    Abstract: An electronic device and method for object-centric video representation for action prediction is provided. The electronic device extracts a first sequence of video segments from video content associated with a domain and detects a set of objects in the first sequence of video segments. The electronic device generates a set of embeddings based on the first sequence of video segments and the set of objects. The electronic device applies a PTE model on the set of embeddings. The electronic device predicts, based on the application, a set of object-action pairs associated with a second sequence of video segments of the video content. Each object-action pair includes an action to be executed using an object of the set of objects in a video segment of the second sequence of video segments. The second sequence of video segments succeeds the first sequence of video segments in a timeline of the video content.
    Type: Application
    Filed: December 14, 2023
    Publication date: January 9, 2025
    Applicants: Honda Motor Co., Ltd., Brown University
    Inventors: CE ZHANG, CHANGCHENG FU, SHIJIE WANG, NAKUL AGARWAL, KWONJOON LEE, CHIHO CHOI, CHEN SUN
  • Publication number: 20250014321
    Abstract: An electronic device and method for using neural language models for long-term action anticipation from videos is provided. The electronic device receives a video that includes one or more objects performing a physical task and generates, based on the video, a first set of tags that corresponds to a first sequence of actions associated with the physical task. The electronic device generates a first prompt for a neural language model based on the first set of tags and predicts, by application of the neural language model on the first prompt, a second set of tags that corresponds to a second sequence of actions associated with the physical task. The second sequence of actions succeeds the first sequence of actions. The electronic device controls a display device to display first prediction information based on the second set of tags.
    Type: Application
    Filed: December 14, 2023
    Publication date: January 9, 2025
    Applicants: Honda Motor Co., Ltd., Brown University
    Inventors: CE ZHANG, CHANGCHENG FU, SHIJIE WANG, QI ZHAO, CHEN SUN, NAKUL AGARWAL, KWONJOON LEE
  • Patent number: 12183090
    Abstract: According to one aspect, intersection scenario description may be implemented by receiving a video stream of a surrounding environment of an ego-vehicle, extracting tracklets and appearance features associated with dynamic objects from the surrounding environment, extracting motion features associated with dynamic objects from the surrounding environment based on the corresponding tracklets, passing the appearance features through an appearance neural network to generate an appearance model, passing the motion features through a motion neural network to generate a motion model, passing the appearance model and the motion model through a fusion network to generate a fusion output, passing the fusion output through a classifier to generate a classifier output, and passing the classifier output through a loss function to generate a multi-label classification output associated with the ego-vehicle, dynamic objects, and corresponding motion paths.
    Type: Grant
    Filed: June 30, 2022
    Date of Patent: December 31, 2024
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Nakul Agarwal, Yi-Ting Chen
  • Patent number: 12169964
    Abstract: A system and method for providing weakly-supervised online action segmentation that include receiving image data associated with multi-view videos of a procedure, wherein the procedure involves a plurality of atomic actions. The system and method also include analyzing the image data using weakly-supervised action segmentation to identify each of the plurality of atomic actions by using an ordered sequence of action labels. The system and method additionally include training a neural network with data pertaining to the plurality of atomic actions based on the weakly-supervised action segmentation. The system and method further include executing online action segmentation to label atomic actions that are occurring in real-time based on the plurality of atomic actions trained to the neural network.
    Type: Grant
    Filed: February 1, 2022
    Date of Patent: December 17, 2024
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Reza Ghoddoosian, Isht Dwivedi, Nakul Agarwal, Chiho Choi, Behzad Dariush
  • Publication number: 20240404297
    Abstract: Systems and methods for training a neural network for generating a reasoning statement are provided. In one embodiment, a method includes receiving sensor data from a perspective of an ego agent. The method includes identifying a plurality of captured objects in the at least one roadway environment. The method includes receiving a set of ranking classifications for a captured object of the plurality of captured objects. The annotator reasoning statement is a natural language explanation for the applied attribute. The method includes generating a training dataset for the object type including the annotator reasoning statements of the set of ranking classifications that include the applied attribute from the plurality of importance attributes in the importance category. The method includes training the neural network to generate a generated reasoning statement based on the training dataset in response to a training agent detecting a detected object of the object type.
    Type: Application
    Filed: June 5, 2023
    Publication date: December 5, 2024
    Inventors: Enna SACHDEVA, Nakul AGARWAL, Sean F. ROELOFS, Jiachen LI, Behzad DARIUSH, Chiho CHOI
  • Publication number: 20240371166
    Abstract: According to one aspect, weakly-supervised action segmentation may include performing feature extraction to extract one or more features associated with a current frame of a video including a series of one or more actions, feeding one or more of the features to a recognition network to generate a predicted action score for the current frame of the video, feeding one or more of the features and the predicted action score to an action transition model to generate a potential subsequent action, feeding the potential subsequent action and the predicted action score to a hybrid segmentation model to generate a predicted sequence of actions from a first frame of the video to the current frame of the video, and segmenting or labeling one or more frames of the video based on the predicted sequence of actions from the first frame of the video to the current frame of the video.
    Type: Application
    Filed: April 27, 2023
    Publication date: November 7, 2024
    Inventors: Reza GHODDOOSIAN, Isht DWIVEDI, Nakul AGARWAL, Behzad DARIUSH
  • Patent number: 12094214
    Abstract: A system and method for providing an agent action anticipative transformer that include receiving image data associated with a video of a surrounding environment of an ego agent. The system and method additionally include analyzing the image data and extracting short range clips from the image data. The system and method also include analyzing the short range clips and extracting clip-level features associated with each of the short range clips. The system and method further include executing self-supervision using causal masking with respect to the extracted clip-level features to output action predictions and feature predictions to enable ego-centric action anticipation with respect to at least one target agent to autonomously control the ego agent.
    Type: Grant
    Filed: April 29, 2022
    Date of Patent: September 17, 2024
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Harshayu Girase, Nakul Agarwal, Chiho Choi
  • Publication number: 20240161447
    Abstract: According to one aspect, spatial action localization in the future (SALF) may include feeding a frame from a time step of a video clip through an encoder to generate a latent feature, feeding the latent feature and one or more latent features from one or more previous time steps of the video clip through a future feature predictor to generate a cumulative information for the time step, feeding the cumulative information through a decoder to generate a predicted action area and a predicted action classification associated with the predicted action area, and implementing an action based on the predicted action area and the predicted action classification. The encoder may include a 2D convolutional neural network (CNN) and/or a 3D-CNN. The future feature predictor may be based on an ordinary differential equation (ODE) function.
    Type: Application
    Filed: April 14, 2023
    Publication date: May 16, 2024
    Inventors: Hyung-gun CHI, Kwonjoon LEE, Nakul AGARWAL, Yi XU, Chiho CHOI
  • Patent number: 11845464
    Abstract: Driver behavior risk assessment and pedestrian awareness may include an receiving an input stream of images of an environment including one or more objects within the environment, estimating an intention of an ego vehicle based on the input stream of images and a temporal recurrent network (TRN), generating a scene representation based on the input stream of images and a graph neural network (GNN), generating a prediction of a situation based on the scene representation and the intention of the ego vehicle, and generating an influenced or non-influenced action determination based on the prediction of the situation and the scene representation.
    Type: Grant
    Filed: January 29, 2021
    Date of Patent: December 19, 2023
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Nakul Agarwal, Yi-Ting Chen
  • Publication number: 20230351759
    Abstract: A system and method for providing an agent action anticipative transformer that include receiving image data associated with a video of a surrounding environment of an ego agent. The system and method additionally include analyzing the image data and extracting short range clips from the image data. The system and method also include analyzing the short range clips and extracting clip-level features associated with each of the short range clips. The system and method further include executing self-supervision using causal masking with respect to the extracted clip-level features to output action predictions and feature predictions to enable ego-centric action anticipation with respect to at least one target agent to autonomously control the ego agent.
    Type: Application
    Filed: April 29, 2022
    Publication date: November 2, 2023
    Inventors: Harshayu GIRASE, Nakul AGARWAL, Chiho CHOI
  • Publication number: 20230311942
    Abstract: Driver behavior risk assessment and pedestrian awareness may include an receiving an input stream of images of an environment including one or more objects within the environment, estimating an intention of an ego vehicle based on the input stream of images and a temporal recurrent network (TRN), generating a scene representation based on the input stream of images and a graph neural network (GNN), generating a prediction of a situation based on the scene representation and the intention of the ego vehicle, and generating an influenced or non-influenced action determination based on the prediction of the situation and the scene representation.
    Type: Application
    Filed: June 8, 2023
    Publication date: October 5, 2023
    Inventors: Nakul AGARWAL, Yi-Ting CHEN
  • Patent number: 11741723
    Abstract: A system and method for performing intersection scenario retrieval that includes receiving a video stream of a surrounding environment of an ego vehicle. The system and method also include analyzing the video stream to trim the video stream into video clips of an intersection scene associated with the travel of the ego vehicle. The system and method additionally include annotating the ego vehicle, dynamic objects, and their motion paths that are included within the intersection scene with action units that describe an intersection scenario. The system and method further include retrieving at least one intersection scenario based on a query of an electronic dataset that stores a combination of action units to operably control a presentation of at least one intersection scenario video clip that includes the at least one intersection scenario.
    Type: Grant
    Filed: June 29, 2020
    Date of Patent: August 29, 2023
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Yi-Ting Chen, Nakul Agarwal, Behzad Dariush, Ahmed Taha
  • Publication number: 20230154195
    Abstract: According to one aspect, intersection scenario description may be implemented by receiving a video stream of a surrounding environment of an ego-vehicle, extracting tracklets and appearance features associated with dynamic objects from the surrounding environment, extracting motion features associated with dynamic objects from the surrounding environment based on the corresponding tracklets, passing the appearance features through an appearance neural network to generate an appearance model, passing the motion features through a motion neural network to generate a motion model, passing the appearance model and the motion model through a fusion network to generate a fusion output, passing the fusion output through a classifier to generate a classifier output, and passing the classifier output through a loss function to generate a multi-label classification output associated with the ego-vehicle, dynamic objects, and corresponding motion paths.
    Type: Application
    Filed: June 30, 2022
    Publication date: May 18, 2023
    Inventors: Nakul AGARWAL, Yi-Ting CHEN
  • Publication number: 20230141037
    Abstract: A system and method for providing weakly-supervised online action segmentation that include receiving image data associated with multi-view videos of a procedure, wherein the procedure involves a plurality of atomic actions. The system and method also include analyzing the image data using weakly-supervised action segmentation to identify each of the plurality of atomic actions by using an ordered sequence of action labels. The system and method additionally include training a neural network with data pertaining to the plurality of atomic actions based on the weakly-supervised action segmentation. The system and method further include executing online action segmentation to label atomic actions that are occurring in real-time based on the plurality of atomic actions trained to the neural network.
    Type: Application
    Filed: February 1, 2022
    Publication date: May 11, 2023
    Inventors: Reza GHODDOOSIAN, Isht DWIVEDI, Nakul AGARWAL, Chiho CHOI, Behzad DARIUSH
  • Patent number: 11580743
    Abstract: A system and method for providing unsupervised domain adaption for spatio-temporal action localization that includes receiving video data associated with a source domain and a target domain that are associated with a surrounding environment of a vehicle. The system and method also include analyzing the video data associated with the source domain and the target domain and determining a key frame of the source domain and a key frame of the target domain. The system and method additionally include completing an action localization model to model a temporal context of actions occurring within the key frame of the source domain and the key frame of the target domain and completing an action adaption model to localize individuals and their actions and to classify the actions based on the video data. The system and method further include combining losses to complete spatio-temporal action localization of individuals and actions.
    Type: Grant
    Filed: March 25, 2022
    Date of Patent: February 14, 2023
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Yi-Ting Chen, Behzad Dariush, Nakul Agarwal, Ming-Hsuan Yang
  • Patent number: 11403850
    Abstract: A system and method for providing unsupervised domain adaption for spatio-temporal action localization that includes receiving video data associated with a surrounding environment of a vehicle. The system and method also include completing an action localization model to model a temporal context of actions occurring within the surrounding environment of the vehicle based on the video data and completing an action adaption model to localize individuals and their actions and to classify the actions based on the video data. The system and method further include combining losses from the action localization model and the action adaption model to complete spatio-temporal action localization of individuals and actions that occur within the surrounding environment of the vehicle.
    Type: Grant
    Filed: February 28, 2020
    Date of Patent: August 2, 2022
    Assignee: Honda Motor Co., Ltd.
    Inventors: Yi-Ting Chen, Behzad Dariush, Nakul Agarwal, Ming-Hsuan Yang
  • Publication number: 20220215661
    Abstract: A system and method for providing unsupervised domain adaption for spatio-temporal action localization that includes receiving video data associated with a source domain and a target domain that are associated with a surrounding environment of a vehicle. The system and method also include analyzing the video data associated with the source domain and the target domain and determining a key frame of the source domain and a key frame of the target domain. The system and method additionally include completing an action localization model to model a temporal context of actions occurring within the key frame of the source domain and the key frame of the target domain and completing an action adaption model to localize individuals and their actions and to classify the actions based on the video data. The system and method further include combining losses to complete spatio-temporal action localization of individuals and actions.
    Type: Application
    Filed: March 25, 2022
    Publication date: July 7, 2022
    Inventors: Yi-Ting CHEN, Behzad DARIUSH, Nakul AGARWAL, Ming-Hsuan YANG