Patents by Inventor Manmohan Chandraker

Manmohan Chandraker has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

LEVERAGING SEMANTIC INFORMATION FOR A MULTI-DOMAIN VISUAL AGENT

Publication number: 20250148766

Abstract: Systems and methods for leveraging semantic information for a multi-domain visual agent. Semantic information can be leveraged to obtain a multi-domain visual agent. To train the multi-domain visual agent, questions can be sampled from question templates for domain-specific label spaces to obtain a unified label space. The domain-specific labels from the domain-specific label spaces can be mapped into natural language descriptions (NLD) to obtain mapped NLD. The mapped NLD can be converted into prompts by combining the questions sampled from the unified label space and the annotations. The semantic information can be learned by iteratively generating outputs from tokens extracted from the prompts using a large-language model (LLM). The multi-domain visual agent (MDVA) can be trained using the semantic information.

Type: Application

Filed: November 1, 2024

Publication date: May 8, 2025

Inventors: Vijay Kumar Baikampady Gopalkrishna, Masoud Faraki, Yumin Suh, Manmohan Chandraker
PHOTOREALISTIC SYNTHESIS OF AGENTS IN TRAFFIC SCENES

Publication number: 20250148736

Abstract: A computer-implemented method for synthesizing an image includes extracting agent neural radiance fields (NeRFs) from driving video logs and storing agent NeRFs in a database. For a driving video log to be edited, a scene NeRF and agent NeRFs are extracted from the driving video log to be edited. One or more agent NeRFs are selected from the database to insert into or replace existing agents in a traffic scene of the driving video log based on photorealism criteria. The traffic scene is edited by inserting a selected agent NeRF into the traffic scene, replacing existing agents in the traffic scene with the selected agent NeRF, or removing one or more existing agents from the traffic scene. An image of the edited traffic scene is synthesized by composing edited agent NeRFs with the scene NeRF and performing volume rendering.

Type: Application

Filed: October 23, 2024

Publication date: May 8, 2025

Inventors: Bingbing Zhuang, Ziyu Jiang, Manmohan Chandraker, Shanlin Sun
LLM-BASED HYBRID PLANNER FOR AUTONOMOUS DRIVING

Publication number: 20250145176

Abstract: Methods and systems for operating a vehicle include prompting a large language model LLM to generate parameters for a rule-based planner based on historical data for vehicles in a road scene. A trajectory is generated using the parameters. A driving action is performed to implement the trajectory.

Type: Application

Filed: November 1, 2024

Publication date: May 8, 2025

Inventors: Manmohan Chandraker, Francesco Pittaluga, Vijay Kumar Baikampady Gopalkrishna, Sharan Satish Prema
CLOSED-LOOP SIMULATOR FOR MULTIAGENT BEHAVIOR WITH CONTROLLABLE DIFFUSION

Publication number: 20250148911

Abstract: Methods and systems include determining actions for agents in a driving scenario using a diffusion model, based on individual controllable behavior patterns for the agents. A state of the driving scenario is updated based on the determined actions for the plurality of agents. The determination of actions and the update of the state are repeated in a closed-loop fashion to generate simulated trajectories for the plurality of agents. A planner model is trained to select actions for an operating agent based on the simulated trajectories.

Type: Application

Filed: October 31, 2024

Publication date: May 8, 2025

Inventors: Manmohan Chandraker, Francesco Pittaluga, Bingbing Zhuang, Wei-Jer Chang
SELF-IMPROVING DATA ENGINE FOR AUTONOMOUS VEHICLES

Publication number: 20250148757

Abstract: Systems and methods for a self-improving data engine for autonomous vehicles is presented. To train the self-improving data engine for autonomous vehicles (SIDE), multi-modality dense captioning (MMDC) models can detect unrecognized classes from diversified descriptions for input images. A vision-language-model (VLM) can generate textual features from the diversified descriptions and image features from corresponding images to the diversified descriptions. Curated features, including curated textual features and curated image features, can be obtained by comparing similarity scores between the textual features and top-ranked image features based on their likelihood scores. Generate annotations, including bounding boxes and labels, can be generated for the curated features by comparing the similarity scores of labels generated by a zero-shot classifier and the curated textual features. The SIDE can be trained using the curated features, annotations, and feedback.

Type: Application

Filed: October 30, 2024

Publication date: May 8, 2025

Inventors: Jong-Chyi Su, Sparsh Garg, Samuel Schulter, Manmohan Chandraker, Mingfu Liang
PHOTOREALISTIC TRAINING DATA AUGMENTATION

Publication number: 20250148697

Abstract: Methods and systems include training a model for rendering a three-dimensional volume using a loss function that includes a depth loss term and a distribution loss term that regularize an output of the model to produce realistic scenarios. A simulated scenario is generated based on an original scenario, with the simulated scenario including a different position and pose relative to the original scenario in a three-dimensional (3D) scene that is generated by the model from the original scenario. A self-driving model is trained for an autonomous vehicle using the simulated scenario.

Type: Application

Filed: November 4, 2024

Publication date: May 8, 2025

Inventors: Ziyu Jiang, Bingbing Zhuang, Manmohan Chandraker
SELF-IMPROVING MODELS FOR AGENTIC VISUAL PROGRAM SYNTHESIS

Publication number: 20250139527

Abstract: Systems and methods for a self-improving model for agentic visual program synthesis. An agent can be continuously trained using an optimal training tuple to perform a corrective action to a monitored entity which in turn generates new input data for the training. To train the agent, an input question can be decomposed into vision model tasks to generate task outputs. The task outputs can be corrected based on feedback to obtain corrected task outputs. The optimal training tuple can be generated by comparing an optimal tuple threshold with a similarity score of the input image, the input question, and the corrected task outputs.

Type: Application

Filed: October 29, 2024

Publication date: May 1, 2025

Inventors: Vijay Kumar Baikampady Gopalkrishna, Samuel Schulter, Manmohan Chandraker, Zaid Khan
HIERARCHICAL SCENE MODELING FOR SELF-DRIVING VEHICLES

Publication number: 20250118010

Abstract: A computer-implemented method for synthesizing an image includes capturing data from a scene and decomposing the captured scene into static objects; dynamic objects and sky. Bounding boxes are generated for the dynamic objects and motion is simulated for the dynamic objects as static movement of the bounding boxes. The dynamic objects and the static objects are merged according to density and color of sample points. The sky is blended into a merged version of the dynamic objects and the static objects, and an image is synthesized from volume rendered rays.

Type: Application

Filed: October 1, 2024

Publication date: April 10, 2025

Inventors: Ziyu Jiang, Bingbing Zhuang, Manmohan Chandraker
HYBRID MOTION PLANNER FOR AUTONOMOUS VEHICLES

Publication number: 20250115254

Abstract: Systems and methods for a hybrid motion planner for autonomous vehicles. A multi-lane intelligent driver model (MIDM) can predict trajectory predictions from collected data by considering adjacent lanes of an ego vehicle. A multi-lane hybrid planning driver model (MPDM) can be trained using open-loop ground truth data and close-loop simulations to obtain a trained MPDM. The trained MPDM can predict planned trajectories with collected data and the trajectory predictions to generate final trajectories for the autonomous vehicles. The final trajectories can be employed to control the autonomous vehicles.

Type: Application

Filed: October 3, 2024

Publication date: April 10, 2025

Inventors: Buyu Liu, Francesco Pittaluga, Bingbing Zhuang, Manmohan Chandraker, Samuel Sohn
INSTANTANEOUS PERCEPTION OF FINE-GRAINED 3D MOTION

Publication number: 20250115250

Abstract: Methods and systems for motion detection include performing a first prediction to predict voxel occupancy based on a sequence of input point clouds including a current point cloud and a set of previous point clouds. A second prediction is performed to predict voxel occupancy for the sequence of input point clouds using predicted voxel occupancy between the input point clouds. Motion detection is performed based on the completed voxel occupancy. An action is performed responsive to a detected motion.

Type: Application

Filed: October 1, 2024

Publication date: April 10, 2025

Inventors: Bingbing Zhuang, Manmohan Chandraker, Di Liu
VIEW SYNTHESIS FOR SELF-DRIVING

Publication number: 20250118009

Abstract: A computer-implemented method for synthesizing an image includes capturing data from a scene and fusing grid-based representations of the scene from different encodings to inherit beneficial properties of the different encodings, The encodings include Lidar encoding and a high definition map encoding. Rays are rendered from fused grid-based representations. A density and color are determined for points in the rays. A volume rendering is employed for the rays with the density and color. An image is synthesized from the volume rendered rays with the density and the color.

Type: Application

Filed: October 1, 2024

Publication date: April 10, 2025

Inventors: Bingbing Zhuang, Ziyu Jiang, Buyu Liu, Manmohan Chandraker, Shanlin Sun
AUTOMATIC DATA SYSTEMS FOR NOVEL OBJECT DETECTION

Publication number: 20250118044

Abstract: Systems and methods for identifying novel objects in an image include detecting one or more objects in an image and generating one or more captions for the image. One or more predicted categories of the one or more objects detected in the image and the one or more captions are matched to identify, from the one or more predicted categories, a category of a novel object in the image. An image feature and a text description feature are generated using a description of the novel object. A relevant image is selected using a similarity score between the image feature and the text description feature. A model is updated using the relevant image and associated description of the novel object.

Type: Application

Filed: September 20, 2024

Publication date: April 10, 2025

Inventors: Jong-Chyi Su, Samuel Schulter, Sparsh Garg, Manmohan Chandraker, Mingfu Liang
AUTOMATIC ISSUE DETECTION IN MODELS

Publication number: 20250118063

Abstract: Systems and methods include detecting one or more objects in an image and generating one or more captions for the image. One or more predicted categories of the one or more objects detected in the image and the one or more captions are matched. From the one or more predicted categories, a category that is not successfully predicted in the image is identified. Data is curated to improve the category that is not successfully predicted in the image. A perception model is finetuned using data curated.

Type: Application

Filed: September 20, 2024

Publication date: April 10, 2025

Inventors: Jong-Chyi Su, Samuel Schulter, Sparsh Garg, Manmohan Chandraker, Mingfu Liang
GENERATING ADVERSARIAL DRIVING SCENARIOS FOR AUTONOMOUS VEHICLES

Publication number: 20250115278

Abstract: Systems and methods for generating adversarial driving scenarios for autonomous vehicles. An artificial intelligence model can compute an adversarial loss function by minimizing the distance between predicted adversarial perturbed trajectories and corresponding generated neighbor future trajectories from input data. A traffic violation loss function can be computed based on observed adversarial agents adhering to driving rules from the input data. A comfort loss function can be computed based on the predicted driving characteristics of adversarial vehicles relevant to comfort of hypothetical passengers from the input data. A planner module can be trained for autonomous vehicles based on a combined loss function of the adversarial loss function, the traffic violation loss function and the comfort loss function to generate adversarial driving scenarios. An autonomous vehicle can be controlled based on trajectories generated in the adversarial driving scenarios.

Type: Application

Filed: October 3, 2024

Publication date: April 10, 2025

Inventors: Francesco Pittaluga, Buyu Liu, Manmohan Chandraker, Kaiyuan Zhang
AUTOMATIC MULTI-MODALITY SENSOR CALIBRATION WITH NEAR-INFRARED IMAGES

Publication number: 20250117029

Abstract: Systems and methods for automatic multi-modality sensor calibration with near-infrared images (NIR). Image keypoints from collected images and NIR keypoints from NIR can be detected. A deep-learning-based neural network that learns relation graphs between the image keypoints and the NIR keypoints can match the image keypoints and the NIR keypoints. Three dimensional (3D) points from 3D point cloud data can be filtered based on corresponding 3D points from the NIR keypoints (NIR-to-3D points) to obtain filtered NIR-to-3D points. An extrinsic calibration can be optimized based on a reprojection error computed from the filtered NIR-to-3D points to obtain an optimized extrinsic calibration for an autonomous entity control system. An entity can be controlled by employing the optimized extrinsic calibration for the autonomous entity control system.

Type: Application

Filed: October 3, 2024

Publication date: April 10, 2025

Inventors: Tom Bu, Bingbing Zhuang, Manmohan Chandraker
Semantic image capture fault detection

Patent number: 12205356

Abstract: Methods and systems for detecting faults include capturing an image of a scene using a camera. The image is embedded using a segmentation model that includes an image branch having an image embedding layer that embeds images into a joint latent space and a text branch having a text embedding layer that embeds text into the joint latent space. Semantic information is generated for a region of the image corresponding to a predetermined static object using the embedded image. A fault of the camera is identified based on a discrepancy between the semantic information and semantic information of the predetermined static image. The fault of the camera is corrected.

Type: Grant

Filed: March 23, 2023

Date of Patent: January 21, 2025

Assignee: NEC Corporation

Inventors: Samuel Schulter, Sparsh Garg, Manmohan Chandraker
Learning to fuse geometrical and CNN relative camera pose via uncertainty

Patent number: 12205324

Abstract: A computer-implemented method for fusing geometrical and Convolutional Neural Network (CNN) relative camera pose is provided. The method includes receiving two images having different camera poses. The method further includes inputting the two images into a geometric solver branch to return, as a first solution, an estimated camera pose and an associated pose uncertainty value determined from a Jacobian of a reproduction error function. The method also includes inputting the two images into a CNN branch to return, as a second solution, a predicted camera pose and an associated pose uncertainty value. The method additionally includes fusing, by a processor device, the first solution and the second solution in a probabilistic manner using Bayes' rule to obtain a fused pose.

Type: Grant

Filed: November 5, 2021

Date of Patent: January 21, 2025

Assignee: NEC Corporation

Inventors: Bingbing Zhuang, Manmohan Chandraker
OPTIMIZING MODELS FOR OPEN-VOCABULARY DETECTION

Publication number: 20240378454

Abstract: Systems and methods for optimizing models for open-vocabulary detection. Region proposals can be obtained by employing a pre-trained vision-language model and a pre-trained region proposal network. Object feature predictions can be obtained by employing a trained teacher neural network with the region proposals. Object feature predictions can be filtered above a threshold to obtain pseudo labels. A student neural network with a split-and-fusion detection head can be trained by utilizing the region proposals, base ground truth class labels and the pseudo labels. The pseudo labels can be optimized by reducing the noise from the pseudo labels by employing the trained split-and-fusion detection head of the trained student neural network to obtain optimized object detections. An action can be performed relative to a scene layout based on the optimized object detections.

Type: Application

Filed: May 9, 2024

Publication date: November 14, 2024

Inventors: Samuel Schulter, Yumin Suh, Manmohan Chandraker, Vijay Kumar Baikampady Gopalkrishna
QUESTION DECOMPOSITION IN VISUAL QUESTION ANSWERING

Publication number: 20240379234

Abstract: Methods and systems for visual question answering include decomposing an initial question to generate a sub-question. The initial question and an image are applied to a visual question answering model to generate an answer and a confidence score. It is determined that the confidence score is below a threshold value. The sub-question is applied to the visual question answering model, responsive to the determination that the confidence score is below a threshold value, to generate a final answer.

Type: Application

Filed: May 9, 2024

Publication date: November 14, 2024

Inventors: Vijay Kumar Baikampady Gopalkrishnan, Samuel Schulter, Manmohan Chandraker
End-to-end parametric road layout prediction with cheap supervision

Patent number: 12131557

Abstract: A computer-implemented method for road layout prediction is provided. The method includes segmenting, by a first processor-based element, an RGB image to output pixel-level semantic segmentation results for the RGB image in a perspective view for both visible and occluded pixels in the perspective view based on contextual clues. The method further includes learning, by a second processor-based element, a mapping from the pixel-level semantic segmentation results for the RGB image in the perspective view to a top view of the RGB image using a road plane assumption. The method also includes generating, by a third processor-based element, an occlusion-aware parametric road layout prediction for road layout related attributes in the top view.

Type: Grant

Filed: November 8, 2021

Date of Patent: October 29, 2024

Assignee: NEC Corporation

Inventors: Buyu Liu, Bingbing Zhuang, Manmohan Chandraker

1 2 3 4 5 … next