Patents by Inventor Manmohan Chandraker

Manmohan Chandraker has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250145176
    Abstract: Methods and systems for operating a vehicle include prompting a large language model LLM to generate parameters for a rule-based planner based on historical data for vehicles in a road scene. A trajectory is generated using the parameters. A driving action is performed to implement the trajectory.
    Type: Application
    Filed: November 1, 2024
    Publication date: May 8, 2025
    Inventors: Manmohan Chandraker, Francesco Pittaluga, Vijay Kumar Baikampady Gopalkrishna, Sharan Satish Prema
  • Publication number: 20250148697
    Abstract: Methods and systems include training a model for rendering a three-dimensional volume using a loss function that includes a depth loss term and a distribution loss term that regularize an output of the model to produce realistic scenarios. A simulated scenario is generated based on an original scenario, with the simulated scenario including a different position and pose relative to the original scenario in a three-dimensional (3D) scene that is generated by the model from the original scenario. A self-driving model is trained for an autonomous vehicle using the simulated scenario.
    Type: Application
    Filed: November 4, 2024
    Publication date: May 8, 2025
    Inventors: Ziyu Jiang, Bingbing Zhuang, Manmohan Chandraker
  • Publication number: 20250148736
    Abstract: A computer-implemented method for synthesizing an image includes extracting agent neural radiance fields (NeRFs) from driving video logs and storing agent NeRFs in a database. For a driving video log to be edited, a scene NeRF and agent NeRFs are extracted from the driving video log to be edited. One or more agent NeRFs are selected from the database to insert into or replace existing agents in a traffic scene of the driving video log based on photorealism criteria. The traffic scene is edited by inserting a selected agent NeRF into the traffic scene, replacing existing agents in the traffic scene with the selected agent NeRF, or removing one or more existing agents from the traffic scene. An image of the edited traffic scene is synthesized by composing edited agent NeRFs with the scene NeRF and performing volume rendering.
    Type: Application
    Filed: October 23, 2024
    Publication date: May 8, 2025
    Inventors: Bingbing Zhuang, Ziyu Jiang, Manmohan Chandraker, Shanlin Sun
  • Publication number: 20250148766
    Abstract: Systems and methods for leveraging semantic information for a multi-domain visual agent. Semantic information can be leveraged to obtain a multi-domain visual agent. To train the multi-domain visual agent, questions can be sampled from question templates for domain-specific label spaces to obtain a unified label space. The domain-specific labels from the domain-specific label spaces can be mapped into natural language descriptions (NLD) to obtain mapped NLD. The mapped NLD can be converted into prompts by combining the questions sampled from the unified label space and the annotations. The semantic information can be learned by iteratively generating outputs from tokens extracted from the prompts using a large-language model (LLM). The multi-domain visual agent (MDVA) can be trained using the semantic information.
    Type: Application
    Filed: November 1, 2024
    Publication date: May 8, 2025
    Inventors: Vijay Kumar Baikampady Gopalkrishna, Masoud Faraki, Yumin Suh, Manmohan Chandraker
  • Publication number: 20250148757
    Abstract: Systems and methods for a self-improving data engine for autonomous vehicles is presented. To train the self-improving data engine for autonomous vehicles (SIDE), multi-modality dense captioning (MMDC) models can detect unrecognized classes from diversified descriptions for input images. A vision-language-model (VLM) can generate textual features from the diversified descriptions and image features from corresponding images to the diversified descriptions. Curated features, including curated textual features and curated image features, can be obtained by comparing similarity scores between the textual features and top-ranked image features based on their likelihood scores. Generate annotations, including bounding boxes and labels, can be generated for the curated features by comparing the similarity scores of labels generated by a zero-shot classifier and the curated textual features. The SIDE can be trained using the curated features, annotations, and feedback.
    Type: Application
    Filed: October 30, 2024
    Publication date: May 8, 2025
    Inventors: Jong-Chyi Su, Sparsh Garg, Samuel Schulter, Manmohan Chandraker, Mingfu Liang
  • Publication number: 20250148911
    Abstract: Methods and systems include determining actions for agents in a driving scenario using a diffusion model, based on individual controllable behavior patterns for the agents. A state of the driving scenario is updated based on the determined actions for the plurality of agents. The determination of actions and the update of the state are repeated in a closed-loop fashion to generate simulated trajectories for the plurality of agents. A planner model is trained to select actions for an operating agent based on the simulated trajectories.
    Type: Application
    Filed: October 31, 2024
    Publication date: May 8, 2025
    Inventors: Manmohan Chandraker, Francesco Pittaluga, Bingbing Zhuang, Wei-Jer Chang
  • Publication number: 20250139527
    Abstract: Systems and methods for a self-improving model for agentic visual program synthesis. An agent can be continuously trained using an optimal training tuple to perform a corrective action to a monitored entity which in turn generates new input data for the training. To train the agent, an input question can be decomposed into vision model tasks to generate task outputs. The task outputs can be corrected based on feedback to obtain corrected task outputs. The optimal training tuple can be generated by comparing an optimal tuple threshold with a similarity score of the input image, the input question, and the corrected task outputs.
    Type: Application
    Filed: October 29, 2024
    Publication date: May 1, 2025
    Inventors: Vijay Kumar Baikampady Gopalkrishna, Samuel Schulter, Manmohan Chandraker, Zaid Khan
  • Publication number: 20250118063
    Abstract: Systems and methods include detecting one or more objects in an image and generating one or more captions for the image. One or more predicted categories of the one or more objects detected in the image and the one or more captions are matched. From the one or more predicted categories, a category that is not successfully predicted in the image is identified. Data is curated to improve the category that is not successfully predicted in the image. A perception model is finetuned using data curated.
    Type: Application
    Filed: September 20, 2024
    Publication date: April 10, 2025
    Inventors: Jong-Chyi Su, Samuel Schulter, Sparsh Garg, Manmohan Chandraker, Mingfu Liang
  • Publication number: 20250115254
    Abstract: Systems and methods for a hybrid motion planner for autonomous vehicles. A multi-lane intelligent driver model (MIDM) can predict trajectory predictions from collected data by considering adjacent lanes of an ego vehicle. A multi-lane hybrid planning driver model (MPDM) can be trained using open-loop ground truth data and close-loop simulations to obtain a trained MPDM. The trained MPDM can predict planned trajectories with collected data and the trajectory predictions to generate final trajectories for the autonomous vehicles. The final trajectories can be employed to control the autonomous vehicles.
    Type: Application
    Filed: October 3, 2024
    Publication date: April 10, 2025
    Inventors: Buyu Liu, Francesco Pittaluga, Bingbing Zhuang, Manmohan Chandraker, Samuel Sohn
  • Publication number: 20250115278
    Abstract: Systems and methods for generating adversarial driving scenarios for autonomous vehicles. An artificial intelligence model can compute an adversarial loss function by minimizing the distance between predicted adversarial perturbed trajectories and corresponding generated neighbor future trajectories from input data. A traffic violation loss function can be computed based on observed adversarial agents adhering to driving rules from the input data. A comfort loss function can be computed based on the predicted driving characteristics of adversarial vehicles relevant to comfort of hypothetical passengers from the input data. A planner module can be trained for autonomous vehicles based on a combined loss function of the adversarial loss function, the traffic violation loss function and the comfort loss function to generate adversarial driving scenarios. An autonomous vehicle can be controlled based on trajectories generated in the adversarial driving scenarios.
    Type: Application
    Filed: October 3, 2024
    Publication date: April 10, 2025
    Inventors: Francesco Pittaluga, Buyu Liu, Manmohan Chandraker, Kaiyuan Zhang
  • Publication number: 20250118009
    Abstract: A computer-implemented method for synthesizing an image includes capturing data from a scene and fusing grid-based representations of the scene from different encodings to inherit beneficial properties of the different encodings, The encodings include Lidar encoding and a high definition map encoding. Rays are rendered from fused grid-based representations. A density and color are determined for points in the rays. A volume rendering is employed for the rays with the density and color. An image is synthesized from the volume rendered rays with the density and the color.
    Type: Application
    Filed: October 1, 2024
    Publication date: April 10, 2025
    Inventors: Bingbing Zhuang, Ziyu Jiang, Buyu Liu, Manmohan Chandraker, Shanlin Sun
  • Publication number: 20250115250
    Abstract: Methods and systems for motion detection include performing a first prediction to predict voxel occupancy based on a sequence of input point clouds including a current point cloud and a set of previous point clouds. A second prediction is performed to predict voxel occupancy for the sequence of input point clouds using predicted voxel occupancy between the input point clouds. Motion detection is performed based on the completed voxel occupancy. An action is performed responsive to a detected motion.
    Type: Application
    Filed: October 1, 2024
    Publication date: April 10, 2025
    Inventors: Bingbing Zhuang, Manmohan Chandraker, Di Liu
  • Publication number: 20250117029
    Abstract: Systems and methods for automatic multi-modality sensor calibration with near-infrared images (NIR). Image keypoints from collected images and NIR keypoints from NIR can be detected. A deep-learning-based neural network that learns relation graphs between the image keypoints and the NIR keypoints can match the image keypoints and the NIR keypoints. Three dimensional (3D) points from 3D point cloud data can be filtered based on corresponding 3D points from the NIR keypoints (NIR-to-3D points) to obtain filtered NIR-to-3D points. An extrinsic calibration can be optimized based on a reprojection error computed from the filtered NIR-to-3D points to obtain an optimized extrinsic calibration for an autonomous entity control system. An entity can be controlled by employing the optimized extrinsic calibration for the autonomous entity control system.
    Type: Application
    Filed: October 3, 2024
    Publication date: April 10, 2025
    Inventors: Tom Bu, Bingbing Zhuang, Manmohan Chandraker
  • Publication number: 20250118010
    Abstract: A computer-implemented method for synthesizing an image includes capturing data from a scene and decomposing the captured scene into static objects; dynamic objects and sky. Bounding boxes are generated for the dynamic objects and motion is simulated for the dynamic objects as static movement of the bounding boxes. The dynamic objects and the static objects are merged according to density and color of sample points. The sky is blended into a merged version of the dynamic objects and the static objects, and an image is synthesized from volume rendered rays.
    Type: Application
    Filed: October 1, 2024
    Publication date: April 10, 2025
    Inventors: Ziyu Jiang, Bingbing Zhuang, Manmohan Chandraker
  • Publication number: 20250118044
    Abstract: Systems and methods for identifying novel objects in an image include detecting one or more objects in an image and generating one or more captions for the image. One or more predicted categories of the one or more objects detected in the image and the one or more captions are matched to identify, from the one or more predicted categories, a category of a novel object in the image. An image feature and a text description feature are generated using a description of the novel object. A relevant image is selected using a similarity score between the image feature and the text description feature. A model is updated using the relevant image and associated description of the novel object.
    Type: Application
    Filed: September 20, 2024
    Publication date: April 10, 2025
    Inventors: Jong-Chyi Su, Samuel Schulter, Sparsh Garg, Manmohan Chandraker, Mingfu Liang
  • Patent number: 12205324
    Abstract: A computer-implemented method for fusing geometrical and Convolutional Neural Network (CNN) relative camera pose is provided. The method includes receiving two images having different camera poses. The method further includes inputting the two images into a geometric solver branch to return, as a first solution, an estimated camera pose and an associated pose uncertainty value determined from a Jacobian of a reproduction error function. The method also includes inputting the two images into a CNN branch to return, as a second solution, a predicted camera pose and an associated pose uncertainty value. The method additionally includes fusing, by a processor device, the first solution and the second solution in a probabilistic manner using Bayes' rule to obtain a fused pose.
    Type: Grant
    Filed: November 5, 2021
    Date of Patent: January 21, 2025
    Assignee: NEC Corporation
    Inventors: Bingbing Zhuang, Manmohan Chandraker
  • Patent number: 12205356
    Abstract: Methods and systems for detecting faults include capturing an image of a scene using a camera. The image is embedded using a segmentation model that includes an image branch having an image embedding layer that embeds images into a joint latent space and a text branch having a text embedding layer that embeds text into the joint latent space. Semantic information is generated for a region of the image corresponding to a predetermined static object using the embedded image. A fault of the camera is identified based on a discrepancy between the semantic information and semantic information of the predetermined static image. The fault of the camera is corrected.
    Type: Grant
    Filed: March 23, 2023
    Date of Patent: January 21, 2025
    Assignee: NEC Corporation
    Inventors: Samuel Schulter, Sparsh Garg, Manmohan Chandraker
  • Publication number: 20240379234
    Abstract: Methods and systems for visual question answering include decomposing an initial question to generate a sub-question. The initial question and an image are applied to a visual question answering model to generate an answer and a confidence score. It is determined that the confidence score is below a threshold value. The sub-question is applied to the visual question answering model, responsive to the determination that the confidence score is below a threshold value, to generate a final answer.
    Type: Application
    Filed: May 9, 2024
    Publication date: November 14, 2024
    Inventors: Vijay Kumar Baikampady Gopalkrishnan, Samuel Schulter, Manmohan Chandraker
  • Publication number: 20240378454
    Abstract: Systems and methods for optimizing models for open-vocabulary detection. Region proposals can be obtained by employing a pre-trained vision-language model and a pre-trained region proposal network. Object feature predictions can be obtained by employing a trained teacher neural network with the region proposals. Object feature predictions can be filtered above a threshold to obtain pseudo labels. A student neural network with a split-and-fusion detection head can be trained by utilizing the region proposals, base ground truth class labels and the pseudo labels. The pseudo labels can be optimized by reducing the noise from the pseudo labels by employing the trained split-and-fusion detection head of the trained student neural network to obtain optimized object detections. An action can be performed relative to a scene layout based on the optimized object detections.
    Type: Application
    Filed: May 9, 2024
    Publication date: November 14, 2024
    Inventors: Samuel Schulter, Yumin Suh, Manmohan Chandraker, Vijay Kumar Baikampady Gopalkrishna
  • Patent number: 12131557
    Abstract: A computer-implemented method for road layout prediction is provided. The method includes segmenting, by a first processor-based element, an RGB image to output pixel-level semantic segmentation results for the RGB image in a perspective view for both visible and occluded pixels in the perspective view based on contextual clues. The method further includes learning, by a second processor-based element, a mapping from the pixel-level semantic segmentation results for the RGB image in the perspective view to a top view of the RGB image using a road plane assumption. The method also includes generating, by a third processor-based element, an occlusion-aware parametric road layout prediction for road layout related attributes in the top view.
    Type: Grant
    Filed: November 8, 2021
    Date of Patent: October 29, 2024
    Assignee: NEC Corporation
    Inventors: Buyu Liu, Bingbing Zhuang, Manmohan Chandraker