Patents by Inventor Manmohan Chandraker

Manmohan Chandraker has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240379234
    Abstract: Methods and systems for visual question answering include decomposing an initial question to generate a sub-question. The initial question and an image are applied to a visual question answering model to generate an answer and a confidence score. It is determined that the confidence score is below a threshold value. The sub-question is applied to the visual question answering model, responsive to the determination that the confidence score is below a threshold value, to generate a final answer.
    Type: Application
    Filed: May 9, 2024
    Publication date: November 14, 2024
    Inventors: Vijay Kumar Baikampady Gopalkrishnan, Samuel Schulter, Manmohan Chandraker
  • Publication number: 20240378454
    Abstract: Systems and methods for optimizing models for open-vocabulary detection. Region proposals can be obtained by employing a pre-trained vision-language model and a pre-trained region proposal network. Object feature predictions can be obtained by employing a trained teacher neural network with the region proposals. Object feature predictions can be filtered above a threshold to obtain pseudo labels. A student neural network with a split-and-fusion detection head can be trained by utilizing the region proposals, base ground truth class labels and the pseudo labels. The pseudo labels can be optimized by reducing the noise from the pseudo labels by employing the trained split-and-fusion detection head of the trained student neural network to obtain optimized object detections. An action can be performed relative to a scene layout based on the optimized object detections.
    Type: Application
    Filed: May 9, 2024
    Publication date: November 14, 2024
    Inventors: Samuel Schulter, Yumin Suh, Manmohan Chandraker, Vijay Kumar Baikampady Gopalkrishna
  • Patent number: 12131557
    Abstract: A computer-implemented method for road layout prediction is provided. The method includes segmenting, by a first processor-based element, an RGB image to output pixel-level semantic segmentation results for the RGB image in a perspective view for both visible and occluded pixels in the perspective view based on contextual clues. The method further includes learning, by a second processor-based element, a mapping from the pixel-level semantic segmentation results for the RGB image in the perspective view to a top view of the RGB image using a road plane assumption. The method also includes generating, by a third processor-based element, an occlusion-aware parametric road layout prediction for road layout related attributes in the top view.
    Type: Grant
    Filed: November 8, 2021
    Date of Patent: October 29, 2024
    Assignee: NEC Corporation
    Inventors: Buyu Liu, Bingbing Zhuang, Manmohan Chandraker
  • Publication number: 20240354583
    Abstract: Methods and systems for training a model include annotating a subset of an unlabeled training dataset, that includes images of road scenes, with labels. A road defect detection model is iteratively trained, including adding pseudo-labels to a remainder of examples from the unlabeled training dataset and training the road defect detection model based on the labels and the pseudo-labels.
    Type: Application
    Filed: March 25, 2024
    Publication date: October 24, 2024
    Inventors: Sparsh Garg, Samuel Schulter, Bingbing Zhuang, Manmohan Chandraker
  • Publication number: 20240354921
    Abstract: Systems and methods for road defect level prediction. A depth map is obtained from an image dataset received from input peripherals by employing a vision transformer model. A plurality of semantic maps is obtained from the image dataset by employing a semantic segmentation model to give pixel-wise segmentation results of road scenes to detect road pixels. Regions of interest (ROI) are detected by utilizing the road pixels. Road defect levels are predicted by fitting the ROI and the depth map into a road surface model to generate road points classified into road defect levels. The predicted road defect levels are visualized on a road map.
    Type: Application
    Filed: March 26, 2024
    Publication date: October 24, 2024
    Inventors: Sparsh Garg, Bingbing Zhuang, Samuel Schulter, Manmohan Chandraker
  • Publication number: 20240354336
    Abstract: Systems and methods are provided for identifying and retrieving semantically similar images from a database. Semantic analysis is performed on an input query utilizing a vision language model to identify semantic concepts associated with the input query. A preliminary set of images is retrieved from the database for semantic concepts identified. Relevant concepts are extracted for images with a tokenizer by comparing images against a predefined label space to identify relevant concepts. A ranked list of relevant concepts is generated based on occurrence frequency within the set. The preliminary set of images is refined based on selecting specific relevant concepts from the ranked list by the user by combining the input query with the specific relevant concepts. Additional semantic analysis is iteratively performed to retrieve additional sets of images semantically similar to the combined input query and selection of the specific relevant concepts until a threshold condition is met.
    Type: Application
    Filed: April 18, 2024
    Publication date: October 24, 2024
    Inventors: Vijay Kumar Baikampady Gopalkrishna, Samuel Schulter, Manmohan Chandraker, Xiang Yu
  • Publication number: 20240354642
    Abstract: Methods and systems for fine-tuning a model include generating a label space for a target domain. Text pseudo-labels are generated for images in an unlabeled dataset from the target domain based on the label space using a pre-trained vision language model. The pre-trained vision language model is fine-tuned for the target domain using the images with the text pseudo-labels.
    Type: Application
    Filed: March 26, 2024
    Publication date: October 24, 2024
    Inventors: Vijay Kumar Baikampady Gopalkrishna, Manmohan Chandraker
  • Publication number: 20240355090
    Abstract: Systems and methods are provided for matching one or more images using conditional similarity pseudo-labels, including analyzing an unlabeled dataset of images, accessing a foundational vision-language model trained on a plurality of image-text pairs, and defining a set of attributes each comprising multiple possible values for generating pseudo-labels based on notions of similarity (NoS). Text prompts are generated for each attribute value using a prompt template and encoding the text prompts using a text encoder of the foundational model. Each image in the dataset of images is processed through a vision encoder of the foundational model to obtain visual features, the visual features are compared against encoded text prompts to assign a pseudo-label for each attribute for each image, and a conditional similarity network (CSN) is trained with the pseudo-labeled images to generate a conditional similarity model.
    Type: Application
    Filed: April 18, 2024
    Publication date: October 24, 2024
    Inventors: Vijay Kumar Baikampady Gopalkrishna, Samuel Schulter, Xiang Yu, Manmohan Chandraker
  • Patent number: 12080100
    Abstract: A method for employing facial information in unsupervised person re-identification is presented. The method includes extracting, by a body feature extractor, body features from a first data stream, extracting, by a head feature extractor, head features from a second data stream, outputting a body descriptor vector from the body feature extractor, outputting a head descriptor vector from the head feature extractor, and concatenating the body descriptor vector and the head descriptor vector to enable a model to generate a descriptor vector.
    Type: Grant
    Filed: November 5, 2021
    Date of Patent: September 3, 2024
    Assignee: NEC Corporation
    Inventors: Yumin Suh, Xiang Yu, Yi-Hsuan Tsai, Masoud Faraki, Manmohan Chandraker
  • Patent number: 12045992
    Abstract: Methods and systems for training a model include combining data from multiple datasets, the datasets having different respective label spaces. Relationships between labels in the different label spaces are identified. A unified neural network model is trained, using the combined data and the identified relationships to generate a unified model, with a class relational binary cross-entropy loss.
    Type: Grant
    Filed: November 5, 2021
    Date of Patent: July 23, 2024
    Assignee: NEC Corporation
    Inventors: Yi-Hsuan Tsai, Masoud Faraki, Yumin Suh, Sparsh Garg, Manmohan Chandraker, Dongwan Kim
  • Publication number: 20240233314
    Abstract: A system for rich human analysis includes a memory and one or more processors in communication with the memory configured to extract images from camera in a surveillance system and feed the images to a person detection and tracking system that deciphers human activity tasks. Attributes of persons detected and tracked by the person detection and tracking system are estimated by a rich human analysis system to identify attributes in accordance with set criteria using a set of filters of deeper layers of convolutional layers of a feature extractor where the filters are divided into N groups trained on N corresponding tasks corresponding to task-specific heads such that one task is assigned to each group of the N groups and that each task loss updates only one subset of filters. One or more people that satisfy the attributes and the set criteria are identified.
    Type: Application
    Filed: March 21, 2024
    Publication date: July 11, 2024
    Inventors: Yumin Suh, Weijian Deng, Xiang Yu, Masoud Faraki, Manmohan Chandraker, Turgun Kashgari
  • Patent number: 12005892
    Abstract: A method for simultaneous multi-agent recurrent trajectory prediction is presented. The method includes reconstructing a topological layout of a scene from a dataset including real-world data, generating a road graph of the scene, the road graph capturing a hierarchical structure of interconnected lanes, incorporating vehicles from the scene on the generated road graph by utilizing tracklet information available in the dataset, assigning the vehicles to their closest lane identifications, and identifying diverse plausible behaviors for every vehicle in the scene. The method further includes sampling one behavior from the diverse plausible behaviors to select an associated velocity profile sampled from the real-world data of the dataset that resembles the sampled one behavior and feeding the road graph and the sampled velocity profile with a desired destination to a dynamics simulator to generate a plurality of simulated diverse trajectories output on a visualization device.
    Type: Grant
    Filed: November 5, 2020
    Date of Patent: June 11, 2024
    Assignee: NEC CORPORATION
    Inventors: Sriram Nochur Narayanan, Manmohan Chandraker
  • Patent number: 11987236
    Abstract: A method provided for 3D object localization predicts pairs of 2D bounding boxes. Each pair corresponds to a detected object in each of the two consecutive input monocular images. The method generates, for each detected object, a relative motion estimation specifying a relative motion between the two images. The method constructs an object cost volume by aggregating temporal features from the two images using the pairs of 2D bounding boxes and the relative motion estimation to predict a range of object depth candidates and a confidence score for each object depth candidate and an object depth from the object depth candidates. The method updates the relative motion estimation based on the object cost volume and the object depth to provide a refined object motion and a refined object depth. The method reconstructs a 3D bounding box for each detected object based on the refined object motion and refined object depth.
    Type: Grant
    Filed: August 23, 2021
    Date of Patent: May 21, 2024
    Assignee: NEC Corporation
    Inventors: Pan Ji, Buyu Liu, Bingbing Zhuang, Manmohan Chandraker, Xiangyu Chen
  • Publication number: 20240152767
    Abstract: Systems and methods for training a visual question answer model include training a teacher model by performing image conditional visual question generation on a visual language model (VLM) and a targeted visual question answer dataset using images to generate question and answer pairs. Unlabeled images are pseudolabeled using the teacher model to decode synthetic question and answer pairs for the unlabeled images. The synthetic question and answer pairs for the unlabeled images are merged with real data from the targeted visual question answer dataset to generate a self-augmented training set. A student model is trained using the VLM and the self-augmented training set to return visual answers to text queries.
    Type: Application
    Filed: October 30, 2023
    Publication date: May 9, 2024
    Inventors: Vijay Kumar Baikampady Gopalkrishna, Samuel Schulter, Xiang Yu, Zaid Khan, Manmohan Chandraker
  • Patent number: 11977602
    Abstract: A method for training a model for face recognition is provided. The method forward trains a training batch of samples to form a face recognition model w(t), and calculates sample weights for the batch. The method obtains a training batch gradient with respect to model weights thereof and updates, using the gradient, the model w(t) to a face recognition model what(t). The method forwards a validation batch of samples to the face recognition model what(t). The method obtains a validation batch gradient, and updates, using the validation batch gradient and what(t), a sample-level importance weight of samples in the training batch to obtain an updated sample-level importance weight. The method obtains a training batch upgraded gradient based on the updated sample-level importance weight of the training batch samples, and updates, using the upgraded gradient, the model w(t) to a trained model w(t+1) corresponding to a next iteration.
    Type: Grant
    Filed: November 8, 2021
    Date of Patent: May 7, 2024
    Assignee: NEC Corporation
    Inventors: Xiang Yu, Yi-Hsuan Tsai, Masoud Faraki, Ramin Moslemi, Manmohan Chandraker, Chang Liu
  • Patent number: 11947626
    Abstract: A method for improving face recognition from unseen domains by learning semantically meaningful representations is presented. The method includes obtaining face images with associated identities from a plurality of datasets, randomly selecting two datasets of the plurality of datasets to train a model, sampling batch face images and their corresponding labels, sampling triplet samples including one anchor face image, a sample face image from a same identity, and a sample face image from a different identity than that of the one anchor face image, performing a forward pass by using the samples of the selected two datasets, finding representations of the face images by using a backbone convolutional neural network (CNN), generating covariances from the representations of the face images and the backbone CNN, the covariances made in different spaces by using positive pairs and negative pairs, and employing the covariances to compute a cross-domain similarity loss function.
    Type: Grant
    Filed: November 5, 2021
    Date of Patent: April 2, 2024
    Assignee: NEC Corporation
    Inventors: Masoud Faraki, Xiang Yu, Yi-Hsuan Tsai, Yumin Suh, Manmohan Chandraker
  • Publication number: 20240037187
    Abstract: Video methods and systems include extracting features of a first modality and a second modality from a labeled first training dataset in a first domain and an unlabeled second training dataset in a second domain. A video analysis model is trained using contrastive learning on the extracted features, including optimization of a loss function that includes a cross-domain regularization part and a cross-modality regularization part.
    Type: Application
    Filed: October 11, 2023
    Publication date: February 1, 2024
    Inventors: Yi-Hsuan Tsai, Xiang Yu, Bingbing Zhuang, Manmohan Chandraker, Donghyun Kim
  • Publication number: 20240037186
    Abstract: Video methods and systems include extracting features of a first modality and a second modality from a labeled first training dataset in a first domain and an unlabeled second training dataset in a second domain. A video analysis model is trained using contrastive learning on the extracted features, including optimization of a loss function that includes a cross-domain regularization part and a cross-modality regularization part.
    Type: Application
    Filed: October 11, 2023
    Publication date: February 1, 2024
    Inventors: Yi-Hsuan Tsai, Xiang Yu, Bingbing Zhuang, Manmohan Chandraker, Donghyun Kim
  • Publication number: 20240037188
    Abstract: Video methods and systems include extracting features of a first modality and a second modality from a labeled first training dataset in a first domain and an unlabeled second training dataset in a second domain. A video analysis model is trained using contrastive learning on the extracted features, including optimization of a loss function that includes a cross-domain regularization part and a cross-modality regularization part.
    Type: Application
    Filed: October 11, 2023
    Publication date: February 1, 2024
    Inventors: Yi-Hsuan Tsai, Xiang Yu, Bingbing Zhuang, Manmohan Chandraker, Donghyun Kim
  • Patent number: 11816901
    Abstract: Methods and systems for training a trajectory prediction model and performing a vehicle maneuver include encoding a set of training data to generate encoded training vectors, where the training data includes trajectory information for agents over time. Trajectory scenarios are simulated based on the encoded training vectors, with each simulated trajectory scenario representing one or more agents with respective agent trajectories, to generate simulated training data. A predictive neural network model is trained using the simulated training data to generate predicted trajectory scenarios based on a detected scene.
    Type: Grant
    Filed: February 26, 2021
    Date of Patent: November 14, 2023
    Inventors: Sriram Nochur Narayanan, Buyu Liu, Ramin Moslemi, Francesco Pittaluga, Manmohan Chandraker