Patents by Inventor Manmohan Chandraker

Manmohan Chandraker has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

FINE-TUNING VISION LANGAUGE MODELS WITH UNPAIRED DATA

Publication number: 20240354642

Abstract: Methods and systems for fine-tuning a model include generating a label space for a target domain. Text pseudo-labels are generated for images in an unlabeled dataset from the target domain based on the label space using a pre-trained vision language model. The pre-trained vision language model is fine-tuned for the target domain using the images with the text pseudo-labels.

Type: Application

Filed: March 26, 2024

Publication date: October 24, 2024

Inventors: Vijay Kumar Baikampady Gopalkrishna, Manmohan Chandraker
ROAD ANALYSIS WITH UNIVERSAL LEARNING

Publication number: 20240354583

Abstract: Methods and systems for training a model include annotating a subset of an unlabeled training dataset, that includes images of road scenes, with labels. A road defect detection model is iteratively trained, including adding pseudo-labels to a remainder of examples from the unlabeled training dataset and training the road defect detection model based on the labels and the pseudo-labels.

Type: Application

Filed: March 25, 2024

Publication date: October 24, 2024

Inventors: Sparsh Garg, Samuel Schulter, Bingbing Zhuang, Manmohan Chandraker
MULTIMODAL SEMANTIC ANALYSIS AND IMAGE RETRIEVAL

Publication number: 20240354336

Abstract: Systems and methods are provided for identifying and retrieving semantically similar images from a database. Semantic analysis is performed on an input query utilizing a vision language model to identify semantic concepts associated with the input query. A preliminary set of images is retrieved from the database for semantic concepts identified. Relevant concepts are extracted for images with a tokenizer by comparing images against a predefined label space to identify relevant concepts. A ranked list of relevant concepts is generated based on occurrence frequency within the set. The preliminary set of images is refined based on selecting specific relevant concepts from the ranked list by the user by combining the input query with the specific relevant concepts. Additional semantic analysis is iteratively performed to retrieve additional sets of images semantically similar to the combined input query and selection of the specific relevant concepts until a threshold condition is met.

Type: Application

Filed: April 18, 2024

Publication date: October 24, 2024

Inventors: Vijay Kumar Baikampady Gopalkrishna, Samuel Schulter, Manmohan Chandraker, Xiang Yu
ROAD DEFECT LEVEL PREDICTION

Publication number: 20240354921

Abstract: Systems and methods for road defect level prediction. A depth map is obtained from an image dataset received from input peripherals by employing a vision transformer model. A plurality of semantic maps is obtained from the image dataset by employing a semantic segmentation model to give pixel-wise segmentation results of road scenes to detect road pixels. Regions of interest (ROI) are detected by utilizing the road pixels. Road defect levels are predicted by fitting the ROI and the depth map into a road surface model to generate road points classified into road defect levels. The predicted road defect levels are visualized on a road map.

Type: Application

Filed: March 26, 2024

Publication date: October 24, 2024

Inventors: Sparsh Garg, Bingbing Zhuang, Samuel Schulter, Manmohan Chandraker
CONDITIONAL SIMILARITY-BASED IMAGE IDENTIFICATION AND MATCHING WITH REDUCED LABELS

Publication number: 20240355090

Abstract: Systems and methods are provided for matching one or more images using conditional similarity pseudo-labels, including analyzing an unlabeled dataset of images, accessing a foundational vision-language model trained on a plurality of image-text pairs, and defining a set of attributes each comprising multiple possible values for generating pseudo-labels based on notions of similarity (NoS). Text prompts are generated for each attribute value using a prompt template and encoding the text prompts using a text encoder of the foundational model. Each image in the dataset of images is processed through a vision encoder of the foundational model to obtain visual features, the visual features are compared against encoded text prompts to assign a pseudo-label for each attribute for each image, and a conditional similarity network (CSN) is trained with the pseudo-labeled images to generate a conditional similarity model.

Type: Application

Filed: April 18, 2024

Publication date: October 24, 2024

Inventors: Vijay Kumar Baikampady Gopalkrishna, Samuel Schulter, Xiang Yu, Manmohan Chandraker
Face-aware person re-identification system

Patent number: 12080100

Abstract: A method for employing facial information in unsupervised person re-identification is presented. The method includes extracting, by a body feature extractor, body features from a first data stream, extracting, by a head feature extractor, head features from a second data stream, outputting a body descriptor vector from the body feature extractor, outputting a head descriptor vector from the head feature extractor, and concatenating the body descriptor vector and the head descriptor vector to enable a model to generate a descriptor vector.

Type: Grant

Filed: November 5, 2021

Date of Patent: September 3, 2024

Assignee: NEC Corporation

Inventors: Yumin Suh, Xiang Yu, Yi-Hsuan Tsai, Masoud Faraki, Manmohan Chandraker
Multi-domain semantic segmentation with label shifts

Patent number: 12045992

Abstract: Methods and systems for training a model include combining data from multiple datasets, the datasets having different respective label spaces. Relationships between labels in the different label spaces are identified. A unified neural network model is trained, using the combined data and the identified relationships to generate a unified model, with a class relational binary cross-entropy loss.

Type: Grant

Filed: November 5, 2021

Date of Patent: July 23, 2024

Assignee: NEC Corporation

Inventors: Yi-Hsuan Tsai, Masoud Faraki, Yumin Suh, Sparsh Garg, Manmohan Chandraker, Dongwan Kim
GRADIENT SPLIT SYSTEM FOR RICH HUMAN ANALYSIS

Publication number: 20240233314

Abstract: A system for rich human analysis includes a memory and one or more processors in communication with the memory configured to extract images from camera in a surveillance system and feed the images to a person detection and tracking system that deciphers human activity tasks. Attributes of persons detected and tracked by the person detection and tracking system are estimated by a rich human analysis system to identify attributes in accordance with set criteria using a set of filters of deeper layers of convolutional layers of a feature extractor where the filters are divided into N groups trained on N corresponding tasks corresponding to task-specific heads such that one task is assigned to each group of the N groups and that each task loss updates only one subset of filters. One or more people that satisfy the attributes and the set criteria are identified.

Type: Application

Filed: March 21, 2024

Publication date: July 11, 2024

Inventors: Yumin Suh, Weijian Deng, Xiang Yu, Masoud Faraki, Manmohan Chandraker, Turgun Kashgari
Simulating diverse long-term future trajectories in road scenes

Patent number: 12005892

Abstract: A method for simultaneous multi-agent recurrent trajectory prediction is presented. The method includes reconstructing a topological layout of a scene from a dataset including real-world data, generating a road graph of the scene, the road graph capturing a hierarchical structure of interconnected lanes, incorporating vehicles from the scene on the generated road graph by utilizing tracklet information available in the dataset, assigning the vehicles to their closest lane identifications, and identifying diverse plausible behaviors for every vehicle in the scene. The method further includes sampling one behavior from the diverse plausible behaviors to select an associated velocity profile sampled from the real-world data of the dataset that resembles the sampled one behavior and feeding the road graph and the sampled velocity profile with a desired destination to a dynamics simulator to generate a plurality of simulated diverse trajectories output on a visualization device.

Type: Grant

Filed: November 5, 2020

Date of Patent: June 11, 2024

Assignee: NEC CORPORATION

Inventors: Sriram Nochur Narayanan, Manmohan Chandraker
Monocular 3D object localization from temporal aggregation

Patent number: 11987236

Abstract: A method provided for 3D object localization predicts pairs of 2D bounding boxes. Each pair corresponds to a detected object in each of the two consecutive input monocular images. The method generates, for each detected object, a relative motion estimation specifying a relative motion between the two images. The method constructs an object cost volume by aggregating temporal features from the two images using the pairs of 2D bounding boxes and the relative motion estimation to predict a range of object depth candidates and a confidence score for each object depth candidate and an object depth from the object depth candidates. The method updates the relative motion estimation based on the object cost volume and the object depth to provide a refined object motion and a refined object depth. The method reconstructs a 3D bounding box for each detected object based on the refined object motion and refined object depth.

Type: Grant

Filed: August 23, 2021

Date of Patent: May 21, 2024

Assignee: NEC Corporation

Inventors: Pan Ji, Buyu Liu, Bingbing Zhuang, Manmohan Chandraker, Xiangyu Chen
VISUAL QUESTION ANSWERING WITH UNLABELED IMAGE AUGMENTATION

Publication number: 20240152767

Abstract: Systems and methods for training a visual question answer model include training a teacher model by performing image conditional visual question generation on a visual language model (VLM) and a targeted visual question answer dataset using images to generate question and answer pairs. Unlabeled images are pseudolabeled using the teacher model to decode synthetic question and answer pairs for the unlabeled images. The synthetic question and answer pairs for the unlabeled images are merged with real data from the targeted visual question answer dataset to generate a self-augmented training set. A student model is trained using the VLM and the self-augmented training set to return visual answers to text queries.

Type: Application

Filed: October 30, 2023

Publication date: May 9, 2024

Inventors: Vijay Kumar Baikampady Gopalkrishna, Samuel Schulter, Xiang Yu, Zaid Khan, Manmohan Chandraker
Domain generalized margin via meta-learning for deep face recognition

Patent number: 11977602

Abstract: A method for training a model for face recognition is provided. The method forward trains a training batch of samples to form a face recognition model w(t), and calculates sample weights for the batch. The method obtains a training batch gradient with respect to model weights thereof and updates, using the gradient, the model w(t) to a face recognition model what(t). The method forwards a validation batch of samples to the face recognition model what(t). The method obtains a validation batch gradient, and updates, using the validation batch gradient and what(t), a sample-level importance weight of samples in the training batch to obtain an updated sample-level importance weight. The method obtains a training batch upgraded gradient based on the updated sample-level importance weight of the training batch samples, and updates, using the upgraded gradient, the model w(t) to a trained model w(t+1) corresponding to a next iteration.

Type: Grant

Filed: November 8, 2021

Date of Patent: May 7, 2024

Assignee: NEC Corporation

Inventors: Xiang Yu, Yi-Hsuan Tsai, Masoud Faraki, Ramin Moslemi, Manmohan Chandraker, Chang Liu
Face recognition from unseen domains via learning of semantic features

Patent number: 11947626

Abstract: A method for improving face recognition from unseen domains by learning semantically meaningful representations is presented. The method includes obtaining face images with associated identities from a plurality of datasets, randomly selecting two datasets of the plurality of datasets to train a model, sampling batch face images and their corresponding labels, sampling triplet samples including one anchor face image, a sample face image from a same identity, and a sample face image from a different identity than that of the one anchor face image, performing a forward pass by using the samples of the selected two datasets, finding representations of the face images by using a backbone convolutional neural network (CNN), generating covariances from the representations of the face images and the backbone CNN, the covariances made in different spaces by using positive pairs and negative pairs, and employing the covariances to compute a cross-domain similarity loss function.

Type: Grant

Filed: November 5, 2021

Date of Patent: April 2, 2024

Assignee: NEC Corporation

Inventors: Masoud Faraki, Xiang Yu, Yi-Hsuan Tsai, Yumin Suh, Manmohan Chandraker
VIDEO DOMAIN ADAPTATION VIA CONTRASTIVE LEARNING

Publication number: 20240037187

Abstract: Video methods and systems include extracting features of a first modality and a second modality from a labeled first training dataset in a first domain and an unlabeled second training dataset in a second domain. A video analysis model is trained using contrastive learning on the extracted features, including optimization of a loss function that includes a cross-domain regularization part and a cross-modality regularization part.

Type: Application

Filed: October 11, 2023

Publication date: February 1, 2024

Inventors: Yi-Hsuan Tsai, Xiang Yu, Bingbing Zhuang, Manmohan Chandraker, Donghyun Kim
VIDEO DOMAIN ADAPTATION VIA CONTRASTIVE LEARNING

Publication number: 20240037188

Abstract: Video methods and systems include extracting features of a first modality and a second modality from a labeled first training dataset in a first domain and an unlabeled second training dataset in a second domain. A video analysis model is trained using contrastive learning on the extracted features, including optimization of a loss function that includes a cross-domain regularization part and a cross-modality regularization part.

Type: Application

Filed: October 11, 2023

Publication date: February 1, 2024

Inventors: Yi-Hsuan Tsai, Xiang Yu, Bingbing Zhuang, Manmohan Chandraker, Donghyun Kim
VIDEO DOMAIN ADAPTATION VIA CONTRASTIVE LEARNING

Publication number: 20240037186

Abstract: Video methods and systems include extracting features of a first modality and a second modality from a labeled first training dataset in a first domain and an unlabeled second training dataset in a second domain. A video analysis model is trained using contrastive learning on the extracted features, including optimization of a loss function that includes a cross-domain regularization part and a cross-modality regularization part.

Type: Application

Filed: October 11, 2023

Publication date: February 1, 2024

Inventors: Yi-Hsuan Tsai, Xiang Yu, Bingbing Zhuang, Manmohan Chandraker, Donghyun Kim
Multi-agent trajectory prediction

Patent number: 11816901

Abstract: Methods and systems for training a trajectory prediction model and performing a vehicle maneuver include encoding a set of training data to generate encoded training vectors, where the training data includes trajectory information for agents over time. Trajectory scenarios are simulated based on the encoded training vectors, with each simulated trajectory scenario representing one or more agents with respective agent trajectories, to generate simulated training data. A predictive neural network model is trained using the simulated training data to generate predicted trajectory scenarios based on a detected scene.

Type: Grant

Filed: February 26, 2021

Date of Patent: November 14, 2023

Inventors: Sriram Nochur Narayanan, Buyu Liu, Ramin Moslemi, Francesco Pittaluga, Manmohan Chandraker
SEMANTIC IMAGE CAPTURE FAULT DETECTION

Publication number: 20230281977

Abstract: Methods and systems for detecting faults include capturing an image of a scene using a camera. The image is embedded using a segmentation model that includes an image branch having an image embedding layer that embeds images into a joint latent space and a text branch having a text embedding layer that embeds text into the joint latent space. Semantic information is generated for a region of the image corresponding to a predetermined static object using the embedded image. A fault of the camera is identified based on a discrepancy between the semantic information and semantic information of the predetermined static image. The fault of the camera is corrected.

Type: Application

Filed: March 23, 2023

Publication date: September 7, 2023

Inventors: Samuel Schulter, Sparsh Garg, Manmohan Chandraker
Facial recognition for masked individuals

Patent number: 11710346

Abstract: Methods and systems for training a neural network include generate an image of a mask. A copy of an image is generated from an original set of training data. The copy is altered to add the image of a mask to a face detected within the copy. An augmented set of training data is generated that includes the original set of training data and the altered copy. A neural network model is trained to recognize masked faces using the augmented set of training data.

Type: Grant

Filed: May 26, 2021

Date of Patent: July 25, 2023

Inventors: Manmohan Chandraker, Ting Wang, Xiang Xu, Francesco Pittaluga, Gaurav Sharma, Yi-Hsuan Tsai, Masoud Faraki, Yuheng Chen, Yue Tian, Ming-Fang Huang, Jian Fang
CONTROLLABLE DYNAMIC MULTI-TASK ARCHITECTURES

Publication number: 20230196122

Abstract: Systems and methods for generating a hypernetwork configured to be trained for a plurality of tasks; receiving a task preference vector identifying a hierarchical priority for the plurality of tasks, and a resource constraint as a tuple; finding tree sub-structures and the corresponding modulation of features for every tuple within an N-stream anchor network; optimizing a branching regularized loss function to train an edge hypernet; and training a weight hypernet, keeping the anchor net and the edge hypernet fixed.

Type: Application

Filed: August 31, 2022

Publication date: June 22, 2023

Inventors: Yumin Suh, Samuel Schulter, Xiang Yu, Masoud Faraki, Manmohan Chandraker, Dripta Raychaudhuri

prev 1 2 3 4 5 6 … next