Patents by Inventor Yumin Suh

Yumin Suh has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

LEVERAGING SEMANTIC INFORMATION FOR A MULTI-DOMAIN VISUAL AGENT

Publication number: 20250148766

Abstract: Systems and methods for leveraging semantic information for a multi-domain visual agent. Semantic information can be leveraged to obtain a multi-domain visual agent. To train the multi-domain visual agent, questions can be sampled from question templates for domain-specific label spaces to obtain a unified label space. The domain-specific labels from the domain-specific label spaces can be mapped into natural language descriptions (NLD) to obtain mapped NLD. The mapped NLD can be converted into prompts by combining the questions sampled from the unified label space and the annotations. The semantic information can be learned by iteratively generating outputs from tokens extracted from the prompts using a large-language model (LLM). The multi-domain visual agent (MDVA) can be trained using the semantic information.

Type: Application

Filed: November 1, 2024

Publication date: May 8, 2025

Inventors: Vijay Kumar Baikampady Gopalkrishna, Masoud Faraki, Yumin Suh, Manmohan Chandraker
VISUAL OBJECT DETECTION USING EXPLICIT NEGATIVES

Publication number: 20250118053

Abstract: Systems and methods for visual object detection using explicit negatives. To train an artificial intelligence model with explicit negatives, a data sampler can sample input data from a language-based dataset to select images with annotations. A negative generation engine can generate explicit negatives representing sentences that include contradicting words that are semantically related to the annotations by using an external knowledgebase. A model trainer can minimize the classification loss of positive labels while decreasing the confidence score of the explicit negatives for the artificial intelligence model. The negative generation engine can be optimized to generate next explicit negatives. The artificial intelligence model can backpropagate using positive labels and the next explicit negatives to generate supervisory loss corresponding to the net explicit negatives. The artificial intelligence model can detect objects from an input image.

Type: Application

Filed: October 3, 2024

Publication date: April 10, 2025

Inventors: Samuel Schulter, Vijay Kumar Baikampady Gopalkrishna, Yumin Suh
SYSTEM ENABLEMENT BASED ON IMAGE QUALITY ANALYSIS

Publication number: 20250118067

Abstract: Systems and methods include generating a detection output for an image over multiple iterations by applying a dropout randomly to a different convolutional layer of a learning model for each iteration. The detection outputs are clustered, on labels, for each iteration. A total surface area for the clusters is computed over the iteration. A confidence is computed for the image using the total surface area for the clusters as an uncertainty score. A system is disabled if the confidence is below a threshold.

Type: Application

Filed: September 17, 2024

Publication date: April 10, 2025

Inventors: Sparsh Garg, Samuel Schulter, Yumin Suh
EFFICIENT TRANSFORMER-BASED PANOPTIC SEGMENTATION

Publication number: 20250117947

Abstract: Methods and systems for segmentation include encoding an image using a backbone model to generate feature maps. An exit point based on one of the feature maps. The feature maps are processed with a dynamic transformer encoder that includes layers, exiting the dynamic transformer encoder at a layer identified by the exit point. An output of the dynamic transformer encoder is decoded to output a segmentation of the image.

Type: Application

Filed: September 23, 2024

Publication date: April 10, 2025

Inventors: Abhishek Aich, Yumin Suh, Samuel Schulter, Manyi Yao
LANGUAGE-BASED OBJECT DETECTION AND DATA AUGMENTATION FOR SELF-DRIVING VEHICLE OPERATION

Publication number: 20250115276

Abstract: Methods and systems for object detection include generating a negative description for an input image of a road scene, based on a positive description of the input image, using a language model. A negative image is generated based on the input image and the negative description by replacing a portion of the input image that is described by the positive description with content that is described by the negative description using a generative image model. An object detection model is trained with the input image, the positive description, the negative description, and the negative image. An object is identified within a driving scene using the trained object detection model. A driving action is performed in a self-driving vehicle responsive to the identified object.

Type: Application

Filed: October 2, 2024

Publication date: April 10, 2025

Inventors: Samuel Schulter, Vijay Kumar Baikampady Gopalkrishna, Yumin Suh
OPTIMIZING MODELS FOR OPEN-VOCABULARY DETECTION

Publication number: 20240378454

Abstract: Systems and methods for optimizing models for open-vocabulary detection. Region proposals can be obtained by employing a pre-trained vision-language model and a pre-trained region proposal network. Object feature predictions can be obtained by employing a trained teacher neural network with the region proposals. Object feature predictions can be filtered above a threshold to obtain pseudo labels. A student neural network with a split-and-fusion detection head can be trained by utilizing the region proposals, base ground truth class labels and the pseudo labels. The pseudo labels can be optimized by reducing the noise from the pseudo labels by employing the trained split-and-fusion detection head of the trained student neural network to obtain optimized object detections. An action can be performed relative to a scene layout based on the optimized object detections.

Type: Application

Filed: May 9, 2024

Publication date: November 14, 2024

Inventors: Samuel Schulter, Yumin Suh, Manmohan Chandraker, Vijay Kumar Baikampady Gopalkrishna
DIVERSE FUTURE PREDICTIONS FOR PEDESTRIANS

Publication number: 20240351582

Abstract: Methods and systems for trajectory prediction include encoding trajectories of agents in a scene from past images of the scene. Lane centerlines are encoded for agents in the scene. The agents in the scene are encoded using the encoded trajectories and the encoded lane centerlines. A hypercolumn trajectory is decoded from the encoded agents to generate predicted trajectories for the agents. A vehicle is automatically operated responsive to the predicted trajectories.

Type: Application

Filed: April 18, 2024

Publication date: October 24, 2024

Inventors: Buyu Liu, Sriram Nochur Narayanan, Bingbing Zhuang, Yumin Suh
Face-aware person re-identification system

Patent number: 12080100

Abstract: A method for employing facial information in unsupervised person re-identification is presented. The method includes extracting, by a body feature extractor, body features from a first data stream, extracting, by a head feature extractor, head features from a second data stream, outputting a body descriptor vector from the body feature extractor, outputting a head descriptor vector from the head feature extractor, and concatenating the body descriptor vector and the head descriptor vector to enable a model to generate a descriptor vector.

Type: Grant

Filed: November 5, 2021

Date of Patent: September 3, 2024

Assignee: NEC Corporation

Inventors: Yumin Suh, Xiang Yu, Yi-Hsuan Tsai, Masoud Faraki, Manmohan Chandraker
Multi-domain semantic segmentation with label shifts

Patent number: 12045992

Abstract: Methods and systems for training a model include combining data from multiple datasets, the datasets having different respective label spaces. Relationships between labels in the different label spaces are identified. A unified neural network model is trained, using the combined data and the identified relationships to generate a unified model, with a class relational binary cross-entropy loss.

Type: Grant

Filed: November 5, 2021

Date of Patent: July 23, 2024

Assignee: NEC Corporation

Inventors: Yi-Hsuan Tsai, Masoud Faraki, Yumin Suh, Sparsh Garg, Manmohan Chandraker, Dongwan Kim
GRADIENT SPLIT SYSTEM FOR RICH HUMAN ANALYSIS

Publication number: 20240233314

Abstract: A system for rich human analysis includes a memory and one or more processors in communication with the memory configured to extract images from camera in a surveillance system and feed the images to a person detection and tracking system that deciphers human activity tasks. Attributes of persons detected and tracked by the person detection and tracking system are estimated by a rich human analysis system to identify attributes in accordance with set criteria using a set of filters of deeper layers of convolutional layers of a feature extractor where the filters are divided into N groups trained on N corresponding tasks corresponding to task-specific heads such that one task is assigned to each group of the N groups and that each task loss updates only one subset of filters. One or more people that satisfy the attributes and the set criteria are identified.

Type: Application

Filed: March 21, 2024

Publication date: July 11, 2024

Inventors: Yumin Suh, Weijian Deng, Xiang Yu, Masoud Faraki, Manmohan Chandraker, Turgun Kashgari
SINGLE TRAINING SEQUENCE FOR NEURAL NETWORK USEABLE FOR MULTI-TASK SCENARIOS

Publication number: 20240160927

Abstract: Systems and methods for performing multiple tasks with a single artificial intelligence model that can include training a supernet model for an application by splitting the application into tasks, and splitting the supernet model into subnets. The methods and systems can further assign the tasks computing budgets, and match the tasks to subnets by matching the computing budget of the tasks to the computing capacity of the subnets. Further, the methods and systems can perform the tasks with matching subnets to produce parameters that are used by the supernet to perform the application. The supernet combines all of the task to produce a model for the application and the supernet retains weights for the tasks to be used in subsequent applications.

Type: Application

Filed: November 7, 2023

Publication date: May 16, 2024

Inventors: Yumin Suh, Samuel Schulter, Xiang Yu, Abhishek Aich
Face recognition from unseen domains via learning of semantic features

Patent number: 11947626

Abstract: A method for improving face recognition from unseen domains by learning semantically meaningful representations is presented. The method includes obtaining face images with associated identities from a plurality of datasets, randomly selecting two datasets of the plurality of datasets to train a model, sampling batch face images and their corresponding labels, sampling triplet samples including one anchor face image, a sample face image from a same identity, and a sample face image from a different identity than that of the one anchor face image, performing a forward pass by using the samples of the selected two datasets, finding representations of the face images by using a backbone convolutional neural network (CNN), generating covariances from the representations of the face images and the backbone CNN, the covariances made in different spaces by using positive pairs and negative pairs, and employing the covariances to compute a cross-domain similarity loss function.

Type: Grant

Filed: November 5, 2021

Date of Patent: April 2, 2024

Assignee: NEC Corporation

Inventors: Masoud Faraki, Xiang Yu, Yi-Hsuan Tsai, Yumin Suh, Manmohan Chandraker
OPEN-VOCABULARY OBJECT DETECTION WITH VISION AND LANGUAGE SUPERVISION

Publication number: 20240078816

Abstract: A computer-implemented method for training a neural network to predict object categories without manual annotation is provided. The method includes feeding training datasets including at least images and data annotations to an object detection neural network, converting, by a text prompter, the data annotations into natural text inputs, converting, by a text embedder, the natural text inputs into embeddings, minimizing objective functions during training to adjust parameters of the object detection neural network, and predicting, by the object detection neural network, objects within images and videos.

Type: Application

Filed: August 11, 2023

Publication date: March 7, 2024

Inventors: Samuel Schulter, Vijay Kumar Baikampady Gopalkrishna, Yumin Suh, Shiyu Zhao
OBJECT DETECTION IN DRIVER ASSISTANCE SYSTEM

Publication number: 20240071092

Abstract: A computer-implemented method for detecting objects within an advanced driver assistance system (ADAS) is provided. The method includes obtaining road scene datasets from a plurality of cameras, including at least road scene images and road scene data annotations, to be provided to an object detection neural network communicating with an open-vocabulary detector of a vehicle, converting, by a text prompter, the road scene data annotations into natural text inputs, converting, by a text embedder, the natural text inputs into embeddings, minimizing objective functions during training to adjust parameters of the object detection neural network, and detecting, by the object detection neural network, objects within the road scene datasets to provide alerts or notifications to a driver of the vehicle pertaining to the detected objects.

Type: Application

Filed: August 11, 2023

Publication date: February 29, 2024

Inventors: Samuel Schulter, Vijay Kumar Baikampady Gopalkrishna, Yumin Suh
CONTROLLABLE DYNAMIC MULTI-TASK ARCHITECTURES

Publication number: 20230196122

Abstract: Systems and methods for generating a hypernetwork configured to be trained for a plurality of tasks; receiving a task preference vector identifying a hierarchical priority for the plurality of tasks, and a resource constraint as a tuple; finding tree sub-structures and the corresponding modulation of features for every tuple within an N-stream anchor network; optimizing a branching regularized loss function to train an edge hypernet; and training a weight hypernet, keeping the anchor net and the edge hypernet fixed.

Type: Application

Filed: August 31, 2022

Publication date: June 22, 2023

Inventors: Yumin Suh, Samuel Schulter, Xiang Yu, Masoud Faraki, Manmohan Chandraker, Dripta Raychaudhuri
DOMAIN GENERALIZABLE CONTINUAL LEARNING USING COVARIANCES

Publication number: 20230153572

Abstract: A computer-implemented method for model training is provided. The method includes receiving, by a hardware processor, sets of images, each set corresponding to a respective task. The method further includes training, by the hardware processor, a task-based neural network classifier having a center and a covariance matrix for each of a plurality of classes in a last layer of the task-based neural network classifier and a plurality of convolutional layers preceding the last layer, by using a similarity between an image feature of a last convolutional layer from among the plurality of convolutional layers and the center and the covariance matrix for a given one of the plurality of classes, the similarity minimizing an impact of a data model forgetting problem.

Type: Application

Filed: October 21, 2022

Publication date: May 18, 2023

Inventors: Masoud Faraki, Yi-Hsuan Tsai, Xiang Yu, Samuel Schulter, Yumin Suh, Christian Simon
FACE RECOGNITION FROM UNSEEN DOMAINS VIA LEARNING OF SEMANTIC FEATURES

Publication number: 20220147765

Abstract: A method for improving face recognition from unseen domains by learning semantically meaningful representations is presented. The method includes obtaining face images with associated identities from a plurality of datasets, randomly selecting two datasets of the plurality of datasets to train a model, sampling batch face images and their corresponding labels, sampling triplet samples including one anchor face image, a sample face image from a same identity, and a sample face image from a different identity than that of the one anchor face image, performing a forward pass by using the samples of the selected two datasets, finding representations of the face images by using a backbone convolutional neural network (CNN), generating covariances from the representations of the face images and the backbone CNN, the covariances made in different spaces by using positive pairs and negative pairs, and employing the covariances to compute a cross-domain similarity loss function.

Type: Application

Filed: November 5, 2021

Publication date: May 12, 2022

Inventors: Masoud Faraki, Xiang Yu, Yi-Hsuan Tsai, Yumin Suh, Manmohan Chandraker
MULTI-DOMAIN SEMANTIC SEGMENTATION WITH LABEL SHIFTS

Publication number: 20220148189

Abstract: Methods and systems for training a model include combining data from multiple datasets, the datasets having different respective label spaces. Relationships between labels in the different label spaces are identified. A unified neural network model is trained, using the combined data and the identified relationships to generate a unified model, with a class relational binary cross-entropy loss.

Type: Application

Filed: November 5, 2021

Publication date: May 12, 2022

Inventors: Yi-Hsuan Tsai, Masoud Faraki, Yumin Suh, Sparsh Garg, Manmohan Chandraker, Dongwan Kim
FACE-AWARE PERSON RE-IDENTIFICATION SYSTEM

Publication number: 20220147735

Abstract: A method for employing facial information in unsupervised person re-identification is presented. The method includes extracting, by a body feature extractor, body features from a first data stream, extracting, by a head feature extractor, head features from a second data stream, outputting a body descriptor vector from the body feature extractor, outputting a head descriptor vector from the head feature extractor, and concatenating the body descriptor vector and the head descriptor vector to enable a model to generate a descriptor vector.

Type: Application

Filed: November 5, 2021

Publication date: May 12, 2022

Inventors: Yumin Suh, Xiang Yu, Yi-Hsuan Tsai, Masoud Faraki, Manmohan Chandraker
MULTI-TASK LEARNING VIA GRADIENT SPLIT FOR RICH HUMAN ANALYSIS

Publication number: 20220121953

Abstract: A method for multi-task learning via gradient split for rich human analysis is presented. The method includes extracting images from training data having a plurality of datasets, each dataset associated with one task, feeding the training data into a neural network model including a feature extractor and task-specific heads, wherein the feature extractor has a feature extractor shared component and a feature extractor task-specific component, dividing filters of deeper layers of convolutional layers of the feature extractor into N groups, N being a number of tasks, assigning one task to each group of the N groups, and manipulating gradients so that each task loss updates only one subset of filters.

Type: Application

Filed: October 7, 2021

Publication date: April 21, 2022

Inventors: Yumin Suh, Xiang Yu, Masoud Faraki, Manmohan Chandraker, Weijian Deng