Patents by Inventor Samuel Schulter

Samuel Schulter has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240078816
    Abstract: A computer-implemented method for training a neural network to predict object categories without manual annotation is provided. The method includes feeding training datasets including at least images and data annotations to an object detection neural network, converting, by a text prompter, the data annotations into natural text inputs, converting, by a text embedder, the natural text inputs into embeddings, minimizing objective functions during training to adjust parameters of the object detection neural network, and predicting, by the object detection neural network, objects within images and videos.
    Type: Application
    Filed: August 11, 2023
    Publication date: March 7, 2024
    Inventors: Samuel Schulter, Vijay Kumar Baikampady Gopalkrishna, Yumin Suh, Shiyu Zhao
  • Publication number: 20240071092
    Abstract: A computer-implemented method for detecting objects within an advanced driver assistance system (ADAS) is provided. The method includes obtaining road scene datasets from a plurality of cameras, including at least road scene images and road scene data annotations, to be provided to an object detection neural network communicating with an open-vocabulary detector of a vehicle, converting, by a text prompter, the road scene data annotations into natural text inputs, converting, by a text embedder, the natural text inputs into embeddings, minimizing objective functions during training to adjust parameters of the object detection neural network, and detecting, by the object detection neural network, objects within the road scene datasets to provide alerts or notifications to a driver of the vehicle pertaining to the detected objects.
    Type: Application
    Filed: August 11, 2023
    Publication date: February 29, 2024
    Inventors: Samuel Schulter, Vijay Kumar Baikampady Gopalkrishna, Yumin Suh
  • Publication number: 20240071105
    Abstract: Methods and systems for training a model include pre-training a backbone model with a pre-training decoder, using an unlabeled dataset with multiple distinct sensor data modalities that derive from different sensor types. The backbone model is fine-tuned with an output decoder after pre-training, using a labeled dataset with the multiple modalities.
    Type: Application
    Filed: August 22, 2023
    Publication date: February 29, 2024
    Inventors: Samuel Schulter, Bingbing Zhuang, Vijay Kumar Baikampady Gopalkrishna, Sparsh Garg, Zhixing Zhang
  • Publication number: 20230281999
    Abstract: Methods and systems identifying road hazards include capturing an image of a road scene using a camera. The image is embedded using a segmentation model that includes an image branch having an image embedding layer that embeds images into a joint latent space and a text branch having a text embedding layer that embeds text into the joint latent space. A mask is generated for an object within the image using the segmentation model. A probability is determined that the object matches a road hazard using the segmentation mode. A signal is generated responsive to the probability to ameliorate a danger posed by the road hazard.
    Type: Application
    Filed: March 23, 2023
    Publication date: September 7, 2023
    Inventors: Samuel Schulter, Sparsh Garg
  • Publication number: 20230281858
    Abstract: A method for object detection obtains, from a set of RGB images lacking annotations, a set of regions that include potential objects, a bounding box, and an objectness score indicating a region prediction confidence. The method obtains, by a region scorer for each region in the set, a category from a fixed set of categories and a confidence for the category responsive to the objectness score. The method duplicates each region in the set to obtain a first and a second patch. The method encodes the patches to obtain an image vector. The method encodes a template sentence using the category to obtain a text vector for each category. The method compares the image vector to the text vector via a similarity function to obtain a similarity probability based on the confidence. The method defines a final set of pseudo labels based on the similarity probability being above a threshold.
    Type: Application
    Filed: February 21, 2023
    Publication date: September 7, 2023
    Inventors: Samuel Schulter, Vijay Kumar Baikampady Gopalkrishna
  • Publication number: 20230281826
    Abstract: Methods and systems for training an image segmentation model include embedding training images, from multiple training datasets having differing label spaces, in a joint latent space to generate first features. Textual labels of the training images are embedded in the joint latent space to generate second features. A segmentation model is trained using the first features and the second features.
    Type: Application
    Filed: March 6, 2023
    Publication date: September 7, 2023
    Inventor: Samuel Schulter
  • Publication number: 20230281977
    Abstract: Methods and systems for detecting faults include capturing an image of a scene using a camera. The image is embedded using a segmentation model that includes an image branch having an image embedding layer that embeds images into a joint latent space and a text branch having a text embedding layer that embeds text into the joint latent space. Semantic information is generated for a region of the image corresponding to a predetermined static object using the embedded image. A fault of the camera is identified based on a discrepancy between the semantic information and semantic information of the predetermined static image. The fault of the camera is corrected.
    Type: Application
    Filed: March 23, 2023
    Publication date: September 7, 2023
    Inventors: Samuel Schulter, Sparsh Garg, Manmohan Chandraker
  • Publication number: 20230281963
    Abstract: A method is provided for pretraining vision and language models that includes receiving image-text pairs, each including an image and a text describing the image. The method encodes an image into a set of feature vectors corresponding to input image patches and a CLS token which represents a global image feature. The method parses, by a text tokenizer, the text into a set of feature vectors as tokens for each word in the text. The method encodes the CLS token from the NN based visual encoder and the tokens from the text tokenizer into a set of features by a NN based text and multimodal encoder that shares weights for encoding both the CLS token and the tokens. The method accumulates the weights from multiple iterations as an exponential moving average of the weights during the pretraining until a predetermined error threshold is reduced to be under a threshold amount.
    Type: Application
    Filed: February 28, 2023
    Publication date: September 7, 2023
    Inventors: Vijay Kumar Baikampady Gopalkrishna, Xiang Yu, Samuel Schulter
  • Publication number: 20230196122
    Abstract: Systems and methods for generating a hypernetwork configured to be trained for a plurality of tasks; receiving a task preference vector identifying a hierarchical priority for the plurality of tasks, and a resource constraint as a tuple; finding tree sub-structures and the corresponding modulation of features for every tuple within an N-stream anchor network; optimizing a branching regularized loss function to train an edge hypernet; and training a weight hypernet, keeping the anchor net and the edge hypernet fixed.
    Type: Application
    Filed: August 31, 2022
    Publication date: June 22, 2023
    Inventors: Yumin Suh, Samuel Schulter, Xiang Yu, Masoud Faraki, Manmohan Chandraker, Dripta Raychaudhuri
  • Publication number: 20230154104
    Abstract: A method for achieving high-fidelity novel view synthesis and 3D reconstruction for large-scale scenes is presented. The method includes obtaining images from a video stream received from a plurality of video image capturing devices, grouping the images into different image clusters representing a large-scale 3D scene, training a neural radiance field (NeRF) and an uncertainty multilayer perceptron (MLP) for each of the image clusters to generate a plurality of NeRFs and a plurality of uncertainty MLPs for the large-scale 3D scene, applying a rendering loss and an entropy loss to the plurality of NeRFs, performing uncertainty-based fusion to the plurality of NeRFs to define a fused NeRF, and jointly fine-tuning the plurality of NeRFs and the plurality of uncertainty MLPs, and during inference, applying the fused NeRF for novel view synthesis of the large-scale 3D scene.
    Type: Application
    Filed: October 11, 2022
    Publication date: May 18, 2023
    Inventors: Bingbing Zhuang, Samuel Schulter, Yi-Hsuan Tsai, Buyu Liu, Nanbo Li
  • Publication number: 20230153572
    Abstract: A computer-implemented method for model training is provided. The method includes receiving, by a hardware processor, sets of images, each set corresponding to a respective task. The method further includes training, by the hardware processor, a task-based neural network classifier having a center and a covariance matrix for each of a plurality of classes in a last layer of the task-based neural network classifier and a plurality of convolutional layers preceding the last layer, by using a similarity between an image feature of a last convolutional layer from among the plurality of convolutional layers and the center and the covariance matrix for a given one of the plurality of classes, the similarity minimizing an impact of a data model forgetting problem.
    Type: Application
    Filed: October 21, 2022
    Publication date: May 18, 2023
    Inventors: Masoud Faraki, Yi-Hsuan Tsai, Xiang Yu, Samuel Schulter, Yumin Suh, Christian Simon
  • Publication number: 20230088335
    Abstract: Systems and methods is provided for road hazard analysis. The method includes obtaining sensor data of a road environment including a road and observable surroundings, and applying labels to the sensor data. The method further includes training a first neural network model to identify road hazards, training a second neural network model to identify faded lane markings, and training a third neural network model to identify overhanging trees and blocking foliage. The method further includes implementing the trained neural network models to detect road hazards in a real road setting.
    Type: Application
    Filed: September 9, 2022
    Publication date: March 23, 2023
    Inventors: Sparsh Garg, Samuel Schulter, Vijay Kumar Baikampady Gopalkrishna
  • Publication number: 20230081913
    Abstract: Systems and methods are provided for multi-modal test-time adaptation. The method includes inputting a digital image into a pre-trained Camera Intra-modal Pseudo-label Generator, and inputting a point cloud set into a pre-trained Lidar Intra-modal Pseudo-label Generator. The method further includes applying a fast 2-dimension (2D) model, and a slow 2D model, to the inputted digital image to apply pseudo-labels, and applying a fast 3-dimension (3D) model, and a slow 3D model, to the inputted point cloud set to apply pseudo-labels. The method further includes fusing pseudo-label predictions from the fast models and the slow models through an Inter-modal Pseudo-label Refinement module to obtain robust pseudo labels, and measuring a prediction consistency for the pseudo-labels.
    Type: Application
    Filed: September 6, 2022
    Publication date: March 16, 2023
    Inventors: Yi-Hsuan Tsai, Bingbing Zhuang, Samuel Schulter, Buyu Liu, Sparsh Garg, Ramin Moslemi, Inkyu Shin
  • Patent number: 11604943
    Abstract: Systems and methods for domain adaptation for structured output via disentangled representations are provided. The system receives a ground truth of a source domain. The ground truth is used in a task loss function for a first convolutional neural network that predicts at least one output based on inputs from the source domain and a target domain. The system clusters the ground truth of the source domain into a predetermined number of clusters, and predicts, via a second convolutional neural network, a structure of label patches. The structure includes an assignment of each of the at least one output of the first convolutional neural network to the predetermined number of clusters. A cluster loss is computed for the predicted structure of label patches, and an adversarial loss function is applied to the predicted structure of label patches to align the source domain and the target domain on a structural level.
    Type: Grant
    Filed: May 1, 2019
    Date of Patent: March 14, 2023
    Inventors: Yi-Hsuan Tsai, Samuel Schulter, Kihyuk Sohn, Manmohan Chandraker
  • Patent number: 11518382
    Abstract: A method is provided for danger prediction. The method includes generating fully-annotated simulated training data for a machine learning model responsive to receiving a set of computer-selected simulator-adjusting parameters. The method further includes training the machine learning model using reinforcement learning on the fully-annotated simulated training data. The method also includes measuring an accuracy of the trained machine learning model relative to learning a discriminative function for a given task. The discriminative function predicts a given label for a given image from the fully-annotated simulated training data. The method additionally includes adjusting the computer-selected simulator-adjusting parameters and repeating said training and measuring steps responsive to the accuracy being below a threshold accuracy.
    Type: Grant
    Filed: November 26, 2019
    Date of Patent: December 6, 2022
    Inventors: Samuel Schulter, Nataniel Ruiz, Manmohan Chandraker
  • Patent number: 11468680
    Abstract: A method for performing video domain adaptation for human action recognition is presented. The method includes using annotated source data from a source video and unannotated target data from a target video in an unsupervised domain adaptation setting, identifying and aligning discriminative clips in the source and target videos via an attention mechanism, and learning spatial-background invariant human action representations by employing a self-supervised clip order prediction loss for both the annotated source data and the unannotated target data.
    Type: Grant
    Filed: August 20, 2020
    Date of Patent: October 11, 2022
    Inventors: Gaurav Sharma, Samuel Schulter, Jinwoo Choi
  • Patent number: 11468591
    Abstract: Systems and methods for road typology scene annotation are provided. A method for road typology scene annotation includes receiving an image having a road scene. The image is received from an imaging device. The method populates, using a machine learning model, a set of attribute settings with values representing the road scene. An annotation interface is implemented and configured to adjust values of the attribute settings to correspond with the road scene. Based on the values of the attribute settings, a simulated overhead view of the respective road scene is generated.
    Type: Grant
    Filed: June 2, 2020
    Date of Patent: October 11, 2022
    Inventor: Samuel Schulter
  • Patent number: 11462112
    Abstract: A method is provided in an Advanced Driver-Assistance System (ADAS). The method extracts, from an input video stream including a plurality of images using a multi-task Convolutional Neural Network (CNN), shared features across different perception tasks. The perception tasks include object detection and other perception tasks. The method concurrently solves, using the multi-task CNN, the different perception tasks in a single pass by concurrently processing corresponding ones of the shared features by respective different branches of the multi-task CNN to provide a plurality of different perception task outputs. Each respective different branch corresponds to a respective one of the different perception tasks. The method forms a parametric representation of a driving scene as at least one top-view map responsive to the plurality of different perception task outputs.
    Type: Grant
    Filed: February 11, 2020
    Date of Patent: October 4, 2022
    Inventors: Quoc-Huy Tran, Samuel Schulter, Paul Vernaza, Buyu Liu, Pan Ji, Yi-Hsuan Tsai, Manmohan Chandraker
  • Patent number: 11455813
    Abstract: Systems and methods are provided for producing a road layout model. The method includes capturing digital images having a perspective view, converting each of the digital images into top-down images, and conveying a top-down image of time t to a neural network that performs a feature transform to form a feature map of time t. The method also includes transferring the feature map of the top-down image of time t to a feature transform module to warp the feature map to a time t+1, and conveying a top-down image of time t+1 to form a feature map of time t+1. The method also includes combining the warped feature map of time t with the feature map of time t+1 to form a combined feature map, transferring the combined feature map to a long short-term memory (LSTM) module to generate the road layout model, and displaying the road layout model.
    Type: Grant
    Filed: November 12, 2020
    Date of Patent: September 27, 2022
    Inventors: Buyu Liu, Bingbing Zhuang, Samuel Schulter, Manmohan Chandraker
  • Patent number: 11373067
    Abstract: A method for implementing parametric models for scene representation to improve autonomous task performance includes generating an initial map of a scene based on at least one image corresponding to a perspective view of the scene, the initial map including a non-parametric top-view representation of the scene, implementing a parametric model to obtain a scene element representation based on the initial map, the scene element representation providing a description of one or more scene elements of the scene and corresponding to an estimated semantic layout of the scene, identifying one or more predicted locations of the one or more scene elements by performing three-dimensional localization based on the at least one image, and obtaining an overlay for performing an autonomous task by placing the one or more scene elements with the one or more respective predicted locations onto the scene element representation.
    Type: Grant
    Filed: July 30, 2019
    Date of Patent: June 28, 2022
    Inventors: Samuel Schulter, Ziyan Wang, Buyu Liu, Manmohan Chandraker