Patents by Inventor Rares A. Ambrus

Rares A. Ambrus has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240153197
    Abstract: An example method includes generating embeddings of image data that includes multiple images, where each image has a different viewpoints of a scene, generating a latent space and a decoder, wherein the decoder receives embeddings as input to generate an output viewpoint, for each viewpoint in the image data, determining a volumetric rendering view synthesis loss and a multi-view photometric loss, and applying an optimization algorithm to the latent space and the decoder over a number of epochs until the volumetric rendering view synthesis loss is within a volumetric threshold and the multi-view photometric loss is within a multi-view threshold.
    Type: Application
    Filed: August 3, 2023
    Publication date: May 9, 2024
    Applicants: Toyota Research Institute, Inc., Massachusetts Institute of Technology, Toyota Jidosha Kabushiki Kaisha
    Inventors: Vitor Guizilini, Rares A. Ambrus, Jiading Fang, Sergey Zakharov, Vincent Sitzmann, Igor Vasiljevic, Adrien Gaidon
  • Publication number: 20240153107
    Abstract: Systems and methods for performing three-dimensional multi-object tracking are disclosed herein. In one example, a method includes the steps of determining a residual based on augmented current frame detection bounding boxes, augmented previous frame detection bounding boxes, augmented current frame shape descriptors, and augmented previous frame shape descriptors and predicting an affinity matrix using the residual. The residual indicates a spatiotemporal and shape similarity between current detections in a current frame point cloud data and previous detections in a previous frame point cloud data. The affinity matrix indicates associations between the previous detections and the current detections, as well as the augmented anchors.
    Type: Application
    Filed: May 10, 2023
    Publication date: May 9, 2024
    Applicants: Toyota Research Institute, Inc., The Board of Trustees of the Leland Stanford Junior University, Toyota Jidosha Kabushiki Kaisha
    Inventors: Jie Li, Rares A. Ambrus, Taraneh Sadjadpour, Christin Jeannette Bohg
  • Patent number: 11966234
    Abstract: A method for controlling an ego agent includes capturing a two-dimensional (2D) image of an environment adjacent to the ego agent. The method also includes generating a semantically segmented image of the environment based on the 2D image. The method further includes generating a depth map of the environment based on the semantically segmented image. The method additionally includes generating a three-dimensional (3D) estimate of the environment based on the depth map. The method also includes controlling an action of the ego agent based on the identified location.
    Type: Grant
    Filed: July 23, 2020
    Date of Patent: April 23, 2024
    Assignee: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor Guizilini, Jie Li, Rares A. Ambrus, Sudeep Pillai, Adrien Gaidon
  • Patent number: 11948310
    Abstract: Systems and methods described herein relate to jointly training a machine-learning-based monocular optical flow, depth, and scene flow estimator. One embodiment processes a pair of temporally adjacent monocular image frames using a first neural network structure to produce a first optical flow estimate; processes the pair of temporally adjacent monocular image frames using a second neural network structure to produce an estimated depth map and an estimated scene flow; processes the estimated depth map and the estimated scene flow using the second neural network structure to produce a second optical flow estimate; and imposes a consistency loss between the first optical flow estimate and the second optical flow estimate that minimizes a difference between the first optical flow estimate and the second optical flow estimate to improve performance of the first neural network structure in estimating optical flow and the second neural network structure in estimating depth and scene flow.
    Type: Grant
    Filed: September 29, 2021
    Date of Patent: April 2, 2024
    Assignee: Toyota Research Institute, Inc.
    Inventors: Vitor Guizilini, Rares A. Ambrus, Kuan-Hui Lee, Adrien David Gaidon
  • Patent number: 11948309
    Abstract: Systems and methods described herein relate to jointly training a machine-learning-based monocular optical flow, depth, and scene flow estimator. One embodiment processes a pair of temporally adjacent monocular image frames using a first neural network structure to produce an optical flow estimate and to extract, from at least one image frame in the pair of temporally adjacent monocular image frames, a set of encoded image context features; triangulates the optical flow estimate to generate a depth map; extracts a set of encoded depth context features from the depth map using a depth context encoder; and combines the set of encoded image context features and the set of encoded depth context features to improve performance of a second neural network structure in estimating depth and scene flow.
    Type: Grant
    Filed: September 29, 2021
    Date of Patent: April 2, 2024
    Assignee: Toyota Research Institute, Inc.
    Inventors: Vitor Guizilini, Rares A. Ambrus, Kuan-Hui Lee, Adrien David Gaidon
  • Patent number: 11915487
    Abstract: Systems and methods to improve machine learning by explicitly over-fitting environmental data obtained by an imaging system, such as a monocular camera are disclosed. The system includes training self-supervised depth and pose networks in monocular visual data collected from a certain area over multiple passes. Pose and depth networks may be trained by extracting data from multiple images of a single environment or trajectory, allowing the system to overfit the image data.
    Type: Grant
    Filed: May 5, 2020
    Date of Patent: February 27, 2024
    Assignee: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Rares A. Ambrus, Vitor Guizilini, Sudeep Pillai, Adrien David Gaidon
  • Patent number: 11900626
    Abstract: A method for learning depth-aware keypoints and associated descriptors from monocular video for ego-motion estimation is described. The method includes training a keypoint network and a depth network to learn depth-aware keypoints and the associated descriptors. The training is based on a target image and a context image from successive images of the monocular video. The method also includes lifting 2D keypoints from the target image to learn 3D keypoints based on a learned depth map from the depth network. The method further includes estimating ego-motion from the target image to the context image based on the learned 3D keypoints.
    Type: Grant
    Filed: November 9, 2020
    Date of Patent: February 13, 2024
    Assignee: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Jiexiong Tang, Rares A. Ambrus, Vitor Guizilini, Sudeep Pillai, Hanme Kim, Adrien David Gaidon
  • Patent number: 11891094
    Abstract: Information that identifies a location can be received. In response to a receipt of the information that identifies the location, a file can be retrieved. The file can be for the location. The file can include image data and a set of node data. The set of node data can include information that identifies nodes in a neural network, information that identifies inputs of the nodes, and values of weights to be applied to the inputs. In response to a retrieval of the file, the weights can be applied to the inputs of the nodes and the image data can be received for the neural network. In response to an application of the weights and a receipt of the image data, the neural network can be executed to produce a digital map for the location. The digital map for the location can be transmitted to an automotive navigation system.
    Type: Grant
    Filed: June 10, 2020
    Date of Patent: February 6, 2024
    Assignee: Toyota Research Institute, Inc.
    Inventors: Vitor Guizilini, Rares A. Ambrus, Sudeep Pillai, Adrien David Gaidon
  • Patent number: 11887248
    Abstract: Systems and methods described herein relate to reconstructing a scene in three dimensions from a two-dimensional image. One embodiment processes an image using a detection transformer to detect an object in the scene and to generate a NOCS map of the object and a background depth map; uses MLPs to relate the object to a differentiable database of object priors (PriorDB); recovers, from the NOCS map, a partial 3D object shape; estimates an initial object pose; fits a PriorDB object prior to align in geometry and appearance with the partial 3D shape to produce a complete shape and refines the initial pose estimate; generates an editable and re-renderable 3D scene reconstruction based, at least in part, on the complete shape, the refined pose estimate, and the depth map; and controls the operation of a robot based, at least in part, on the editable and re-renderable 3D scene reconstruction.
    Type: Grant
    Filed: March 16, 2022
    Date of Patent: January 30, 2024
    Assignees: Toyota Research Institute, Inc., Massachusetts Institute of Technology, The Board of Trustees of the Leland Standford Junior Univeristy
    Inventors: Sergey Zakharov, Wadim Kehl, Vitor Guizilini, Adrien David Gaidon, Rares A. Ambrus, Dennis Park, Joshua Tenenbaum, Jiajun Wu, Fredo Durand, Vincent Sitzmann
  • Publication number: 20240029286
    Abstract: A method of generating additional supervision data to improve learning of a geometrically-consistent latent scene representation with a geometric scene representation architecture is provided. The method includes receiving, with a computing device, a latent scene representation encoding a pointcloud from images of a scene captured by a plurality of cameras each with known intrinsics and poses, generating a virtual camera having a viewpoint different from viewpoints of the plurality of cameras, projecting information from the pointcloud onto the viewpoint of the virtual camera, and decoding the latent scene representation based on the virtual camera thereby generating an RGB image and depth map corresponding to the viewpoint of the virtual camera for implementation as additional supervision data.
    Type: Application
    Filed: February 16, 2023
    Publication date: January 25, 2024
    Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha, Toyota Technological Institute at Chicago
    Inventors: Vitor Guizilini, Igor Vasiljevic, Adrien D. Gaidon, Jiading Fang, Gregory Shakhnarovich, Matthew R. Walter, Rares A. Ambrus
  • Publication number: 20240028792
    Abstract: The disclosure provides implicit representations for multi-object 3D shape, 6D pose and size, and appearance optimization, including obtaining shape, 6D pose and size, and appearance codes. Training is employed using shape and appearance priors from an implicit joint differential database. 2D masks are also obtained and are used in an optimization process that utilizes a combined loss minimizing function and an Octree-based coarse-to-fine differentiable optimization to jointly optimize the latest shape, appearance, pose and size, and 2D masks. An object surface is recovered from the latest shape codes to a desired resolution level. The database represents shapes as Signed Distance Fields (SDF), and appearance as Texture Fields (TF).
    Type: Application
    Filed: July 19, 2022
    Publication date: January 25, 2024
    Applicants: TOYOTA RESEARCH INSTITUTE, INC., TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: MUHAMMAD ZUBAIR IRSHAD, Sergey Zakharov, Rares A. Ambrus, Adrien D. Gaidon
  • Publication number: 20240013409
    Abstract: A method for multiple object tracking includes receiving, with a computing device, a point cloud dataset, detecting one or more objects in the point cloud dataset, each of the detected one or more objects defined by points of the point cloud dataset and a bounding box, querying one or more historical tracklets for historical tracklet states corresponding to each of the one or more detected objects, implementing a 4D encoding backbone comprising two branches: a first branch configured to compute per-point features for each of the one or more objects and the corresponding historical tracklet states, and a second branch configured to obtain 4D point features, concatenating the per-point features and the 4D point features, and predicting, with a decoder receiving the concatenated per-point features, current tracklet states for each of the one or more objects.
    Type: Application
    Filed: May 26, 2023
    Publication date: January 11, 2024
    Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha, The Board of Trustees of the Leland Stanford Junior University
    Inventors: Colton Stearns, Jie Li, Rares A. Ambrus, Vitor Campagnolo Guizilini, Sergey Zakharov, Adrien D. Gaidon, Davis Rempe, Tolga Birdal, Leonidas J. Guibas
  • Patent number: 11868439
    Abstract: Systems, methods, and other embodiments described herein relate to training a multi-task network using real and virtual data. In one embodiment, a method includes acquiring training data that includes real data and virtual data for training a multi-task network that performs at least depth prediction and semantic segmentation. The method includes generating a first output from the multi-task network using the real data and second output from the multi-task network using the virtual data. The method includes generating a mixed loss by analyzing the first output to produce a real loss and the second output to produce a virtual loss. The method includes updating the multi-task network using the mixed loss.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: January 9, 2024
    Assignee: Toyota Research Institute, Inc.
    Inventors: Vitor Guizilini, Adrien David Gaidon, Jie Li, Rares A. Ambrus
  • Publication number: 20240005540
    Abstract: System, methods, and other embodiments described herein relate to an improved approach to training a depth model to derive depth estimates from monocular images using cost volumes. In one embodiment, a method includes predicting, using a depth model, depth values from at least one input image that is a monocular image. The method includes generating a cost volume by sampling the depth values corresponding to bins of the cost volume. The method includes determining loss values for the bins of the cost volume. The method includes training the depth model according to the loss values of the cost volume.
    Type: Application
    Filed: May 27, 2022
    Publication date: January 4, 2024
    Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha
    Inventors: Vitor Guizilini, Rares A. Ambrus, Sergey Zakharov
  • Patent number: 11854280
    Abstract: A method for 3D object detection is described. The method includes detecting semantic keypoints from monocular images of a video stream capturing a 3D object. The method also includes inferring a 3D bounding box of the 3D object corresponding to the detected semantic vehicle keypoints. The method further includes scoring the inferred 3D bounding box of the 3D object. The method also includes detecting the 3D object according to a final 3D bounding box generated based on the scoring of the inferred 3D bounding box.
    Type: Grant
    Filed: April 27, 2021
    Date of Patent: December 26, 2023
    Assignee: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Arjun Bhargava, Haofeng Chen, Adrien David Gaidon, Rares A. Ambrus, Sudeep Pillai
  • Publication number: 20230386060
    Abstract: System, methods, and other embodiments described herein relate to an improved approach to training a depth model to derive depth estimates from monocular images using histograms to assess photometric losses. In one embodiment, a method includes determining loss values according to a photometric loss function. The loss values are associated with a depth map derived from an input image that is a monocular image. The method includes generating histograms for the loss values corresponding to different regions of a target image. The method includes, responsive to identifying erroneous values of the loss values, masking the erroneous values to avoid considering the erroneous values during training of the depth model.
    Type: Application
    Filed: May 27, 2022
    Publication date: November 30, 2023
    Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha
    Inventors: Vitor Guizilini, Rares A. Ambrus, Sergey Zakharov
  • Publication number: 20230386059
    Abstract: System, methods, and other embodiments described herein relate to an improved approach to training a depth model for monocular depth estimation by warping depth features prior to decoding. In one embodiment, a method includes encoding, using an encoder of a depth model, a source image into depth features of a scene depicted by the source image. The method includes warping the depth features into warped features of a target frame of a target image associated with the source image. The method includes decoding, using a decoder of the depth model, the warped features into a depth map. The method includes training the depth model according to a loss derived from the depth map.
    Type: Application
    Filed: May 27, 2022
    Publication date: November 30, 2023
    Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha
    Inventors: Vitor Guizilini, Rares A. Ambrus, Sergey Zakharov
  • Publication number: 20230377180
    Abstract: In accordance with one embodiment of the present disclosure, a method includes receiving a set of images, each image depicting a view of a scene, generating sparse depth data from each image of the set of images, training a monocular depth estimation model with the sparse depth data, generating, with the trained monocular depth estimation model, depth data and uncertainty data for each image, training a NeRF model with the set of images, wherein the training is constrained by the depth data and uncertainty data, and rendering, with the trained NeRF model, a new image having a new view of the scene.
    Type: Application
    Filed: May 18, 2022
    Publication date: November 23, 2023
    Applicant: Toyota Research Institute Inc.
    Inventors: Rares Ambrus, Sergey Zakharov, Vitor C. Guizilini, Adrien Gaidon
  • Patent number: 11822621
    Abstract: Systems and methods described herein relate to training a machine-learning-based monocular depth estimator.
    Type: Grant
    Filed: March 31, 2021
    Date of Patent: November 21, 2023
    Assignee: Toyota Research Institute, Inc.
    Inventors: Vitor Guizilini, Rares A. Ambrus, Adrien David Gaidon, Jie Li
  • Patent number: 11798288
    Abstract: Described are systems and methods for self-learned label refinement of a training set. In on example, a system includes a processor and a memory having a training set generation module that causes the processor to train a model using an image as an input to the model and 2D bounding based on 3D bounding boxes as ground truths, select a first subset from predicted 2D bounding boxes previously outputted by the model, retrain the model using the image as the input and the first subset as ground truths, select a second set of predicted 2D bounding boxes previously outputted by the model, and generate the training set by selecting the 3D bounding boxes from a master set of 3D bounding boxes that have corresponding 2D bounding boxes that form the second subset.
    Type: Grant
    Filed: May 25, 2021
    Date of Patent: October 24, 2023
    Assignee: Toyota Research Institute, Inc.
    Inventors: Dennis Park, Rares A. Ambrus, Vitor Guizilini, Jie Li, Adrien David Gaidon