Patents by Inventor Igor Vasiljevic

Igor Vasiljevic has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods for generic visual odometry using learned features via neural camera models

Patent number: 12333750

Abstract: Systems and methods for self-supervised learning for visual odometry using camera images, may include: estimating correspondences between keypoints of a target camera image and keypoints of a context camera image; based on the keypoint correspondences, lifting a set of 2D keypoints to 3D, using a neural camera model; and projecting the 3D keypoints into the context camera image using the neural camera model. Some embodiments may use the neural camera model to achieve the lifting and projecting of keypoints without a known or calibrated camera model.

Type: Grant

Filed: April 17, 2022

Date of Patent: June 17, 2025

Assignee: TOYOTA RESEARCH INSTITUTE, INC.

Inventors: Vitor Guizilini, Igor Vasiljevic, Rares A. Ambrus, Sudeep Pillai, Adrien Gaidon
Systems and methods for estimating scaled maps by sampling representations from a learning model

Patent number: 12293548

Abstract: Systems, methods, and other embodiments described herein relate to estimating scaled depth maps by sampling variational representations of an image using a learning model. In one embodiment, a method includes encoding data embeddings by a learning model to form conditioned latent representations using attention networks, the data embeddings including features about an image from a camera and calibration information about the camera. The method also includes computing a probability distribution of the conditioned latent representations by factoring scale priors. The method also includes sampling the probability distribution to generate variations for the data embeddings. The method also includes estimating scaled depth maps of a scene from the variations at different coordinates using the attention networks.

Type: Grant

Filed: October 13, 2023

Date of Patent: May 6, 2025

Assignees: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha

Inventors: Vitor Campagnolo Guizilini, Igor Vasiljevic, Dian Chen, Adrien David Gaidon, Rares A. Ambrus
SELF EXTRINSIC SELF-CALIBRATION VIA GEOMETRICALLY CONSISTENT SELF-SUPERVISED DEPTH AND EGO-MOTION LEARNING

Publication number: 20250095380

Abstract: Systems and methods described herein relate to self-supervised scale-aware learning of camera extrinsic parameters. One embodiment processes instantaneous velocity between a target image and a context image captured by a first camera; jointly training a depth network and pose network based on scaling by the instantaneous velocity; produce depth map using the depth network; produce ego-motion of the first camera using the pose network; generate synthesized image from the target image using a reprojection operation based on the depth map, the ego-motion, the context image and camera intrinsics; determine photometric loss by comparing the synthesized image to the target image; generate photometric consistency constraint using a gradient from the photometric loss; determine pose consistency constraint between the first camera and a second camera; and optimize the photometric consistency constraint, the pose consistency constraint, the depth network and the pose network to generate estimated extrinsic parameters.

Type: Application

Filed: September 18, 2023

Publication date: March 20, 2025

Applicants: TOYOTA RESEARCH INSTITUTE, INC., TOYOTA JIDOSHA KABUSHIKI KAISHA

Inventors: TAKAYUKI KANAI, Vitor Campagnolo Guizilini, Rares A. Ambrus, Adrien Gaidon, Igor Vasiljevic
Systems and methods for self-supervised learning of camera intrinsic parameters from a sequence of images

Patent number: 12175708

Abstract: Systems and methods described herein relate to self-supervised learning of camera intrinsic parameters from a sequence of images. One embodiment produces a depth map from a current image frame captured by a camera; generates a point cloud from the depth map using a differentiable unprojection operation; produces a camera pose estimate from the current image frame and a context image frame; produces a warped point cloud based on the camera pose estimate; generates a warped image frame from the warped point cloud using a differentiable projection operation; compares the warped image frame with the context image frame to produce a self-supervised photometric loss; updates a set of estimated camera intrinsic parameters on a per-image-sequence basis using one or more gradients from the self-supervised photometric loss; and generates, based on a converged set of learned camera intrinsic parameters, a rectified image frame from an image frame captured by the camera.

Type: Grant

Filed: March 11, 2022

Date of Patent: December 24, 2024

Assignees: Toyota Research Institute, Inc., Toyota Technological Institute at Chicago

Inventors: Vitor Guizilini, Adrien David Gaidon, Rares A. Ambrus, Igor Vasiljevic, Jiading Fang, Gregory Shakhnarovich, Matthew R. Walter
SYSTEMS AND METHODS FOR ESTIMATING SCALED MAPS BY SAMPLING REPRESENTATIONS FROM A LEARNING MODEL

Publication number: 20240354991

Abstract: Systems, methods, and other embodiments described herein relate to estimating scaled depth maps by sampling variational representations of an image using a learning model. In one embodiment, a method includes encoding data embeddings by a learning model to form conditioned latent representations using attention networks, the data embeddings including features about an image from a camera and calibration information about the camera. The method also includes computing a probability distribution of the conditioned latent representations by factoring scale priors. The method also includes sampling the probability distribution to generate variations for the data embeddings. The method also includes estimating scaled depth maps of a scene from the variations at different coordinates using the attention networks.

Type: Application

Filed: October 13, 2023

Publication date: October 24, 2024

Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha

Inventors: Vitor Campagnolo Guizilini, Igor Vasiljevic, Dian Chen, Adrien David Gaidon, Rares A. Ambrus
SYSTEMS AND METHODS FOR AUGMENTING IMAGE EMBEDDINGS USING DERIVED GEOMETRIC EMBEDDINGS

Publication number: 20240354973

Abstract: Systems, methods, and other embodiments described herein relate to augmenting image embeddings using derived geometries for estimating scaled depth. In one embodiment, a method includes generating a geometric viewing vector using pixel coordinates and intrinsic parameters about a camera for an image captured about a scene. The method also includes deriving geometric embeddings from the geometric viewing vector associated with the image for the camera. The method also includes computing a representation by augmenting image embeddings with the geometric embeddings, the image embeddings associated with visual characteristics about the image. The method also includes estimating a scaled depth of the image from the representation.

Type: Application

Filed: September 12, 2023

Publication date: October 24, 2024

Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha

Inventors: Vitor Campagnolo Guizilini, Igor Vasiljevic, Dian Chen, Adrien David Gaidon, Rares A. Ambrus
SYSTEMS AND METHODS FOR AUGMENTING IMAGES DURING TRAINING OF A DEPTH MODEL

Publication number: 20240354974

Abstract: Systems, methods, and other embodiments described herein relate to augmenting an image frame during training that enhances scene geometries and transformation capabilities for depth prediction. In one embodiment, a method includes generating rays with camera intrinsics to form a grid for an image frame. The method also includes injecting noise, by an encoder during training of a learning model, to individually perturb pixels within pixel boundaries for the rays, the pixel boundaries defined by the grid. The method also includes removing a subset of the rays randomly by the encoder and extract features from the rays. The method also includes comparing scaled depth estimates to a ground truth for a grid resolution using the features and adjust the learning model from the comparison.

Type: Application

Filed: November 17, 2023

Publication date: October 24, 2024

Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha

Inventors: Vitor Campagnolo Guizilini, Igor Vasiljevic, Dian Chen, Adrien David Gaidon, Rares Andrei Ambrus
SYSTEMS AND METHODS FOR GENERATING AN IMAGE USING INTERPOLATION OF FEATURES

Publication number: 20240331268

Abstract: System, methods, and other embodiments described herein relate to generating an image by interpolating features estimated from a learning model. In one embodiment, a method includes sampling three-dimensional (3D) points of a light ray that crosses a frustum space associated with a single-view camera, the 3D points reflecting depth estimates derived from data that the single-view camera generates for a scene. The method also includes deriving feature values for the 3D points using tri-linear interpolation across feature planes of the frustum space, the feature planes being estimated by a learning model. The method also includes inferring an image in two dimensions (2D) by translating the feature values and compositing the data with volumetric rendering for the scene. The method also includes executing a control task by a controller using the image.

Type: Application

Filed: March 29, 2023

Publication date: October 3, 2024

Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha, Toyota Technological Institute at Chicago

Inventors: Jiading Fang, Vitor Guizilini, Igor Vasiljevic, Rares A. Ambrus, Gregory Shakhnarovich, Matthew R. Walter, Adrien David Gaidon
SYSTEMS AND METHODS FOR DEPTH SYNTHESIS WITH TRANSFORMER ARCHITECTURES

Publication number: 20240249465

Abstract: Systems and methods for enhanced computer vision capabilities, particularly including depth synthesis, which may be applicable to autonomous vehicle operation are described. A vehicle may be equipped with a geometric scene representation (GSR) architecture for synthesizing depth views at arbitrary viewpoints. The GSR architecture synthesizes depth views enable advanced functions, including depth interpolation and depth extrapolation. The GSR architecture implements functions (i.e., depth interpolation, depth extrapolation) that are useful for various computer vision applications for autonomous vehicles, such as predicting depth maps from unseen locations. For example, a vehicle includes a processor device synthesizing depth views at multiple viewpoints, where the multiple viewpoints are from image data of a surrounding environment for the vehicle.

Type: Application

Filed: January 19, 2023

Publication date: July 25, 2024

Applicants: TOYOTA RESEARCH INSTITUTE, INC., TOYOTA JIDOSHA KABUSHIKI KAISHA, TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO

Inventors: VITOR GUIZILINI, Igor Vasiljevic, Adrien D. Gaidon, Greg Shakhnarovich, Matthew Walter, Jiading Fang, Rares A. Ambrus
Scale-aware depth estimation using multi-camera projection loss

Patent number: 12033341

Abstract: A method for scale-aware depth estimation using multi-camera projection loss is described. The method includes determining a multi-camera photometric loss associated with a multi-camera rig of an ego vehicle. The method also includes training a scale-aware depth estimation model and an ego-motion estimation model according to the multi-camera photometric loss. The method further includes predicting a 360° point cloud of a scene surrounding the ego vehicle according to the scale-aware depth estimation model and the ego-motion estimation model. The method also includes planning a vehicle control action of the ego vehicle according to the 360° point cloud of the scene surrounding the ego vehicle.

Type: Grant

Filed: July 30, 2021

Date of Patent: July 9, 2024

Assignees: TOYOTA RESEARCH INSTITUTE, INC., TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO

Inventors: Vitor Guizilini, Rares Andrei Ambrus, Adrien David Gaidon, Igor Vasiljevic, Gregory Shakhnarovich
RADIANT AND VOLUMETRIC LATENT SPACE ENCODING FOR VOLUMETRIC RENDERING

Publication number: 20240161471

Abstract: Systems and methods described herein support enhanced computer vision capabilities which may be applicable to, for example, autonomous vehicle operation. An example method includes generating, through training, a shared latent space based on (i) image data that include multiple images, where each image has a different viewing frame of a scene, and (ii) first and second types of embeddings, and training a decoder based on the first type of embeddings. The method also includes generating an embedding based on the first type of embeddings that is representative of a novel viewing frame of the scene, decoding, with the decoder, the shared latent space using cross-attention with the generated embedding, and generating the novel viewing frame of the scene based on an output of the decoder.

Type: Application

Filed: August 3, 2023

Publication date: May 16, 2024

Applicants: Toyota Research Institute, Inc., Massachusetts Institute of Technology, Toyota Jidosha Kabushiki Kaisha

Inventors: Vitor Guizilini, Rares A. Ambrus, Jiading Fang, Sergey Zakharov, Vincent Sitzmann, Igor Vasiljevic, Adrien Gaidon
SHARED LATENT SPACES FOR VOLUMETRIC RENDERING

Publication number: 20240161510

Abstract: Systems and methods described herein support enhanced computer vision capabilities which may be applicable to, for example, autonomous vehicle operation. An example method includes An example method includes training a shared latent space and a first decoder based on first image data that includes multiple images, and training the shared latent space and a second decoder based on second image data that includes multiple images. The method also includes generating a volumetric embedding that is representative of a novel viewing frame the first scene. Further, the method includes decoding, with the first decoders, the shared latent space with the volumetric embedding, and generating the novel viewing frame of the first scene based on the output of the first decoder.

Type: Application

Filed: August 3, 2023

Publication date: May 16, 2024

Applicants: Toyota Research Institute, Inc., Massachusetts Institute of Technology, Toyota Jidosha Kabushiki Kaisha

Inventors: Vitor Guizilini, Rares A. Ambrus, Jiading Fang, Sergey Zakharov, Vincent Sitzmann, Igor Vasiljevic, Adrien Gaidon
CROSS-ATTENTION DECODING FOR VOLUMETRIC RENDERING

Publication number: 20240161389

Abstract: Systems and methods described herein support enhanced computer vision capabilities which may be applicable to, for example, autonomous vehicle operation. An example method includes generating a latent space and a decoder based on image data that includes multiple images, where each image has a different viewing frame of a scene. The method also includes generating a volumetric embedding that is representative of a novel viewing frame of the scene. The method includes decoding, with the decoder, the latent space using cross-attention with the volumetric embedding, and generating a novel viewing frame of the scene based on an output of the decoder.

Type: Application

Filed: August 3, 2023

Publication date: May 16, 2024

Applicants: Toyota Research Institute, Inc., Massachusetts Institute of Technology, Toyota Jidosha Kabushiki Kaisha

Inventors: Vitor Guizilini, Rares A. Ambrus, Jiading Fang, Sergey Zakharov, Vincent Sitzmann, Igor Vasiljevic, Adrien Gaidon
SELF-SUPERVISED DEPTH FOR VOLUMETRIC RENDERING REGULARIZATION

Publication number: 20240153197

Abstract: An example method includes generating embeddings of image data that includes multiple images, where each image has a different viewpoints of a scene, generating a latent space and a decoder, wherein the decoder receives embeddings as input to generate an output viewpoint, for each viewpoint in the image data, determining a volumetric rendering view synthesis loss and a multi-view photometric loss, and applying an optimization algorithm to the latent space and the decoder over a number of epochs until the volumetric rendering view synthesis loss is within a volumetric threshold and the multi-view photometric loss is within a multi-view threshold.

Type: Application

Filed: August 3, 2023

Publication date: May 9, 2024

Applicants: Toyota Research Institute, Inc., Massachusetts Institute of Technology, Toyota Jidosha Kabushiki Kaisha

Inventors: Vitor Guizilini, Rares A. Ambrus, Jiading Fang, Sergey Zakharov, Vincent Sitzmann, Igor Vasiljevic, Adrien Gaidon
GEOMETRIC 3D AUGMENTATIONS FOR TRANSFORMER ARCHITECTURES

Publication number: 20240029286

Abstract: A method of generating additional supervision data to improve learning of a geometrically-consistent latent scene representation with a geometric scene representation architecture is provided. The method includes receiving, with a computing device, a latent scene representation encoding a pointcloud from images of a scene captured by a plurality of cameras each with known intrinsics and poses, generating a virtual camera having a viewpoint different from viewpoints of the plurality of cameras, projecting information from the pointcloud onto the viewpoint of the virtual camera, and decoding the latent scene representation based on the virtual camera thereby generating an RGB image and depth map corresponding to the viewpoint of the virtual camera for implementation as additional supervision data.

Type: Application

Filed: February 16, 2023

Publication date: January 25, 2024

Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha, Toyota Technological Institute at Chicago

Inventors: Vitor Guizilini, Igor Vasiljevic, Adrien D. Gaidon, Jiading Fang, Gregory Shakhnarovich, Matthew R. Walter, Rares A. Ambrus
Self-occlusion masks to improve self-supervised monocular depth estimation in multi-camera settings

Patent number: 11875521

Abstract: A method for self-supervised depth and ego-motion estimation is described. The method includes determining a multi-camera photometric loss associated with a multi-camera rig of an ego vehicle. The method also includes generating a self-occlusion mask by manually segmenting self-occluded areas of images captured by the multi-camera rig of the ego vehicle. The method further includes multiplying the multi-camera photometric loss with the self-occlusion mask to form a self-occlusion masked photometric loss. The method also includes training a depth estimation model and an ego-motion estimation model according to the self-occlusion masked photometric loss. The method further includes predicting a 360° point cloud of a scene surrounding the ego vehicle according to the depth estimation model and the ego-motion estimation model.

Type: Grant

Filed: July 26, 2021

Date of Patent: January 16, 2024

Assignee: TOYOTA RESEARCH INSTITUTE, INC.

Inventors: Vitor Guizilini, Rares Andrei Ambrus, Adrien David Gaidon, Igor Vasiljevic, Gregory Shakhnarovich
System and method to improve multi-camera monocular depth estimation using pose averaging

Patent number: 11727589

Abstract: A method for multi-camera monocular depth estimation using pose averaging is described. The method includes determining a multi-camera photometric loss associated with a multi-camera rig of an ego vehicle. The method also includes determining a multi-camera pose consistency constraint (PCC) loss associated with the multi-camera rig of the ego vehicle. The method further includes adjusting the multi-camera photometric loss according to the multi-camera PCC loss to form a multi-camera PCC photometric loss. The method also includes training a multi-camera depth estimation model and an ego-motion estimation model according to the multi-camera PCC photometric loss. The method further includes predicting a 360° point cloud of a scene surrounding the ego vehicle according to the trained multi-camera depth estimation model and the ego-motion estimation model.

Type: Grant

Filed: July 16, 2021

Date of Patent: August 15, 2023

Assignee: TOYOTA RESEARCH INSTITUTE, INC.

Inventors: Vitor Guizilini, Rares Andrei Ambrus, Adrien David Gaidon, Igor Vasiljevic, Gregory Shakhnarovich
Camera agnostic depth network

Patent number: 11704821

Abstract: A method for monocular depth/pose estimation in a camera agnostic network is described. The method includes projecting lifted 3D points onto an image plane according to a predicted ray vector based on a monocular depth model, a monocular pose model, and a camera center of a camera agnostic network. The method also includes predicting a warped target image from a predicted depth map of the monocular depth model, a ray surface of the predicted ray vector, and a projection of the lifted 3D points according to the camera agnostic network.

Type: Grant

Filed: January 21, 2022

Date of Patent: July 18, 2023

Assignee: TOYOTA RESEARCH INSTITUTE, INC.

Inventors: Vitor Guizilini, Sudeep Pillai, Adrien David Gaidon, Rares A. Ambrus, Igor Vasiljevic
Systems and methods for multi-camera modeling with neural camera networks

Patent number: 11704822

Abstract: Systems and methods for self-supervised depth estimation using image frames captured from a camera mounted on a vehicle comprise: receiving a first image from the camera mounted at a first location on the vehicle; receiving a second image from the camera mounted at a second location on the vehicle; predicting a depth map for the first image; warping the first image to a perspective of the camera mounted at the second location on the vehicle to arrive at a warped first image; projecting the warped first image onto the second image; determining a loss based on the projection; and updating the predicted depth values for the first image.

Type: Grant

Filed: January 13, 2022

Date of Patent: July 18, 2023

Assignee: TOYOTA RESEARCH INSTITUTE, INC.

Inventors: Vitor Guizilini, Igor Vasiljevic, Rares A. Ambrus, Adrien Gaidon
Shared median-scaling metric for multi-camera self-supervised depth evaluation

Patent number: 11688090

Abstract: A method for multi-camera self-supervised depth evaluation is described. The method includes training a self-supervised depth estimation model and an ego-motion estimation model according to a multi-camera photometric loss associated with a multi-camera rig of an ego vehicle. The method also includes generating a single-scale correction factor according to a depth map of each camera of the multi-camera rig during a time-step. The method further includes predicting a 360° point cloud of a scene surrounding the ego vehicle according to the self-supervised depth estimation model and the ego-motion estimation model. The method also includes scaling the 360° point cloud according to the single-scale correction factor to form an aligned 360° point cloud.

Type: Grant

Filed: July 15, 2021

Date of Patent: June 27, 2023

Assignee: TOYOTA RESEARCH INSTITUTE, INC.

Inventors: Vitor Guizilini, Rares Andrei Ambrus, Adrien David Gaidon, Igor Vasiljevic, Gregory Shakhnarovich

1 2 next