Patents by Inventor Vitor Guizilini

Vitor Guizilini has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220245843
    Abstract: Systems and methods for self-supervised learning for visual odometry using camera images, may include: estimating correspondences between keypoints of a target camera image and keypoints of a context camera image; based on the keypoint correspondences, lifting a set of 2D keypoints to 3D, using a neural camera model; and projecting the 3D keypoints into the context camera image using the neural camera model. Some embodiments may use the neural camera model to achieve the lifting and projecting of keypoints without a known or calibrated camera model.
    Type: Application
    Filed: April 17, 2022
    Publication date: August 4, 2022
    Inventors: VITOR GUIZILINI, Igor Vasiljevic, Rares A. Ambrus, Sudeep Pillai, Adrien Gaidon
  • Patent number: 11398043
    Abstract: Systems and methods for generating depth models and depth maps from images obtained from an imaging system are presented. A self-supervised neural network may be capable of regularizing depth information from surface normals. Rather than rely on separate depth and surface normal networks, surface normal information is extracted from the depth information and a smoothness function is applied to the surface normals instead of a depth gradient. Smoothing the surface normal may provide improved representation of environmental structures by both smoothing texture-less areas while preserving sharp boundaries between structures.
    Type: Grant
    Filed: June 26, 2020
    Date of Patent: July 26, 2022
    Assignee: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor Guizilini, Adrien David Gaidon, Rares A. Ambrus
  • Patent number: 11398095
    Abstract: A method includes capturing a two-dimensional (2D) image of an environment adjacent to an ego vehicle, the environment includes at least a dynamic object and a static object. The method also includes generating, via a depth estimation network, a depth map of the environment based on the 2D image, an accuracy of a depth estimate for the dynamic object in the depth map is greater than an accuracy of a depth estimate for the static object in the depth map. The method further includes generating a three-dimensional (3D) estimate of the environment based on the depth map and identifying a location of the dynamic object in the 3D estimate. The method additionally includes controlling an action of the ego vehicle based on the identified location.
    Type: Grant
    Filed: June 23, 2020
    Date of Patent: July 26, 2022
    Assignee: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor Guizilini, Adrien David Gaidon
  • Patent number: 11386567
    Abstract: System, methods, and other embodiments described herein relate to semi-supervised training of a depth model for monocular depth estimation. In one embodiment, a method includes training the depth model according to a first stage that is self-supervised and that includes using first training data that comprises pairs of training images. Respective ones of the pairs including separate frames depicting a scene of a monocular video. The method includes training the depth model according to a second stage that is weakly supervised and that includes using second training data to produce depth maps according to the depth model. The second training data comprising individual images with corresponding sparse depth data. The second training data providing for updating the depth model according to second stage loss values that are based, at least in part, on the depth maps and the depth data.
    Type: Grant
    Filed: December 3, 2019
    Date of Patent: July 12, 2022
    Assignee: Toyota Research Institute, Inc.
    Inventors: Vitor Guizilini, Sudeep Pillai, Rares A. Ambrus, Jie Li, Adrien David Gaidon
  • Publication number: 20220207270
    Abstract: A bird's eye view feature map, augmented with semantic information, can be used to detect an object in an environment. A point cloud data set augmented with the semantic information that is associated with identities of classes of objects can be obtained. Features can be extracted from the point cloud data set. Based on the features, an initial bird's eye view feature map can be produced. Because operations performed on the point cloud data set to extract the features or to produce the initial bird's eye view feature map can have an effect of diminishing an ability to distinguish the semantic information in the initial bird's eye view feature map, the initial bird's eye view feature map can be augmented with the semantic information to produce an augmented bird's eye view feature map. Based on the augmented bird's eye view feature map, the object in the environment can be detected.
    Type: Application
    Filed: December 31, 2020
    Publication date: June 30, 2022
    Inventors: Jie Li, Rares A. Ambrus, Vitor Guizilini, Adrien David Gaidon, Jia-En Pan
  • Patent number: 11341719
    Abstract: A method is presented. The method includes estimating an ego-motion of an agent based on a current image from a sequence of images and at least one previous image from the sequence of images. Each image in the sequence of images may be a two-dimensional (2D) image. The method also includes estimating a depth of the current image based the at least one previous image. The estimated depth accounts for a depth uncertainty measurement in the current image and the at least one previous image. The method further includes generating a three-dimensional (3D) reconstruction of the current image based on the estimated ego-motion and the estimated depth. The method still further includes controlling an action of the agent based on the three-dimensional reconstruction.
    Type: Grant
    Filed: May 7, 2020
    Date of Patent: May 24, 2022
    Assignee: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor Guizilini, Adrien David Gaidon
  • Publication number: 20220156525
    Abstract: System, methods, and other embodiments described herein relate to training a multi-task network using real and virtual data. In one embodiment, a method includes acquiring training data that includes real data and virtual data for training a multi-task network that performs at least depth prediction and semantic segmentation. The method includes generating a first output from the multi-task network using the real data and second output from the multi-task network using the virtual data. The method includes generating a mixed loss by analyzing the first output to produce a real loss and the second output to produce a virtual loss. The method includes updating the multi-task network using the mixed loss.
    Type: Application
    Filed: March 29, 2021
    Publication date: May 19, 2022
    Inventors: Vitor Guizilini, Adrien David Gaidon, Jie Li, Rares A. Ambrus
  • Publication number: 20220156971
    Abstract: Systems and methods described herein relate to training a machine-learning-based monocular depth estimator.
    Type: Application
    Filed: March 31, 2021
    Publication date: May 19, 2022
    Inventors: Vitor Guizilini, Rares A. Ambrus, Adrien David Gaidon, Jie Li
  • Publication number: 20220148204
    Abstract: System, methods, and other embodiments described herein relate to determining depths of a scene from a monocular image. In one embodiment, a method includes generating depth features from sensor data according to whether the sensor data includes sparse depth data. The method includes selectively injecting the depth features into a depth model. The method includes generating a depth map from at least a monocular image using the depth model that is guided by the depth features when injected. The method includes providing the depth map as depth estimates of objects represented in the monocular image.
    Type: Application
    Filed: February 16, 2021
    Publication date: May 12, 2022
    Inventors: Vitor Guizilini, Rares A. Ambrus, Adrien David Gaidon
  • Publication number: 20220148206
    Abstract: A method for monocular depth/pose estimation in a camera agnostic network is described. The method includes projecting lifted 3D points onto an image plane according to a predicted ray vector based on a monocular depth model, a monocular pose model, and a camera center of a camera agnostic network. The method also includes predicting a warped target image from a predicted depth map of the monocular depth model, a ray surface of the predicted ray vector, and a projection of the lifted 3D points according to the camera agnostic network.
    Type: Application
    Filed: January 21, 2022
    Publication date: May 12, 2022
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor GUIZILINI, Sudeep PILLAI, Adrien David GAIDON, Rares A. AMBRUS, Igor VASILJEVIC
  • Publication number: 20220148203
    Abstract: System, methods, and other embodiments described herein relate to training a depth model for joint depth completion and prediction. In one arrangement, a method includes generating depth features from sparse depth data according to a sparse auxiliary network (SAN) of a depth model. The method includes generating a first depth map from a monocular image and a second depth map from the monocular image and the depth features using the depth model. The method includes generating a depth loss from the second depth map and the sparse depth data and an image loss from the first depth map and the sparse depth data. The method includes updating the depth model including the SAN using the depth loss and the image loss.
    Type: Application
    Filed: January 21, 2021
    Publication date: May 12, 2022
    Inventors: Vitor Guizilini, Rares A. Ambrus, Adrien David Gaidon
  • Publication number: 20220148202
    Abstract: System, methods, and other embodiments described herein relate to determining depths of a scene from a monocular image. In one embodiment, a method includes generating depth features from depth data using a sparse auxiliary network (SAN) by i) sparsifying the depth data, ii) applying sparse residual blocks of the SAN to the depth data, and iii) densifying the depth features. The method includes generating a depth map from the depth features and a monocular image that corresponds with the depth data according to a depth model that includes the SAN. The method includes providing the depth map as depth estimates of objects represented in the monocular image.
    Type: Application
    Filed: January 7, 2021
    Publication date: May 12, 2022
    Inventors: Vitor Guizilini, Rares A. Ambrus, Adrien David Gaidon
  • Patent number: 11328517
    Abstract: A system and method generate feature space data that may be used for object detection. The system includes one or more processors and a memory. The memory may include one or more modules having instructions that, when executed by the one or more processors, cause the one or more processors to obtain a two-dimension image of a scene, generate an output depth map based on the two-dimension image of the scene, generate a pseudo-LIDAR point cloud based on the output depth map, generate a bird's eye view (BEV) feature space based on the pseudo-LIDAR point cloud, and modify the BEV feature space to generate an improved BEV feature space using feature space neural network that was trained by using a training LIDAR feature space as a ground truth based on a LIDAR point cloud.
    Type: Grant
    Filed: May 20, 2020
    Date of Patent: May 10, 2022
    Assignee: Toyota Research Institute, Inc.
    Inventors: Victor Vaquero Gomez, Rares A. Ambrus, Vitor Guizilini, Adrien David Gaidon
  • Publication number: 20220138975
    Abstract: Systems and methods for self-supervised depth estimation using image frames captured from a camera mounted on a vehicle comprise: receiving a first image from the camera mounted at a first location on the vehicle; receiving a second image from the camera mounted at a second location on the vehicle; predicting a depth map for the first image; warping the first image to a perspective of the camera mounted at the second location on the vehicle to arrive at a warped first image; projecting the warped first image onto the second image; determining a loss based on the projection; and updating the predicted depth values for the first image.
    Type: Application
    Filed: January 13, 2022
    Publication date: May 5, 2022
    Inventors: Vitor Guizilini, Igor Vasiljevic, Rares A. Ambrus, Adrien Gaidon
  • Patent number: 11321859
    Abstract: A method for scene reconstruction includes generating a depth estimate and a first pose estimate from a current image. The method also includes generating a second pose estimate based on the current image and one or more previous images in a sequence of images. The method further includes generating a warped image by warping each pixel in the current image based on the depth estimate, the first pose estimate, and the second pose estimate. The method still further includes controlling an action of an agent based on the second warped image.
    Type: Grant
    Filed: June 22, 2020
    Date of Patent: May 3, 2022
    Assignee: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor Guizilini, Adrien David Gaidon
  • Patent number: 11321863
    Abstract: Systems, methods, and other embodiments described herein relate to generating depth estimates of an environment depicted in a monocular image. In one embodiment, a method includes identifying semantic features in the monocular image according to a semantic model. The method includes injecting the semantic features into a depth model using pixel-adaptive convolutions. The method includes generating a depth map from the monocular image using the depth model that is guided by the semantic features. The pixel-adaptive convolutions are integrated into a decoder of the depth model. The method includes providing the depth map as the depth estimates for the monocular image.
    Type: Grant
    Filed: February 28, 2020
    Date of Patent: May 3, 2022
    Assignee: Toyota Research Institute, Inc.
    Inventors: Vitor Guizilini, Rares A. Ambrus, Jie Li, Adrien David Gaidon
  • Patent number: 11321862
    Abstract: Systems and methods for self-supervised depth estimation using image frames captured from a plurality of cameras mounted on a vehicle, may include: receiving a first image from a camera mounted at a first location on the vehicle, the source image comprising pixels representing a scene of the environment of the vehicle; receiving a reference image from a camera mounted at a second location on the vehicle, the reference image comprising pixels representing a scene of the environment; predicting a depth map for the first image, the depth map comprising predicted depth values for pixels of the first image; warping the first image to a perspective of the camera mounted at the second location on the vehicle to arrive at a warped first image; projecting the warped first image onto the source image; determining a loss based on the projection; and updating the predicted depth values for the first image.
    Type: Grant
    Filed: September 15, 2020
    Date of Patent: May 3, 2022
    Assignee: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor Guizilini, Igor Vasiljevic, Rares A. Ambrus, Adrien Gaidon
  • Patent number: 11315269
    Abstract: A system for generating point clouds having surface normal information includes one or more processors and a memory having a depth map generating module, a point cloud generating module, and surface normal generating module. The depth map generating module causes the one or more processors to generate a depth map from one or more images of a scene. The point cloud causes the one or more processors to generate a point cloud from the depth map having a plurality of points corresponding to one or more pixels of the depth map. The surface normal generating module causes the one or more processors to generate surface normal information for at least a portion of the one or more pixels of the depth map and inject the surface normal information into the point cloud such that the plurality of points of the point cloud include three-dimensional location information and surface normal information.
    Type: Grant
    Filed: August 24, 2020
    Date of Patent: April 26, 2022
    Assignee: Toyota Research Institute, Inc.
    Inventors: Victor Vaquero Gomez, Rares A. Ambrus, Vitor Guizilini, Adrien David Gaidon
  • Publication number: 20220108463
    Abstract: A method for using an artificial neural network associated with an agent to estimate depth, includes receiving, at the artificial neural network, an input image captured via a sensor associated with the agent. The method also includes upsampling, at each decoding layer of a plurality of decoding layers of the artificial neural network, decoded features associated with the input image to a resolution associated with a final output of the artificial neural network. The method further includes concatenating, at each decoding layer, the upsampled decoded features with features obtained at a convolution layer associated with a respective decoding layer. The method still further includes estimating, at a recurrent module of the artificial neural network, a depth of the input image based on receiving the concatenated upsampled decoded features from each decoding layer. The method also includes controlling an action of an agent based on the depth estimate.
    Type: Application
    Filed: December 17, 2021
    Publication date: April 7, 2022
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor GUIZILINI, Adrien David GAIDON
  • Publication number: 20220084232
    Abstract: Systems and methods for map construction using a video sequence captured on a camera of a vehicle in an environment, comprising: receiving a video sequence from the camera, the video sequence including a plurality of image frames capturing a scene of the environment of the vehicle; using a neural camera model to predict a depth map and a ray surface for the plurality of image frames in the received video sequence; and constructing a map of the scene of the environment based on image data captured in the plurality of frames and depth information in the predicted depth maps.
    Type: Application
    Filed: September 15, 2020
    Publication date: March 17, 2022
    Inventors: VITOR GUIZILINI, IGOR VASILJEVIC, RARES A. AMBRUS, SUDEEP PILLAI, ADRIEN GAIDON