Patents by Inventor Sudeep Pillai

Sudeep Pillai has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11138751
    Abstract: System, methods, and other embodiments described herein relate to training a depth model for monocular depth estimation. In one embodiment, a method includes generating, as part of training the depth model according to a supervised training stage, a depth map from a first image of a pair of training images using the depth model. The pair of training images are separate frames depicting a scene from a monocular video. The method includes generating a transformation from the first image and a second image of the pair using a pose model. The method includes computing a supervised loss based, at least in part, on reprojecting the depth map and training depth data onto an image space of the second image according to at least the transformation. The method includes updating the depth model and the pose model according to at least the supervised loss.
    Type: Grant
    Filed: November 20, 2019
    Date of Patent: October 5, 2021
    Assignee: Toyota Research Institute, Inc.
    Inventors: Vitor Guizilini, Sudeep Pillai, Rares A. Ambrus, Jie Li, Adrien David Gaidon
  • Publication number: 20210281814
    Abstract: System, methods, and other embodiments described herein relate to improving depth estimates for monocular images using a neural camera model that is independent of a camera type. In one embodiment, a method includes receiving a monocular image from a pair of training images derived from a monocular video. The method includes generating, using a ray surface network, a ray surface that approximates an image character of the monocular image as produced by a camera having the camera type. The method includes creating a synthesized image according to at least the ray surface and a depth map associated with the monocular image.
    Type: Application
    Filed: June 12, 2020
    Publication date: September 9, 2021
    Inventors: Vitor Guizilini, Igor Vasiljevic, Rares A. Ambrus, Sudeep Pillai, Adrien David Gaidon
  • Patent number: 11107230
    Abstract: System, methods, and other embodiments described herein relate to generating depth estimates from a monocular image. In one embodiment, a method includes, in response to receiving the monocular image, flipping, by a disparity model, the monocular image to generate a flipped image. The disparity model is a machine learning algorithm. The method includes analyzing, using the disparity model, the monocular image and the flipped image to generate disparity maps including a monocular disparity map corresponding to the monocular image and a flipped disparity map corresponding with the flipped image. The method includes generating, in the disparity model, a fused disparity map from the monocular disparity map and the flipped disparity map. The method includes providing the fused disparity map as the depth estimates of objects represented in the monocular image.
    Type: Grant
    Filed: February 15, 2019
    Date of Patent: August 31, 2021
    Assignee: Toyota Research Institute, Inc.
    Inventors: Sudeep Pillai, Rares A. Ambrus, Adrien David Gaidon
  • Publication number: 20210237764
    Abstract: A method for learning depth-aware keypoints and associated descriptors from monocular video for ego-motion estimation is described. The method includes training a keypoint network and a depth network to learn depth-aware keypoints and the associated descriptors. The training is based on a target image and a context image from successive images of the monocular video. The method also includes lifting 2D keypoints from the target image to learn 3D keypoints based on a learned depth map from the depth network. The method further includes estimating ego-motion from the target image to the context image based on the learned 3D keypoints.
    Type: Application
    Filed: November 9, 2020
    Publication date: August 5, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Jiexiong TANG, Rares A. AMBRUS, Vitor GUIZILINI, Sudeep PILLAI, Hanme KIM, Adrien David GAIDON
  • Publication number: 20210237774
    Abstract: A method for learning depth-aware keypoints and associated descriptors from monocular video for monocular visual odometry is described. The method includes training a keypoint network and a depth network to learn depth-aware keypoints and the associated descriptors. The training is based on a target image and a context image from successive images of the monocular video. The method also includes lifting 2D keypoints from the target image to learn 3D keypoints based on a learned depth map from the depth network. The method further includes estimating a trajectory of an ego-vehicle based on the learned 3D keypoints.
    Type: Application
    Filed: November 9, 2020
    Publication date: August 5, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Jiexiong TANG, Rares A. AMBRUS, Vitor GUIZILINI, Sudeep PILLAI, Hanme KIM, Adrien David GAIDON
  • Publication number: 20210118163
    Abstract: System, methods, and other embodiments described herein relate to generating depth estimates of an environment depicted in a monocular image. In one embodiment, a method includes, in response to receiving the monocular image, processing the monocular image according to a depth model to generate a depth map. Processing the monocular images includes encoding the monocular image according to encoding layers of the depth model including iteratively encoding features of the monocular image to generate feature maps at successively refined representations using packing blocks within the encoding layers. Processing the monocular image further includes decoding the feature maps according to decoding layers of the depth model including iteratively decoding the features maps associated with separate ones of the packing blocks using unpacking blocks of the decoding layers to generate the depth map. The method includes providing the depth map as the depth estimates of objects represented in the monocular image.
    Type: Application
    Filed: October 17, 2019
    Publication date: April 22, 2021
    Inventors: Vitor Guizilini, Rares A. Ambrus, Sudeep Pillai, Adrien David Gaidon
  • Publication number: 20210118184
    Abstract: System, methods, and other embodiments described herein relate to self-supervised training of a depth model for monocular depth estimation. In one embodiment, a method includes processing a first image of a pair according to the depth model to generate a depth map. The method includes processing the first image and a second image of the pair according to a pose model to generate a transformation that defines a relationship between the pair. The pair of images are separate frames depicting a scene of a monocular video. The method includes generating a monocular loss and a pose loss, the pose loss including at least a velocity component that accounts for motion of a camera between the training images. The method includes updating the pose model according to the pose loss and the depth model according to the monocular loss to improve scale awareness of the depth model in producing depth estimates.
    Type: Application
    Filed: October 17, 2019
    Publication date: April 22, 2021
    Inventors: Sudeep Pillai, Rares A. Ambrus, Vitor Guizilini, Adrien David Gaidon
  • Publication number: 20210089836
    Abstract: Systems and methods for training a neural keypoint detection network are disclosed herein. One embodiment extracts a portion of an input image; applies a transformation to the portion of the input image to produce a transformed portion of the input image; processes the portion of the input image and the transformed portion of the input image using the neural keypoint detection network to identify one or more candidate keypoint pairs between the portion of the input image and the transformed portion of the input image; and processes the one or more candidate keypoint pairs using an inlier-outlier neural network, the inlier-outlier neural network producing an indirect supervisory signal to train the neural keypoint detection network to identify one or more candidate keypoint pairs between the portion of the input image and the transformed portion of the input image.
    Type: Application
    Filed: March 31, 2020
    Publication date: March 25, 2021
    Inventors: Jiexiong Tang, Rares A. Ambrus, Vitor Guizilini, Sudeep Pillai, Hanme Kim
  • Publication number: 20210089890
    Abstract: Systems and methods for detecting and matching keypoints between different views of a scene are disclosed herein. One embodiment acquires first and second images; subdivides the first and second images into first and second pluralities of cells, respectively; processes both pluralities of cells using a neural keypoint detection network to identify a first keypoint for a particular cell in the first plurality of cells and a second keypoint for a particular cell in the second plurality of cells, at least one of the first and second keypoints lying in a cell other than the particular cell in the first or second plurality of cells for which it was identified; and classifies the first keypoint and the second keypoint as a matching keypoint pair based, at least in part, on a comparison between a first descriptor associated with the first keypoint and a second descriptor associated with the second keypoint.
    Type: Application
    Filed: March 31, 2020
    Publication date: March 25, 2021
    Inventors: Jiexiong Tang, Rares A. Ambrus, Vitor Guizilini, Sudeep Pillai, Hanme Kim
  • Publication number: 20210004646
    Abstract: System, methods, and other embodiments described herein relate to semi-supervised training of a depth model for monocular depth estimation. In one embodiment, a method includes training the depth model according to a first stage that is self-supervised and that includes using first training data that comprises pairs of training images. Respective ones of the pairs including separate frames depicting a scene of a monocular video. The method includes training the depth model according to a second stage that is weakly supervised and that includes using second training data to produce depth maps according to the depth model. The second training data comprising individual images with corresponding sparse depth data. The second training data providing for updating the depth model according to second stage loss values that are based, at least in part, on the depth maps and the depth data.
    Type: Application
    Filed: December 3, 2019
    Publication date: January 7, 2021
    Inventors: Vitor Guizilini, Sudeep Pillai, Rares A. Ambrus, Jie Li, Adrien David Gaidon
  • Publication number: 20210004976
    Abstract: System, methods, and other embodiments described herein relate to training a depth model for monocular depth estimation. In one embodiment, a method includes generating, as part of training the depth model according to a supervised training stage, a depth map from a first image of a pair of training images using the depth model. The pair of training images are separate frames depicting a scene from a monocular video. The method includes generating a transformation from the first image and a second image of the pair using a pose model. The method includes computing a supervised loss based, at least in part, on reprojecting the depth map and training depth data onto an image space of the second image according to at least the transformation. The method includes updating the depth model and the pose model according to at least the supervised loss.
    Type: Application
    Filed: November 20, 2019
    Publication date: January 7, 2021
    Inventors: Vitor Guizilini, Sudeep Pillai, Rares A. Ambrus, Jie Li, Adrien David Gaidon
  • Publication number: 20210004660
    Abstract: System, methods, and other embodiments described herein relate to estimating ego-motion. In one embodiment, a method for estimating ego-motion based on a plurality of input images in a self-supervised system includes receiving a source image and a target image, determining a depth estimation Dt based on the target image, determining a depth estimation Ds based on a source image, and determining an ego-motion estimation in a form of a six degrees-of-freedom (6 DOF) transformation between the target image and the source image by inputting the depth estimations (Dt, Ds), the target image, and the source image into a two-stream network architecture trained to output the 6 DOF transformation based at least in part on the depth estimations (Dt, Ds), the target image, and the source image.
    Type: Application
    Filed: October 16, 2019
    Publication date: January 7, 2021
    Inventors: Rares A. Ambrus, Vitor Guizilini, Sudeep Pillai, Jie Li, Adrien David Gaidon
  • Publication number: 20210004974
    Abstract: System, methods, and other embodiments described herein relate to semi-supervised training of a depth model using a neural camera model that is independent of a camera type. In one embodiment, a method includes acquiring training data including at least a pair of training images and depth data associated with the training images. The method includes training the depth model using the training data to generate a self-supervised loss from the pair of training images and a supervised loss from the depth data. Training the depth model includes learning the camera type by generating, using a ray surface model, a ray surface that approximates an image character of the training images as produced by a camera having the camera type. The method includes providing the depth model to infer depths from monocular images in a device.
    Type: Application
    Filed: June 19, 2020
    Publication date: January 7, 2021
    Inventors: Vitor Guizilini, Igor Vasiljevic, Rares A. Ambrus, Sudeep Pillai, Adrien David Gaidon
  • Publication number: 20200134379
    Abstract: Acquiring labeled data can be a significant bottleneck in the development of machine learning models that are accurate and efficient enough to enable safety-critical applications, such as automated driving. The process of labeling of driving logs can be automated. Unlabeled real-world driving logs, which include data captured by one or more vehicle sensors, can be automatically labeled to generate one or more labeled real-world driving logs. The automatic labeling can include analysis-by-synthesis on the unlabeled real-world driving logs to generate simulated driving logs, which can include reconstructed driving scenes or portions thereof. The automatic labeling can further include simulation-to-real automatic labeling on the simulated driving logs and the unlabeled real-world driving logs to generate one or more labeled real-world driving logs. The automatically labeled real-world driving logs can be stored in one or more data stores for subsequent training, validation, evaluation, and/or model management.
    Type: Application
    Filed: October 30, 2018
    Publication date: April 30, 2020
    Inventors: Adrien David Gaidon, James J. Kuffner, JR., Sudeep Pillai
  • Publication number: 20200090359
    Abstract: System, methods, and other embodiments described herein relate to generating depth estimates from a monocular image. In one embodiment, a method includes, in response to receiving the monocular image, flipping, by a disparity model, the monocular image to generate a flipped image. The disparity model is a machine learning algorithm. The method includes analyzing, using the disparity model, the monocular image and the flipped image to generate disparity maps including a monocular disparity map corresponding to the monocular image and a flipped disparity map corresponding with the flipped image. The method includes generating, in the disparity model, a fused disparity map from the monocular disparity map and the flipped disparity map. The method includes providing the fused disparity map as the depth estimates of objects represented in the monocular image.
    Type: Application
    Filed: February 15, 2019
    Publication date: March 19, 2020
    Inventors: Sudeep Pillai, Rares A. Ambrus, Adrien David Gaidon
  • Patent number: 10477178
    Abstract: A tunable and iterative stereo mapping technique is provided, capable of identifying disparities at or substantially faster than real-time (e.g., frame-rate of 120 Hz). The method includes identifying a plurality of points in an image, determining disparity values for each of the points in the image and generating a piece-wise planar mesh based on the points and their respective disparity values. A disparity interpolation can be performed on candidate planes using estimated plane parameters for the candidate planes and a disparity image can be generated having a plurality of regions based on the disparity interpolation. Multiple iterations can be performed until the image is reconstructed with an appropriate resolution based on predetermined thresholds. The thresholds can be modified to provide a tunable system by changing the threshold values to either increase a resolution of a final reconstructed image and/or increase a computation speed of the tunable and iterative stereo mapping technique.
    Type: Grant
    Filed: June 29, 2017
    Date of Patent: November 12, 2019
    Assignee: Massachusetts Institute of Technology
    Inventors: John Joseph Leonard, Sudeep Pillai
  • Publication number: 20190020861
    Abstract: A tunable and iterative stereo mapping technique is provided, capable of identifying disparities at or substantially faster than real-time (e.g., frame-rate of 120 Hz). The method includes identifying a plurality of points in an image, determining disparity values for each of the points in the image and generating a piece-wise planar mesh based on the points and their respective disparity values. A disparity interpolation can be performed on candidate planes using estimated plane parameters for the candidate planes and a disparity image can be generated having a plurality of regions based on the disparity interpolation. Multiple iterations can be performed until the image is reconstructed with an appropriate resolution based on predetermined thresholds. The thresholds can be modified to provide a tunable system by changing the threshold values to either increase a resolution of a final reconstructed image and/or increase a computation speed of the tunable and iterative stereo mapping technique.
    Type: Application
    Filed: June 29, 2017
    Publication date: January 17, 2019
    Inventors: John Joseph Leonard, Sudeep Pillai