Patents by Inventor Rares A. Ambrus

Rares A. Ambrus has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11144818
    Abstract: System, methods, and other embodiments described herein relate to estimating ego-motion. In one embodiment, a method for estimating ego-motion based on a plurality of input images in a self-supervised system includes receiving a source image and a target image, determining a depth estimation Dt based on the target image, determining a depth estimation Ds based on a source image, and determining an ego-motion estimation in a form of a six degrees-of-freedom (6 DOF) transformation between the target image and the source image by inputting the depth estimations (Dt, Ds), the target image, and the source image into a two-stream network architecture trained to output the 6 DOF transformation based at least in part on the depth estimations (Dt, Ds), the target image, and the source image.
    Type: Grant
    Filed: October 16, 2019
    Date of Patent: October 12, 2021
    Assignee: Toyota Research Institute, Inc.
    Inventors: Rares A. Ambrus, Vitor Guizilini, Sudeep Pillai, Jie Li, Adrien David Gaidon
  • Patent number: 11138751
    Abstract: System, methods, and other embodiments described herein relate to training a depth model for monocular depth estimation. In one embodiment, a method includes generating, as part of training the depth model according to a supervised training stage, a depth map from a first image of a pair of training images using the depth model. The pair of training images are separate frames depicting a scene from a monocular video. The method includes generating a transformation from the first image and a second image of the pair using a pose model. The method includes computing a supervised loss based, at least in part, on reprojecting the depth map and training depth data onto an image space of the second image according to at least the transformation. The method includes updating the depth model and the pose model according to at least the supervised loss.
    Type: Grant
    Filed: November 20, 2019
    Date of Patent: October 5, 2021
    Assignee: Toyota Research Institute, Inc.
    Inventors: Vitor Guizilini, Sudeep Pillai, Rares A. Ambrus, Jie Li, Adrien David Gaidon
  • Publication number: 20210281814
    Abstract: System, methods, and other embodiments described herein relate to improving depth estimates for monocular images using a neural camera model that is independent of a camera type. In one embodiment, a method includes receiving a monocular image from a pair of training images derived from a monocular video. The method includes generating, using a ray surface network, a ray surface that approximates an image character of the monocular image as produced by a camera having the camera type. The method includes creating a synthesized image according to at least the ray surface and a depth map associated with the monocular image.
    Type: Application
    Filed: June 12, 2020
    Publication date: September 9, 2021
    Inventors: Vitor Guizilini, Igor Vasiljevic, Rares A. Ambrus, Sudeep Pillai, Adrien David Gaidon
  • Patent number: 11107230
    Abstract: System, methods, and other embodiments described herein relate to generating depth estimates from a monocular image. In one embodiment, a method includes, in response to receiving the monocular image, flipping, by a disparity model, the monocular image to generate a flipped image. The disparity model is a machine learning algorithm. The method includes analyzing, using the disparity model, the monocular image and the flipped image to generate disparity maps including a monocular disparity map corresponding to the monocular image and a flipped disparity map corresponding with the flipped image. The method includes generating, in the disparity model, a fused disparity map from the monocular disparity map and the flipped disparity map. The method includes providing the fused disparity map as the depth estimates of objects represented in the monocular image.
    Type: Grant
    Filed: February 15, 2019
    Date of Patent: August 31, 2021
    Assignee: Toyota Research Institute, Inc.
    Inventors: Sudeep Pillai, Rares A. Ambrus, Adrien David Gaidon
  • Publication number: 20210237774
    Abstract: A method for learning depth-aware keypoints and associated descriptors from monocular video for monocular visual odometry is described. The method includes training a keypoint network and a depth network to learn depth-aware keypoints and the associated descriptors. The training is based on a target image and a context image from successive images of the monocular video. The method also includes lifting 2D keypoints from the target image to learn 3D keypoints based on a learned depth map from the depth network. The method further includes estimating a trajectory of an ego-vehicle based on the learned 3D keypoints.
    Type: Application
    Filed: November 9, 2020
    Publication date: August 5, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Jiexiong TANG, Rares A. AMBRUS, Vitor GUIZILINI, Sudeep PILLAI, Hanme KIM, Adrien David GAIDON
  • Publication number: 20210237764
    Abstract: A method for learning depth-aware keypoints and associated descriptors from monocular video for ego-motion estimation is described. The method includes training a keypoint network and a depth network to learn depth-aware keypoints and the associated descriptors. The training is based on a target image and a context image from successive images of the monocular video. The method also includes lifting 2D keypoints from the target image to learn 3D keypoints based on a learned depth map from the depth network. The method further includes estimating ego-motion from the target image to the context image based on the learned 3D keypoints.
    Type: Application
    Filed: November 9, 2020
    Publication date: August 5, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Jiexiong TANG, Rares A. AMBRUS, Vitor GUIZILINI, Sudeep PILLAI, Hanme KIM, Adrien David GAIDON
  • Patent number: 11054839
    Abstract: A mobile robotic device receives point cloud data corresponding to an internal space of a facility and processes the data to generate a map of the facility that enables the mobile robotic device to move within the internal space. The processing of the point cloud data includes segmentation of the data into planar primitives that are identified as ceiling, floor, or wall primitives. Openings in the wall primitives are identified as doors or occlusions. Viewpoints for the processed planar primitives are generated and a complex cell data structure is generated with vertices representing faces of the structure and edges representing walls. After an energy minimization of the complex cell structure is performed, adjacent regions of space are evaluated for merger and a map of the internal space is generated. The mobile robotic device moves through the internal space of the facility with reference to the map.
    Type: Grant
    Filed: December 28, 2017
    Date of Patent: July 6, 2021
    Assignee: Robert Bosch GmbH
    Inventors: Axel Wendt, Rares Ambrus, Sebastian Claici
  • Publication number: 20210118163
    Abstract: System, methods, and other embodiments described herein relate to generating depth estimates of an environment depicted in a monocular image. In one embodiment, a method includes, in response to receiving the monocular image, processing the monocular image according to a depth model to generate a depth map. Processing the monocular images includes encoding the monocular image according to encoding layers of the depth model including iteratively encoding features of the monocular image to generate feature maps at successively refined representations using packing blocks within the encoding layers. Processing the monocular image further includes decoding the feature maps according to decoding layers of the depth model including iteratively decoding the features maps associated with separate ones of the packing blocks using unpacking blocks of the decoding layers to generate the depth map. The method includes providing the depth map as the depth estimates of objects represented in the monocular image.
    Type: Application
    Filed: October 17, 2019
    Publication date: April 22, 2021
    Inventors: Vitor Guizilini, Rares A. Ambrus, Sudeep Pillai, Adrien David Gaidon
  • Publication number: 20210118184
    Abstract: System, methods, and other embodiments described herein relate to self-supervised training of a depth model for monocular depth estimation. In one embodiment, a method includes processing a first image of a pair according to the depth model to generate a depth map. The method includes processing the first image and a second image of the pair according to a pose model to generate a transformation that defines a relationship between the pair. The pair of images are separate frames depicting a scene of a monocular video. The method includes generating a monocular loss and a pose loss, the pose loss including at least a velocity component that accounts for motion of a camera between the training images. The method includes updating the pose model according to the pose loss and the depth model according to the monocular loss to improve scale awareness of the depth model in producing depth estimates.
    Type: Application
    Filed: October 17, 2019
    Publication date: April 22, 2021
    Inventors: Sudeep Pillai, Rares A. Ambrus, Vitor Guizilini, Adrien David Gaidon
  • Publication number: 20210090280
    Abstract: System, methods, and other embodiments described herein relate to generating depth estimates of an environment depicted in a monocular image. In one embodiment, a method includes identifying semantic features in the monocular image according to a semantic model. The method includes injecting the semantic features into a depth model using pixel-adaptive convolutions. The method includes generating a depth map from the monocular image using the depth model that is guided by the semantic features. The pixel-adaptive convolutions are integrated into a decoder of the depth model. The method includes providing the depth map as the depth estimates for the monocular image.
    Type: Application
    Filed: February 28, 2020
    Publication date: March 25, 2021
    Inventors: Vitor Guizilini, Rares A. Ambrus, Jie Li, Adrien David Gaidon
  • Publication number: 20210089890
    Abstract: Systems and methods for detecting and matching keypoints between different views of a scene are disclosed herein. One embodiment acquires first and second images; subdivides the first and second images into first and second pluralities of cells, respectively; processes both pluralities of cells using a neural keypoint detection network to identify a first keypoint for a particular cell in the first plurality of cells and a second keypoint for a particular cell in the second plurality of cells, at least one of the first and second keypoints lying in a cell other than the particular cell in the first or second plurality of cells for which it was identified; and classifies the first keypoint and the second keypoint as a matching keypoint pair based, at least in part, on a comparison between a first descriptor associated with the first keypoint and a second descriptor associated with the second keypoint.
    Type: Application
    Filed: March 31, 2020
    Publication date: March 25, 2021
    Inventors: Jiexiong Tang, Rares A. Ambrus, Vitor Guizilini, Sudeep Pillai, Hanme Kim
  • Publication number: 20210089836
    Abstract: Systems and methods for training a neural keypoint detection network are disclosed herein. One embodiment extracts a portion of an input image; applies a transformation to the portion of the input image to produce a transformed portion of the input image; processes the portion of the input image and the transformed portion of the input image using the neural keypoint detection network to identify one or more candidate keypoint pairs between the portion of the input image and the transformed portion of the input image; and processes the one or more candidate keypoint pairs using an inlier-outlier neural network, the inlier-outlier neural network producing an indirect supervisory signal to train the neural keypoint detection network to identify one or more candidate keypoint pairs between the portion of the input image and the transformed portion of the input image.
    Type: Application
    Filed: March 31, 2020
    Publication date: March 25, 2021
    Inventors: Jiexiong Tang, Rares A. Ambrus, Vitor Guizilini, Sudeep Pillai, Hanme Kim
  • Publication number: 20210090277
    Abstract: System, methods, and other embodiments described herein relate to self-supervised training for monocular depth estimation. In one embodiment, a method includes filtering disfavored images from first training data to produce second training data that is a subsampled version of the first training data. The disfavored images correspond with anomaly maps within a set of depth maps. The first depth model is trained according to the first training data and generates the depth maps from the first training data after initially being trained with the first training data. The method includes training a second depth model according to a self-supervised training process using the second training data. The method includes providing the second depth model to infer distances from monocular images.
    Type: Application
    Filed: March 24, 2020
    Publication date: March 25, 2021
    Inventors: Vitor Guizilini, Rares A. Ambrus, Rui Hou, Jie Li, Adrien David Gaidon
  • Publication number: 20210004646
    Abstract: System, methods, and other embodiments described herein relate to semi-supervised training of a depth model for monocular depth estimation. In one embodiment, a method includes training the depth model according to a first stage that is self-supervised and that includes using first training data that comprises pairs of training images. Respective ones of the pairs including separate frames depicting a scene of a monocular video. The method includes training the depth model according to a second stage that is weakly supervised and that includes using second training data to produce depth maps according to the depth model. The second training data comprising individual images with corresponding sparse depth data. The second training data providing for updating the depth model according to second stage loss values that are based, at least in part, on the depth maps and the depth data.
    Type: Application
    Filed: December 3, 2019
    Publication date: January 7, 2021
    Inventors: Vitor Guizilini, Sudeep Pillai, Rares A. Ambrus, Jie Li, Adrien David Gaidon
  • Publication number: 20210004976
    Abstract: System, methods, and other embodiments described herein relate to training a depth model for monocular depth estimation. In one embodiment, a method includes generating, as part of training the depth model according to a supervised training stage, a depth map from a first image of a pair of training images using the depth model. The pair of training images are separate frames depicting a scene from a monocular video. The method includes generating a transformation from the first image and a second image of the pair using a pose model. The method includes computing a supervised loss based, at least in part, on reprojecting the depth map and training depth data onto an image space of the second image according to at least the transformation. The method includes updating the depth model and the pose model according to at least the supervised loss.
    Type: Application
    Filed: November 20, 2019
    Publication date: January 7, 2021
    Inventors: Vitor Guizilini, Sudeep Pillai, Rares A. Ambrus, Jie Li, Adrien David Gaidon
  • Publication number: 20210004660
    Abstract: System, methods, and other embodiments described herein relate to estimating ego-motion. In one embodiment, a method for estimating ego-motion based on a plurality of input images in a self-supervised system includes receiving a source image and a target image, determining a depth estimation Dt based on the target image, determining a depth estimation Ds based on a source image, and determining an ego-motion estimation in a form of a six degrees-of-freedom (6 DOF) transformation between the target image and the source image by inputting the depth estimations (Dt, Ds), the target image, and the source image into a two-stream network architecture trained to output the 6 DOF transformation based at least in part on the depth estimations (Dt, Ds), the target image, and the source image.
    Type: Application
    Filed: October 16, 2019
    Publication date: January 7, 2021
    Inventors: Rares A. Ambrus, Vitor Guizilini, Sudeep Pillai, Jie Li, Adrien David Gaidon
  • Publication number: 20210004974
    Abstract: System, methods, and other embodiments described herein relate to semi-supervised training of a depth model using a neural camera model that is independent of a camera type. In one embodiment, a method includes acquiring training data including at least a pair of training images and depth data associated with the training images. The method includes training the depth model using the training data to generate a self-supervised loss from the pair of training images and a supervised loss from the depth data. Training the depth model includes learning the camera type by generating, using a ray surface model, a ray surface that approximates an image character of the training images as produced by a camera having the camera type. The method includes providing the depth model to infer depths from monocular images in a device.
    Type: Application
    Filed: June 19, 2020
    Publication date: January 7, 2021
    Inventors: Vitor Guizilini, Igor Vasiljevic, Rares A. Ambrus, Sudeep Pillai, Adrien David Gaidon
  • Publication number: 20200090359
    Abstract: System, methods, and other embodiments described herein relate to generating depth estimates from a monocular image. In one embodiment, a method includes, in response to receiving the monocular image, flipping, by a disparity model, the monocular image to generate a flipped image. The disparity model is a machine learning algorithm. The method includes analyzing, using the disparity model, the monocular image and the flipped image to generate disparity maps including a monocular disparity map corresponding to the monocular image and a flipped disparity map corresponding with the flipped image. The method includes generating, in the disparity model, a fused disparity map from the monocular disparity map and the flipped disparity map. The method includes providing the fused disparity map as the depth estimates of objects represented in the monocular image.
    Type: Application
    Filed: February 15, 2019
    Publication date: March 19, 2020
    Inventors: Sudeep Pillai, Rares A. Ambrus, Adrien David Gaidon
  • Publication number: 20190324474
    Abstract: A mobile robotic device receives point cloud data corresponding to an internal space of a facility and processes the data to generate a map of the facility that enables the mobile robotic device to move within the internal space. The processing of the point cloud data includes segmentation of the data into planar primitives that are identified as ceiling, floor, or wall primitives. Openings in the wall primitives are identified as doors or occlusions. Viewpoints for the processed planar primitives are generated and a complex cell data structure is generated with vertices representing faces of the structure and edges representing walls. After an energy minimization of the complex cell structure is performed, adjacent regions of space are evaluated for merger and a map of the internal space is generated. The mobile robotic device moves through the internal space of the facility with reference to the map.
    Type: Application
    Filed: December 28, 2017
    Publication date: October 24, 2019
    Inventors: Axel Wendt, Rares Ambrus, Sebastian Claici