Patents by Inventor Rares A. Ambrus

Rares A. Ambrus has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220084230
    Abstract: Systems and methods for self-supervised depth estimation using image frames captured from a vehicle-mounted camera, may include: receiving a first image captured by the camera while the camera is mounted at a first location on the vehicle, the source image comprising pixels representing a scene of the environment of the vehicle; receiving a reference image captured by the camera while the camera is mounted at a second location on the vehicle, the reference image comprising pixels representing a scene of the environment; predicting a depth map for the first image comprising predicted depth values for pixels of the first image; warping the first image to a perspective of the camera at the second location on the vehicle to arrive at a warped first image; projecting the warped first image onto the source image; determining a loss based on the projection; and updating predicted depth values for the first image.
    Type: Application
    Filed: September 15, 2020
    Publication date: March 17, 2022
    Inventors: Vitor Guizilini, Igor Vasiljevic, Rares A. Ambrus, Adrien Gaidon
  • Publication number: 20220058817
    Abstract: A system for generating point clouds having surface normal information includes one or more processors and a memory having a depth map generating module, a point cloud generating module, and surface normal generating module. The depth map generating module causes the one or more processors to generate a depth map from one or more images of a scene. The point cloud causes the one or more processors to generate a point cloud from the depth map having a plurality of points corresponding to one or more pixels of the depth map. The surface normal generating module causes the one or more processors to generate surface normal information for at least a portion of the one or more pixels of the depth map and inject the surface normal information into the point cloud such that the plurality of points of the point cloud include three-dimensional location information and surface normal information.
    Type: Application
    Filed: August 24, 2020
    Publication date: February 24, 2022
    Inventors: Victor Vaquero Gomez, Rares A. Ambrus, Vitor Guizilini, Adrien David Gaidon
  • Publication number: 20220055663
    Abstract: A method for behavior cloned vehicle trajectory planning is described. The method includes perceiving vehicles proximate an ego vehicle in a driving environment, including a scalar confidence value of each perceived vehicle. The method also includes generating a bird's-eye-view (BEV) grid showing the ego vehicle and each perceived vehicle based on each of the scalar confidence value. The method further includes ignoring at least one of the perceived vehicles when the scalar confidence value of the at least one of the perceived vehicles is less than a predetermined value. The method also includes selecting an ego vehicle trajectory based on a cloned expert vehicle behavior policy according to remaining perceived vehicles.
    Type: Application
    Filed: August 21, 2020
    Publication date: February 24, 2022
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Andreas BUEHLER, Adrien David GAIDON, Rares A. AMBRUS, Guy ROSMAN, Wolfram BURGARD
  • Patent number: 11256986
    Abstract: Systems and methods for training a neural keypoint detection network are disclosed herein. One embodiment extracts a portion of an input image; applies a transformation to the portion of the input image to produce a transformed portion of the input image; processes the portion of the input image and the transformed portion of the input image using the neural keypoint detection network to identify one or more candidate keypoint pairs between the portion of the input image and the transformed portion of the input image; and processes the one or more candidate keypoint pairs using an inlier-outlier neural network, the inlier-outlier neural network producing an indirect supervisory signal to train the neural keypoint detection network to identify one or more candidate keypoint pairs between the portion of the input image and the transformed portion of the input image.
    Type: Grant
    Filed: March 31, 2020
    Date of Patent: February 22, 2022
    Assignee: Toyota Research Institute, Inc.
    Inventors: Jiexiong Tang, Rares A. Ambrus, Vitor Guizilini, Sudeep Pillai, Hanme Kim
  • Patent number: 11257231
    Abstract: A method for monocular depth/pose estimation in a camera agnostic network is described. The method includes training a monocular depth model and a monocular pose model to learn monocular depth estimation and monocular pose estimation based on a target image and context images from monocular video captured by the camera agnostic network. The method also includes lifting 3D points from image pixels of the target image according to the context images. The method further includes projecting the lifted 3D points onto an image plane according to a predicted ray vector based on the monocular depth model, the monocular pose model, and a camera center of the camera agnostic network. The method also includes predicting a warped target image from a predicted depth map of the monocular depth model, a ray surface of the predicted ray vector, and a projection of the lifted 3D points according to the camera agnostic network.
    Type: Grant
    Filed: June 17, 2020
    Date of Patent: February 22, 2022
    Assignee: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor Guizilini, Sudeep Pillai, Adrien David Gaidon, Rares A. Ambrus, Igor Vasiljevic
  • Publication number: 20220036650
    Abstract: The embodiments disclosed herein describe vehicles, systems and methods for multi-resolution fusion of pseudo-LiDAR features. In one aspect, a method for multi-resolution fusion of pseudo-LiDAR features includes receiving image data from one or more image sensors, generating a point cloud from the image data, generating, from the point cloud, a first bird's eye view map having a first resolution, generating, from the point cloud, a second bird's eye view map having a second resolution, and generating a combined bird's eye view map by combining features of the first bird's eye view map with features from the second bird's eye view map.
    Type: Application
    Filed: July 28, 2020
    Publication date: February 3, 2022
    Applicant: Toyota Research Institute, Inc.
    Inventors: Victor Vaquero Gomez, Rares A. Ambrus, Vitor Guizilini, Adrien D. Gaidon
  • Publication number: 20220024048
    Abstract: A deformable sensor comprises an enclosure comprising a deformable membrane, the enclosure configured to be filled with a medium, and an imaging sensor, disposed within the enclosure, having a field of view configured to be directed toward a bottom surface of the deformable membrane. The imaging sensor is configured to capture an image of the deformable membrane. The deformable sensor is configured to determine depth values for a plurality of points on the deformable membrane based on the image captured by the imaging sensor and a trained neural network.
    Type: Application
    Filed: January 13, 2021
    Publication date: January 27, 2022
    Applicant: Toyota Research Institute, Inc.
    Inventors: Rares A. Ambrus, Vitor Guizilini, Naveen Suresh Kuppuswamy, Andrew M. Beaulieu, Adrien D. Gaidon, Alexander Alspach
  • Publication number: 20220026918
    Abstract: A method for controlling an ego agent includes capturing a two-dimensional (2D) image of an environment adjacent to the ego agent. The method also includes generating a semantically segmented image of the environment based on the 2D image. The method further includes generating a depth map of the environment based on the semantically segmented image. The method additionally includes generating a three-dimensional (3D) estimate of the environment based on the depth map. The method also includes controlling an action of the ego agent based on the identified location.
    Type: Application
    Filed: July 23, 2020
    Publication date: January 27, 2022
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor GUIZILINI, Jie LI, Rares A. AMBRUS, Sudeep PILLAI, Adrien GAIDON
  • Publication number: 20210407117
    Abstract: Systems and methods for extracting ground plane information directly from monocular images using self-supervised depth networks are disclosed. Self-supervised depth networks are used to generate a three-dimensional reconstruction of observed structures. From this reconstruction the system may generate surface normals. The surface normals can be calculated directly from depth maps in a way that is much less computationally expensive and accurate than surface normals extraction from standard LiDAR data. Surface normals facing substantially the same direction and facing upwards may be determined to reflect a ground plane.
    Type: Application
    Filed: June 26, 2020
    Publication date: December 30, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor GUIZILINI, Rares A. AMBRUS, Adrien David GAIDON
  • Publication number: 20210407115
    Abstract: Systems and methods for generating depth models and depth maps from images obtained from an imaging system are presented. A self-supervised neural network may be capable of regularizing depth information from surface normals. Rather than rely on separate depth and surface normal networks, surface normal information is extracted from the depth information and a smoothness function is applied to the surface normals instead of a depth gradient. Smoothing the surface normal may provide improved representation of environmental structures by both smoothing texture-less areas while preserving sharp boundaries between structures.
    Type: Application
    Filed: June 26, 2020
    Publication date: December 30, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor GUIZILINI, Adrien David GAIDON, Rares A. AMBRUS
  • Patent number: 11210802
    Abstract: System, methods, and other embodiments described herein relate to self-supervised training for monocular depth estimation. In one embodiment, a method includes filtering disfavored images from first training data to produce second training data that is a subsampled version of the first training data. The disfavored images correspond with anomaly maps within a set of depth maps. The first depth model is trained according to the first training data and generates the depth maps from the first training data after initially being trained with the first training data. The method includes training a second depth model according to a self-supervised training process using the second training data. The method includes providing the second depth model to infer distances from monocular images.
    Type: Grant
    Filed: March 24, 2020
    Date of Patent: December 28, 2021
    Assignee: Toyota Research Institute, Inc.
    Inventors: Vitor Guizilini, Rares A. Ambrus, Rui Hou, Jie Li, Adrien David Gaidon
  • Publication number: 20210398301
    Abstract: A method for monocular depth/pose estimation in a camera agnostic network is described. The method includes training a monocular depth model and a monocular pose model to learn monocular depth estimation and monocular pose estimation based on a target image and context images from monocular video captured by the camera agnostic network. The method also includes lifting 3D points from image pixels of the target image according to the context images. The method further includes projecting the lifted 3D points onto an image plane according to a predicted ray vector based on the monocular depth model, the monocular pose model, and a camera center of the camera agnostic network. The method also includes predicting a warped target image from a predicted depth map of the monocular depth model, a ray surface of the predicted ray vector, and a projection of the lifted 3D points according to the camera agnostic network.
    Type: Application
    Filed: June 17, 2020
    Publication date: December 23, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor GUIZILINI, Sudeep PILLAI, Adrien David GAIDON, Rares A. AMBRUS, Igor VASILJEVIC
  • Publication number: 20210387649
    Abstract: A representation of a spatial structure of objects in an image can be determined. A mode of a neural network can be set, in response to a receipt of the image and a receipt of a facing direction of a camera that produced the image. The mode can account for the facing direction. The facing direction can include one or more of a first facing direction of a first camera disposed on a vehicle or a second facing direction of a second camera disposed on the vehicle. The neural network can be executed, in response to the mode having been set, to determine the representation of the spatial structure of the objects in the image. The representation of the spatial structure of the objects in the image can be transmitted to an automotive navigation system to determine a distance between the vehicle and a specific object in the image.
    Type: Application
    Filed: June 11, 2020
    Publication date: December 16, 2021
    Inventors: Sudeep Pillai, Vitor Guizilini, Rares A. Ambrus, Adrien David Gaidon
  • Publication number: 20210390714
    Abstract: A two dimensional image can be received. A depth map can be produced, via a first neural network, from the two dimensional image. A bird's eye view image can be produced, via a second neural network, from the depth map. The second neural network can implement a machine learning algorithm that preserves spatial gradient information associated with one or more objects included in the depth map and causes a position of a pixel in an object, included in the bird's eye view image, to be represented by a differentiable function. Three dimensional objects can be detected, via a third neural network, from the two dimensional image, the bird's eye view image, and the spatial gradient information. A combination of the first neural network, the second neural network, and the third neural network can be end-to-end trainable and can be included in a perception system.
    Type: Application
    Filed: June 11, 2020
    Publication date: December 16, 2021
    Inventors: Vitor Guizilini, Rares A. Ambrus, Sudeep Pillai, Adrien David Gaidon
  • Publication number: 20210387648
    Abstract: Information that identifies a location can be received. In response to a receipt of the information that identifies the location, a file can be retrieved. The file can be for the location. The file can include image data and a set of node data. The set of node data can include information that identifies nodes in a neural network, information that identifies inputs of the nodes, and values of weights to be applied to the inputs. In response to a retrieval of the file, the weights can be applied to the inputs of the nodes and the image data can be received for the neural network. In response to an application of the weights and a receipt of the image data, the neural network can be executed to produce a digital map for the location. The digital map for the location can be transmitted to an automotive navigation system.
    Type: Application
    Filed: June 10, 2020
    Publication date: December 16, 2021
    Inventors: Vitor Guizilini, Rares A. Ambrus, Sudeep Pillai, Adrien David Gaidon
  • Publication number: 20210365697
    Abstract: A system and method generate feature space data that may be used for object detection. The system includes one or more processors and a memory. The memory may include one or more modules having instructions that, when executed by the one or more processors, cause the one or more processors to obtain a two-dimension image of a scene, generate an output depth map based on the two-dimension image of the scene, generate a pseudo-LIDAR point cloud based on the output depth map, generate a bird's eye view (BEV) feature space based on the pseudo-LIDAR point cloud, and modify the BEV feature space to generate an improved BEV feature space using feature space neural network that was trained by using a training LIDAR feature space as a ground truth based on a LIDAR point cloud.
    Type: Application
    Filed: May 20, 2020
    Publication date: November 25, 2021
    Inventors: Victor Vaquero Gomez, Rares A. Ambrus, Vitor Guizilini, Adrien David Gaidon
  • Patent number: 11176709
    Abstract: System, methods, and other embodiments described herein relate to self-supervised training of a depth model for monocular depth estimation. In one embodiment, a method includes processing a first image of a pair according to the depth model to generate a depth map. The method includes processing the first image and a second image of the pair according to a pose model to generate a transformation that defines a relationship between the pair. The pair of images are separate frames depicting a scene of a monocular video. The method includes generating a monocular loss and a pose loss, the pose loss including at least a velocity component that accounts for motion of a camera between the training images. The method includes updating the pose model according to the pose loss and the depth model according to the monocular loss to improve scale awareness of the depth model in producing depth estimates.
    Type: Grant
    Filed: October 17, 2019
    Date of Patent: November 16, 2021
    Assignee: Toyota Research Institute, Inc.
    Inventors: Sudeep Pillai, Rares A. Ambrus, Vitor Guizilini, Adrien David Gaidon
  • Publication number: 20210350222
    Abstract: Systems and methods to improve machine learning by explicitly over-fitting environmental data obtained by an imaging system, such as a monocular camera are disclosed. The system includes training self-supervised depth and pose networks in monocular visual data collected from a certain area over multiple passes. Pose and depth networks may be trained by extracting data from multiple images of a single environment or trajectory, allowing the system to overfit the image data.
    Type: Application
    Filed: May 5, 2020
    Publication date: November 11, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Rares A. AMBRUS, Vitor GUIZILINI, Sudeep PILLAI, Adrien David GAIDON
  • Patent number: 11145074
    Abstract: System, methods, and other embodiments described herein relate to generating depth estimates of an environment depicted in a monocular image. In one embodiment, a method includes, in response to receiving the monocular image, processing the monocular image according to a depth model to generate a depth map. Processing the monocular images includes encoding the monocular image according to encoding layers of the depth model including iteratively encoding features of the monocular image to generate feature maps at successively refined representations using packing blocks within the encoding layers. Processing the monocular image further includes decoding the feature maps according to decoding layers of the depth model including iteratively decoding the features maps associated with separate ones of the packing blocks using unpacking blocks of the decoding layers to generate the depth map. The method includes providing the depth map as the depth estimates of objects represented in the monocular image.
    Type: Grant
    Filed: October 17, 2019
    Date of Patent: October 12, 2021
    Assignee: Toyota Research Institute, Inc.
    Inventors: Vitor Guizilini, Rares A. Ambrus, Sudeep Pillai, Adrien David Gaidon
  • Patent number: 11144818
    Abstract: System, methods, and other embodiments described herein relate to estimating ego-motion. In one embodiment, a method for estimating ego-motion based on a plurality of input images in a self-supervised system includes receiving a source image and a target image, determining a depth estimation Dt based on the target image, determining a depth estimation Ds based on a source image, and determining an ego-motion estimation in a form of a six degrees-of-freedom (6 DOF) transformation between the target image and the source image by inputting the depth estimations (Dt, Ds), the target image, and the source image into a two-stream network architecture trained to output the 6 DOF transformation based at least in part on the depth estimations (Dt, Ds), the target image, and the source image.
    Type: Grant
    Filed: October 16, 2019
    Date of Patent: October 12, 2021
    Assignee: Toyota Research Institute, Inc.
    Inventors: Rares A. Ambrus, Vitor Guizilini, Sudeep Pillai, Jie Li, Adrien David Gaidon