Patents by Inventor Adrien David GAIDON

Adrien David GAIDON has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210398301
    Abstract: A method for monocular depth/pose estimation in a camera agnostic network is described. The method includes training a monocular depth model and a monocular pose model to learn monocular depth estimation and monocular pose estimation based on a target image and context images from monocular video captured by the camera agnostic network. The method also includes lifting 3D points from image pixels of the target image according to the context images. The method further includes projecting the lifted 3D points onto an image plane according to a predicted ray vector based on the monocular depth model, the monocular pose model, and a camera center of the camera agnostic network. The method also includes predicting a warped target image from a predicted depth map of the monocular depth model, a ray surface of the predicted ray vector, and a projection of the lifted 3D points according to the camera agnostic network.
    Type: Application
    Filed: June 17, 2020
    Publication date: December 23, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor GUIZILINI, Sudeep PILLAI, Adrien David GAIDON, Rares A. AMBRUS, Igor VASILJEVIC
  • Publication number: 20210387649
    Abstract: A representation of a spatial structure of objects in an image can be determined. A mode of a neural network can be set, in response to a receipt of the image and a receipt of a facing direction of a camera that produced the image. The mode can account for the facing direction. The facing direction can include one or more of a first facing direction of a first camera disposed on a vehicle or a second facing direction of a second camera disposed on the vehicle. The neural network can be executed, in response to the mode having been set, to determine the representation of the spatial structure of the objects in the image. The representation of the spatial structure of the objects in the image can be transmitted to an automotive navigation system to determine a distance between the vehicle and a specific object in the image.
    Type: Application
    Filed: June 11, 2020
    Publication date: December 16, 2021
    Inventors: Sudeep Pillai, Vitor Guizilini, Rares A. Ambrus, Adrien David Gaidon
  • Publication number: 20210387648
    Abstract: Information that identifies a location can be received. In response to a receipt of the information that identifies the location, a file can be retrieved. The file can be for the location. The file can include image data and a set of node data. The set of node data can include information that identifies nodes in a neural network, information that identifies inputs of the nodes, and values of weights to be applied to the inputs. In response to a retrieval of the file, the weights can be applied to the inputs of the nodes and the image data can be received for the neural network. In response to an application of the weights and a receipt of the image data, the neural network can be executed to produce a digital map for the location. The digital map for the location can be transmitted to an automotive navigation system.
    Type: Application
    Filed: June 10, 2020
    Publication date: December 16, 2021
    Inventors: Vitor Guizilini, Rares A. Ambrus, Sudeep Pillai, Adrien David Gaidon
  • Publication number: 20210390714
    Abstract: A two dimensional image can be received. A depth map can be produced, via a first neural network, from the two dimensional image. A bird's eye view image can be produced, via a second neural network, from the depth map. The second neural network can implement a machine learning algorithm that preserves spatial gradient information associated with one or more objects included in the depth map and causes a position of a pixel in an object, included in the bird's eye view image, to be represented by a differentiable function. Three dimensional objects can be detected, via a third neural network, from the two dimensional image, the bird's eye view image, and the spatial gradient information. A combination of the first neural network, the second neural network, and the third neural network can be end-to-end trainable and can be included in a perception system.
    Type: Application
    Filed: June 11, 2020
    Publication date: December 16, 2021
    Inventors: Vitor Guizilini, Rares A. Ambrus, Sudeep Pillai, Adrien David Gaidon
  • Publication number: 20210390718
    Abstract: A method for estimating depth is presented. The method includes generating, at each decoding layer of a neural network, decoded features of an input image. The method also includes upsampling, at each decoding layer, the decoded features to a resolution of a final output of the neural network. The method still further includes concatenating, at each decoding layer, the upsampled decoded features with features generated at a convolution layer of the neural network. The method additionally includes sequentially receiving the concatenated upsampled decoded features at a long-short term memory (LSTM) module of the neural network from each decoding layer. The method still further includes generating, at the LSTM module, a depth estimate of the input image after receiving the concatenated upsampled inverse depth estimate from a final layer of a decoder of the neural network. The method also includes controlling an action of an agent based on the depth estimate.
    Type: Application
    Filed: June 11, 2020
    Publication date: December 16, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor GUIZILINI, Adrien David GAIDON
  • Publication number: 20210389773
    Abstract: System, methods, and other embodiments described herein relate to determining driving behaviors for controlling a vehicle. In one embodiment, a method includes generating, using textual descriptions in combination with driving log snippets, a joint feature space that represents a coordinated mapping between the textual descriptions and the driving log snippets. The method includes training a policy network to generate identified behaviors from the driving behaviors according to a correspondence between an observed context that is mapped onto the joint feature space and the driving behaviors defined in the joint feature space resulting from at least the textual descriptions. The method includes providing a behavior cloning model including at least an encoder, the joint feature space, and the policy network to generate control behaviors from the driving behaviors defined in the joint feature space according to acquired observations of a surrounding environment of the vehicle.
    Type: Application
    Filed: June 10, 2020
    Publication date: December 16, 2021
    Inventor: Adrien David Gaidon
  • Publication number: 20210383553
    Abstract: A method includes generating a first warped image based on a pose and a depth estimated from a current image and a previous image in a sequence of images captured by a camera of the agent. The method also includes estimating a motion of dynamic object between the previous image and the target image. The method further includes generating a second warped image from the first warped image based on the estimated motion. The method still further includes controlling an action of an agent based on the second warped image.
    Type: Application
    Filed: June 4, 2020
    Publication date: December 9, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor Guizilini, Adrien David Gaidon
  • Publication number: 20210383240
    Abstract: A neural architecture search system for generating a neural network includes one or more processors and a memory. The memory includes a generator module, a self-supervised training module, and an output module. The modules cause the one or more processors to generate a candidate neural network by a controller neural network, obtain training data, generate an output by the candidate neural network performing a specific task using the training data as an input, determine a loss value using a loss function that considers the output of the candidate neural network and at least a portion of the training data, adjust the one or more model weights of the controller neural network based on the loss value, and output the candidate neural network. The candidate neural network may be derived from the controller neural network and one or more model weights of the controller neural network.
    Type: Application
    Filed: June 9, 2020
    Publication date: December 9, 2021
    Inventors: Adrien David Gaidon, Jie Li, Vitor Guizilini
  • Publication number: 20210365697
    Abstract: A system and method generate feature space data that may be used for object detection. The system includes one or more processors and a memory. The memory may include one or more modules having instructions that, when executed by the one or more processors, cause the one or more processors to obtain a two-dimension image of a scene, generate an output depth map based on the two-dimension image of the scene, generate a pseudo-LIDAR point cloud based on the output depth map, generate a bird's eye view (BEV) feature space based on the pseudo-LIDAR point cloud, and modify the BEV feature space to generate an improved BEV feature space using feature space neural network that was trained by using a training LIDAR feature space as a ground truth based on a LIDAR point cloud.
    Type: Application
    Filed: May 20, 2020
    Publication date: November 25, 2021
    Inventors: Victor Vaquero Gomez, Rares A. Ambrus, Vitor Guizilini, Adrien David Gaidon
  • Publication number: 20210365733
    Abstract: A method for image reconstruction and domain transfer through an invertible depth network is described. The method includes training a first invertible depth network model using a first image dataset corresponding to a first geographic region to estimate a first depth map. The method also includes retraining the first invertible depth network model using a second image dataset corresponding to a second geographic region to estimate a second depth map. The method further includes reconstructing, by the first invertible depth network model, a third image dataset based on the second depth map. The method also includes training a second invertible depth network model using the third image dataset corresponding to the first geographic region and the second geographic region to estimate a third depth map.
    Type: Application
    Filed: May 20, 2020
    Publication date: November 25, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor GUIZILINI, Adrien David GAIDON
  • Publication number: 20210358296
    Abstract: Systems and methods determining velocity of an object associated with a three-dimensional (3D) scene may include: a LIDAR system generating two sets of 3D point cloud data of the scene from two consecutive point cloud sweeps; a pillar feature network encoding data of the point cloud data to extract two-dimensional (2D) bird's-eye-view embeddings for each of the point cloud data sets in the form of pseudo images, wherein the 2D bird's-eye-view embeddings for a first of the two point cloud data sets comprises pillar features for the first point cloud data set and the 2D bird's-eye-view embeddings for a second of the two point cloud data sets comprises pillar features for the second point cloud data set; and a feature pyramid network encoding the pillar features and performing a 2D optical flow estimation to estimate the velocity of the object.
    Type: Application
    Filed: May 18, 2020
    Publication date: November 18, 2021
    Inventors: Kuan-Hui LEE, Matthew T. Kliemann, Adrien David Gaidon
  • Publication number: 20210358137
    Abstract: Systems and methods determining velocity of an object associated with a three-dimensional (3D) scene may include: a LIDAR system generating two sets of 3D point cloud data of the scene from two consecutive point cloud sweeps; a pillar feature network encoding data of the point cloud data to extract two-dimensional (2D) bird's-eye-view embeddings for each of the point cloud data sets in the form of pseudo images, wherein the 2D bird's-eye-view embeddings for a first of the two point cloud data sets comprises pillar features for the first point cloud data set and the 2D bird's-eye-view embeddings for a second of the two point cloud data sets comprises pillar features for the second point cloud data set; and a feature pyramid network encoding the pillar features and performing a 2D optical flow estimation to estimate the velocity of the object.
    Type: Application
    Filed: May 18, 2020
    Publication date: November 18, 2021
    Inventors: KUAN-HUI LEE, SUDEEP PILLAI, ADRIEN DAVID GAIDON
  • Patent number: 11176709
    Abstract: System, methods, and other embodiments described herein relate to self-supervised training of a depth model for monocular depth estimation. In one embodiment, a method includes processing a first image of a pair according to the depth model to generate a depth map. The method includes processing the first image and a second image of the pair according to a pose model to generate a transformation that defines a relationship between the pair. The pair of images are separate frames depicting a scene of a monocular video. The method includes generating a monocular loss and a pose loss, the pose loss including at least a velocity component that accounts for motion of a camera between the training images. The method includes updating the pose model according to the pose loss and the depth model according to the monocular loss to improve scale awareness of the depth model in producing depth estimates.
    Type: Grant
    Filed: October 17, 2019
    Date of Patent: November 16, 2021
    Assignee: Toyota Research Institute, Inc.
    Inventors: Sudeep Pillai, Rares A. Ambrus, Vitor Guizilini, Adrien David Gaidon
  • Publication number: 20210350222
    Abstract: Systems and methods to improve machine learning by explicitly over-fitting environmental data obtained by an imaging system, such as a monocular camera are disclosed. The system includes training self-supervised depth and pose networks in monocular visual data collected from a certain area over multiple passes. Pose and depth networks may be trained by extracting data from multiple images of a single environment or trajectory, allowing the system to overfit the image data.
    Type: Application
    Filed: May 5, 2020
    Publication date: November 11, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Rares A. AMBRUS, Vitor GUIZILINI, Sudeep PILLAI, Adrien David GAIDON
  • Publication number: 20210350616
    Abstract: A method is presented. The method includes estimating an ego-motion of an agent based on a current image from a sequence of images and at least one previous image from the sequence of images. Each image in the sequence of images may be a two-dimensional (2D) image. The method also includes estimating a depth of the current image based the at least one previous image. The estimated depth accounts for a depth uncertainty measurement in the current image and the at least one previous image. The method further includes generating a three-dimensional (3D) reconstruction of the current image based on the estimated ego-motion and the estimated depth. The method still further includes controlling an action of the agent based on the three-dimensional reconstruction.
    Type: Application
    Filed: May 7, 2020
    Publication date: November 11, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor GUIZILINI, Adrien David GAIDON
  • Publication number: 20210334976
    Abstract: Systems and methods for panoptic segmentation of an image of a scene, comprising: receiving a synthetic data set as simulation data set in a simulation domain, the simulation data set comprising a plurality of synthetic data objects; disentangling the synthetic data objects by class for a plurality of object classes; training each class of the plurality of classes separately by applying a Generative Adversarial Network (GAN) to each class from the data set in the simulation domain to create a generated instance for each class; combining the generated instances for each class with labels for the objects in each class to obtain a fake instance of an object; fusing the fake instances to create a fused image; and applying a GAN to the fused image and a corresponding real data set in a real-world domain to obtain an updated data set. The process can be repeated across multiple iterations.
    Type: Application
    Filed: April 24, 2020
    Publication date: October 28, 2021
    Inventors: KUAN-HUI LEE, JIE LI, ADRIEN DAVID GAIDON
  • Publication number: 20210326676
    Abstract: One or more embodiments of the present disclosure include systems and methods that use neural architecture fusion to learn how to combine multiple separate pre-trained networks by fusing their architectures into a single network for better computational efficiency and higher accuracy. For example, a computer implemented method of the disclosure includes obtaining multiple trained networks. Each of the trained networks may be associated with a respective task and has a respective architecture. The method further includes generating a directed acyclic graph that represents at least a partial union of the architectures of the trained networks. The method additionally includes defining a joint objective for the directed acyclic graph that combines a performance term and a distillation term. The method also includes optimizing the joint objective over the directed acyclic graph.
    Type: Application
    Filed: April 20, 2020
    Publication date: October 21, 2021
    Inventors: ADRIEN DAVID GAIDON, JIE LI
  • Publication number: 20210326601
    Abstract: A method for keypoint matching includes determining a first set of keypoints corresponding to a current environment of the agent. The method further includes determining a second set of keypoints from a pre-built map of the current environment. The method still further includes identifying matching pairs of keypoints from the first set of keypoints and the second set of keypoints based on geometrical similarities between respective keypoints of the first set of keypoints and the second set of keypoints. The method also includes determining a current location of the agent based on the identified matching pairs of keypoints. The method further includes controlling an action of the agent based on the current location.
    Type: Application
    Filed: April 15, 2021
    Publication date: October 21, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Jiexiong TANG, Rares Andrei AMBRUS, Jie LI, Vitor GUIZILINI, Sudeep PILLAI, Adrien David GAIDON
  • Publication number: 20210318140
    Abstract: A method for localization performed by an agent includes receiving a query image of a current environment of the agent captured by a sensor integrated with the agent. The method also includes receiving a target image comprising a first set of keypoints matching a second set of keypoints of the query image. The first set of keypoints may be generated based on a task specified for the agent. The method still further includes determining a current location based on the target image.
    Type: Application
    Filed: April 14, 2021
    Publication date: October 14, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Jiexiong TANG, Rares Andrei AMBRUS, Hanme KIM, Vitor GUIZILINI, Adrien David GAIDON, Xipeng WANG, Jeff WALLS, SR., Sudeep PILLAI
  • Publication number: 20210319236
    Abstract: A method for keypoint matching includes receiving an input image obtained by a sensor of an agent. The method also includes identifying a set of keypoints of the received image. The method further includes augmenting the descriptor of each of the keypoints with semantic information of the input image. The method also includes identifying a target image based on one or more semantically augmented descriptors of the target image matching one or more semantically augmented descriptors of the input image. The method further includes controlling an action of the agent in response to identifying the target.
    Type: Application
    Filed: April 14, 2021
    Publication date: October 14, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Jiexiong TANG, Rares Andrei AMBRUS, Vitor GUIZILINI, Adrien David GAIDON