Patents by Inventor Sudeep Pillai

Sudeep Pillai has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220148206
    Abstract: A method for monocular depth/pose estimation in a camera agnostic network is described. The method includes projecting lifted 3D points onto an image plane according to a predicted ray vector based on a monocular depth model, a monocular pose model, and a camera center of a camera agnostic network. The method also includes predicting a warped target image from a predicted depth map of the monocular depth model, a ray surface of the predicted ray vector, and a projection of the lifted 3D points according to the camera agnostic network.
    Type: Application
    Filed: January 21, 2022
    Publication date: May 12, 2022
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor GUIZILINI, Sudeep PILLAI, Adrien David GAIDON, Rares A. AMBRUS, Igor VASILJEVIC
  • Publication number: 20220114375
    Abstract: Systems and methods are provided for developing/updating training datasets for traffic light detection/perception models. V2I-based information may indicate a particular traffic light state/state of transition. This information can be compared to a traffic light perception prediction. When the prediction is inconsistent with the V2I-based information, data regarding the condition(s)/traffic light(s)/etc. can be saved and uploaded to a training database to update/refine the training dataset(s) maintained therein. In this way, an existing traffic light perception model can be updated/improved and/or a better traffic light perception model can be developed.
    Type: Application
    Filed: October 8, 2020
    Publication date: April 14, 2022
    Inventors: Kun-Hsin Chen, Peiyan Gong, Sunsho Kaku, Sudeep Pillai, Hai Jin, Sarah Yoo, David L. Garber, Ryan W. Wolcott
  • Publication number: 20220101045
    Abstract: A method for traffic light auto-labeling includes aggregating vehicle-to-infrastructure (V2I) traffic light signals at an intersection to determine transition states of each driving lane at the intersection during operation of an ego vehicle. The method also includes automatically labeling image training data to form auto-labeled image training data for a traffic light recognition model within the ego vehicle according to the determined transition states of each driving lane at the intersection. The method further includes planning a trajectory of the ego vehicle to comply with a right-of-way according to the determined transition states of each driving lane at the intersection according to a trained traffic light detection model. A federated learning module may train the traffic light recognition model using the auto-labeled image training data during the operation of the ego vehicle.
    Type: Application
    Filed: September 25, 2020
    Publication date: March 31, 2022
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Kun-Hsin CHEN, Sudeep PILLAI, Shunsho KAKU, Hai JIN, Peiyan GONG, Wolfram BURGARD
  • Publication number: 20220084232
    Abstract: Systems and methods for map construction using a video sequence captured on a camera of a vehicle in an environment, comprising: receiving a video sequence from the camera, the video sequence including a plurality of image frames capturing a scene of the environment of the vehicle; using a neural camera model to predict a depth map and a ray surface for the plurality of image frames in the received video sequence; and constructing a map of the scene of the environment based on image data captured in the plurality of frames and depth information in the predicted depth maps.
    Type: Application
    Filed: September 15, 2020
    Publication date: March 17, 2022
    Inventors: VITOR GUIZILINI, IGOR VASILJEVIC, RARES A. AMBRUS, SUDEEP PILLAI, ADRIEN GAIDON
  • Publication number: 20220084231
    Abstract: Systems and methods for self-supervised learning for visual odometry using camera images captured on a camera, may include: using a key point network to learn a keypoint matrix for a target image and a context image captured by the camera; using the learned descriptors to estimate correspondences between the target image and the context image; based on the keypoint correspondences, lifting a set of 2D keypoints to 3D, using a learned neural camera model; estimating a transformation between the target image and the context image using 3D-2D keypoint correspondences; and projecting the 3D keypoints into the context image using the learned neural camera model.
    Type: Application
    Filed: September 15, 2020
    Publication date: March 17, 2022
    Inventors: VITOR GUIZILINI, Igor Vasiljevic, Rares A. Ambrus, Sudeep Pillai, Adrien Gaidon
  • Patent number: 11257231
    Abstract: A method for monocular depth/pose estimation in a camera agnostic network is described. The method includes training a monocular depth model and a monocular pose model to learn monocular depth estimation and monocular pose estimation based on a target image and context images from monocular video captured by the camera agnostic network. The method also includes lifting 3D points from image pixels of the target image according to the context images. The method further includes projecting the lifted 3D points onto an image plane according to a predicted ray vector based on the monocular depth model, the monocular pose model, and a camera center of the camera agnostic network. The method also includes predicting a warped target image from a predicted depth map of the monocular depth model, a ray surface of the predicted ray vector, and a projection of the lifted 3D points according to the camera agnostic network.
    Type: Grant
    Filed: June 17, 2020
    Date of Patent: February 22, 2022
    Assignee: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor Guizilini, Sudeep Pillai, Adrien David Gaidon, Rares A. Ambrus, Igor Vasiljevic
  • Patent number: 11256986
    Abstract: Systems and methods for training a neural keypoint detection network are disclosed herein. One embodiment extracts a portion of an input image; applies a transformation to the portion of the input image to produce a transformed portion of the input image; processes the portion of the input image and the transformed portion of the input image using the neural keypoint detection network to identify one or more candidate keypoint pairs between the portion of the input image and the transformed portion of the input image; and processes the one or more candidate keypoint pairs using an inlier-outlier neural network, the inlier-outlier neural network producing an indirect supervisory signal to train the neural keypoint detection network to identify one or more candidate keypoint pairs between the portion of the input image and the transformed portion of the input image.
    Type: Grant
    Filed: March 31, 2020
    Date of Patent: February 22, 2022
    Assignee: Toyota Research Institute, Inc.
    Inventors: Jiexiong Tang, Rares A. Ambrus, Vitor Guizilini, Sudeep Pillai, Hanme Kim
  • Publication number: 20220026918
    Abstract: A method for controlling an ego agent includes capturing a two-dimensional (2D) image of an environment adjacent to the ego agent. The method also includes generating a semantically segmented image of the environment based on the 2D image. The method further includes generating a depth map of the environment based on the semantically segmented image. The method additionally includes generating a three-dimensional (3D) estimate of the environment based on the depth map. The method also includes controlling an action of the ego agent based on the identified location.
    Type: Application
    Filed: July 23, 2020
    Publication date: January 27, 2022
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor GUIZILINI, Jie LI, Rares A. AMBRUS, Sudeep PILLAI, Adrien GAIDON
  • Publication number: 20220005217
    Abstract: A method for estimating depth of a scene includes selecting an image of the scene from a sequence of images of the scene captured via an in-vehicle sensor of a first agent. The method also includes identifying previously captured images of the scene. The method further includes selecting a set of images from the previously captured images based on each image of the set of images satisfying depth criteria. The method still further includes estimating the depth of the scene based on the selected image and the selected set of images.
    Type: Application
    Filed: July 6, 2021
    Publication date: January 6, 2022
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Jiexiong TANG, Rares Andrei AMBRUS, Sudeep PILLAI, Vitor GUIZILINI, Adrien David GAIDON
  • Publication number: 20210398301
    Abstract: A method for monocular depth/pose estimation in a camera agnostic network is described. The method includes training a monocular depth model and a monocular pose model to learn monocular depth estimation and monocular pose estimation based on a target image and context images from monocular video captured by the camera agnostic network. The method also includes lifting 3D points from image pixels of the target image according to the context images. The method further includes projecting the lifted 3D points onto an image plane according to a predicted ray vector based on the monocular depth model, the monocular pose model, and a camera center of the camera agnostic network. The method also includes predicting a warped target image from a predicted depth map of the monocular depth model, a ray surface of the predicted ray vector, and a projection of the lifted 3D points according to the camera agnostic network.
    Type: Application
    Filed: June 17, 2020
    Publication date: December 23, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor GUIZILINI, Sudeep PILLAI, Adrien David GAIDON, Rares A. AMBRUS, Igor VASILJEVIC
  • Publication number: 20210387649
    Abstract: A representation of a spatial structure of objects in an image can be determined. A mode of a neural network can be set, in response to a receipt of the image and a receipt of a facing direction of a camera that produced the image. The mode can account for the facing direction. The facing direction can include one or more of a first facing direction of a first camera disposed on a vehicle or a second facing direction of a second camera disposed on the vehicle. The neural network can be executed, in response to the mode having been set, to determine the representation of the spatial structure of the objects in the image. The representation of the spatial structure of the objects in the image can be transmitted to an automotive navigation system to determine a distance between the vehicle and a specific object in the image.
    Type: Application
    Filed: June 11, 2020
    Publication date: December 16, 2021
    Inventors: Sudeep Pillai, Vitor Guizilini, Rares A. Ambrus, Adrien David Gaidon
  • Publication number: 20210390714
    Abstract: A two dimensional image can be received. A depth map can be produced, via a first neural network, from the two dimensional image. A bird's eye view image can be produced, via a second neural network, from the depth map. The second neural network can implement a machine learning algorithm that preserves spatial gradient information associated with one or more objects included in the depth map and causes a position of a pixel in an object, included in the bird's eye view image, to be represented by a differentiable function. Three dimensional objects can be detected, via a third neural network, from the two dimensional image, the bird's eye view image, and the spatial gradient information. A combination of the first neural network, the second neural network, and the third neural network can be end-to-end trainable and can be included in a perception system.
    Type: Application
    Filed: June 11, 2020
    Publication date: December 16, 2021
    Inventors: Vitor Guizilini, Rares A. Ambrus, Sudeep Pillai, Adrien David Gaidon
  • Publication number: 20210387648
    Abstract: Information that identifies a location can be received. In response to a receipt of the information that identifies the location, a file can be retrieved. The file can be for the location. The file can include image data and a set of node data. The set of node data can include information that identifies nodes in a neural network, information that identifies inputs of the nodes, and values of weights to be applied to the inputs. In response to a retrieval of the file, the weights can be applied to the inputs of the nodes and the image data can be received for the neural network. In response to an application of the weights and a receipt of the image data, the neural network can be executed to produce a digital map for the location. The digital map for the location can be transmitted to an automotive navigation system.
    Type: Application
    Filed: June 10, 2020
    Publication date: December 16, 2021
    Inventors: Vitor Guizilini, Rares A. Ambrus, Sudeep Pillai, Adrien David Gaidon
  • Publication number: 20210358137
    Abstract: Systems and methods determining velocity of an object associated with a three-dimensional (3D) scene may include: a LIDAR system generating two sets of 3D point cloud data of the scene from two consecutive point cloud sweeps; a pillar feature network encoding data of the point cloud data to extract two-dimensional (2D) bird's-eye-view embeddings for each of the point cloud data sets in the form of pseudo images, wherein the 2D bird's-eye-view embeddings for a first of the two point cloud data sets comprises pillar features for the first point cloud data set and the 2D bird's-eye-view embeddings for a second of the two point cloud data sets comprises pillar features for the second point cloud data set; and a feature pyramid network encoding the pillar features and performing a 2D optical flow estimation to estimate the velocity of the object.
    Type: Application
    Filed: May 18, 2020
    Publication date: November 18, 2021
    Inventors: KUAN-HUI LEE, SUDEEP PILLAI, ADRIEN DAVID GAIDON
  • Patent number: 11176709
    Abstract: System, methods, and other embodiments described herein relate to self-supervised training of a depth model for monocular depth estimation. In one embodiment, a method includes processing a first image of a pair according to the depth model to generate a depth map. The method includes processing the first image and a second image of the pair according to a pose model to generate a transformation that defines a relationship between the pair. The pair of images are separate frames depicting a scene of a monocular video. The method includes generating a monocular loss and a pose loss, the pose loss including at least a velocity component that accounts for motion of a camera between the training images. The method includes updating the pose model according to the pose loss and the depth model according to the monocular loss to improve scale awareness of the depth model in producing depth estimates.
    Type: Grant
    Filed: October 17, 2019
    Date of Patent: November 16, 2021
    Assignee: Toyota Research Institute, Inc.
    Inventors: Sudeep Pillai, Rares A. Ambrus, Vitor Guizilini, Adrien David Gaidon
  • Publication number: 20210350222
    Abstract: Systems and methods to improve machine learning by explicitly over-fitting environmental data obtained by an imaging system, such as a monocular camera are disclosed. The system includes training self-supervised depth and pose networks in monocular visual data collected from a certain area over multiple passes. Pose and depth networks may be trained by extracting data from multiple images of a single environment or trajectory, allowing the system to overfit the image data.
    Type: Application
    Filed: May 5, 2020
    Publication date: November 11, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Rares A. AMBRUS, Vitor GUIZILINI, Sudeep PILLAI, Adrien David GAIDON
  • Publication number: 20210326601
    Abstract: A method for keypoint matching includes determining a first set of keypoints corresponding to a current environment of the agent. The method further includes determining a second set of keypoints from a pre-built map of the current environment. The method still further includes identifying matching pairs of keypoints from the first set of keypoints and the second set of keypoints based on geometrical similarities between respective keypoints of the first set of keypoints and the second set of keypoints. The method also includes determining a current location of the agent based on the identified matching pairs of keypoints. The method further includes controlling an action of the agent based on the current location.
    Type: Application
    Filed: April 15, 2021
    Publication date: October 21, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Jiexiong TANG, Rares Andrei AMBRUS, Jie LI, Vitor GUIZILINI, Sudeep PILLAI, Adrien David GAIDON
  • Publication number: 20210318140
    Abstract: A method for localization performed by an agent includes receiving a query image of a current environment of the agent captured by a sensor integrated with the agent. The method also includes receiving a target image comprising a first set of keypoints matching a second set of keypoints of the query image. The first set of keypoints may be generated based on a task specified for the agent. The method still further includes determining a current location based on the target image.
    Type: Application
    Filed: April 14, 2021
    Publication date: October 14, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Jiexiong TANG, Rares Andrei AMBRUS, Hanme KIM, Vitor GUIZILINI, Adrien David GAIDON, Xipeng WANG, Jeff WALLS, SR., Sudeep PILLAI
  • Patent number: 11145074
    Abstract: System, methods, and other embodiments described herein relate to generating depth estimates of an environment depicted in a monocular image. In one embodiment, a method includes, in response to receiving the monocular image, processing the monocular image according to a depth model to generate a depth map. Processing the monocular images includes encoding the monocular image according to encoding layers of the depth model including iteratively encoding features of the monocular image to generate feature maps at successively refined representations using packing blocks within the encoding layers. Processing the monocular image further includes decoding the feature maps according to decoding layers of the depth model including iteratively decoding the features maps associated with separate ones of the packing blocks using unpacking blocks of the decoding layers to generate the depth map. The method includes providing the depth map as the depth estimates of objects represented in the monocular image.
    Type: Grant
    Filed: October 17, 2019
    Date of Patent: October 12, 2021
    Assignee: Toyota Research Institute, Inc.
    Inventors: Vitor Guizilini, Rares A. Ambrus, Sudeep Pillai, Adrien David Gaidon
  • Patent number: 11144818
    Abstract: System, methods, and other embodiments described herein relate to estimating ego-motion. In one embodiment, a method for estimating ego-motion based on a plurality of input images in a self-supervised system includes receiving a source image and a target image, determining a depth estimation Dt based on the target image, determining a depth estimation Ds based on a source image, and determining an ego-motion estimation in a form of a six degrees-of-freedom (6 DOF) transformation between the target image and the source image by inputting the depth estimations (Dt, Ds), the target image, and the source image into a two-stream network architecture trained to output the 6 DOF transformation based at least in part on the depth estimations (Dt, Ds), the target image, and the source image.
    Type: Grant
    Filed: October 16, 2019
    Date of Patent: October 12, 2021
    Assignee: Toyota Research Institute, Inc.
    Inventors: Rares A. Ambrus, Vitor Guizilini, Sudeep Pillai, Jie Li, Adrien David Gaidon