Patents by Inventor Adrien GAIDON

Adrien GAIDON has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220026918
    Abstract: A method for controlling an ego agent includes capturing a two-dimensional (2D) image of an environment adjacent to the ego agent. The method also includes generating a semantically segmented image of the environment based on the 2D image. The method further includes generating a depth map of the environment based on the semantically segmented image. The method additionally includes generating a three-dimensional (3D) estimate of the environment based on the depth map. The method also includes controlling an action of the ego agent based on the identified location.
    Type: Application
    Filed: July 23, 2020
    Publication date: January 27, 2022
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Vitor GUIZILINI, Jie LI, Rares A. AMBRUS, Sudeep PILLAI, Adrien GAIDON
  • Publication number: 20210398014
    Abstract: A method for controlling an ego agent includes periodically receiving policy information comprising a spatial environment observation and a current state of the ego agent. The method also includes selecting, for each received policy information, a low-level policy from a number of low-level policies. The low-level policy may be selected based on a high-level policy. The method further includes controlling an action of the ego agent based on the selected low-level policy.
    Type: Application
    Filed: August 25, 2020
    Publication date: December 23, 2021
    Applicants: TOYOTA RESEARCH INSTITUTE, INC., THE BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIVERSITY
    Inventors: Zhangjie CAO, Erdem BIYIK, Woodrow Zhouyuan WANG, Allan RAVENTOS, Adrien GAIDON, Guy ROSMAN, Dorsa SADIGH
  • Patent number: 11205082
    Abstract: A system and method for predicting pedestrian intent is provided. A prediction circuit comprising a plurality of gated recurrent units (GRUB) receives a sequence of images captured by a camera. The prediction circuit parses each frame of the sequence of images to identify one or more pedestrians and one or more objects. Using the parsed data, the prediction circuit generates a pedestrian-centric spatiotemporal graph, the parsed data comprising one or more identified pedestrians and one or more identified object. The prediction circuit uses the pedestrian-centric graph to determine a probability of one or more pedestrians crossing a street for each frame of the sequence of images.
    Type: Grant
    Filed: October 8, 2019
    Date of Patent: December 21, 2021
    Assignees: TOYOTA RESEARCH INSTITUTE, INC., The Board of Trustees of the Leland Stanford Junior University
    Inventors: Ehsan Adeli-Mosabbeb, Kuan Lee, Adrien Gaidon, Bingbin Liu, Zhangjie Cao, Juan Carlos Niebles
  • Patent number: 11144761
    Abstract: A system for applying video data to a neural network (NN) for online multi-class multi-object tracking includes a computer programed to perform an image classification method including the operations of receiving a video sequence; detecting candidate objects in each of a previous and a current video frame; transforming the previous and current video frames into a temporal difference input image; applying the temporal difference input image to a pre-trained neural network (NN) (or deep convolutional network) comprising an ordered sequence of layers; and based on a classification value received by the neural network, associating a pair of detected candidate objects in the previous and current frames as belonging to one of matching objects and different objects.
    Type: Grant
    Filed: April 4, 2016
    Date of Patent: October 12, 2021
    Assignee: XEROX CORPORATION
    Inventor: Adrien Gaidon
  • Patent number: 11074438
    Abstract: A method for predicting spatial positions of several key points on a human body in the near future in an egocentric setting is described. The method includes generating a frame-level supervision for human poses. The method also includes suppressing noise and filling missing joints of the human body using a pose completion module. The method further includes splitting the poses into a global stream and a local stream. Furthermore, the method includes combining the global stream and the local stream to forecast future human locomotion.
    Type: Grant
    Filed: October 1, 2019
    Date of Patent: July 27, 2021
    Assignees: TOYOTA RESEARCH INSTITUTE, INC., THE BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIVERSITY
    Inventors: Karttikeya Mangalam, Ehsan Adeli-Mosabbeb, Kuan-Hui Lee, Adrien Gaidon, Juan Carlos Niebles Duque
  • Publication number: 20210134002
    Abstract: A method for monocular 3D object perception is described. The method includes sampling multiple, stochastic latent variables from a learned latent feature distribution of an RGB image for a 2D object detected in the RGB image. The method also includes lifting a 3D proposal for each stochastic latent variable sampled for the detected 2D object. The method further includes selecting a 3D proposal for the detected 2D object using a proposal selection algorithm to reduce 3D proposal lifting overlap. The method also includes planning a trajectory of an ego vehicle according to a 3D location and pose of the 2D object according to the selected 3D proposal.
    Type: Application
    Filed: January 24, 2020
    Publication date: May 6, 2021
    Applicant: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Yu YAO, Wadim KEHL, Adrien GAIDON
  • Publication number: 20210103742
    Abstract: A system and method for predicting pedestrian intent is provided. A prediction circuit comprising a plurality of gated recurrent units (GRUB) receives a sequence of images captured by a camera. The prediction circuit parses each frame of the sequence of images to identify one or more pedestrians and one or more objects. Using the parsed data, the prediction circuit generates a pedestrian-centric spatiotemporal graph, the parsed data comprising one or more identified pedestrians and one or more identified object. The prediction circuit uses the pedestrian-centric graph to determine a probability of one or more pedestrians crossing a street for each frame of the sequence of images.
    Type: Application
    Filed: October 8, 2019
    Publication date: April 8, 2021
    Inventors: Ehsan Adeli-Mosabbeb, Kuan Lee, Adrien Gaidon, Bingbin Liu, Zhangjie Cao, Juan Carlos Niebles
  • Publication number: 20210097266
    Abstract: A method for predicting spatial positions of several key points on a human body in the near future in an egocentric setting is described. The method includes generating a frame-level supervision for human poses. The method also includes suppressing noise and filling missing joints of the human body using a pose completion module. The method further includes splitting the poses into a global stream and a local stream. Furthermore, the method includes combining the global stream and the local stream to forecast future human locomotion.
    Type: Application
    Filed: October 1, 2019
    Publication date: April 1, 2021
    Applicants: TOYOTA RESEARCH INSTITUTE, INC., THE BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIVERSITY
    Inventors: Karttikeya MANGALAM, Ehsan ADELI-MOSABBEB, Kuan-Hui LEE, Adrien GAIDON, Juan Carlos NIEBLES DUQUE
  • Publication number: 20200234066
    Abstract: A method for performing vehicle taillight recognition is described. The method includes extracting spatial features from a sequence of images of a real-world traffic scene during operation of an ego vehicle. The method includes selectively focusing a convolutional neural network (CNN) of a CNN-long short-term memory (CNN-LSTM) framework on a selected region of the sequence of images according to a spatial attention model for a vehicle taillight recognition task. The method includes selecting, by an LSTM network of the CNN-LSTM framework, frames within the selected region of the sequence of images according to a temporal attention model for the vehicle taillight recognition task. The method includes inferring, according to the selected frames within the selected region of the sequence of images, an intent of an ado vehicle according to a taillight state. The method includes planning a trajectory of the ego vehicle from the intent inferred from the ado vehicle.
    Type: Application
    Filed: April 19, 2019
    Publication date: July 23, 2020
    Inventors: Kuan-Hui LEE, Takaaki TAGAWA, Jia-En M. PAN, Adrien GAIDON, Bertrand DOUILLARD
  • Patent number: 10019652
    Abstract: A system and method are suited for assessing video performance analysis. A computer graphics engine clones real-world data in a virtual world by decomposing the real-world data into visual components and objects in one or more object categories and populates the virtual world with virtual visual components and virtual objects. A scripting component controls the virtual visual components and the virtual objects in the virtual world based on the set of real-world data. A synthetic clone of the video sequence is generated based on the script controlling the virtual visual components and the virtual objects. The real-world data is compared with the synthetic clone of the video sequence and a transferability of conclusions from the virtual world to the real-world is assessed based on this comparison.
    Type: Grant
    Filed: February 23, 2016
    Date of Patent: July 10, 2018
    Assignee: Xerox Corporation
    Inventors: Qiao Wang, Adrien Gaidon, Eleonora Vig
  • Patent number: 9984315
    Abstract: Methods and systems for online domain adaptation for multi-object tracking. Video of an area of interest can be captured with an image-capturing unit. The video (e.g., video images) can be analyzed with a pre-trained object detector utilizing online domain adaptation including convex multi-task learning and an associated self-tuning stochastic optimization procedure to jointly adapt online all trackers associated with the pre-trained object detector and a pre-trained category-level model from the trackers in order to efficiently track a plurality of objects in the video captured by the image-capturing unit.
    Type: Grant
    Filed: May 5, 2015
    Date of Patent: May 29, 2018
    Assignee: Condurent Business Services, LLC
    Inventors: Adrien Gaidon, Eleonora Vig
  • Patent number: 9946933
    Abstract: A computer-implemented video classification method and system are disclosed. The method includes receiving an input video including a sequence of frames. At least one transformation of the input video is generated, each transformation including a sequence of frames. For the input video and each transformation, local descriptors are extracted from the respective sequence of frames. The local descriptors of the input video and each transformation are aggregated to form an aggregated feature vector with a first set of processing layers learned using unsupervised learning. An output classification value is generated for the input video, based on the aggregated feature vector with a second set of processing layers learned using supervised learning.
    Type: Grant
    Filed: August 18, 2016
    Date of Patent: April 17, 2018
    Assignee: XEROX CORPORATION
    Inventors: César Roberto De Souza, Adrien Gaidon, Eleonora Vig, Antonio M. Lopez
  • Publication number: 20180053057
    Abstract: A computer-implemented video classification method and system are disclosed. The method includes receiving an input video including a sequence of frames. At least one transformation of the input video is generated, each transformation including a sequence of frames. For the input video and each transformation, local descriptors are extracted from the respective sequence of frames. The local descriptors of the input video and each transformation are aggregated to form an aggregated feature vector with a first set of processing layers learned using unsupervised learning. An output classification value is generated for the input video, based on the aggregated feature vector with a second set of processing layers learned using supervised learning.
    Type: Application
    Filed: August 18, 2016
    Publication date: February 22, 2018
    Applicant: Xerox Corporation
    Inventors: César Roberto De Souza, Adrien Gaidon, Eleonora Vig, Antonio M. Lopez
  • Patent number: 9792492
    Abstract: A method for extracting a representation from an image includes inputting an image to a pre-trained neural network. The gradient of a loss function is computed with respect to parameters of the neural network, for the image. A gradient representation is extracted for the image based on the computed gradients, which can be used, for example, for classification or retrieval.
    Type: Grant
    Filed: July 7, 2015
    Date of Patent: October 17, 2017
    Assignee: XEROX CORPORATION
    Inventors: Albert Gordo Soldevila, Adrien Gaidon, Florent C. Perronnin
  • Publication number: 20170286774
    Abstract: A system for applying video data to a neural network (NN) for online multi-class multi-object tracking includes a computer programed to perform an image classification method including the operations of receiving a video sequence; detecting candidate objects in each of a previous and a current video frame; transforming the previous and current video frames into a temporal difference input image; applying the temporal difference input image to a pre-trained neural network (NN) (or deep convolutional network) comprising an ordered sequence of layers; and based on a classification value received by the neural network, associating a pair of detected candidate objects in the previous and current frames as belonging to one of matching objects and different objects.
    Type: Application
    Filed: April 4, 2016
    Publication date: October 5, 2017
    Applicant: Xerox Corporation
    Inventor: Adrien Gaidon
  • Publication number: 20170243083
    Abstract: A system and method are suited for assessing video performance analysis. A computer graphics engine clones real-world data in a virtual world by decomposing the real-world data into visual components and objects in one or more object categories and populates the virtual world with virtual visual components and virtual objects. A scripting component controls the virtual visual components and the virtual objects in the virtual world based on the set of real-world data. A synthetic clone of the video sequence is generated based on the script controlling the virtual visual components and the virtual objects. The real-world data is compared with the synthetic clone of the video sequence and a transferability of conclusions from the virtual world to the real-world is assessed based on this comparison.
    Type: Application
    Filed: February 23, 2016
    Publication date: August 24, 2017
    Applicant: Xerox Corporation
    Inventors: Qiao Wang, Adrien Gaidon, Eleonora Vig
  • Patent number: 9697439
    Abstract: An object detection method includes for each of a set of patches of an image, encoding features of the patch with a non-linear mapping function, and computing per-patch statistics based on the encoded features for approximating a window-level non-linear operation by a patch-level operation. Then, windows are extracted from the image, each window comprising a sub-set of the set of patches. Each of the windows is scored based on the computed patch statistics of the respective sub-set of patches. Objects, if any, can then be detected in the image, based on the window scores. The method and system allow the non-linear operations to be performed only at the patch level, reducing the computation time of the method, since there are generally many more windows than patches, while not impacting performance unduly, as compared to a system which performs non-linear operations at the window level.
    Type: Grant
    Filed: October 2, 2014
    Date of Patent: July 4, 2017
    Assignee: XEROX CORPORATION
    Inventors: Adrien Gaidon, Diane Larlus-Larrondo, Florent C. Perronnin
  • Publication number: 20170011280
    Abstract: A method for extracting a representation from an image includes inputting an image to a pre-trained neural network. The gradient of a loss function is computed with respect to parameters of the neural network, for the image. A gradient representation is extracted for the image based on the computed gradients, which can be used, for example, for classification or retrieval.
    Type: Application
    Filed: July 7, 2015
    Publication date: January 12, 2017
    Applicant: Xerox Corporation
    Inventors: Albert Gordo Soldevila, Adrien Gaidon, Florent C. Perronnin
  • Publication number: 20160328613
    Abstract: Methods and systems for online domain adaptation for multi-object tracking. Video of an area of interest can be captured with an image-capturing unit. The video (e.g., video images) can be analyzed with a pre-trained object detector utilizing online domain adaptation including convex multi-task learning and an associated self-tuning stochastic optimization procedure to jointly adapt online all trackers associated with the pre-trained object detector and a pre-trained category-level model from the trackers in order to efficiently track a plurality of objects in the video captured by the image-capturing unit.
    Type: Application
    Filed: May 5, 2015
    Publication date: November 10, 2016
    Inventors: Adrien Gaidon, Eleonora Vig
  • Publication number: 20160314351
    Abstract: A graphical user interface (GUI) of a business process management (BPM) system is provided to construct a process model that is displayed on a graphical display device as a graphical representation comprising nodes representing process events, activities, or decision points and including computer vision (CV) nodes representing video stream processing, with flow connectors defining operational sequences of nodes and data flow between nodes of the process model. The process model is executed to perform a process represented by the process model including executing CV nodes of the process model by performing video stream processing represented by the CV nodes of the process model. The available CV nodes include a set of video pattern detection nodes, and a set of video pattern relation nodes defining a video grammar of relations between video patterns detectable by the video pattern detection nodes.
    Type: Application
    Filed: April 27, 2015
    Publication date: October 27, 2016
    Inventors: Adrian Corneliu Mos, Adrien Gaidon, Eleonora Vig