Patents by Inventor Adrien GAIDON
Adrien GAIDON has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20220026918Abstract: A method for controlling an ego agent includes capturing a two-dimensional (2D) image of an environment adjacent to the ego agent. The method also includes generating a semantically segmented image of the environment based on the 2D image. The method further includes generating a depth map of the environment based on the semantically segmented image. The method additionally includes generating a three-dimensional (3D) estimate of the environment based on the depth map. The method also includes controlling an action of the ego agent based on the identified location.Type: ApplicationFiled: July 23, 2020Publication date: January 27, 2022Applicant: TOYOTA RESEARCH INSTITUTE, INC.Inventors: Vitor GUIZILINI, Jie LI, Rares A. AMBRUS, Sudeep PILLAI, Adrien GAIDON
-
Publication number: 20210398014Abstract: A method for controlling an ego agent includes periodically receiving policy information comprising a spatial environment observation and a current state of the ego agent. The method also includes selecting, for each received policy information, a low-level policy from a number of low-level policies. The low-level policy may be selected based on a high-level policy. The method further includes controlling an action of the ego agent based on the selected low-level policy.Type: ApplicationFiled: August 25, 2020Publication date: December 23, 2021Applicants: TOYOTA RESEARCH INSTITUTE, INC., THE BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIVERSITYInventors: Zhangjie CAO, Erdem BIYIK, Woodrow Zhouyuan WANG, Allan RAVENTOS, Adrien GAIDON, Guy ROSMAN, Dorsa SADIGH
-
Patent number: 11205082Abstract: A system and method for predicting pedestrian intent is provided. A prediction circuit comprising a plurality of gated recurrent units (GRUB) receives a sequence of images captured by a camera. The prediction circuit parses each frame of the sequence of images to identify one or more pedestrians and one or more objects. Using the parsed data, the prediction circuit generates a pedestrian-centric spatiotemporal graph, the parsed data comprising one or more identified pedestrians and one or more identified object. The prediction circuit uses the pedestrian-centric graph to determine a probability of one or more pedestrians crossing a street for each frame of the sequence of images.Type: GrantFiled: October 8, 2019Date of Patent: December 21, 2021Assignees: TOYOTA RESEARCH INSTITUTE, INC., The Board of Trustees of the Leland Stanford Junior UniversityInventors: Ehsan Adeli-Mosabbeb, Kuan Lee, Adrien Gaidon, Bingbin Liu, Zhangjie Cao, Juan Carlos Niebles
-
Patent number: 11144761Abstract: A system for applying video data to a neural network (NN) for online multi-class multi-object tracking includes a computer programed to perform an image classification method including the operations of receiving a video sequence; detecting candidate objects in each of a previous and a current video frame; transforming the previous and current video frames into a temporal difference input image; applying the temporal difference input image to a pre-trained neural network (NN) (or deep convolutional network) comprising an ordered sequence of layers; and based on a classification value received by the neural network, associating a pair of detected candidate objects in the previous and current frames as belonging to one of matching objects and different objects.Type: GrantFiled: April 4, 2016Date of Patent: October 12, 2021Assignee: XEROX CORPORATIONInventor: Adrien Gaidon
-
Patent number: 11074438Abstract: A method for predicting spatial positions of several key points on a human body in the near future in an egocentric setting is described. The method includes generating a frame-level supervision for human poses. The method also includes suppressing noise and filling missing joints of the human body using a pose completion module. The method further includes splitting the poses into a global stream and a local stream. Furthermore, the method includes combining the global stream and the local stream to forecast future human locomotion.Type: GrantFiled: October 1, 2019Date of Patent: July 27, 2021Assignees: TOYOTA RESEARCH INSTITUTE, INC., THE BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIVERSITYInventors: Karttikeya Mangalam, Ehsan Adeli-Mosabbeb, Kuan-Hui Lee, Adrien Gaidon, Juan Carlos Niebles Duque
-
Publication number: 20210134002Abstract: A method for monocular 3D object perception is described. The method includes sampling multiple, stochastic latent variables from a learned latent feature distribution of an RGB image for a 2D object detected in the RGB image. The method also includes lifting a 3D proposal for each stochastic latent variable sampled for the detected 2D object. The method further includes selecting a 3D proposal for the detected 2D object using a proposal selection algorithm to reduce 3D proposal lifting overlap. The method also includes planning a trajectory of an ego vehicle according to a 3D location and pose of the 2D object according to the selected 3D proposal.Type: ApplicationFiled: January 24, 2020Publication date: May 6, 2021Applicant: TOYOTA RESEARCH INSTITUTE, INC.Inventors: Yu YAO, Wadim KEHL, Adrien GAIDON
-
Publication number: 20210103742Abstract: A system and method for predicting pedestrian intent is provided. A prediction circuit comprising a plurality of gated recurrent units (GRUB) receives a sequence of images captured by a camera. The prediction circuit parses each frame of the sequence of images to identify one or more pedestrians and one or more objects. Using the parsed data, the prediction circuit generates a pedestrian-centric spatiotemporal graph, the parsed data comprising one or more identified pedestrians and one or more identified object. The prediction circuit uses the pedestrian-centric graph to determine a probability of one or more pedestrians crossing a street for each frame of the sequence of images.Type: ApplicationFiled: October 8, 2019Publication date: April 8, 2021Inventors: Ehsan Adeli-Mosabbeb, Kuan Lee, Adrien Gaidon, Bingbin Liu, Zhangjie Cao, Juan Carlos Niebles
-
Publication number: 20210097266Abstract: A method for predicting spatial positions of several key points on a human body in the near future in an egocentric setting is described. The method includes generating a frame-level supervision for human poses. The method also includes suppressing noise and filling missing joints of the human body using a pose completion module. The method further includes splitting the poses into a global stream and a local stream. Furthermore, the method includes combining the global stream and the local stream to forecast future human locomotion.Type: ApplicationFiled: October 1, 2019Publication date: April 1, 2021Applicants: TOYOTA RESEARCH INSTITUTE, INC., THE BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIVERSITYInventors: Karttikeya MANGALAM, Ehsan ADELI-MOSABBEB, Kuan-Hui LEE, Adrien GAIDON, Juan Carlos NIEBLES DUQUE
-
Publication number: 20200234066Abstract: A method for performing vehicle taillight recognition is described. The method includes extracting spatial features from a sequence of images of a real-world traffic scene during operation of an ego vehicle. The method includes selectively focusing a convolutional neural network (CNN) of a CNN-long short-term memory (CNN-LSTM) framework on a selected region of the sequence of images according to a spatial attention model for a vehicle taillight recognition task. The method includes selecting, by an LSTM network of the CNN-LSTM framework, frames within the selected region of the sequence of images according to a temporal attention model for the vehicle taillight recognition task. The method includes inferring, according to the selected frames within the selected region of the sequence of images, an intent of an ado vehicle according to a taillight state. The method includes planning a trajectory of the ego vehicle from the intent inferred from the ado vehicle.Type: ApplicationFiled: April 19, 2019Publication date: July 23, 2020Inventors: Kuan-Hui LEE, Takaaki TAGAWA, Jia-En M. PAN, Adrien GAIDON, Bertrand DOUILLARD
-
Patent number: 10019652Abstract: A system and method are suited for assessing video performance analysis. A computer graphics engine clones real-world data in a virtual world by decomposing the real-world data into visual components and objects in one or more object categories and populates the virtual world with virtual visual components and virtual objects. A scripting component controls the virtual visual components and the virtual objects in the virtual world based on the set of real-world data. A synthetic clone of the video sequence is generated based on the script controlling the virtual visual components and the virtual objects. The real-world data is compared with the synthetic clone of the video sequence and a transferability of conclusions from the virtual world to the real-world is assessed based on this comparison.Type: GrantFiled: February 23, 2016Date of Patent: July 10, 2018Assignee: Xerox CorporationInventors: Qiao Wang, Adrien Gaidon, Eleonora Vig
-
Patent number: 9984315Abstract: Methods and systems for online domain adaptation for multi-object tracking. Video of an area of interest can be captured with an image-capturing unit. The video (e.g., video images) can be analyzed with a pre-trained object detector utilizing online domain adaptation including convex multi-task learning and an associated self-tuning stochastic optimization procedure to jointly adapt online all trackers associated with the pre-trained object detector and a pre-trained category-level model from the trackers in order to efficiently track a plurality of objects in the video captured by the image-capturing unit.Type: GrantFiled: May 5, 2015Date of Patent: May 29, 2018Assignee: Condurent Business Services, LLCInventors: Adrien Gaidon, Eleonora Vig
-
Patent number: 9946933Abstract: A computer-implemented video classification method and system are disclosed. The method includes receiving an input video including a sequence of frames. At least one transformation of the input video is generated, each transformation including a sequence of frames. For the input video and each transformation, local descriptors are extracted from the respective sequence of frames. The local descriptors of the input video and each transformation are aggregated to form an aggregated feature vector with a first set of processing layers learned using unsupervised learning. An output classification value is generated for the input video, based on the aggregated feature vector with a second set of processing layers learned using supervised learning.Type: GrantFiled: August 18, 2016Date of Patent: April 17, 2018Assignee: XEROX CORPORATIONInventors: César Roberto De Souza, Adrien Gaidon, Eleonora Vig, Antonio M. Lopez
-
Publication number: 20180053057Abstract: A computer-implemented video classification method and system are disclosed. The method includes receiving an input video including a sequence of frames. At least one transformation of the input video is generated, each transformation including a sequence of frames. For the input video and each transformation, local descriptors are extracted from the respective sequence of frames. The local descriptors of the input video and each transformation are aggregated to form an aggregated feature vector with a first set of processing layers learned using unsupervised learning. An output classification value is generated for the input video, based on the aggregated feature vector with a second set of processing layers learned using supervised learning.Type: ApplicationFiled: August 18, 2016Publication date: February 22, 2018Applicant: Xerox CorporationInventors: César Roberto De Souza, Adrien Gaidon, Eleonora Vig, Antonio M. Lopez
-
Patent number: 9792492Abstract: A method for extracting a representation from an image includes inputting an image to a pre-trained neural network. The gradient of a loss function is computed with respect to parameters of the neural network, for the image. A gradient representation is extracted for the image based on the computed gradients, which can be used, for example, for classification or retrieval.Type: GrantFiled: July 7, 2015Date of Patent: October 17, 2017Assignee: XEROX CORPORATIONInventors: Albert Gordo Soldevila, Adrien Gaidon, Florent C. Perronnin
-
Publication number: 20170286774Abstract: A system for applying video data to a neural network (NN) for online multi-class multi-object tracking includes a computer programed to perform an image classification method including the operations of receiving a video sequence; detecting candidate objects in each of a previous and a current video frame; transforming the previous and current video frames into a temporal difference input image; applying the temporal difference input image to a pre-trained neural network (NN) (or deep convolutional network) comprising an ordered sequence of layers; and based on a classification value received by the neural network, associating a pair of detected candidate objects in the previous and current frames as belonging to one of matching objects and different objects.Type: ApplicationFiled: April 4, 2016Publication date: October 5, 2017Applicant: Xerox CorporationInventor: Adrien Gaidon
-
Publication number: 20170243083Abstract: A system and method are suited for assessing video performance analysis. A computer graphics engine clones real-world data in a virtual world by decomposing the real-world data into visual components and objects in one or more object categories and populates the virtual world with virtual visual components and virtual objects. A scripting component controls the virtual visual components and the virtual objects in the virtual world based on the set of real-world data. A synthetic clone of the video sequence is generated based on the script controlling the virtual visual components and the virtual objects. The real-world data is compared with the synthetic clone of the video sequence and a transferability of conclusions from the virtual world to the real-world is assessed based on this comparison.Type: ApplicationFiled: February 23, 2016Publication date: August 24, 2017Applicant: Xerox CorporationInventors: Qiao Wang, Adrien Gaidon, Eleonora Vig
-
Patent number: 9697439Abstract: An object detection method includes for each of a set of patches of an image, encoding features of the patch with a non-linear mapping function, and computing per-patch statistics based on the encoded features for approximating a window-level non-linear operation by a patch-level operation. Then, windows are extracted from the image, each window comprising a sub-set of the set of patches. Each of the windows is scored based on the computed patch statistics of the respective sub-set of patches. Objects, if any, can then be detected in the image, based on the window scores. The method and system allow the non-linear operations to be performed only at the patch level, reducing the computation time of the method, since there are generally many more windows than patches, while not impacting performance unduly, as compared to a system which performs non-linear operations at the window level.Type: GrantFiled: October 2, 2014Date of Patent: July 4, 2017Assignee: XEROX CORPORATIONInventors: Adrien Gaidon, Diane Larlus-Larrondo, Florent C. Perronnin
-
Publication number: 20170011280Abstract: A method for extracting a representation from an image includes inputting an image to a pre-trained neural network. The gradient of a loss function is computed with respect to parameters of the neural network, for the image. A gradient representation is extracted for the image based on the computed gradients, which can be used, for example, for classification or retrieval.Type: ApplicationFiled: July 7, 2015Publication date: January 12, 2017Applicant: Xerox CorporationInventors: Albert Gordo Soldevila, Adrien Gaidon, Florent C. Perronnin
-
Publication number: 20160328613Abstract: Methods and systems for online domain adaptation for multi-object tracking. Video of an area of interest can be captured with an image-capturing unit. The video (e.g., video images) can be analyzed with a pre-trained object detector utilizing online domain adaptation including convex multi-task learning and an associated self-tuning stochastic optimization procedure to jointly adapt online all trackers associated with the pre-trained object detector and a pre-trained category-level model from the trackers in order to efficiently track a plurality of objects in the video captured by the image-capturing unit.Type: ApplicationFiled: May 5, 2015Publication date: November 10, 2016Inventors: Adrien Gaidon, Eleonora Vig
-
Publication number: 20160314351Abstract: A graphical user interface (GUI) of a business process management (BPM) system is provided to construct a process model that is displayed on a graphical display device as a graphical representation comprising nodes representing process events, activities, or decision points and including computer vision (CV) nodes representing video stream processing, with flow connectors defining operational sequences of nodes and data flow between nodes of the process model. The process model is executed to perform a process represented by the process model including executing CV nodes of the process model by performing video stream processing represented by the CV nodes of the process model. The available CV nodes include a set of video pattern detection nodes, and a set of video pattern relation nodes defining a video grammar of relations between video patterns detectable by the video pattern detection nodes.Type: ApplicationFiled: April 27, 2015Publication date: October 27, 2016Inventors: Adrian Corneliu Mos, Adrien Gaidon, Eleonora Vig