Patents by Inventor Adrien GAIDON

Adrien GAIDON has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEM AND METHOD FOR MONOCULAR DEPTH ESTIMATION FROM SEMANTIC INFORMATION

Publication number: 20220026918

Abstract: A method for controlling an ego agent includes capturing a two-dimensional (2D) image of an environment adjacent to the ego agent. The method also includes generating a semantically segmented image of the environment based on the 2D image. The method further includes generating a depth map of the environment based on the semantically segmented image. The method additionally includes generating a three-dimensional (3D) estimate of the environment based on the depth map. The method also includes controlling an action of the ego agent based on the identified location.

Type: Application

Filed: July 23, 2020

Publication date: January 27, 2022

Applicant: TOYOTA RESEARCH INSTITUTE, INC.

Inventors: Vitor GUIZILINI, Jie LI, Rares A. AMBRUS, Sudeep PILLAI, Adrien GAIDON
REINFORCEMENT LEARNING BASED CONTROL OF IMITATIVE POLICIES FOR AUTONOMOUS DRIVING

Publication number: 20210398014

Abstract: A method for controlling an ego agent includes periodically receiving policy information comprising a spatial environment observation and a current state of the ego agent. The method also includes selecting, for each received policy information, a low-level policy from a number of low-level policies. The low-level policy may be selected based on a high-level policy. The method further includes controlling an action of the ego agent based on the selected low-level policy.

Type: Application

Filed: August 25, 2020

Publication date: December 23, 2021

Applicants: TOYOTA RESEARCH INSTITUTE, INC., THE BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIVERSITY

Inventors: Zhangjie CAO, Erdem BIYIK, Woodrow Zhouyuan WANG, Allan RAVENTOS, Adrien GAIDON, Guy ROSMAN, Dorsa SADIGH
Spatiotemporal relationship reasoning for pedestrian intent prediction

Patent number: 11205082

Abstract: A system and method for predicting pedestrian intent is provided. A prediction circuit comprising a plurality of gated recurrent units (GRUB) receives a sequence of images captured by a camera. The prediction circuit parses each frame of the sequence of images to identify one or more pedestrians and one or more objects. Using the parsed data, the prediction circuit generates a pedestrian-centric spatiotemporal graph, the parsed data comprising one or more identified pedestrians and one or more identified object. The prediction circuit uses the pedestrian-centric graph to determine a probability of one or more pedestrians crossing a street for each frame of the sequence of images.

Type: Grant

Filed: October 8, 2019

Date of Patent: December 21, 2021

Assignees: TOYOTA RESEARCH INSTITUTE, INC., The Board of Trustees of the Leland Stanford Junior University

Inventors: Ehsan Adeli-Mosabbeb, Kuan Lee, Adrien Gaidon, Bingbin Liu, Zhangjie Cao, Juan Carlos Niebles
Deep data association for online multi-class multi-object tracking

Patent number: 11144761

Abstract: A system for applying video data to a neural network (NN) for online multi-class multi-object tracking includes a computer programed to perform an image classification method including the operations of receiving a video sequence; detecting candidate objects in each of a previous and a current video frame; transforming the previous and current video frames into a temporal difference input image; applying the temporal difference input image to a pre-trained neural network (NN) (or deep convolutional network) comprising an ordered sequence of layers; and based on a classification value received by the neural network, associating a pair of detected candidate objects in the previous and current frames as belonging to one of matching objects and different objects.

Type: Grant

Filed: April 4, 2016

Date of Patent: October 12, 2021

Assignee: XEROX CORPORATION

Inventor: Adrien Gaidon
Disentangling human dynamics for pedestrian locomotion forecasting with noisy supervision

Patent number: 11074438

Abstract: A method for predicting spatial positions of several key points on a human body in the near future in an egocentric setting is described. The method includes generating a frame-level supervision for human poses. The method also includes suppressing noise and filling missing joints of the human body using a pose completion module. The method further includes splitting the poses into a global stream and a local stream. Furthermore, the method includes combining the global stream and the local stream to forecast future human locomotion.

Type: Grant

Filed: October 1, 2019

Date of Patent: July 27, 2021

Assignees: TOYOTA RESEARCH INSTITUTE, INC., THE BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIVERSITY

Inventors: Karttikeya Mangalam, Ehsan Adeli-Mosabbeb, Kuan-Hui Lee, Adrien Gaidon, Juan Carlos Niebles Duque
VARIATIONAL 3D OBJECT DETECTION

Publication number: 20210134002

Abstract: A method for monocular 3D object perception is described. The method includes sampling multiple, stochastic latent variables from a learned latent feature distribution of an RGB image for a 2D object detected in the RGB image. The method also includes lifting a 3D proposal for each stochastic latent variable sampled for the detected 2D object. The method further includes selecting a 3D proposal for the detected 2D object using a proposal selection algorithm to reduce 3D proposal lifting overlap. The method also includes planning a trajectory of an ego vehicle according to a 3D location and pose of the 2D object according to the selected 3D proposal.

Type: Application

Filed: January 24, 2020

Publication date: May 6, 2021

Applicant: TOYOTA RESEARCH INSTITUTE, INC.

Inventors: Yu YAO, Wadim KEHL, Adrien GAIDON
SPATIOTEMPORAL RELATIONSHIP REASONING FOR PEDESTRIAN INTENT PREDICTION

Publication number: 20210103742

Abstract: A system and method for predicting pedestrian intent is provided. A prediction circuit comprising a plurality of gated recurrent units (GRUB) receives a sequence of images captured by a camera. The prediction circuit parses each frame of the sequence of images to identify one or more pedestrians and one or more objects. Using the parsed data, the prediction circuit generates a pedestrian-centric spatiotemporal graph, the parsed data comprising one or more identified pedestrians and one or more identified object. The prediction circuit uses the pedestrian-centric graph to determine a probability of one or more pedestrians crossing a street for each frame of the sequence of images.

Type: Application

Filed: October 8, 2019

Publication date: April 8, 2021

Inventors: Ehsan Adeli-Mosabbeb, Kuan Lee, Adrien Gaidon, Bingbin Liu, Zhangjie Cao, Juan Carlos Niebles
DISENTANGLING HUMAN DYNAMICS FOR PEDESTRIAN LOCOMOTION FORECASTING WITH NOISY SUPERVISION

Publication number: 20210097266

Abstract: A method for predicting spatial positions of several key points on a human body in the near future in an egocentric setting is described. The method includes generating a frame-level supervision for human poses. The method also includes suppressing noise and filling missing joints of the human body using a pose completion module. The method further includes splitting the poses into a global stream and a local stream. Furthermore, the method includes combining the global stream and the local stream to forecast future human locomotion.

Type: Application

Filed: October 1, 2019

Publication date: April 1, 2021

Applicants: TOYOTA RESEARCH INSTITUTE, INC., THE BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIVERSITY

Inventors: Karttikeya MANGALAM, Ehsan ADELI-MOSABBEB, Kuan-Hui LEE, Adrien GAIDON, Juan Carlos NIEBLES DUQUE
ATTENTION-BASED RECURRENT CONVOLUTIONAL NETWORK FOR VEHICLE TAILLIGHT RECOGNITION

Publication number: 20200234066

Abstract: A method for performing vehicle taillight recognition is described. The method includes extracting spatial features from a sequence of images of a real-world traffic scene during operation of an ego vehicle. The method includes selectively focusing a convolutional neural network (CNN) of a CNN-long short-term memory (CNN-LSTM) framework on a selected region of the sequence of images according to a spatial attention model for a vehicle taillight recognition task. The method includes selecting, by an LSTM network of the CNN-LSTM framework, frames within the selected region of the sequence of images according to a temporal attention model for the vehicle taillight recognition task. The method includes inferring, according to the selected frames within the selected region of the sequence of images, an intent of an ado vehicle according to a taillight state. The method includes planning a trajectory of the ego vehicle from the intent inferred from the ado vehicle.

Type: Application

Filed: April 19, 2019

Publication date: July 23, 2020

Inventors: Kuan-Hui LEE, Takaaki TAGAWA, Jia-En M. PAN, Adrien GAIDON, Bertrand DOUILLARD
Generating a virtual world to assess real-world video analysis performance

Patent number: 10019652

Abstract: A system and method are suited for assessing video performance analysis. A computer graphics engine clones real-world data in a virtual world by decomposing the real-world data into visual components and objects in one or more object categories and populates the virtual world with virtual visual components and virtual objects. A scripting component controls the virtual visual components and the virtual objects in the virtual world based on the set of real-world data. A synthetic clone of the video sequence is generated based on the script controlling the virtual visual components and the virtual objects. The real-world data is compared with the synthetic clone of the video sequence and a transferability of conclusions from the virtual world to the real-world is assessed based on this comparison.

Type: Grant

Filed: February 23, 2016

Date of Patent: July 10, 2018

Assignee: Xerox Corporation

Inventors: Qiao Wang, Adrien Gaidon, Eleonora Vig
Online domain adaptation for multi-object tracking

Patent number: 9984315

Abstract: Methods and systems for online domain adaptation for multi-object tracking. Video of an area of interest can be captured with an image-capturing unit. The video (e.g., video images) can be analyzed with a pre-trained object detector utilizing online domain adaptation including convex multi-task learning and an associated self-tuning stochastic optimization procedure to jointly adapt online all trackers associated with the pre-trained object detector and a pre-trained category-level model from the trackers in order to efficiently track a plurality of objects in the video captured by the image-capturing unit.

Type: Grant

Filed: May 5, 2015

Date of Patent: May 29, 2018

Assignee: Condurent Business Services, LLC

Inventors: Adrien Gaidon, Eleonora Vig
System and method for video classification using a hybrid unsupervised and supervised multi-layer architecture

Patent number: 9946933

Abstract: A computer-implemented video classification method and system are disclosed. The method includes receiving an input video including a sequence of frames. At least one transformation of the input video is generated, each transformation including a sequence of frames. For the input video and each transformation, local descriptors are extracted from the respective sequence of frames. The local descriptors of the input video and each transformation are aggregated to form an aggregated feature vector with a first set of processing layers learned using unsupervised learning. An output classification value is generated for the input video, based on the aggregated feature vector with a second set of processing layers learned using supervised learning.

Type: Grant

Filed: August 18, 2016

Date of Patent: April 17, 2018

Assignee: XEROX CORPORATION

Inventors: César Roberto De Souza, Adrien Gaidon, Eleonora Vig, Antonio M. Lopez
SYSTEM AND METHOD FOR VIDEO CLASSIFICATION USING A HYBRID UNSUPERVISED AND SUPERVISED MULTI-LAYER ARCHITECTURE

Publication number: 20180053057

Abstract: A computer-implemented video classification method and system are disclosed. The method includes receiving an input video including a sequence of frames. At least one transformation of the input video is generated, each transformation including a sequence of frames. For the input video and each transformation, local descriptors are extracted from the respective sequence of frames. The local descriptors of the input video and each transformation are aggregated to form an aggregated feature vector with a first set of processing layers learned using unsupervised learning. An output classification value is generated for the input video, based on the aggregated feature vector with a second set of processing layers learned using supervised learning.

Type: Application

Filed: August 18, 2016

Publication date: February 22, 2018

Applicant: Xerox Corporation

Inventors: César Roberto De Souza, Adrien Gaidon, Eleonora Vig, Antonio M. Lopez
Extracting gradient features from neural networks

Patent number: 9792492

Abstract: A method for extracting a representation from an image includes inputting an image to a pre-trained neural network. The gradient of a loss function is computed with respect to parameters of the neural network, for the image. A gradient representation is extracted for the image based on the computed gradients, which can be used, for example, for classification or retrieval.

Type: Grant

Filed: July 7, 2015

Date of Patent: October 17, 2017

Assignee: XEROX CORPORATION

Inventors: Albert Gordo Soldevila, Adrien Gaidon, Florent C. Perronnin
DEEP DATA ASSOCIATION FOR ONLINE MULTI-CLASS MULTI-OBJECT TRACKING

Publication number: 20170286774

Abstract: A system for applying video data to a neural network (NN) for online multi-class multi-object tracking includes a computer programed to perform an image classification method including the operations of receiving a video sequence; detecting candidate objects in each of a previous and a current video frame; transforming the previous and current video frames into a temporal difference input image; applying the temporal difference input image to a pre-trained neural network (NN) (or deep convolutional network) comprising an ordered sequence of layers; and based on a classification value received by the neural network, associating a pair of detected candidate objects in the previous and current frames as belonging to one of matching objects and different objects.

Type: Application

Filed: April 4, 2016

Publication date: October 5, 2017

Applicant: Xerox Corporation

Inventor: Adrien Gaidon
GENERATING A VIRTUAL WORLD TO ASSESS REAL-WORLD VIDEO ANALYSIS PERFORMANCE

Publication number: 20170243083

Abstract: A system and method are suited for assessing video performance analysis. A computer graphics engine clones real-world data in a virtual world by decomposing the real-world data into visual components and objects in one or more object categories and populates the virtual world with virtual visual components and virtual objects. A scripting component controls the virtual visual components and the virtual objects in the virtual world based on the set of real-world data. A synthetic clone of the video sequence is generated based on the script controlling the virtual visual components and the virtual objects. The real-world data is compared with the synthetic clone of the video sequence and a transferability of conclusions from the virtual world to the real-world is assessed based on this comparison.

Type: Application

Filed: February 23, 2016

Publication date: August 24, 2017

Applicant: Xerox Corporation

Inventors: Qiao Wang, Adrien Gaidon, Eleonora Vig
Efficient object detection with patch-level window processing

Patent number: 9697439

Abstract: An object detection method includes for each of a set of patches of an image, encoding features of the patch with a non-linear mapping function, and computing per-patch statistics based on the encoded features for approximating a window-level non-linear operation by a patch-level operation. Then, windows are extracted from the image, each window comprising a sub-set of the set of patches. Each of the windows is scored based on the computed patch statistics of the respective sub-set of patches. Objects, if any, can then be detected in the image, based on the window scores. The method and system allow the non-linear operations to be performed only at the patch level, reducing the computation time of the method, since there are generally many more windows than patches, while not impacting performance unduly, as compared to a system which performs non-linear operations at the window level.

Type: Grant

Filed: October 2, 2014

Date of Patent: July 4, 2017

Assignee: XEROX CORPORATION

Inventors: Adrien Gaidon, Diane Larlus-Larrondo, Florent C. Perronnin
EXTRACTING GRADIENT FEATURES FROM NEURAL NETWORKS

Publication number: 20170011280

Abstract: A method for extracting a representation from an image includes inputting an image to a pre-trained neural network. The gradient of a loss function is computed with respect to parameters of the neural network, for the image. A gradient representation is extracted for the image based on the computed gradients, which can be used, for example, for classification or retrieval.

Type: Application

Filed: July 7, 2015

Publication date: January 12, 2017

Applicant: Xerox Corporation

Inventors: Albert Gordo Soldevila, Adrien Gaidon, Florent C. Perronnin
ONLINE DOMAIN ADAPTATION FOR MULTI-OBJECT TRACKING

Publication number: 20160328613

Abstract: Methods and systems for online domain adaptation for multi-object tracking. Video of an area of interest can be captured with an image-capturing unit. The video (e.g., video images) can be analyzed with a pre-trained object detector utilizing online domain adaptation including convex multi-task learning and an associated self-tuning stochastic optimization procedure to jointly adapt online all trackers associated with the pre-trained object detector and a pre-trained category-level model from the trackers in order to efficiently track a plurality of objects in the video captured by the image-capturing unit.

Type: Application

Filed: May 5, 2015

Publication date: November 10, 2016

Inventors: Adrien Gaidon, Eleonora Vig
EXTENDING GENERIC BUSINESS PROCESS MANAGEMENT WITH COMPUTER VISION CAPABILITIES

Publication number: 20160314351

Abstract: A graphical user interface (GUI) of a business process management (BPM) system is provided to construct a process model that is displayed on a graphical display device as a graphical representation comprising nodes representing process events, activities, or decision points and including computer vision (CV) nodes representing video stream processing, with flow connectors defining operational sequences of nodes and data flow between nodes of the process model. The process model is executed to perform a process represented by the process model including executing CV nodes of the process model by performing video stream processing represented by the CV nodes of the process model. The available CV nodes include a set of video pattern detection nodes, and a set of video pattern relation nodes defining a video grammar of relations between video patterns detectable by the video pattern detection nodes.

Type: Application

Filed: April 27, 2015

Publication date: October 27, 2016

Inventors: Adrian Corneliu Mos, Adrien Gaidon, Eleonora Vig

prev 1 2 3 next