Patents by Inventor Alireza Nakhaei Sarvedani

Alireza Nakhaei Sarvedani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Cooperative multi-goal, multi-agent, multi-stage reinforcement learning

Patent number: 11657266

Abstract: According to one aspect, cooperative multi-goal, multi-agent, multi-stage (CM3) reinforcement learning may include training a first agent using a first policy gradient and a first critic using a first loss function to learn goals in a single-agent environment using a Markov decision process, training a number of agents based on the first policy gradient and a second policy gradient and a second critic based on the first loss function and a second loss function to learn cooperation between the agents in a multi-agent environment using a Markov game to instantiate a second agent neural network, each of the agents instantiated with the first agent neural network in a pre-trained fashion, and generating a CM3 network policy based on the first agent neural network and the second agent neural network. The CM3 network policy may be implemented in a CM3 based autonomous vehicle to facilitate autonomous driving.

Type: Grant

Filed: November 16, 2018

Date of Patent: May 23, 2023

Assignee: HONDA MOTOR CO., LTD.

Inventors: Jiachen Yang, Alireza Nakhaei Sarvedani, David Francis Isele, Kikuo Fujimura
System and method for multi-agent reinforcement learning with periodic parameter sharing

Patent number: 11657251

Abstract: A system and method for multi-agent reinforcement learning with periodic parameter sharing that include inputting at least one occupancy grid to a convolutional neural network (CNN) and at least one vehicle dynamic parameter into a first fully connected layer and concatenating outputs of the CNN and the first fully connected layer. The system and method also include providing Q value estimates for agent actions based on processing of the concatenated outputs and choosing at least one autonomous action to be executed by at least one of: an ego agent and a target agent. The system and method further include processing a multi-agent policy that accounts for operation of the ego agent and the target agent with respect to one another within a multi-agent environment based on the at least one autonomous action to be executed by at least one of: the ego agent and the target agent.

Type: Grant

Filed: November 11, 2019

Date of Patent: May 23, 2023

Assignee: HONDA MOTOR CO., LTD.

Inventors: Alireza Nakhaei Sarvedani, Kikuo Fujimura, Safa Cicek
Probabilistic-based lane-change decision making and motion planning system and method thereof

Patent number: 11608067

Abstract: A system and method for providing probabilistic-based lane-change decision making and motion planning that include receiving data associated with a roadway environment of an ego vehicle. The system and method also include performing gap analysis to determine at least one gap between neighboring vehicles that are traveling within the target lane to filter out an optimal merging entrance for the ego vehicle to merge into the target lane and determining a probability value associated with an intention of a driver of a following neighboring vehicle to yield to allow the ego vehicle to merge into the target lane. The system and method further include controlling the ego vehicle to autonomously continue traveling within the current lane or autonomously merge from current lane to the target lane based on at least one of: if the optimal merging entrance is filtered out and if the probability value indicates an intention of the driver to yield.

Type: Grant

Filed: September 11, 2020

Date of Patent: March 21, 2023

Assignee: HONDA MOTOR CO., LTD.

Inventors: Peng Xu, Alireza Nakhaei Sarvedani, Kikuo Fujimura
System and method for providing cooperation-aware lane change control in dense traffic

Patent number: 11608083

Abstract: A system and method for providing cooperation-aware lane change control in dense traffic that include receiving vehicle dynamic data associated with an ego vehicle and receiving environment data associated with a surrounding environment of the ego vehicle. The system and method also include utilizing a controller that includes an analyzer to analyze the vehicle dynamic data and a recurrent neural network to analyze the environment data. The system and method further include executing a heuristic algorithm that sequentially evaluates the future states of the ego vehicle and the predicted interactive motions of the surrounding vehicles to promote the cooperation-aware lane change control in the dense traffic.

Type: Grant

Filed: April 9, 2020

Date of Patent: March 21, 2023

Assignee: HONDA MOTOR CO., LTD.

Inventors: Alireza Nakhaei Sarvedani, Kikuo Fujimura, Chiho Choi, Sangjae Bae, Dhruv Mauria Saxena
Model-free reinforcement learning

Patent number: 11465650

Abstract: A system for generating a model-free reinforcement learning policy may include a processor, a memory, and a simulator. The simulator may be implemented via the processor and the memory. The simulator may generate a simulated traffic scenario including two or more lanes, an ego-vehicle, a dead end position, and one or more traffic participants. The dead end position may be a position by which a lane change for the ego-vehicle may be desired. The simulated traffic scenario may be associated with an occupancy map, a relative velocity map, a relative displacement map, and a relative heading map at each time step within the simulated traffic scenario. The simulator may model the ego-vehicle and one or more of the traffic participants using a kinematic bicycle model. The simulator may build a policy based on the simulated traffic scenario using an actor-critic network. The policy may be implemented on an autonomous vehicle.

Type: Grant

Filed: April 6, 2020

Date of Patent: October 11, 2022

Assignee: HONDA MOTOR CO., LTD.

Inventors: Dhruv Mauria Saxena, Sangjae Bae, Alireza Nakhaei Sarvedani, Kikuo Fujimura
PROBABILISTIC-BASED LANE-CHANGE DECISION MAKING AND MOTION PLANNING SYSTEM AND METHOD THEREOF

Publication number: 20220048513

Abstract: A system and method for providing probabilistic-based lane-change decision making and motion planning that include receiving data associated with a roadway environment of an ego vehicle. The system and method also include performing gap analysis to determine at least one gap between neighboring vehicles that are traveling within the target lane to filter out an optimal merging entrance for the ego vehicle to merge into the target lane and determining a probability value associated with an intention of a driver of a following neighboring vehicle to yield to allow the ego vehicle to merge into the target lane. The system and method further include controlling the ego vehicle to autonomously continue traveling within the current lane or autonomously merge from current lane to the target lane based on at least one of: if the optimal merging entrance is filtered out and if the probability value indicates an intention of the driver to yield.

Type: Application

Filed: September 11, 2020

Publication date: February 17, 2022

Inventors: Peng Xu, Alireza Nakhaei Sarvedani, Kikuo Fujimura
Reinforcement learning with scene decomposition for navigating complex environments

Patent number: 11242050

Abstract: Systems and methods for providing navigation to a vehicle may include receiving observation data from one or more sensors of the vehicle, generating projection data corresponding to the one or more traffic participants based on the observation data for each time step within a time period, and predicting interactions between the vehicle, the one or more traffic participants, and the one or more obstacles, based on the projection data of the one or more traffic participants. The systems and methods may further include determining a set of actions by the vehicle corresponding to a probability of the vehicle safely arriving at a target location based on the predicted interactions, and selecting one or more actions from the set of actions and provide the one or more actions to a navigation system of the vehicle, wherein the navigation system uses the navigation data to provide navigation instructions to the vehicle.

Type: Grant

Filed: May 29, 2019

Date of Patent: February 8, 2022

Assignee: HONDA MOTOR CO., LTD.

Inventors: Maxime Bouton, Alireza Nakhaei Sarvedani, Kikuo Fujimura, Mykel John Kochenderfer
REINFORCEMENT LEARNING WITH ITERATIVE REASONING FOR MERGING IN DENSE TRAFFIC

Publication number: 20210271988

Abstract: According to one aspect, a system for reinforcement learning with iterative reasoning may include a memory for storing computer readable code and a processor operatively coupled to the memory, the processor configured to receive a level-0 policy and a desired reasoning level n. The processor may repeat for k=1 . . . n times, the following: populate a training environment with a level-(k?1) first agent, populate the training environment with a level-(k?1) second agent, and train a level-k agent based on the level-(k?1) first agent and the level-(k?1) second agent to derive a level-k policy.

Type: Application

Filed: July 28, 2020

Publication date: September 2, 2021

Inventors: Maxime Bouton, David Francis Isele, Alireza Nakhaei Sarvedani, Mykel Kochenderfer, Kikuo Fujimura
Interaction-aware decision making

Patent number: 11093829

Abstract: Interaction-aware decision making may include training a first agent based on a first policy gradient, training a first critic based on a first loss function to learn goals in a single-agent environment using a Markov decision process, training a number N of agents based on the first policy gradient, training a second policy gradient and a second critic based on the first loss function and a second loss function to learn goals in a multi-agent environment using a Markov game to instantiate a second agent neural network, and generating an interaction-aware decision making network policy based on the first agent neural network and the second agent neural network. The N number of agents may be associated with a driver type indicative of a level of cooperation. When a collision occurs, a negative reward or penalty may be assigned to each agent involved based on a lane priority level of respective agents.

Type: Grant

Filed: April 29, 2019

Date of Patent: August 17, 2021

Assignee: Honda Motor Co., Ltd.

Inventors: Yeping Hu, Alireza Nakhaei Sarvedani, Masayoshi Tomizuka, Kikuo Fujimura
Reinforcement learning on autonomous vehicles

Patent number: 10990096

Abstract: The present disclosure generally relates to methods and systems for controlling an autonomous vehicle. The vehicle may collect scenario information from one or more sensors mounted on a vehicle. The vehicle may determine a high-level option for a fixed time horizon based on the scenario information. The vehicle may apply a prediction algorithm to the high-level option to mask undesired low-level behaviors for completing the high-level option where a collision is predicted to occur. The vehicle may evaluate a restricted subspace of low-level behaviors using a reinforcement learning system. The vehicle may control the vehicle to perform the high-level option by executing a low-level behavior selected from the restricted subspace. The vehicle may adjust the reinforcement learning system by evaluating a metric of the executed low-level behavior.

Type: Grant

Filed: April 27, 2018

Date of Patent: April 27, 2021

Assignee: HONDA MOTOR CO., LTD.

Inventors: David Isele, Alireza Nakhaei Sarvedani, Kikuo Fujimura
MODEL-FREE REINFORCEMENT LEARNING

Publication number: 20210086798

Abstract: A system for generating a model-free reinforcement learning policy may include a processor, a memory, and a simulator. The simulator may be implemented via the processor and the memory. The simulator may generate a simulated traffic scenario including two or more lanes, an ego-vehicle, a dead end position, and one or more traffic participants. The dead end position may be a position by which a lane change for the ego-vehicle may be desired. The simulated traffic scenario may be associated with an occupancy map, a relative velocity map, a relative displacement map, and a relative heading map at each time step within the simulated traffic scenario. The simulator may model the ego-vehicle and one or more of the traffic participants using a kinematic bicycle model. The simulator may build a policy based on the simulated traffic scenario using an actor-critic network. The policy may be implemented on an autonomous vehicle.

Type: Application

Filed: April 6, 2020

Publication date: March 25, 2021

Inventors: Dhruv Mauria Saxena, Sangjae Bae, Alireza Nakhaei Sarvedani, Kikuo Fujimura
SYSTEM AND METHOD FOR PROVIDING COOPERATION-AWARE LANE CHANGE CONTROL IN DENSE TRAFFIC

Publication number: 20210078603

Abstract: A system and method for providing cooperation-aware lane change control in dense traffic that include receiving vehicle dynamic data associated with an ego vehicle and receiving environment data associated with a surrounding environment of the ego vehicle. The system and method also include utilizing a controller that includes an analyzer to analyze the vehicle dynamic data and a recurrent neural network to analyze the environment data. The system and method further include executing a heuristic algorithm that sequentially evaluates the future states of the ego vehicle and the predicted interactive motions of the surrounding vehicles to promote the cooperation-aware lane change control in the dense traffic.

Type: Application

Filed: April 9, 2020

Publication date: March 18, 2021

Inventors: Alireza Nakhaei Sarvedani, Kikuo Fujimura, Chiho Choi, Sangjae Bae, Dhruv Mauria Saxena
Utility decomposition with deep corrections

Patent number: 10795360

Abstract: One or more aspects of utility decomposition with deep corrections are described herein. An entity may be detected within an environment through which an autonomous vehicle is travelling. The entity may be associated with a current velocity and a current position. The autonomous vehicle may be associated with a current position and a current velocity. Additionally, the autonomous vehicle may have a target position or desired destination. A Partially Observable Markov Decision Process (POMDP) model may be built based on the current velocities and current positions of different entities and the autonomous vehicle. Utility decomposition may be performed to break tasks or problems down into sub-tasks or sub-problems. A correction term may be generated using multi-fidelity modeling. A driving parameter may be implemented for a component of the autonomous vehicle based on the POMDP model and the correction term to operate the autonomous vehicle autonomously.

Type: Grant

Filed: April 6, 2018

Date of Patent: October 6, 2020

Assignee: Honda Motor Co., Ltd.

Inventor: Alireza Nakhaei Sarvedani
Autonomous vehicle policy generation

Patent number: 10739776

Abstract: According to one aspect, an autonomous vehicle policy generation system may include a state input generator generating a set of attributes associated with an autonomous vehicle undergoing training, a traffic simulator simulating a simulation environment including the autonomous vehicle, a roadway associated with a number of lanes, and another vehicle within the simulation environment, a Q-masker determining a mask to be applied to a subset of a set of possible actions for the autonomous vehicle for a time interval, and an action generator exploring a remaining set of actions from the set of possible actions and determining an autonomous vehicle policy for the time interval based on the remaining set of actions and the set of attributes associated with the autonomous vehicle.

Type: Grant

Filed: August 14, 2018

Date of Patent: August 11, 2020

Assignee: Honda Motor Co., Ltd.

Inventors: Mustafa Mukadam, Alireza Nakhaei Sarvedani, Akansel Cosgun, Kikuo Fujimura
Keyframe based autonomous vehicle operation

Patent number: 10739774

Abstract: According to one aspect, keyframe based autonomous vehicle operation may include collecting vehicle state information and collecting environment state information. A size of an object within the environment, a distance between the object and the autonomous vehicle, and a lane structure of the environment through which the autonomous vehicle is travelling may be determined. A matching keyframe model may be selected based on the size of the object, the distance from the object to the autonomous vehicle, the lane structure of the environment, and the vehicle state information. Suggested limits for a driving parameter associated with autonomous vehicle operation may be generated based on the selected keyframe model. The autonomous vehicle may be commanded to operate autonomously according to the suggested limits for the driving parameter.

Type: Grant

Filed: April 6, 2018

Date of Patent: August 11, 2020

Assignee: Honda Motor Co., Ltd.

Inventors: Priyam Parashar, Kikuo Fujimura, Alireza Nakhaei Sarvedani, Akansel Cosgun
REINFORCEMENT LEARNING WITH SCENE DECOMPOSITION FOR NAVIGATING COMPLEX ENVIRONMENTS

Publication number: 20200247402

Abstract: Systems and methods for providing navigation to a vehicle may include receiving observation data from one or more sensors of the vehicle, generating projection data corresponding to the one or more traffic participants based on the observation data for each time step within a time period, and predicting interactions between the vehicle, the one or more traffic participants, and the one or more obstacles, based on the projection data of the one or more traffic participants. The systems and methods may further include determining a set of actions by the vehicle corresponding to a probability of the vehicle safely arriving at a target location based on the predicted interactions, and selecting one or more actions from the set of actions and provide the one or more actions to a navigation system of the vehicle, wherein the navigation system uses the navigation data to provide navigation instructions to the vehicle.

Type: Application

Filed: May 29, 2019

Publication date: August 6, 2020

Inventors: Maxime Bouton, Alireza Nakhaei Sarvedani, Kikuo Fujimura, Mykel John Kochenderfer
COOPERATIVE MULTI-GOAL, MULTI-AGENT, MULTI-STAGE REINFORCEMENT LEARNING

Publication number: 20200160168

Abstract: According to one aspect, cooperative multi-goal, multi-agent, multi-stage (CM3) reinforcement learning may include training a first agent using a first policy gradient and a first critic using a first loss function to learn goals in a single-agent environment using a Markov decision process, training a number of agents based on the first policy gradient and a second policy gradient and a second critic based on the first loss function and a second loss function to learn cooperation between the agents in a multi-agent environment using a Markov game to instantiate a second agent neural network, each of the agents instantiated with the first agent neural network in a pre-trained fashion, and generating a CM3 network policy based on the first agent neural network and the second agent neural network. The CM3 network policy may be implemented in a CM3 based autonomous vehicle to facilitate autonomous driving.

Type: Application

Filed: November 16, 2018

Publication date: May 21, 2020

Inventors: Jiachen Yang, Alireza Nakhaei Sarvedani, David Francis Isele, Kikuo Fujimura
SYSTEM AND METHOD FOR MULTI-AGENT REINFORCEMENT LEARNING WITH PERIODIC PARAMETER SHARING

Publication number: 20200151564

Abstract: A system and method for multi-agent reinforcement learning with periodic parameter sharing that include inputting at least one occupancy grid to a convolutional neural network (CNN) and at least one vehicle dynamic parameter into a first fully connected layer and concatenating outputs of the CNN and the first fully connected layer. The system and method also include providing Q value estimates for agent actions based on processing of the concatenated outputs and choosing at least one autonomous action to be executed by at least one of: an ego agent and a target agent. The system and method further include processing a multi-agent policy that accounts for operation of the ego agent and the target agent with respect to one another within a multi-agent environment based on the at least one autonomous action to be executed by at least one of: the ego agent and the target agent.

Type: Application

Filed: November 11, 2019

Publication date: May 14, 2020

Inventors: Alireza Nakhaei Sarvedani, Kikuo Fujimura, Safa Cicek
REINFORCEMENT LEARNING ON AUTONOMOUS VEHICLES

Publication number: 20190332110

Abstract: The present disclosure generally relates to methods and systems for controlling an autonomous vehicle. The vehicle may collect scenario information from one or more sensors mounted on a vehicle. The vehicle may determine a high-level option for a fixed time horizon based on the scenario information. The vehicle may apply a prediction algorithm to the high-level option to mask undesired low-level behaviors for completing the high-level option where a collision is predicted to occur. The vehicle may evaluate a restricted subspace of low-level behaviors using a reinforcement learning system. The vehicle may control the vehicle to perform the high-level option by executing a low-level behavior selected from the restricted subspace. The vehicle may adjust the reinforcement learning system by evaluating a metric of the executed low-level behavior.

Type: Application

Filed: April 27, 2018

Publication date: October 31, 2019

Inventors: David ISELE, Alireza NAKHAEI SARVEDANI, Kikuo FUJIMURA
UTILITY DECOMPOSITION WITH DEEP CORRECTIONS

Publication number: 20190310632

Abstract: One or more aspects of utility decomposition with deep corrections are described herein. An entity may be detected within an environment through which an autonomous vehicle is travelling. The entity may be associated with a current velocity and a current position. The autonomous vehicle may be associated with a current position and a current velocity. Additionally, the autonomous vehicle may have a target position or desired destination. A Partially Observable Markov Decision Process (POMDP) model may be built based on the current velocities and current positions of different entities and the autonomous vehicle. Utility decomposition may be performed to break tasks or problems down into sub-tasks or sub-problems. A correction term may be generated using multi-fidelity modeling. A driving parameter may be implemented for a component of the autonomous vehicle based on the POMDP model and the correction term to operate the autonomous vehicle autonomously.

Type: Application

Filed: April 6, 2018

Publication date: October 10, 2019

Inventor: Alireza Nakhaei Sarvedani

1 2 next