Patents by Inventor Harm Hendrik VAN SEIJEN

Harm Hendrik VAN SEIJEN has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SECURE EXPLORATION FOR REINFORCEMENT LEARNING

Publication number: 20230199031

Abstract: A secured exploration agent for reinforcement learning (RL) is provided. Securitizing an exploration agent includes training the exploration agent to avoid dead-end states and dead-end trajectories. During training, the exploration agent “learns” to identify and avoid dead-end states of a Markov Decision Process (MDP). The secured exploration agent is utilized to safely and efficiently explore the environment, while significantly reducing the training time, as well as the cost and safety concerns associated with conventional RL. The secured exploration agent is employed to guide the behavior of a corresponding exploitation agent. During training, a policy of the exploration agent is iteratively updated to reflect an estimated probability that a state is a dead-end state. The probability, via the exploration policy, that the exploration agent chooses an action that results in a transition to a dead-end state is reduced to reflect the estimated probability that the state is a dead-end state.

Type: Application

Filed: February 17, 2023

Publication date: June 22, 2023

Inventors: Harm Hendrik VAN SEIJEN, Seyed Mehdi FATEMI BOOSHEHRI
Secure exploration for reinforcement learning

Patent number: 11616813

Abstract: A secured exploration agent for reinforcement learning (RL) is provided. Securitizing an exploration agent includes training the exploration agent to avoid dead-end states and dead-end trajectories. During training, the exploration agent “learns” to identify and avoid dead-end states of a Markov Decision Process (MDP). The secured exploration agent is utilized to safely and efficiently explore the environment, while significantly reducing the training time, as well as the cost and safety concerns associated with conventional RL. The secured exploration agent is employed to guide the behavior of a corresponding exploitation agent. During training, a policy of the exploration agent is iteratively updated to reflect an estimated probability that a state is a dead-end state. The probability, via the exploration policy, that the exploration agent chooses an action that results in a transition to a dead-end state is reduced to reflect the estimated probability that the state is a dead-end state.

Type: Grant

Filed: August 28, 2019

Date of Patent: March 28, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Harm Hendrik Van Seijen, Seyed Mehdi Fatemi Booshehri
Hybrid reward architecture for reinforcement learning

Patent number: 10977551

Abstract: Aspects provided herein are relevant to machine learning techniques, including decomposing single-agent reinforcement learning problems into simpler problems addressed by multiple agents. Actions proposed by the multiple agents are then aggregated using an aggregator, which selects an action to take with respect to an environment. Aspects provided herein are also relevant to a hybrid reward model.

Type: Grant

Filed: June 27, 2017

Date of Patent: April 13, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Harm Hendrik Van Seijen, Seyed Mehdi Fatemi Booshehri, Romain Michel Henri Laroche, Joshua Samuel Romoff
SECURE EXPLORATION FOR REINFORCEMENT LEARNING

Publication number: 20200076857

Abstract: A secured exploration agent for reinforcement learning (RL) is provided. Securitizing an exploration agent includes training the exploration agent to avoid dead-end states and dead-end trajectories. During training, the exploration agent “learns” to identify and avoid dead-end states of a Markov Decision Process (MDP). The secured exploration agent is utilized to safely and efficiently explore the environment, while significantly reducing the training time, as well as the cost and safety concerns associated with conventional RL. The secured exploration agent is employed to guide the behavior of a corresponding exploitation agent. During training, a policy of the exploration agent is iteratively updated to reflect an estimated probability that a state is a dead-end state. The probability, via the exploration policy, that the exploration agent chooses an action that results in a transition to a dead-end state is reduced to reflect the estimated probability that the state is a dead-end state.

Type: Application

Filed: August 28, 2019

Publication date: March 5, 2020

Inventors: Harm Hendrik VAN SEIJEN, Seyed Mehdi FATEMI BOOSHEHRI
HYBRID REWARD ARCHITECTURE FOR REINFORCEMENT LEARNING

Publication number: 20180165603

Abstract: Aspects provided herein are relevant to machine learning techniques, including decomposing single-agent reinforcement learning problems into simpler problems addressed by multiple agents. Actions proposed by the multiple agents are then aggregated using an aggregator, which selects an action to take with respect to an environment. Aspects provided herein are also relevant to a hybrid reward model.

Type: Application

Filed: June 27, 2017

Publication date: June 14, 2018

Applicant: Microsoft Technology Licensing, LLC

Inventors: Harm Hendrik VAN SEIJEN, Seyed Mehdi FATEMI BOOSHEHRI, Romain Michel Henri LAROCHE, Joshua Samuel ROMOFF
SCALABILITY OF REINFORCEMENT LEARNING BY SEPARATION OF CONCERNS

Publication number: 20180165602

Abstract: Aspects provided herein are relevant to machine learning techniques, including decomposing single-agent reinforcement learning problems into simpler problems addressed by multiple agents. Actions proposed by the multiple agents are then aggregated using an aggregator, which selects an action to take with respect to an environment. Aspects provided herein are also relevant to a hybrid reward model.

Type: Application

Filed: June 27, 2017

Publication date: June 14, 2018

Applicant: Microsoft Technology Licensing, LLC

Inventors: Harm Hendrik VAN SEIJEN, Seyed Mehdi FATEMI BOOSHEHRI, Romain Michel Henri LAROCHE, Joshua Samuel ROMOFF

SECURE EXPLORATION FOR REINFORCEMENT LEARNING

Secure exploration for reinforcement learning

Hybrid reward architecture for reinforcement learning

SECURE EXPLORATION FOR REINFORCEMENT LEARNING

HYBRID REWARD ARCHITECTURE FOR REINFORCEMENT LEARNING

SCALABILITY OF REINFORCEMENT LEARNING BY SEPARATION OF CONCERNS