Patents by Inventor Bilal KARTAL

Bilal KARTAL has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Devices and methods for reinforcement learning visualization using immersive environments

Patent number: 11720792

Abstract: Disclosed are systems, methods, and devices for generating a visualization of a deep reinforcement learning (DRL) process. State data is received, reflective of states of an environment explored by an DRL agent, each state corresponding to a time step. For each given state, saliency metrics are calculated by processing the state data, each metric measuring saliency of a feature at the time step corresponding to the given state. A graphical visualization is generated, having at least two dimensions in which: each feature of the environment is graphically represented along a first axis; and each time step is represented along a second axis; and a plurality of graphical markers representing corresponding saliency metrics, each graphical marker having a size commensurate with the magnitude of the particular saliency metric represented, and a location along the first and second axes corresponding to the feature and time step for the particular saliency metric.

Type: Grant

Filed: July 31, 2020

Date of Patent: August 8, 2023

Assignee: ROYAL BANK OF CANADA

Inventors: Matthew Edmund Taylor, Bilal Kartal, Pablo Francisco Hernandez Leal, Nathan Douglas, Dianna Yim, Frank Maurer
System and method for deep reinforcement learning

Patent number: 11574148

Abstract: A computer system and method for extending parallelized asynchronous reinforcement learning for training a neural network is described in various embodiments, through coordinated operation of plurality of hardware processors or threads such that each functions as a worker agent that is configured to simultaneously interact with a target computing environment for local gradient computation based on a loss determination and to update global network parameters based at least on local gradient computation to train the neural network through modifications of weighted interconnections between interconnected computing units as gradient computation is conducted across a plurality of iterations of a target computing environment, the loss determination including at least a policy loss term (actor), a value loss term (critic), and an auxiliary control loss. Variations are described further where the neural network is adapted to include terminal state prediction and action guidance.

Type: Grant

Filed: November 5, 2019

Date of Patent: February 7, 2023

Assignee: ROYAL BANK OF CANADA

Inventors: Bilal Kartal, Pablo Francisco Hernandez Leal, Matthew Edmund Taylor
Opponent modeling with asynchronous methods in deep RL

Patent number: 11295174

Abstract: A computer system and method for extending parallelized asynchronous reinforcement learning to include agent modeling for training a neural network is described. Coordinated operation of plurality of hardware processors or threads is utilized such that each functions as a worker process that is configured to simultaneously interact with a target computing environment for local gradient computation based on a loss determination mechanism and to update global network parameters. The loss determination mechanism includes at least a policy loss term (actor), a value loss term (critic), and a supervised cross entropy loss. Variations are described further where the neural network is adapted to include a latent space to track agent policy features.

Type: Grant

Filed: November 5, 2019

Date of Patent: April 5, 2022

Assignee: ROYAL BANK OF CANADA

Inventors: Pablo Francisco Hernandez Leal, Bilal Kartal, Matthew Edmund Taylor
SYSTEM AND METHOD FOR FACILITATING EXPLAINABILITY IN REINFORCEMENT MACHINE LEARNING

Publication number: 20210312282

Abstract: Systems are methods are provided for facilitating explainability of decision-making by reinforcement learning agents. A reinforcement learning agent is instantiated which generates, via a function approximation representation, learned outputs governing its decision-making. Data records of a plurality of past inputs for the agent are stored, each of the past inputs including values of a plurality of state variables. Data records of a plurality of past learned outputs of the agent are also stored. A group definition data structure defining groups of the state variables are received. For a given past input a given group, data generated reflective of a perturbed input by altering a value of at least one state variable is generated, and are presented to the reinforcement learning agent to obtain a perturbed learned output generated by the reinforcement learning agent; and a distance metric is generated reflective of a magnitude of difference between the perturbed learned output and the past learned output.

Type: Application

Filed: April 1, 2021

Publication date: October 7, 2021

Inventors: Pablo Francisco HERNANDEZ LEAL, Ruitong HUANG, Bilal KARTAL, Changjian LI, Matthew Edmund TAYLOR, Alexander BRANDIMARTE, Pui Shing LAM
SYSTEM AND METHOD FOR UNCERTAINTY-BASED ADVICE FOR DEEP REINFORCEMENT LEARNING AGENTS

Publication number: 20210073912

Abstract: Disclosed are systems, methods, and devices for training a learning agent. A learning agent that maintains a reinforcement learning neural network is instantiated. State data reflective of a state of an environment explored by the learning agent is received. An uncertainty metric calculated upon processing the state data, the uncertainty metric measuring epistemic uncertainty of the learning agent. Upon determining that the uncertainty metric exceeds a pre-defined threshold: a request signal requesting an action suggestion from a demonstrator is sent; a suggestion signal reflective of the action suggestion is received; and an action signal to implement the action suggestion is sent.

Type: Application

Filed: September 3, 2020

Publication date: March 11, 2021

Inventors: Felipe Leno DA SILVA, Pablo Francisco HERNANDEZ LEAL, Bilal KARTAL, Matthew Edmund TAYLOR
DEVICES AND METHODS FOR REINFORCEMENT LEARNING VISUALIZATION USING IMMERSIVE ENVIRONMENTS

Publication number: 20210034974

Abstract: Disclosed are systems, methods, and devices for generating a visualization of a deep reinforcement learning (DRL) process. State data is received, reflective of states of an environment explored by an DRL agent, each state corresponding to a time step. For each given state, saliency metrics are calculated by processing the state data, each metric measuring saliency of a feature at the time step corresponding to the given state. A graphical visualization is generated, having at least two dimensions in which: each feature of the environment is graphically represented along a first axis; and each time step is represented along a second axis; and a plurality of graphical markers representing corresponding saliency metrics, each graphical marker having a size commensurate with the magnitude of the particular saliency metric represented, and a location along the first and second axes corresponding to the feature and time step for the particular saliency metric.

Type: Application

Filed: July 31, 2020

Publication date: February 4, 2021

Inventors: Matthew Edmund TAYLOR, Bilal KARTAL, Pablo Francisco HERNANDEZ LEAL, Nathan DOUGLAS, Dianna YIM, Frank MAURER
OPPONENT MODELING WITH ASYNCHRONOUS METHODS IN DEEP RL

Publication number: 20200143208

Abstract: A computer system and method for extending parallelized asynchronous reinforcement learning to include agent modeling for training a neural network is described. Coordinated operation of plurality of hardware processors or threads is utilized such that each functions as a worker process that is configured to simultaneously interact with a target computing environment for local gradient computation based on a loss determination mechanism and to update global network parameters. The loss determination mechanism includes at least a policy loss term (actor), a value loss term (critic), and a supervised cross entropy loss. Variations are described further where the neural network is adapted to include a latent space to track agent policy features.

Type: Application

Filed: November 5, 2019

Publication date: May 7, 2020

Inventors: Pablo Francisco HERNANDEZ LEAL, Bilal KARTAL, Matthew Edmund TAYLOR
SYSTEM AND METHOD FOR DEEP REINFORCEMENT LEARNING

Publication number: 20200143206

Abstract: A computer system and method for extending parallelized asynchronous reinforcement learning for training a neural network is described in various embodiments, through coordinated operation of plurality of hardware processors or threads such that each functions as a worker agent that is configured to simultaneously interact with a target computing environment for local gradient computation based on a loss determination and to update global network parameters based at least on local gradient computation to train the neural network through modifications of weighted interconnections between interconnected computing units as gradient computation is conducted across a plurality of iterations of a target computing environment, the loss determination including at least a policy loss term (actor), a value loss term (critic), and an auxiliary control loss. Variations are described further where the neural network is adapted to include terminal state prediction and action guidance.

Type: Application

Filed: November 5, 2019

Publication date: May 7, 2020

Inventors: Bilal KARTAL, Pablo Francisco HERNANDEZ LEAL, Matthew Edmund TAYLOR