Patents by Inventor Bilal KARTAL

Bilal KARTAL has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11720792
    Abstract: Disclosed are systems, methods, and devices for generating a visualization of a deep reinforcement learning (DRL) process. State data is received, reflective of states of an environment explored by an DRL agent, each state corresponding to a time step. For each given state, saliency metrics are calculated by processing the state data, each metric measuring saliency of a feature at the time step corresponding to the given state. A graphical visualization is generated, having at least two dimensions in which: each feature of the environment is graphically represented along a first axis; and each time step is represented along a second axis; and a plurality of graphical markers representing corresponding saliency metrics, each graphical marker having a size commensurate with the magnitude of the particular saliency metric represented, and a location along the first and second axes corresponding to the feature and time step for the particular saliency metric.
    Type: Grant
    Filed: July 31, 2020
    Date of Patent: August 8, 2023
    Assignee: ROYAL BANK OF CANADA
    Inventors: Matthew Edmund Taylor, Bilal Kartal, Pablo Francisco Hernandez Leal, Nathan Douglas, Dianna Yim, Frank Maurer
  • Patent number: 11574148
    Abstract: A computer system and method for extending parallelized asynchronous reinforcement learning for training a neural network is described in various embodiments, through coordinated operation of plurality of hardware processors or threads such that each functions as a worker agent that is configured to simultaneously interact with a target computing environment for local gradient computation based on a loss determination and to update global network parameters based at least on local gradient computation to train the neural network through modifications of weighted interconnections between interconnected computing units as gradient computation is conducted across a plurality of iterations of a target computing environment, the loss determination including at least a policy loss term (actor), a value loss term (critic), and an auxiliary control loss. Variations are described further where the neural network is adapted to include terminal state prediction and action guidance.
    Type: Grant
    Filed: November 5, 2019
    Date of Patent: February 7, 2023
    Assignee: ROYAL BANK OF CANADA
    Inventors: Bilal Kartal, Pablo Francisco Hernandez Leal, Matthew Edmund Taylor
  • Patent number: 11295174
    Abstract: A computer system and method for extending parallelized asynchronous reinforcement learning to include agent modeling for training a neural network is described. Coordinated operation of plurality of hardware processors or threads is utilized such that each functions as a worker process that is configured to simultaneously interact with a target computing environment for local gradient computation based on a loss determination mechanism and to update global network parameters. The loss determination mechanism includes at least a policy loss term (actor), a value loss term (critic), and a supervised cross entropy loss. Variations are described further where the neural network is adapted to include a latent space to track agent policy features.
    Type: Grant
    Filed: November 5, 2019
    Date of Patent: April 5, 2022
    Assignee: ROYAL BANK OF CANADA
    Inventors: Pablo Francisco Hernandez Leal, Bilal Kartal, Matthew Edmund Taylor
  • Publication number: 20210312282
    Abstract: Systems are methods are provided for facilitating explainability of decision-making by reinforcement learning agents. A reinforcement learning agent is instantiated which generates, via a function approximation representation, learned outputs governing its decision-making. Data records of a plurality of past inputs for the agent are stored, each of the past inputs including values of a plurality of state variables. Data records of a plurality of past learned outputs of the agent are also stored. A group definition data structure defining groups of the state variables are received. For a given past input a given group, data generated reflective of a perturbed input by altering a value of at least one state variable is generated, and are presented to the reinforcement learning agent to obtain a perturbed learned output generated by the reinforcement learning agent; and a distance metric is generated reflective of a magnitude of difference between the perturbed learned output and the past learned output.
    Type: Application
    Filed: April 1, 2021
    Publication date: October 7, 2021
    Inventors: Pablo Francisco HERNANDEZ LEAL, Ruitong HUANG, Bilal KARTAL, Changjian LI, Matthew Edmund TAYLOR, Alexander BRANDIMARTE, Pui Shing LAM
  • Publication number: 20210073912
    Abstract: Disclosed are systems, methods, and devices for training a learning agent. A learning agent that maintains a reinforcement learning neural network is instantiated. State data reflective of a state of an environment explored by the learning agent is received. An uncertainty metric calculated upon processing the state data, the uncertainty metric measuring epistemic uncertainty of the learning agent. Upon determining that the uncertainty metric exceeds a pre-defined threshold: a request signal requesting an action suggestion from a demonstrator is sent; a suggestion signal reflective of the action suggestion is received; and an action signal to implement the action suggestion is sent.
    Type: Application
    Filed: September 3, 2020
    Publication date: March 11, 2021
    Inventors: Felipe Leno DA SILVA, Pablo Francisco HERNANDEZ LEAL, Bilal KARTAL, Matthew Edmund TAYLOR
  • Publication number: 20210034974
    Abstract: Disclosed are systems, methods, and devices for generating a visualization of a deep reinforcement learning (DRL) process. State data is received, reflective of states of an environment explored by an DRL agent, each state corresponding to a time step. For each given state, saliency metrics are calculated by processing the state data, each metric measuring saliency of a feature at the time step corresponding to the given state. A graphical visualization is generated, having at least two dimensions in which: each feature of the environment is graphically represented along a first axis; and each time step is represented along a second axis; and a plurality of graphical markers representing corresponding saliency metrics, each graphical marker having a size commensurate with the magnitude of the particular saliency metric represented, and a location along the first and second axes corresponding to the feature and time step for the particular saliency metric.
    Type: Application
    Filed: July 31, 2020
    Publication date: February 4, 2021
    Inventors: Matthew Edmund TAYLOR, Bilal KARTAL, Pablo Francisco HERNANDEZ LEAL, Nathan DOUGLAS, Dianna YIM, Frank MAURER
  • Publication number: 20200143208
    Abstract: A computer system and method for extending parallelized asynchronous reinforcement learning to include agent modeling for training a neural network is described. Coordinated operation of plurality of hardware processors or threads is utilized such that each functions as a worker process that is configured to simultaneously interact with a target computing environment for local gradient computation based on a loss determination mechanism and to update global network parameters. The loss determination mechanism includes at least a policy loss term (actor), a value loss term (critic), and a supervised cross entropy loss. Variations are described further where the neural network is adapted to include a latent space to track agent policy features.
    Type: Application
    Filed: November 5, 2019
    Publication date: May 7, 2020
    Inventors: Pablo Francisco HERNANDEZ LEAL, Bilal KARTAL, Matthew Edmund TAYLOR
  • Publication number: 20200143206
    Abstract: A computer system and method for extending parallelized asynchronous reinforcement learning for training a neural network is described in various embodiments, through coordinated operation of plurality of hardware processors or threads such that each functions as a worker agent that is configured to simultaneously interact with a target computing environment for local gradient computation based on a loss determination and to update global network parameters based at least on local gradient computation to train the neural network through modifications of weighted interconnections between interconnected computing units as gradient computation is conducted across a plurality of iterations of a target computing environment, the loss determination including at least a policy loss term (actor), a value loss term (critic), and an auxiliary control loss. Variations are described further where the neural network is adapted to include terminal state prediction and action guidance.
    Type: Application
    Filed: November 5, 2019
    Publication date: May 7, 2020
    Inventors: Bilal KARTAL, Pablo Francisco HERNANDEZ LEAL, Matthew Edmund TAYLOR