Patents by Inventor Matthew Edmund TAYLOR

Matthew Edmund TAYLOR has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11720792
    Abstract: Disclosed are systems, methods, and devices for generating a visualization of a deep reinforcement learning (DRL) process. State data is received, reflective of states of an environment explored by an DRL agent, each state corresponding to a time step. For each given state, saliency metrics are calculated by processing the state data, each metric measuring saliency of a feature at the time step corresponding to the given state. A graphical visualization is generated, having at least two dimensions in which: each feature of the environment is graphically represented along a first axis; and each time step is represented along a second axis; and a plurality of graphical markers representing corresponding saliency metrics, each graphical marker having a size commensurate with the magnitude of the particular saliency metric represented, and a location along the first and second axes corresponding to the feature and time step for the particular saliency metric.
    Type: Grant
    Filed: July 31, 2020
    Date of Patent: August 8, 2023
    Assignee: ROYAL BANK OF CANADA
    Inventors: Matthew Edmund Taylor, Bilal Kartal, Pablo Francisco Hernandez Leal, Nathan Douglas, Dianna Yim, Frank Maurer
  • Patent number: 11574148
    Abstract: A computer system and method for extending parallelized asynchronous reinforcement learning for training a neural network is described in various embodiments, through coordinated operation of plurality of hardware processors or threads such that each functions as a worker agent that is configured to simultaneously interact with a target computing environment for local gradient computation based on a loss determination and to update global network parameters based at least on local gradient computation to train the neural network through modifications of weighted interconnections between interconnected computing units as gradient computation is conducted across a plurality of iterations of a target computing environment, the loss determination including at least a policy loss term (actor), a value loss term (critic), and an auxiliary control loss. Variations are described further where the neural network is adapted to include terminal state prediction and action guidance.
    Type: Grant
    Filed: November 5, 2019
    Date of Patent: February 7, 2023
    Assignee: ROYAL BANK OF CANADA
    Inventors: Bilal Kartal, Pablo Francisco Hernandez Leal, Matthew Edmund Taylor
  • Patent number: 11308401
    Abstract: Systems, methods, and computer readable media directed to interactive reinforcement learning with dynamic reuse of prior knowledge are described in various embodiments. The interactive reinforcement learning is adapted for providing computer implemented systems for dynamic action selection based on confidence levels associated with demonstrator data or portions thereof.
    Type: Grant
    Filed: January 31, 2019
    Date of Patent: April 19, 2022
    Assignee: ROYAL BANK OF CANADA
    Inventors: Matthew Edmund Taylor, Zhaodong Wang
  • Patent number: 11295174
    Abstract: A computer system and method for extending parallelized asynchronous reinforcement learning to include agent modeling for training a neural network is described. Coordinated operation of plurality of hardware processors or threads is utilized such that each functions as a worker process that is configured to simultaneously interact with a target computing environment for local gradient computation based on a loss determination mechanism and to update global network parameters. The loss determination mechanism includes at least a policy loss term (actor), a value loss term (critic), and a supervised cross entropy loss. Variations are described further where the neural network is adapted to include a latent space to track agent policy features.
    Type: Grant
    Filed: November 5, 2019
    Date of Patent: April 5, 2022
    Assignee: ROYAL BANK OF CANADA
    Inventors: Pablo Francisco Hernandez Leal, Bilal Kartal, Matthew Edmund Taylor
  • Publication number: 20210312282
    Abstract: Systems are methods are provided for facilitating explainability of decision-making by reinforcement learning agents. A reinforcement learning agent is instantiated which generates, via a function approximation representation, learned outputs governing its decision-making. Data records of a plurality of past inputs for the agent are stored, each of the past inputs including values of a plurality of state variables. Data records of a plurality of past learned outputs of the agent are also stored. A group definition data structure defining groups of the state variables are received. For a given past input a given group, data generated reflective of a perturbed input by altering a value of at least one state variable is generated, and are presented to the reinforcement learning agent to obtain a perturbed learned output generated by the reinforcement learning agent; and a distance metric is generated reflective of a magnitude of difference between the perturbed learned output and the past learned output.
    Type: Application
    Filed: April 1, 2021
    Publication date: October 7, 2021
    Inventors: Pablo Francisco HERNANDEZ LEAL, Ruitong HUANG, Bilal KARTAL, Changjian LI, Matthew Edmund TAYLOR, Alexander BRANDIMARTE, Pui Shing LAM
  • Publication number: 20210073912
    Abstract: Disclosed are systems, methods, and devices for training a learning agent. A learning agent that maintains a reinforcement learning neural network is instantiated. State data reflective of a state of an environment explored by the learning agent is received. An uncertainty metric calculated upon processing the state data, the uncertainty metric measuring epistemic uncertainty of the learning agent. Upon determining that the uncertainty metric exceeds a pre-defined threshold: a request signal requesting an action suggestion from a demonstrator is sent; a suggestion signal reflective of the action suggestion is received; and an action signal to implement the action suggestion is sent.
    Type: Application
    Filed: September 3, 2020
    Publication date: March 11, 2021
    Inventors: Felipe Leno DA SILVA, Pablo Francisco HERNANDEZ LEAL, Bilal KARTAL, Matthew Edmund TAYLOR
  • Publication number: 20210034974
    Abstract: Disclosed are systems, methods, and devices for generating a visualization of a deep reinforcement learning (DRL) process. State data is received, reflective of states of an environment explored by an DRL agent, each state corresponding to a time step. For each given state, saliency metrics are calculated by processing the state data, each metric measuring saliency of a feature at the time step corresponding to the given state. A graphical visualization is generated, having at least two dimensions in which: each feature of the environment is graphically represented along a first axis; and each time step is represented along a second axis; and a plurality of graphical markers representing corresponding saliency metrics, each graphical marker having a size commensurate with the magnitude of the particular saliency metric represented, and a location along the first and second axes corresponding to the feature and time step for the particular saliency metric.
    Type: Application
    Filed: July 31, 2020
    Publication date: February 4, 2021
    Inventors: Matthew Edmund TAYLOR, Bilal KARTAL, Pablo Francisco HERNANDEZ LEAL, Nathan DOUGLAS, Dianna YIM, Frank MAURER
  • Publication number: 20200279136
    Abstract: A system for a machine reinforcement learning architecture for an environment with a plurality of agents includes: at least one memory and at least one processor configured to provide a multi-agent reinforcement learning architecture, the multi-agent reinforcement learning model based on a mean field Q function including multiple types of agents, wherein each type of agent has a corresponding mean field.
    Type: Application
    Filed: February 28, 2020
    Publication date: September 3, 2020
    Inventors: Sriram Ganapathi Subramanian, Pascal Poupart, Matthew Edmund Taylor, Nidhi Hegde
  • Publication number: 20200143208
    Abstract: A computer system and method for extending parallelized asynchronous reinforcement learning to include agent modeling for training a neural network is described. Coordinated operation of plurality of hardware processors or threads is utilized such that each functions as a worker process that is configured to simultaneously interact with a target computing environment for local gradient computation based on a loss determination mechanism and to update global network parameters. The loss determination mechanism includes at least a policy loss term (actor), a value loss term (critic), and a supervised cross entropy loss. Variations are described further where the neural network is adapted to include a latent space to track agent policy features.
    Type: Application
    Filed: November 5, 2019
    Publication date: May 7, 2020
    Inventors: Pablo Francisco HERNANDEZ LEAL, Bilal KARTAL, Matthew Edmund TAYLOR
  • Publication number: 20200143206
    Abstract: A computer system and method for extending parallelized asynchronous reinforcement learning for training a neural network is described in various embodiments, through coordinated operation of plurality of hardware processors or threads such that each functions as a worker agent that is configured to simultaneously interact with a target computing environment for local gradient computation based on a loss determination and to update global network parameters based at least on local gradient computation to train the neural network through modifications of weighted interconnections between interconnected computing units as gradient computation is conducted across a plurality of iterations of a target computing environment, the loss determination including at least a policy loss term (actor), a value loss term (critic), and an auxiliary control loss. Variations are described further where the neural network is adapted to include terminal state prediction and action guidance.
    Type: Application
    Filed: November 5, 2019
    Publication date: May 7, 2020
    Inventors: Bilal KARTAL, Pablo Francisco HERNANDEZ LEAL, Matthew Edmund TAYLOR
  • Publication number: 20190236458
    Abstract: Systems, methods, and computer readable media directed to interactive reinforcement learning with dynamic reuse of prior knowledge are described in various embodiments. The interactive reinforcement learning is adapted for providing computer implemented systems for dynamic action selection based on confidence levels associated with demonstrator data or portions thereof.
    Type: Application
    Filed: January 31, 2019
    Publication date: August 1, 2019
    Inventors: Matthew Edmund TAYLOR, Zhaodong WANG
  • Publication number: 20190236455
    Abstract: Disclosed herein are a system and method for providing a machine learning architecture based on monitored demonstrations. The system may include: a non-transitory computer-readable memory storage; at least one processor configured for dynamically training a machine learning architecture for performing one or more sequential tasks, the at least one processor configured to provide: a data receiver for receiving one or more demonstrator data sets, each demonstrator data set including a data structure representing the one or more state-action pairs; a neural network of the machine learning architecture, the neural network including a group of nodes in one or more layers; and a pre-training engine configured for processing the one or more demonstrator data sets to extract one or more features, the extracted one or more features used to pre-train the neural network based on the one or more state-action pairs observed in one or more interactions with the environment.
    Type: Application
    Filed: January 31, 2019
    Publication date: August 1, 2019
    Inventors: Matthew Edmund TAYLOR, Gabriel Victor DE LA CRUZ, JR., Yunshu DU