Patents by Inventor Chenjun XIAO

Chenjun XIAO has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11593693
    Abstract: Systems and methods of updating a multi-level data structure for controlling an agent. The method may include: accessing a data structure defining one or more nodes. A non-leaf node of the one or more nodes may be associated with one or more edges for traversing to a subsequent node. An edge of the one or more edges may be associated with a visit count and a softmax state-action value estimation. The method may include identifying a node trajectory including a series of nodes based on an asymptotically converging sampling policy, where the node trajectory includes a root node and a leaf node of the data structure, determining a reward indication associated with the node trajectory; and for at least one non-leaf node, updating the visit count and the softmax state-action value estimate associated with one or more edges of the non-leaf node based on the determined reward indication.
    Type: Grant
    Filed: January 23, 2020
    Date of Patent: February 28, 2023
    Assignee: ROYAL BANK OF CANADA
    Inventors: Chenjun Xiao, Ruitong Huang
  • Publication number: 20200234167
    Abstract: Systems and methods of updating a multi-level data structure for controlling an agent. The method may include: accessing a data structure defining one or more nodes. A non-leaf node of the one or more nodes may be associated with one or more edges for traversing to a subsequent node. An edge of the one or more edges may be associated with a visit count and a softmax state-action value estimation. The method may include identifying a node trajectory including a series of nodes based on an asymptotically converging sampling policy, where the node trajectory includes a root node and a leaf node of the data structure, determining a reward indication associated with the node trajectory; and for at least one non-leaf node, updating the visit count and the softmax state-action value estimate associated with one or more edges of the non-leaf node based on the determined reward indication.
    Type: Application
    Filed: January 23, 2020
    Publication date: July 23, 2020
    Inventors: Chenjun XIAO, Ruitong HUANG