Patents by Inventor Stephan Tao ZHENG

Stephan Tao ZHENG has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEMS AND METHODS FOR END-TO-END MULTI-AGENT REINFORCEMENT LEARNING ON A GRAPHICS PROCESSING UNIT

Publication number: 20230237352

Abstract: Embodiments provide a fast multi-agent reinforcement learning (RL) pipeline that runs the full RL workflow end-to-end on a single GPU, using a single store of data for simulation roll-outs, inference, and training. Specifically, simulations and agents in each simulation are run in tandem, taking advantage of the parallel capabilities of the GPU. This way, the costly GPU-CPU communication and copying is significantly reduced, and simulation sampling and learning rates are in turn improved. In this way, a large number of simulations may be concurrently run on the GPU, thus largely improving efficiency of the RL training.

Type: Application

Filed: January 21, 2022

Publication date: July 27, 2023

Inventors: Tian Lan, Stephan Tao Zheng, Sunil Srinivasa
Solving sparse reward tasks using self-balancing shaped rewards

Patent number: 11620572

Abstract: Approaches for using self-balancing shaped rewards include randomly selecting a start and goal state, traversing first and second trajectories for moving from the start state toward the goal state where a first terminal state of the first trajectory is closer to the goal state than a second terminal state of the second trajectory, updating rewards for the first and trajectories using a self-balancing reward function based the terminal states of the other trajectory, determining a gradient for the goal-oriented task module, and updating one or more parameters of the goal-oriented task module based on the gradient. The second trajectory contributes to the determination of the gradient and the first trajectory contributes to the determination of the gradient when the first terminal state is within a first threshold distance of the second terminal state or the first terminal state is within a second threshold distance of the goal state.

Type: Grant

Filed: August 20, 2019

Date of Patent: April 4, 2023

Assignee: salesforce.com, inc.

Inventors: Alexander Richard Trott, Stephan Tao Zheng
Learning world graphs to accelerate hierarchical reinforcement learning

Patent number: 11562251

Abstract: Systems and methods are provided for learning world graphs to accelerate hierarchical reinforcement learning (HRL) for the training of a machine learning system. The systems and methods employ or implement a two-stage framework or approach that includes (1) unsupervised world graph discovery, and (2) accelerated hierarchical reinforcement learning by integrating the graph.

Type: Grant

Filed: August 6, 2019

Date of Patent: January 24, 2023

Assignee: Salesforce.com, Inc.

Inventors: Wenling Shang, Alexander Richard Trott, Stephan Tao Zheng
SOLVING SPARSE REWARD TASKS USING SELF-BALANCING SHAPED REWARDS

Publication number: 20200364614

Abstract: Approaches for using self-balancing shaped rewards include randomly selecting a start and goal state, traversing first and second trajectories for moving from the start state toward the goal state where a first terminal state of the first trajectory is closer to the goal state than a second terminal state of the second trajectory, updating rewards for the first and trajectories using a self-balancing reward function based the terminal states of the other trajectory, determining a gradient for the goal-oriented task module, and updating one or more parameters of the goal-oriented task module based on the gradient. The second trajectory contributes to the determination of the gradient and the first trajectory contributes to the determination of the gradient when the first terminal state is within a first threshold distance of the second terminal state or the first terminal state is within a second threshold distance of the goal state.

Type: Application

Filed: August 20, 2019

Publication date: November 19, 2020

Inventors: Alexander Richard Trott, Stephan Tao Zheng
Learning World Graphs to Accelerate Hierarchical Reinforcement Learning

Publication number: 20200364580

Abstract: Systems and methods are provided for learning world graphs to accelerate hierarchical reinforcement learning (HRL) for the training of a machine learning system. The systems and methods employ or implement a two-stage framework or approach that includes (1) unsupervised world graph discovery, and (2) accelerated hierarchical reinforcement learning by integrating the graph.

Type: Application

Filed: August 6, 2019

Publication date: November 19, 2020

Inventors: Wenling SHANG, Alexander Richard TROTT, Stephan Tao ZHENG

SYSTEMS AND METHODS FOR END-TO-END MULTI-AGENT REINFORCEMENT LEARNING ON A GRAPHICS PROCESSING UNIT

Solving sparse reward tasks using self-balancing shaped rewards

Learning world graphs to accelerate hierarchical reinforcement learning

SOLVING SPARSE REWARD TASKS USING SELF-BALANCING SHAPED REWARDS

Learning World Graphs to Accelerate Hierarchical Reinforcement Learning