Patents by Inventor William Clinton Dabney

William Clinton Dabney has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

DISTRIBUTIONAL REINFORCEMENT LEARNING USING QUANTILE FUNCTION NEURAL NETWORKS

Publication number: 20240135182

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method comprises: receiving a current observation; for each action of a plurality of actions: randomly sampling one or more probability values; for each probability value: processing the action, the current observation, and the probability value using a quantile function network to generate an estimated quantile value for the probability value with respect to a probability distribution over possible returns that would result from the agent performing the action in response to the current observation; determining a measure of central tendency of the one or more estimated quantile values; and selecting an action to be performed by the agent in response to the current observation using the measures of central tendency for the actions.

Type: Application

Filed: December 15, 2023

Publication date: April 25, 2024

Inventors: Georg Ostrovski, William Clinton Dabney
Distributional reinforcement learning using quantile function neural networks

Patent number: 11887000

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method comprises: receiving a current observation; for each action of a plurality of actions: randomly sampling one or more probability values; for each probability value: processing the action, the current observation, and the probability value using a quantile function network to generate an estimated quantile value for the probability value with respect to a probability distribution over possible returns that would result from the agent performing the action in response to the current observation; determining a measure of central tendency of the one or more estimated quantile values; and selecting an action to be performed by the agent in response to the current observation using the measures of central tendency for the actions.

Type: Grant

Filed: February 15, 2023

Date of Patent: January 30, 2024

Assignee: DeepMind Technologies Limited

Inventors: Georg Ostrovski, William Clinton Dabney
Low-pass recurrent neural network systems with memory

Patent number: 11755879

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing and storing inputs for use in a neural network. One of the methods includes receiving input data for storage in a memory system comprising a first set of memory blocks, the memory blocks having an associated order; passing the input data to a highest ordered memory block; for each memory block for which there is a lower ordered memory block: applying a filter function to data currently stored by the memory block to generate filtered data and passing the filtered data to a lower ordered memory block; and for each memory block: combining the data currently stored in the memory block with the data passed to the memory block to generate updated data, and storing the updated data in the memory block.

Type: Grant

Filed: February 11, 2019

Date of Patent: September 12, 2023

Assignee: DeepMind Technologies Limited

Inventors: Razvan Pascanu, William Clinton Dabney, Thomas Stepleton
DISTRIBUTIONAL REINFORCEMENT LEARNING USING QUANTILE FUNCTION NEURAL NETWORKS

Publication number: 20230196108

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method comprises: receiving a current observation; for each action of a plurality of actions: randomly sampling one or more probability values; for each probability value: processing the action, the current observation, and the probability value using a quantile function network to generate an estimated quantile value for the probability value with respect to a probability distribution over possible returns that would result from the agent performing the action in response to the current observation; determining a measure of central tendency of the one or more estimated quantile values; and selecting an action to be performed by the agent in response to the current observation using the measures of central tendency for the actions.

Type: Application

Filed: February 15, 2023

Publication date: June 22, 2023

Inventors: Georg Ostrovski, William Clinton Dabney
Distributional reinforcement learning using quantile function neural networks

Patent number: 11610118

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method comprises: receiving a current observation; for each action of a plurality of actions: randomly sampling one or more probability values; for each probability value: processing the action, the current observation, and the probability value using a quantile function network to generate an estimated quantile value for the probability value with respect to a probability distribution over possible returns that would result from the agent performing the action in response to the current observation; determining a measure of central tendency of the one or more estimated quantile values; and selecting an action to be performed by the agent in response to the current observation using the measures of central tendency for the actions.

Type: Grant

Filed: February 11, 2019

Date of Patent: March 21, 2023

Assignee: DeepMind Technologies Limited

Inventors: Georg Ostrovski, William Clinton Dabney
MODULATING AGENT BEHAVIOR TO OPTIMIZE LEARNING PROGRESS

Publication number: 20210089908

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for controlling an agent. One of the methods includes sampling a behavior modulation in accordance with a current probability distribution; for each of one or more time steps: processing an input comprising an observation characterizing a current state of the environment at the time step using an action selection neural network to generate a respective action score for each action in a set of possible actions that can be performed by the agent; modifying the action scores using the sampled behavior modulation; and selecting the action to be performed by the agent at the time step based on the modified action scores; determining a fitness measure corresponding to the sampled behavior modulation; and updating the current probability distribution over the set of possible behavior modulations using the fitness measure corresponding to the behavior modulation.

Type: Application

Filed: September 25, 2020

Publication date: March 25, 2021

Inventors: Tom Schaul, Diana Luiza Borsa, Fengning Ding, David Szepesvari, Georg Ostrovski, Simon Osindero, William Clinton Dabney
DISTRIBUTIONAL REINFORCEMENT LEARNING

Publication number: 20210064970

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. A current observation characterizing a current state of the environment is received. For each action in a set of multiple actions that can be performed by the agent to interact with the environment, a probability distribution is determined over possible Q returns for the action-current observation pair. For each action, a measure of central tendency of the possible Q returns with respect to the probability distributions for the action-current observation pair is determined. An action to be performed by the agent in response to the current observation is selected using the measures of central tendency.

Type: Application

Filed: November 16, 2020

Publication date: March 4, 2021

Inventors: Marc Gendron-Bellemare, William Clinton Dabney
Distributional reinforcement learning

Patent number: 10860920

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. A current observation characterizing a current state of the environment is received. For each action in a set of multiple actions that can be performed by the agent to interact with the environment, a probability distribution is determined over possible Q returns for the action-current observation pair. For each action, a measure of central tendency of the possible Q returns with respect to the probability distributions for the action-current observation pair is determined. An action to be performed by the agent in response to the current observation is selected using the measures of central tendency.

Type: Grant

Filed: July 10, 2019

Date of Patent: December 8, 2020

Assignee: DeepMind Technologies Limited

Inventors: Marc Gendron-Bellemare, William Clinton Dabney
DISTRIBUTIONAL REINFORCEMENT LEARNING USING QUANTILE FUNCTION NEURAL NETWORKS

Publication number: 20200364557

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method comprises: receiving a current observation; for each action of a plurality of actions: randomly sampling one or more probability values; for each probability value: processing the action, the current observation, and the probability value using a quantile function network to generate an estimated quantile value for the probability value with respect to a probability distribution over possible returns that would result from the agent performing the action in response to the current observation; determining a measure of central tendency of the one or more estimated quantile values; and selecting an action to be performed by the agent in response to the current observation using the measures of central tendency for the actions.

Type: Application

Filed: February 11, 2019

Publication date: November 19, 2020

Inventors: Georg Ostrovski, William Clinton Dabney
Voice user interface knowledge acquisition system

Patent number: 10755177

Abstract: A voice user interface (VUI) system use collaborative filtering to expand its own knowledge base. The system is designed to improve the accuracy and performance of the Natural Language Understanding (NLU) processing that underlies VUIs. The system leverages the knowledge of system users to crowdsource new information.

Type: Grant

Filed: December 31, 2015

Date of Patent: August 25, 2020

Assignee: AMAZON TECHNOLOGIES, INC.

Inventors: William Clinton Dabney, Arpit Gupta, Faisal Ladhak, Markus Dreyer, Anjishnu Kumar
DISTRIBUTIONAL REINFORCEMENT LEARNING

Publication number: 20190332923

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. A current observation characterizing a current state of the environment is received. For each action in a set of multiple actions that can be performed by the agent to interact with the environment, a probability distribution is determined over possible Q returns for the action-current observation pair. For each action, a measure of central tendency of the possible Q returns with respect to the probability distributions for the action-current observation pair is determined. An action to be performed by the agent in response to the current observation is selected using the measures of central tendency.

Type: Application

Filed: July 10, 2019

Publication date: October 31, 2019

Inventors: Marc Gendron-Bellemare, William Clinton Dabney
LOW-PASS RECURRENT NEURAL NETWORK SYSTEMS WITH MEMORY

Publication number: 20190251419

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing and storing inputs for use in a neural network. One of the methods includes receiving input data for storage in a memory system comprising a first set of memory blocks, the memory blocks having an associated order; passing the input data to a highest ordered memory block; for each memory block for which there is a lower ordered memory block: applying a filter function to data currently stored by the memory block to generate filtered data and passing the filtered data to a lower ordered memory block; and for each memory block: combining the data currently stored in the memory block with the data passed to the memory block to generate updated data, and storing the updated data in the memory block.

Type: Application

Filed: February 11, 2019

Publication date: August 15, 2019

Inventors: Razvan Pascanu, William Clinton Dabney, Thomas Stepleton
Automatic loudspeaker configuration

Patent number: 10070244

Abstract: An audio system has multiple loudspeaker devices to produce sound corresponding to different channels of a multi-channel audio signal such as a surround sound audio signal. The loudspeaker devices may have speech recognition capabilities. In response to a command spoken by a user, the loudspeaker devices automatically determine their positions and configure themselves to receive appropriate channels based on the positions. In order to determine the positions, a first of the loudspeaker devices analyzes sound representing the user command to determine the position of the first loudspeaker device relative to the user. The first loudspeaker also produces responsive speech indicating to the user that the loudspeaker devices have been or are being configured. The other loudspeaker devices analyze the sound representing the responsive speech to determine their positions relative to the first loudspeaker device and report their positions to the first loudspeaker device.

Type: Grant

Filed: September 30, 2015

Date of Patent: September 4, 2018

Assignee: Amazon Technologies, Inc.

Inventor: William Clinton Dabney