Patents by Inventor William Clinton Dabney

William Clinton Dabney has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240135182
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method comprises: receiving a current observation; for each action of a plurality of actions: randomly sampling one or more probability values; for each probability value: processing the action, the current observation, and the probability value using a quantile function network to generate an estimated quantile value for the probability value with respect to a probability distribution over possible returns that would result from the agent performing the action in response to the current observation; determining a measure of central tendency of the one or more estimated quantile values; and selecting an action to be performed by the agent in response to the current observation using the measures of central tendency for the actions.
    Type: Application
    Filed: December 15, 2023
    Publication date: April 25, 2024
    Inventors: Georg Ostrovski, William Clinton Dabney
  • Patent number: 11887000
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method comprises: receiving a current observation; for each action of a plurality of actions: randomly sampling one or more probability values; for each probability value: processing the action, the current observation, and the probability value using a quantile function network to generate an estimated quantile value for the probability value with respect to a probability distribution over possible returns that would result from the agent performing the action in response to the current observation; determining a measure of central tendency of the one or more estimated quantile values; and selecting an action to be performed by the agent in response to the current observation using the measures of central tendency for the actions.
    Type: Grant
    Filed: February 15, 2023
    Date of Patent: January 30, 2024
    Assignee: DeepMind Technologies Limited
    Inventors: Georg Ostrovski, William Clinton Dabney
  • Patent number: 11755879
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing and storing inputs for use in a neural network. One of the methods includes receiving input data for storage in a memory system comprising a first set of memory blocks, the memory blocks having an associated order; passing the input data to a highest ordered memory block; for each memory block for which there is a lower ordered memory block: applying a filter function to data currently stored by the memory block to generate filtered data and passing the filtered data to a lower ordered memory block; and for each memory block: combining the data currently stored in the memory block with the data passed to the memory block to generate updated data, and storing the updated data in the memory block.
    Type: Grant
    Filed: February 11, 2019
    Date of Patent: September 12, 2023
    Assignee: DeepMind Technologies Limited
    Inventors: Razvan Pascanu, William Clinton Dabney, Thomas Stepleton
  • Publication number: 20230196108
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method comprises: receiving a current observation; for each action of a plurality of actions: randomly sampling one or more probability values; for each probability value: processing the action, the current observation, and the probability value using a quantile function network to generate an estimated quantile value for the probability value with respect to a probability distribution over possible returns that would result from the agent performing the action in response to the current observation; determining a measure of central tendency of the one or more estimated quantile values; and selecting an action to be performed by the agent in response to the current observation using the measures of central tendency for the actions.
    Type: Application
    Filed: February 15, 2023
    Publication date: June 22, 2023
    Inventors: Georg Ostrovski, William Clinton Dabney
  • Patent number: 11610118
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method comprises: receiving a current observation; for each action of a plurality of actions: randomly sampling one or more probability values; for each probability value: processing the action, the current observation, and the probability value using a quantile function network to generate an estimated quantile value for the probability value with respect to a probability distribution over possible returns that would result from the agent performing the action in response to the current observation; determining a measure of central tendency of the one or more estimated quantile values; and selecting an action to be performed by the agent in response to the current observation using the measures of central tendency for the actions.
    Type: Grant
    Filed: February 11, 2019
    Date of Patent: March 21, 2023
    Assignee: DeepMind Technologies Limited
    Inventors: Georg Ostrovski, William Clinton Dabney
  • Publication number: 20210089908
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for controlling an agent. One of the methods includes sampling a behavior modulation in accordance with a current probability distribution; for each of one or more time steps: processing an input comprising an observation characterizing a current state of the environment at the time step using an action selection neural network to generate a respective action score for each action in a set of possible actions that can be performed by the agent; modifying the action scores using the sampled behavior modulation; and selecting the action to be performed by the agent at the time step based on the modified action scores; determining a fitness measure corresponding to the sampled behavior modulation; and updating the current probability distribution over the set of possible behavior modulations using the fitness measure corresponding to the behavior modulation.
    Type: Application
    Filed: September 25, 2020
    Publication date: March 25, 2021
    Inventors: Tom Schaul, Diana Luiza Borsa, Fengning Ding, David Szepesvari, Georg Ostrovski, Simon Osindero, William Clinton Dabney
  • Publication number: 20210064970
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. A current observation characterizing a current state of the environment is received. For each action in a set of multiple actions that can be performed by the agent to interact with the environment, a probability distribution is determined over possible Q returns for the action-current observation pair. For each action, a measure of central tendency of the possible Q returns with respect to the probability distributions for the action-current observation pair is determined. An action to be performed by the agent in response to the current observation is selected using the measures of central tendency.
    Type: Application
    Filed: November 16, 2020
    Publication date: March 4, 2021
    Inventors: Marc Gendron-Bellemare, William Clinton Dabney
  • Patent number: 10860920
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. A current observation characterizing a current state of the environment is received. For each action in a set of multiple actions that can be performed by the agent to interact with the environment, a probability distribution is determined over possible Q returns for the action-current observation pair. For each action, a measure of central tendency of the possible Q returns with respect to the probability distributions for the action-current observation pair is determined. An action to be performed by the agent in response to the current observation is selected using the measures of central tendency.
    Type: Grant
    Filed: July 10, 2019
    Date of Patent: December 8, 2020
    Assignee: DeepMind Technologies Limited
    Inventors: Marc Gendron-Bellemare, William Clinton Dabney
  • Publication number: 20200364557
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method comprises: receiving a current observation; for each action of a plurality of actions: randomly sampling one or more probability values; for each probability value: processing the action, the current observation, and the probability value using a quantile function network to generate an estimated quantile value for the probability value with respect to a probability distribution over possible returns that would result from the agent performing the action in response to the current observation; determining a measure of central tendency of the one or more estimated quantile values; and selecting an action to be performed by the agent in response to the current observation using the measures of central tendency for the actions.
    Type: Application
    Filed: February 11, 2019
    Publication date: November 19, 2020
    Inventors: Georg Ostrovski, William Clinton Dabney
  • Patent number: 10755177
    Abstract: A voice user interface (VUI) system use collaborative filtering to expand its own knowledge base. The system is designed to improve the accuracy and performance of the Natural Language Understanding (NLU) processing that underlies VUIs. The system leverages the knowledge of system users to crowdsource new information.
    Type: Grant
    Filed: December 31, 2015
    Date of Patent: August 25, 2020
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: William Clinton Dabney, Arpit Gupta, Faisal Ladhak, Markus Dreyer, Anjishnu Kumar
  • Publication number: 20190332923
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. A current observation characterizing a current state of the environment is received. For each action in a set of multiple actions that can be performed by the agent to interact with the environment, a probability distribution is determined over possible Q returns for the action-current observation pair. For each action, a measure of central tendency of the possible Q returns with respect to the probability distributions for the action-current observation pair is determined. An action to be performed by the agent in response to the current observation is selected using the measures of central tendency.
    Type: Application
    Filed: July 10, 2019
    Publication date: October 31, 2019
    Inventors: Marc Gendron-Bellemare, William Clinton Dabney
  • Publication number: 20190251419
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing and storing inputs for use in a neural network. One of the methods includes receiving input data for storage in a memory system comprising a first set of memory blocks, the memory blocks having an associated order; passing the input data to a highest ordered memory block; for each memory block for which there is a lower ordered memory block: applying a filter function to data currently stored by the memory block to generate filtered data and passing the filtered data to a lower ordered memory block; and for each memory block: combining the data currently stored in the memory block with the data passed to the memory block to generate updated data, and storing the updated data in the memory block.
    Type: Application
    Filed: February 11, 2019
    Publication date: August 15, 2019
    Inventors: Razvan Pascanu, William Clinton Dabney, Thomas Stepleton
  • Patent number: 10070244
    Abstract: An audio system has multiple loudspeaker devices to produce sound corresponding to different channels of a multi-channel audio signal such as a surround sound audio signal. The loudspeaker devices may have speech recognition capabilities. In response to a command spoken by a user, the loudspeaker devices automatically determine their positions and configure themselves to receive appropriate channels based on the positions. In order to determine the positions, a first of the loudspeaker devices analyzes sound representing the user command to determine the position of the first loudspeaker device relative to the user. The first loudspeaker also produces responsive speech indicating to the user that the loudspeaker devices have been or are being configured. The other loudspeaker devices analyze the sound representing the responsive speech to determine their positions relative to the first loudspeaker device and report their positions to the first loudspeaker device.
    Type: Grant
    Filed: September 30, 2015
    Date of Patent: September 4, 2018
    Assignee: Amazon Technologies, Inc.
    Inventor: William Clinton Dabney