Patents Assigned to DeepMind Technologies
-
Patent number: 11714996Abstract: A computer-implemented method of training a student machine learning system comprises receiving data indicating execution of an expert, determining one or more actions performed by the expert during the execution and a corresponding state-action Jacobian, and training the student machine learning system using a linear-feedback-stabilized policy. The linear-feedback-stabilized policy may be based on the state-action Jacobian. Also a neural network system for representing a space of probabilistic motor primitives, implemented by one or more computers. The neural network system comprises an encoder configured to generate latent variables based on a plurality of inputs, each input comprising a plurality of frames, and a decoder configured to generate an action based on one or more of the latent variables and a state.Type: GrantFiled: July 25, 2022Date of Patent: August 1, 2023Assignee: DeepMind Technologies LimitedInventors: Leonard Hasenclever, Vu Pham, Joshua Merel, Alexandre Galashov
-
Patent number: 11714993Abstract: Methods, systems, and apparatus for classifying a new example using a comparison set of comparison examples. One method includes maintaining a comparison set, the comparison set including comparison examples and a respective label vector for each of the comparison examples, each label vector including a respective score for each label in a predetermined set of labels; receiving a new example; determining a respective attention weight for each comparison example by applying a neural network attention mechanism to the new example and to the comparison examples; and generating a respective label score for each label in the predetermined set of labels from, for each of the comparison examples, the respective attention weight for the comparison example and the respective label vector for the comparison example, in which the respective label score for each of the labels represents a likelihood that the label is a correct label for the new example.Type: GrantFiled: April 6, 2021Date of Patent: August 1, 2023Assignee: DeepMind Technologies LimitedInventors: Charles Blundell, Oriol Vinyals
-
Patent number: 11704541Abstract: There is described a neural network system for generating a graph, the graph comprising a set of nodes and edges. The system comprises one or more neural networks configured to represent a probability distribution over sequences of node generating decisions and/or edge generating decisions, and one or more computers configured to sample the probability distribution represented by the one or more neural networks to generate a graph.Type: GrantFiled: October 29, 2018Date of Patent: July 18, 2023Assignee: DeepMind Technologies LimitedInventors: Yujia Li, Christopher James Dyer, Oriol Vinyals
-
Patent number: 11693627Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using neural networks having contiguous sparsity patterns. One of the methods includes storing a first parameter matrix of a neural network having a contiguous sparsity pattern in storage associated with a computing device. The computing device performs an inference pass of the neural network to generate an output vector, including reading, from the storage associated with the computing device, one or more activation values from the input vector, reading, from the storage associated with the computing device, a block of non-zero parameter values, and multiplying each of the one or more activation values by one or more of the block of non-zero parameter values.Type: GrantFiled: February 11, 2019Date of Patent: July 4, 2023Assignee: DeepMind Technologies LimitedInventors: Karen Simonyan, Nal Emmerich Kalchbrenner, Erich Konrad Elsen
-
Patent number: 11676035Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. The neural network has a plurality of differentiable weights and a plurality of non-differentiable weights. One of the methods includes determining trained values of the plurality of differentiable weights and the non-differentiable weights by repeatedly performing operations that include determining an update to the current values of the plurality of differentiable weights using a machine learning gradient-based training technique and determining, using an evolution strategies (ES) technique, an update to the current values of a plurality of distribution parameters.Type: GrantFiled: January 23, 2020Date of Patent: June 13, 2023Assignee: DeepMind Technologies LimitedInventors: Karel Lenc, Karen Simonyan, Tom Schaul, Erich Konrad Elsen
-
Patent number: 11675855Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for re-ranking a collection of documents according to a first metric and subject to a constraint on a function of one or more second metrics. One of the methods includes: obtaining, for each document in the first collection of documents, a respective first metric value corresponding to the first metric and respective one or more second metric values corresponding to the one or more second metrics; re-ranking the first collection of documents, comprising: determining the constraint on the function of one or more second metrics by computing a first threshold value using a variable threshold function that takes as input second metric values for the documents in the first collection of documents; and determining the re-ranking for the first collection of documents by solving a constrained optimization for the first metric constrained by the first threshold value.Type: GrantFiled: November 18, 2020Date of Patent: June 13, 2023Assignee: DeepMind Technologies LimitedInventors: Anton Zhernov, Krishnamurthy Dvijotham, Xiaohong Gong, Amogh S. Asgekar
-
Patent number: 11662210Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a grid cell neural network and an action selection neural network. The grid cell network is configured to: receive an input comprising data characterizing a velocity of the agent; process the input to generate a grid cell representation; and process the grid cell representation to generate an estimate of a position of the agent in the environment; the action selection neural network is configured to: receive an input comprising a grid cell representation and an observation characterizing a state of the environment; and process the input to generate an action selection network output.Type: GrantFiled: May 18, 2022Date of Patent: May 30, 2023Assignee: DeepMind Technologies LimitedInventors: Andrea Banino, Sudarshan Kumaran, Raia Thais Hadsell, Benigno Uria-MartÃnez
-
Patent number: 11663441Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection policy neural network, wherein the action selection policy neural network is configured to process an observation characterizing a state of an environment to generate an action selection policy output, wherein the action selection policy output is used to select an action to be performed by an agent interacting with an environment. In one aspect, a method comprises: obtaining an observation characterizing a state of the environment subsequent to the agent performing a selected action; generating a latent representation of the observation; processing the latent representation of the observation using a discriminator neural network to generate an imitation score; determining a reward from the imitation score; and adjusting the current values of the action selection policy neural network parameters based on the reward using a reinforcement learning training technique.Type: GrantFiled: September 27, 2019Date of Patent: May 30, 2023Assignee: DeepMind Technologies LimitedInventors: Scott Ellison Reed, Yusuf Aytar, Ziyu Wang, Tom Paine, Sergio Gomez Colmenarejo, David Budden, Tobias Pfaff, Aaron Gerard Antonius van den Oord, Oriol Vinyals, Alexander Novikov
-
Patent number: 11663475Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by a reinforcement learning agent interacting with an environment. In particular, the actions are selected from a continuous action space and the system trains the action selection neural network jointly with a distribution Q network that is used to update the parameters of the action selection neural network.Type: GrantFiled: September 15, 2022Date of Patent: May 30, 2023Assignee: DeepMind Technologies LimitedInventors: David Budden, Matthew William Hoffman, Gabriel Barth-Maron
-
Patent number: 11651208Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for reinforcement learning. A reinforcement learning neural network selects actions to be performed by an agent interacting with an environment to perform a task in an attempt to achieve a specified result. The reinforcement learning neural network has at least one input to receive an input observation characterizing a state of the environment and at least one output for determining an action to be performed by the agent in response to the input observation. The system includes a reward function network coupled to the reinforcement learning neural network. The reward function network has an input to receive reward data characterizing a reward provided by one or more states of the environment and is configured to determine a reward function to provide one or more target values for training the reinforcement learning neural network.Type: GrantFiled: May 22, 2018Date of Patent: May 16, 2023Assignee: DeepMind Technologies LimitedInventors: Zhongwen Xu, Hado Phillip van Hasselt, Joseph Varughese Modayil, Andre da Motta Salles Barreto, David Silver
-
Patent number: 11636283Abstract: A variational autoencoder (VAE) neural network system, comprising an encoder neural network to encode an input data item to define a posterior distribution for a set of latent variables, and a decoder neural network to generate an output data item representing values of a set of latent variables sampled from the posterior distribution. The system is configured for training with an objective function including a term dependent on a difference between the posterior distribution and a prior distribution. The prior and posterior distributions are arranged so that they cannot be matched to one another. The VAE system may be used for compressing and decompressing data.Type: GrantFiled: June 1, 2020Date of Patent: April 25, 2023Assignee: DeepMind Technologies LimitedInventors: Benjamin Poole, Aaron Gerard Antonius van den Oord, Ali Razavi-Nematollahi, Oriol Vinyals
-
Patent number: 11636347Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a method comprises: obtaining a graph of nodes and edges that represents an interaction history of the agent with the environment; generating an encoded representation of the graph representing the interaction history of the agent with the environment; processing an input based on the encoded representation of the graph using an action selection neural network, in accordance with current values of action selection neural network parameters, to generate an action selection output; and selecting an action from a plurality of possible actions to be performed by the agent using the action selection output generated by the action selection neural network.Type: GrantFiled: January 22, 2020Date of Patent: April 25, 2023Assignee: DeepMind Technologies LimitedInventors: Hanjun Dai, Yujia Li, Chenglong Wang, Rishabh Singh, Po-Sen Huang, Pushmeet Kohli
-
Patent number: 11625604Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. One of the systems includes (i) a plurality of actor computing units, in which each of the actor computing units is configured to maintain a respective replica of the action selection neural network and to perform a plurality of actor operations, and (ii) one or more learner computing units, in which each of the one or more learner computing units is configured to perform a plurality of learner operations.Type: GrantFiled: October 29, 2018Date of Patent: April 11, 2023Assignee: DeepMind Technologies LimitedInventors: David Budden, Gabriel Barth-Maron, John Quan, Daniel George Horgan
-
Patent number: 11627165Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a policy neural network having a plurality of policy parameters and used to select actions to be performed by an agent to control the agent to perform a particular task while interacting with one or more other agents in an environment. In one aspect, the method includes: maintaining data specifying a pool of candidate action selection policies; maintaining data specifying respective matchmaking policy; and training the policy neural network using a reinforcement learning technique to update the policy parameters. The policy parameters define policies to be used in controlling the agent to perform the particular task.Type: GrantFiled: January 24, 2020Date of Patent: April 11, 2023Assignee: DeepMind Technologies LimitedInventors: David Silver, Oriol Vinyals, Maxwell Elliot Jaderberg
-
Patent number: 11615310Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for training machine learning models. One method includes obtaining a machine learning model, wherein the machine learning model comprises one or more model parameters, and the machine learning model is trained using gradient descent techniques to optimize an objective function; determining an update rule for the model parameters using a recurrent neural network (RNN); and applying a determined update rule for a final time step in a sequence of multiple time steps to the model parameters.Type: GrantFiled: May 19, 2017Date of Patent: March 28, 2023Assignee: DeepMind Technologies LimitedInventors: Misha Man Ray Denil, Tom Schaul, Marcin Andrychowicz, Joao Ferdinando Gomes de Freitas, Sergio Gomez Colmenarejo, Matthew William Hoffman, David Benjamin Pfau
-
Patent number: 11610118Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method comprises: receiving a current observation; for each action of a plurality of actions: randomly sampling one or more probability values; for each probability value: processing the action, the current observation, and the probability value using a quantile function network to generate an estimated quantile value for the probability value with respect to a probability distribution over possible returns that would result from the agent performing the action in response to the current observation; determining a measure of central tendency of the one or more estimated quantile values; and selecting an action to be performed by the agent in response to the current observation using the measures of central tendency for the actions.Type: GrantFiled: February 11, 2019Date of Patent: March 21, 2023Assignee: DeepMind Technologies LimitedInventors: Georg Ostrovski, William Clinton Dabney
-
Patent number: 11604985Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. A method includes: training a neural network having multiple network parameters to perform a particular neural network task and to determine trained values of the network parameters using an iterative training process having multiple hyperparameters, the method includes: maintaining multiple candidate neural networks and, for each of the multiple candidate neural networks, data specifying: (i) respective values of network parameters for the candidate neural network, (ii) respective values of hyperparameters for the candidate neural network, and (iii) a quality measure that measures a performance of the candidate neural network on the particular neural network task; and for each of the multiple candidate neural networks, repeatedly performing additional training operations.Type: GrantFiled: November 22, 2018Date of Patent: March 14, 2023Assignee: DeepMind Technologies LimitedInventors: Maxwell Elliot Jaderberg, Wojciech Czarnecki, Timothy Frederick Goldie Green, Valentin Clement Dalibard
-
Patent number: 11604941Abstract: A method of training an action selection neural network to perform a demonstrated task using a supervised learning technique. The action selection neural network is configured to receive demonstration data comprising actions to perform the task and rewards received for performing the actions. The action selection neural network has auxiliary prediction task neural networks on one or more of its intermediate outputs. The action selection policy neural network is trained using multiple combined losses, concurrently with the auxiliary prediction task neural networks.Type: GrantFiled: October 29, 2018Date of Patent: March 14, 2023Assignee: DeepMind Technologies LimitedInventor: Todd Andrew Hester
-
Patent number: 11604997Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a policy neural network. The policy neural network is used to select actions to be performed by an agent that interacts with an environment by receiving an observation characterizing a state of the environment and performing an action from a set of actions in response to the received observation. A trajectory is obtained from a replay memory, and a final update to current values of the policy network parameters is determined for each training observation in the trajectory. The final updates to the current values of the policy network parameters are determined from selected action updates and leave-one-out updates.Type: GrantFiled: June 11, 2018Date of Patent: March 14, 2023Assignee: DeepMind Technologies LimitedInventors: Marc Gendron-Bellemare, Mohammad Gheshlaghi Azar, Audrunas Gruslys, Remi Munos
-
Patent number: 11593640Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory. One of the methods includes providing an output derived from the neural network output for the time step as a system output for the time step; maintaining a current state of the external memory; determining, from the neural network output for the time step, memory state parameters for the time step; updating the current state of the external memory using the memory state parameters for the time step; reading data from the external memory in accordance with the updated state of the external memory; and combining the data read from the external memory with a system input for the next time step to generate the neural network input for the next time step.Type: GrantFiled: September 9, 2019Date of Patent: February 28, 2023Assignee: DeepMind Technologies LimitedInventors: Edward Thomas Grefenstette, Karl Moritz Hermann, Mustafa Suleyman, Philip Blunsom