Patents Assigned to DeepMind Technologies Limited
-
Patent number: 11727281Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent that interacts with an environment. In one aspect, a system comprises: an action selection subsystem that selects actions to be performed by the agent using an action selection policy generated using an action selection neural network; a reward subsystem that is configured to: receive an observation characterizing a current state of the environment and an observation characterizing a goal state of the environment; generate a reward using an embedded representation of the observation characterizing the current state of the environment and an embedded representation of the observation characterizing the goal state of the environment; and a training subsystem that is configured to train the action selection neural network based on the rewards generated by the reward subsystem using reinforcement learning techniques.Type: GrantFiled: January 27, 2022Date of Patent: August 15, 2023Assignee: DeepMind Technologies LimitedInventors: David Constantine Patrick Warde-Farley, Volodymyr Mnih
-
Patent number: 11720781Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for interleaving matrix operations of a gated activation unit. One of the methods includes receiving a plurality of weight matrices of a gated activation unit of the neural network, the gated activation unit having two or more layers, each layer defining operations comprising: (i) a matrix operation between a weight matrix for the layer and concatenated input vectors and (ii) a nonlinear activation operation using a result of the matrix operation. Rows of the plurality of weight matrices are interleaved by assigning groups of corresponding rows to respective thread blocks, each thread block being a computation unit for execution by an independent processing unit of a plurality of independent processing units of a parallel processing device.Type: GrantFiled: October 20, 2017Date of Patent: August 8, 2023Assignee: DeepMind Technologies LimitedInventor: Erich Konrad Elsen
-
Patent number: 11720796Abstract: A method includes maintaining respective episodic memory data for each of multiple actions; receiving a current observation characterizing a current state of an environment being interacted with by an agent; processing the current observation using an embedding neural network in accordance with current values of parameters of the embedding neural network to generate a current key embedding for the current observation; for each action of the plurality of actions: determining the p nearest key embeddings in the episodic memory data for the action to the current key embedding according to a distance measure, and determining a Q value for the action from the return estimates mapped to by the p nearest key embeddings in the episodic memory data for the action; and selecting, using the Q values for the actions, an action from the multiple actions as the action to be performed by the agent.Type: GrantFiled: April 23, 2020Date of Patent: August 8, 2023Assignee: DeepMind Technologies LimitedInventors: Benigno Uria-Martínez, Alexander Pritzel, Charles Blundell, Adrià Puigdomènech Badia
-
Patent number: 11715009Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network including a first subnetwork followed by a second subnetwork on training inputs by optimizing an objective function. In one aspect, a method includes processing a training input using the neural network to generate a training model output, including processing a subnetwork input for the training input using the first subnetwork to generate a subnetwork activation for the training input in accordance with current values of parameters of the first subnetwork, and providing the subnetwork activation as input to the second subnetwork; determining a synthetic gradient of the objective function for the first subnetwork by processing the subnetwork activation using a synthetic gradient model in accordance with current values of parameters of the synthetic gradient model; and updating the current values of the parameters of the first subnetwork using the synthetic gradient.Type: GrantFiled: May 19, 2017Date of Patent: August 1, 2023Assignee: DeepMind Technologies LimitedInventors: Oriol Vinyals, Alexander Benjamin Graves, Wojciech Czarnecki, Koray Kavukcuoglu, Simon Osindero, Maxwell Elliot Jaderberg
-
Patent number: 11714994Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for learning from delayed outcomes using neural networks. One of the methods includes receiving an input observation; generating, from the input observation, an output label distribution over possible labels for the input observation at a final time, comprising: processing the input observation using a first neural network configured to process the input observation to generate a distribution over possible values for an intermediate indicator at a first time earlier than the final time; generating, from the distribution, an input value for the intermediate indicator; and processing the input value for the intermediate indicator using a second neural network configured to process the input value for the intermediate indicator to determine the output label distribution over possible values for the input observation at the final time; and providing an output derived from the output label distribution.Type: GrantFiled: March 11, 2019Date of Patent: August 1, 2023Assignee: DeepMind Technologies LimitedInventors: Huiyi Hu, Ray Jiang, Timothy Arthur Mann, Sven Adrian Gowal, Balaji Lakshminarayanan, András György
-
Patent number: 11714993Abstract: Methods, systems, and apparatus for classifying a new example using a comparison set of comparison examples. One method includes maintaining a comparison set, the comparison set including comparison examples and a respective label vector for each of the comparison examples, each label vector including a respective score for each label in a predetermined set of labels; receiving a new example; determining a respective attention weight for each comparison example by applying a neural network attention mechanism to the new example and to the comparison examples; and generating a respective label score for each label in the predetermined set of labels from, for each of the comparison examples, the respective attention weight for the comparison example and the respective label vector for the comparison example, in which the respective label score for each of the labels represents a likelihood that the label is a correct label for the new example.Type: GrantFiled: April 6, 2021Date of Patent: August 1, 2023Assignee: DeepMind Technologies LimitedInventors: Charles Blundell, Oriol Vinyals
-
Patent number: 11714996Abstract: A computer-implemented method of training a student machine learning system comprises receiving data indicating execution of an expert, determining one or more actions performed by the expert during the execution and a corresponding state-action Jacobian, and training the student machine learning system using a linear-feedback-stabilized policy. The linear-feedback-stabilized policy may be based on the state-action Jacobian. Also a neural network system for representing a space of probabilistic motor primitives, implemented by one or more computers. The neural network system comprises an encoder configured to generate latent variables based on a plurality of inputs, each input comprising a plurality of frames, and a decoder configured to generate an action based on one or more of the latent variables and a state.Type: GrantFiled: July 25, 2022Date of Patent: August 1, 2023Assignee: DeepMind Technologies LimitedInventors: Leonard Hasenclever, Vu Pham, Joshua Merel, Alexandre Galashov
-
Patent number: 11712799Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-driven robotic control. One of the methods includes maintaining robot experience data; obtaining annotation data; training, on the annotation data, a reward model; generating task-specific training data for the particular task, comprising, for each experience in a second subset of the experiences in the robot experience data: processing the observation in the experience using the trained reward model to generate a reward prediction, and associating the reward prediction with the experience; and training a policy neural network on the task-specific training data for the particular task, wherein the policy neural network is configured to receive a network input comprising an observation and to generate a policy output that defines a control policy for a robot performing the particular task.Type: GrantFiled: September 14, 2020Date of Patent: August 1, 2023Assignee: DeepMind Technologies LimitedInventors: Serkan Cabi, Ziyu Wang, Alexander Novikov, Ksenia Konyushkova, Sergio Gomez Colmenarejo, Scott Ellison Reed, Misha Man Ray Denil, Jonathan Karl Scholz, Oleg O. Sushkov, Rae Chan Jeong, David Barker, David Budden, Mel Vecerik, Yusuf Aytar, Joao Ferdinando Gomes de Freitas
-
Patent number: 11714990Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by an agent interacting with an environment. In one aspect, the method comprises: receiving an observation characterizing a current state of the environment; processing the observation and an exploration importance factor using the action selection neural network to generate an action selection output; selecting an action to be performed by the agent using the action selection output; determining an exploration reward; determining an overall reward based on: (i) the exploration importance factor, and (ii) the exploration reward; and training the action selection neural network using a reinforcement learning technique based on the overall reward.Type: GrantFiled: May 22, 2020Date of Patent: August 1, 2023Assignee: DeepMind Technologies LimitedInventors: Adrià Puigdomènech Badia, Pablo Sprechmann, Alex Vitvitskyi, Zhaohan Guo, Bilal Piot, Steven James Kapturowski, Olivier Tieleman, Charles Blundell
-
Patent number: 11704541Abstract: There is described a neural network system for generating a graph, the graph comprising a set of nodes and edges. The system comprises one or more neural networks configured to represent a probability distribution over sequences of node generating decisions and/or edge generating decisions, and one or more computers configured to sample the probability distribution represented by the one or more neural networks to generate a graph.Type: GrantFiled: October 29, 2018Date of Patent: July 18, 2023Assignee: DeepMind Technologies LimitedInventors: Yujia Li, Christopher James Dyer, Oriol Vinyals
-
Patent number: 11693627Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using neural networks having contiguous sparsity patterns. One of the methods includes storing a first parameter matrix of a neural network having a contiguous sparsity pattern in storage associated with a computing device. The computing device performs an inference pass of the neural network to generate an output vector, including reading, from the storage associated with the computing device, one or more activation values from the input vector, reading, from the storage associated with the computing device, a block of non-zero parameter values, and multiplying each of the one or more activation values by one or more of the block of non-zero parameter values.Type: GrantFiled: February 11, 2019Date of Patent: July 4, 2023Assignee: DeepMind Technologies LimitedInventors: Karen Simonyan, Nal Emmerich Kalchbrenner, Erich Konrad Elsen
-
Patent number: 11676035Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. The neural network has a plurality of differentiable weights and a plurality of non-differentiable weights. One of the methods includes determining trained values of the plurality of differentiable weights and the non-differentiable weights by repeatedly performing operations that include determining an update to the current values of the plurality of differentiable weights using a machine learning gradient-based training technique and determining, using an evolution strategies (ES) technique, an update to the current values of a plurality of distribution parameters.Type: GrantFiled: January 23, 2020Date of Patent: June 13, 2023Assignee: DeepMind Technologies LimitedInventors: Karel Lenc, Karen Simonyan, Tom Schaul, Erich Konrad Elsen
-
Patent number: 11675855Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for re-ranking a collection of documents according to a first metric and subject to a constraint on a function of one or more second metrics. One of the methods includes: obtaining, for each document in the first collection of documents, a respective first metric value corresponding to the first metric and respective one or more second metric values corresponding to the one or more second metrics; re-ranking the first collection of documents, comprising: determining the constraint on the function of one or more second metrics by computing a first threshold value using a variable threshold function that takes as input second metric values for the documents in the first collection of documents; and determining the re-ranking for the first collection of documents by solving a constrained optimization for the first metric constrained by the first threshold value.Type: GrantFiled: November 18, 2020Date of Patent: June 13, 2023Assignee: DeepMind Technologies LimitedInventors: Anton Zhernov, Krishnamurthy Dvijotham, Xiaohong Gong, Amogh S. Asgekar
-
Patent number: 11663441Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection policy neural network, wherein the action selection policy neural network is configured to process an observation characterizing a state of an environment to generate an action selection policy output, wherein the action selection policy output is used to select an action to be performed by an agent interacting with an environment. In one aspect, a method comprises: obtaining an observation characterizing a state of the environment subsequent to the agent performing a selected action; generating a latent representation of the observation; processing the latent representation of the observation using a discriminator neural network to generate an imitation score; determining a reward from the imitation score; and adjusting the current values of the action selection policy neural network parameters based on the reward using a reinforcement learning training technique.Type: GrantFiled: September 27, 2019Date of Patent: May 30, 2023Assignee: DeepMind Technologies LimitedInventors: Scott Ellison Reed, Yusuf Aytar, Ziyu Wang, Tom Paine, Sergio Gomez Colmenarejo, David Budden, Tobias Pfaff, Aaron Gerard Antonius van den Oord, Oriol Vinyals, Alexander Novikov
-
Patent number: 11663475Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by a reinforcement learning agent interacting with an environment. In particular, the actions are selected from a continuous action space and the system trains the action selection neural network jointly with a distribution Q network that is used to update the parameters of the action selection neural network.Type: GrantFiled: September 15, 2022Date of Patent: May 30, 2023Assignee: DeepMind Technologies LimitedInventors: David Budden, Matthew William Hoffman, Gabriel Barth-Maron
-
Patent number: 11662210Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a grid cell neural network and an action selection neural network. The grid cell network is configured to: receive an input comprising data characterizing a velocity of the agent; process the input to generate a grid cell representation; and process the grid cell representation to generate an estimate of a position of the agent in the environment; the action selection neural network is configured to: receive an input comprising a grid cell representation and an observation characterizing a state of the environment; and process the input to generate an action selection network output.Type: GrantFiled: May 18, 2022Date of Patent: May 30, 2023Assignee: DeepMind Technologies LimitedInventors: Andrea Banino, Sudarshan Kumaran, Raia Thais Hadsell, Benigno Uria-Martínez
-
Patent number: 11651208Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for reinforcement learning. A reinforcement learning neural network selects actions to be performed by an agent interacting with an environment to perform a task in an attempt to achieve a specified result. The reinforcement learning neural network has at least one input to receive an input observation characterizing a state of the environment and at least one output for determining an action to be performed by the agent in response to the input observation. The system includes a reward function network coupled to the reinforcement learning neural network. The reward function network has an input to receive reward data characterizing a reward provided by one or more states of the environment and is configured to determine a reward function to provide one or more target values for training the reinforcement learning neural network.Type: GrantFiled: May 22, 2018Date of Patent: May 16, 2023Assignee: DeepMind Technologies LimitedInventors: Zhongwen Xu, Hado Phillip van Hasselt, Joseph Varughese Modayil, Andre da Motta Salles Barreto, David Silver
-
Patent number: 11636347Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a method comprises: obtaining a graph of nodes and edges that represents an interaction history of the agent with the environment; generating an encoded representation of the graph representing the interaction history of the agent with the environment; processing an input based on the encoded representation of the graph using an action selection neural network, in accordance with current values of action selection neural network parameters, to generate an action selection output; and selecting an action from a plurality of possible actions to be performed by the agent using the action selection output generated by the action selection neural network.Type: GrantFiled: January 22, 2020Date of Patent: April 25, 2023Assignee: DeepMind Technologies LimitedInventors: Hanjun Dai, Yujia Li, Chenglong Wang, Rishabh Singh, Po-Sen Huang, Pushmeet Kohli
-
Patent number: 11636283Abstract: A variational autoencoder (VAE) neural network system, comprising an encoder neural network to encode an input data item to define a posterior distribution for a set of latent variables, and a decoder neural network to generate an output data item representing values of a set of latent variables sampled from the posterior distribution. The system is configured for training with an objective function including a term dependent on a difference between the posterior distribution and a prior distribution. The prior and posterior distributions are arranged so that they cannot be matched to one another. The VAE system may be used for compressing and decompressing data.Type: GrantFiled: June 1, 2020Date of Patent: April 25, 2023Assignee: DeepMind Technologies LimitedInventors: Benjamin Poole, Aaron Gerard Antonius van den Oord, Ali Razavi-Nematollahi, Oriol Vinyals
-
Patent number: 11627165Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a policy neural network having a plurality of policy parameters and used to select actions to be performed by an agent to control the agent to perform a particular task while interacting with one or more other agents in an environment. In one aspect, the method includes: maintaining data specifying a pool of candidate action selection policies; maintaining data specifying respective matchmaking policy; and training the policy neural network using a reinforcement learning technique to update the policy parameters. The policy parameters define policies to be used in controlling the agent to perform the particular task.Type: GrantFiled: January 24, 2020Date of Patent: April 11, 2023Assignee: DeepMind Technologies LimitedInventors: David Silver, Oriol Vinyals, Maxwell Elliot Jaderberg