Patents by Inventor Koray Kavukcuoglu
Koray Kavukcuoglu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10679126Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a system configured to select actions to be performed by an agent that interacts with an environment. The system comprises a manager neural network subsystem and a worker neural network subsystem. The manager subsystem is configured to, at each of the multiple time steps, generate a final goal vector for the time step. The worker subsystem is configured to, at each of multiple time steps, use the final goal vector generated by the manager subsystem to generate a respective action score for each action in a predetermined set of actions.Type: GrantFiled: July 15, 2019Date of Patent: June 9, 2020Assignee: DeepMind Technologies LimitedInventors: Simon Osindero, Koray Kavukcuoglu, Alexander Vezhnevets
-
Publication number: 20200151515Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a reinforcement learning system. In one aspect, a method of training an action selection policy neural network for use in selecting actions to be performed by an agent navigating through an environment to accomplish one or more goals comprises: receiving an observation image characterizing a current state of the environment; processing, using the action selection policy neural network, an input comprising the observation image to generate an action selection output; processing, using a geometry-prediction neural network, an intermediate output generated by the action selection policy neural network to predict a value of a feature of a geometry of the environment when in the current state; and backpropagating a gradient of a geometry-based auxiliary loss into the action selection policy neural network to determine a geometry-based auxiliary update for current values of the network parameters.Type: ApplicationFiled: January 17, 2020Publication date: May 14, 2020Inventors: Fabio Viola, Piotr Wojciech Mirowski, Andrea Banino, Razvan Pascanu, Hubert Josef Soyer, Andrew James Ballard, Sudarshan Kumaran, Raia Thais Hadsell, Laurent Sifre, Rostislav Goroshin, Koray Kavukcuoglu, Misha Man Ray Denil
-
Publication number: 20200117992Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for distributed training of reinforcement learning systems. One of the methods includes receiving, by a learner, current values of the parameters of the Q network from a parameter server, wherein each learner maintains a respective learner Q network replica and a respective target Q network replica; updating, by the learner, the parameters of the learner Q network replica maintained by the learner using the current values; selecting, by the learner, an experience tuple from a respective replay memory; computing, by the learner, a gradient from the experience tuple using the learner Q network replica maintained by the learner and the target Q network replica maintained by the learner; and providing, by the learner, the computed gradient to the parameter server.Type: ApplicationFiled: October 14, 2019Publication date: April 16, 2020Inventors: Praveen Deepak Srinivasan, Rory Fearon, Cagdas Alcicek, Arun Sarath Nair, Samuel Blackwell, Vedavyas Panneershelvam, Alessandro De Maria, Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Mustafa Suleyman
-
Patent number: 10572776Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a reinforcement learning system. In one aspect, a method of training an action selection policy neural network for use in selecting actions to be performed by an agent navigating through an environment to accomplish one or more goals comprises: receiving an observation image characterizing a current state of the environment; processing, using the action selection policy neural network, an input comprising the observation image to generate an action selection output; processing, using a geometry-prediction neural network, an intermediate output generated by the action selection policy neural network to predict a value of a feature of a geometry of the environment when in the current state; and backpropagating a gradient of a geometry-based auxiliary loss into the action selection policy neural network to determine a geometry-based auxiliary update for current values of the network parameters.Type: GrantFiled: May 3, 2019Date of Patent: February 25, 2020Assignee: DeepMind Technologies LimitedInventors: Fabio Viola, Piotr Wojciech Mirowski, Andrea Banino, Razvan Pascanu, Hubert Josef Soyer, Andrew James Ballard, Sudarshan Kumaran, Raia Thais Hadsell, Laurent Sifre, Rostislav Goroshin, Koray Kavukcuoglu, Misha Man Ray Denil
-
Publication number: 20190340509Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a system configured to select actions to be performed by an agent that interacts with an environment. The system comprises a manager neural network subsystem and a worker neural network subsystem. The manager subsystem is configured to, at each of the multiple time steps, generate a final goal vector for the time step. The worker subsystem is configured to, at each of multiple time steps, use the final goal vector generated by the manager subsystem to generate a respective action score for each action in a predetermined set of actions.Type: ApplicationFiled: July 15, 2019Publication date: November 7, 2019Inventors: Simon Osindero, Koray Kavukcuoglu, Alexander Vezhnevets
-
Publication number: 20190332938Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a machine learning model. In one aspect, a method includes receiving training data for training the machine learning model on a plurality of tasks, where each task includes multiple batches of training data. A task is selected in accordance with a current task selection policy. A batch of training data is selected from the selected task. The machine learning model is trained on the selected batch of training data to determine updated values of the model parameters. A learning progress measure that represents a progress of the training of the machine learning model as a result of training the machine learning model on the selected batch of training data is determined. The current task selection policy is updated using the learning progress measure.Type: ApplicationFiled: July 10, 2019Publication date: October 31, 2019Inventors: Marc Gendron-Bellemare, Jacob Lee Menick, Alexander Benjamin Graves, Koray Kavukcuoglu, Remi Munos
-
Patent number: 10445641Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for distributed training of reinforcement learning systems. One of the methods includes receiving, by a learner, current values of the parameters of the Q network from a parameter server, wherein each learner maintains a respective learner Q network replica and a respective target Q network replica; updating, by the learner, the parameters of the learner Q network replica maintained by the learner using the current values; selecting, by the learner, an experience tuple from a respective replay memory; computing, by the learner, a gradient from the experience tuple using the learner Q network replica maintained by the learner and the target Q network replica maintained by the learner; and providing, by the learner, the computed gradient to the parameter server.Type: GrantFiled: February 4, 2016Date of Patent: October 15, 2019Assignee: Deepmind Technologies LimitedInventors: Praveen Deepak Srinivasan, Rory Fearon, Cagdas Alcicek, Arun Sarath Nair, Samuel Blackwell, Vedavyas Panneershelvam, Alessandro De Maria, Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Mustafa Suleyman
-
Publication number: 20190266449Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a reinforcement learning system. In one aspect, a method of training an action selection policy neural network for use in selecting actions to be performed by an agent navigating through an environment to accomplish one or more goals comprises: receiving an observation image characterizing a current state of the environment; processing, using the action selection policy neural network, an input comprising the observation image to generate an action selection output; processing, using a geometry-prediction neural network, an intermediate output generated by the action selection policy neural network to predict a value of a feature of a geometry of the environment when in the current state; and backpropagating a gradient of a geometry-based auxiliary loss into the action selection policy neural network to determine a geometry-based auxiliary update for current values of the network parameters.Type: ApplicationFiled: May 3, 2019Publication date: August 29, 2019Inventors: Fabio Viola, Piotr Wojciech Mirowski, Andrea Banino, Razvan Pascanu, Hubert Josef Soyer, Andrew James Ballard, Sudarshan Kumaran, Raia Thais Hadsell, Laurent Sifre, Rostislav Goroshin, Koray Kavukcuoglu, Misha Man Ray Denil
-
Publication number: 20190258938Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a reinforcement learning system. The method includes: training an action selection policy neural network, and during the training of the action selection neural network, training one or more auxiliary control neural networks and a reward prediction neural network. Each of the auxiliary control neural networks is configured to receive a respective intermediate output generated by the action selection policy neural network and generate a policy output for a corresponding auxiliary control task. The reward prediction neural network is configured to receive one or more intermediate outputs generated by the action selection policy neural network and generate a corresponding predicted reward.Type: ApplicationFiled: May 3, 2019Publication date: August 22, 2019Inventors: Volodymyr Mnih, Wojciech Czarnecki, Maxwell Elliot Jaderberg, Tom Schaul, David Silver, Koray Kavukcuoglu
-
Publication number: 20190258929Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for asynchronous deep reinforcement learning. One of the systems includes a plurality of workers, wherein each worker is configured to operate independently of each other worker, and wherein each worker is associated with a respective actor that interacts with a respective replica of the environment during the training of the deep neural network.Type: ApplicationFiled: May 3, 2019Publication date: August 22, 2019Inventors: Volodymyr Mnih, Adria Puigdomenech Badia, Alexander Benjamin Graves, Timothy James Alexander Harley, David Silver, Koray Kavukcuoglu
-
Patent number: 10346741Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for asynchronous deep reinforcement learning. One of the systems includes a plurality of workers, wherein each worker is configured to operate independently of each other worker, and wherein each worker is associated with a respective actor that interacts with a respective replica of the environment during the training of the deep neural network.Type: GrantFiled: May 11, 2018Date of Patent: July 9, 2019Assignee: DeepMind Technologies LimitedInventors: Volodymyr Mnih, Adrià Puigdomènech Badia, Alexander Benjamin Graves, Timothy James Alexander Harley, David Silver, Koray Kavukcuoglu
-
Patent number: 10223617Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images using recurrent attention. One of the methods includes determining a location in the first image; extracting a glimpse from the first image using the location; generating a glimpse representation of the extracted glimpse; processing the glimpse representation using a recurrent neural network to update a current internal state of the recurrent neural network to generate a new internal state; processing the new internal state to select a location in a next image in the image sequence after the first image; and processing the new internal state to select an action from a predetermined set of possible actions.Type: GrantFiled: June 4, 2015Date of Patent: March 5, 2019Assignee: DeepMind Technologies LimitedInventors: Volodymyr Mnih, Koray Kavukcuoglu
-
Publication number: 20180330185Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing inputs using an image processing neural network system that includes a spatial transformer module. One of the methods includes receiving an input feature map derived from the one or more input images, and applying a spatial transformation to the input feature map to generate a transformed feature map, comprising: processing the input feature map to generate spatial transformation parameters for the spatial transformation, and sampling from the input feature map in accordance with the spatial transformation parameters to generate the transformed feature map.Type: ApplicationFiled: July 20, 2018Publication date: November 15, 2018Inventors: Maxwell Elliot Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu
-
Publication number: 20180260708Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for asynchronous deep reinforcement learning. One of the systems includes a plurality of workers, wherein each worker is configured to operate independently of each other worker, and wherein each worker is associated with a respective actor that interacts with a respective replica of the environment during the training of the deep neural network.Type: ApplicationFiled: May 11, 2018Publication date: September 13, 2018Inventors: Volodymyr Mnih, Adrià Puigdomènech Badia, Alexander Benjamin Graves, Timothy James Alexander Harley, David Silver, Koray Kavukcuoglu
-
Patent number: 10032089Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing inputs using an image processing neural network system that includes a spatial transformer module. One of the methods includes receiving an input feature map derived from the one or more input images, and applying a spatial transformation to the input feature map to generate a transformed feature map, comprising: processing the input feature map to generate spatial transformation parameters for the spatial transformation, and sampling from the input feature map in accordance with the spatial transformation parameters to generate the transformed feature map.Type: GrantFiled: June 6, 2016Date of Patent: July 24, 2018Assignee: DeepMind Technologies LimitedInventors: Maxwell Elliot Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu
-
Publication number: 20170337464Abstract: Methods and systems for performing a sequence of machine learning tasks. One system includes a sequence of deep neural networks (DNNs), including: a first DNN corresponding to a first machine learning task, wherein the first DNN comprises a first plurality of indexed layers, and each layer in the first plurality of indexed layers is configured to receive a respective layer input and process the layer input to generate a respective layer output; and one or more subsequent DNNs corresponding to one or more respective machine learning tasks, wherein each subsequent DNN comprises a respective plurality of indexed layers, and each layer in a respective plurality of indexed layers with index greater than one receives input from a preceding layer of the respective subsequent DNN, and one or more preceding layers of respective preceding DNNs, wherein a preceding layer is a layer whose index is one less than the current index.Type: ApplicationFiled: December 30, 2016Publication date: November 23, 2017Inventors: Neil Charles Rabinowitz, Guillaume Desjardins, Andrei-Alexandru Rusu, Koray Kavukcuoglu, Raia Thais Hadsell, Razvan Pascanu, James Kirkpatrick, Hubert Josef Soyer
-
Publication number: 20170278018Abstract: We describe a method of reinforcement learning for a subject system having multiple states and actions to move from one state to the next. Training data is generated by operating on the system with a succession of actions and used to train a second neural network. Target values for training the second neural network are derived from a first neural network which is generated by copying weights of the second neural network at intervals.Type: ApplicationFiled: June 9, 2017Publication date: September 28, 2017Inventors: Volodymyr Mnih, Koray Kavukcuoglu
-
Patent number: 9679258Abstract: We describe a method of reinforcement learning for a subject system having multiple states and actions to move from one state to the next. Training data is generated by operating on the system with a succession of actions and used to train a second neural network. Target values for training the second neural network are derived from a first neural network which is generated by copying weights of the second neural network at intervals.Type: GrantFiled: December 5, 2013Date of Patent: June 13, 2017Assignee: Google Inc.Inventors: Volodymyr Mnih, Koray Kavukcuoglu
-
Publication number: 20170140270Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for asynchronous deep reinforcement learning. One of the systems includes a plurality of workers, wherein each worker is configured to operate independently of each other worker, and wherein each worker is associated with a respective actor that interacts with a respective replica of the environment during the training of the deep neural network.Type: ApplicationFiled: November 11, 2016Publication date: May 18, 2017Inventors: Volodymyr Mnih, Adrià Puigdomènech Badia, Alexander Benjamin Graves, Timothy James Alexander Harley, David Silver, Koray Kavukcuoglu
-
Publication number: 20160358038Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing inputs using an image processing neural network system that includes a spatial transformer module. One of the methods includes receiving an input feature map derived from the one or more input images, and applying a spatial transformation to the input feature map to generate a transformed feature map, comprising: processing the input feature map to generate spatial transformation parameters for the spatial transformation, and sampling from the input feature map in accordance with the spatial transformation parameters to generate the transformed feature map.Type: ApplicationFiled: June 6, 2016Publication date: December 8, 2016Inventors: Maxwell Elliot Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu