Patents by Inventor Benigno URIA-MARTINEZ
Benigno URIA-MARTINEZ has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11720796Abstract: A method includes maintaining respective episodic memory data for each of multiple actions; receiving a current observation characterizing a current state of an environment being interacted with by an agent; processing the current observation using an embedding neural network in accordance with current values of parameters of the embedding neural network to generate a current key embedding for the current observation; for each action of the plurality of actions: determining the p nearest key embeddings in the episodic memory data for the action to the current key embedding according to a distance measure, and determining a Q value for the action from the return estimates mapped to by the p nearest key embeddings in the episodic memory data for the action; and selecting, using the Q values for the actions, an action from the multiple actions as the action to be performed by the agent.Type: GrantFiled: April 23, 2020Date of Patent: August 8, 2023Assignee: DeepMind Technologies LimitedInventors: Benigno Uria-Martínez, Alexander Pritzel, Charles Blundell, Adrià Puigdomènech Badia
-
Patent number: 11662210Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a grid cell neural network and an action selection neural network. The grid cell network is configured to: receive an input comprising data characterizing a velocity of the agent; process the input to generate a grid cell representation; and process the grid cell representation to generate an estimate of a position of the agent in the environment; the action selection neural network is configured to: receive an input comprising a grid cell representation and an observation characterizing a state of the environment; and process the input to generate an action selection network output.Type: GrantFiled: May 18, 2022Date of Patent: May 30, 2023Assignee: DeepMind Technologies LimitedInventors: Andrea Banino, Sudarshan Kumaran, Raia Thais Hadsell, Benigno Uria-Martínez
-
Publication number: 20230124261Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a spatial embedding neural network that is configured to process data characterizing motion of an agent that is interacting with an environment to generate spatial embeddings. In one aspect, a method comprises: processing data characterizing the motion of the agent in the environment at the current time step using a spatial embedding neural network to generate a current spatial embedding for the current time step; determining a predicted score and a target score for each of a plurality of slots in an external memory, wherein each slot stores: (i) a representation of an observation characterizing a state of the environment, and (ii) a spatial embedding; and determining an update to values of the set of spatial embedding neural network parameters based on an error between the predicted scores and the target scores.Type: ApplicationFiled: May 12, 2021Publication date: April 20, 2023Inventors: Benigno Uria-Martínez, Andrea Banino, Borja Ibarz Gabardos, Vinicius Zambaldi, Charles Blundell
-
Publication number: 20220276056Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a grid cell neural network and an action selection neural network. The grid cell network is configured to: receive an input comprising data characterizing a velocity of the agent; process the input to generate a grid cell representation; and process the grid cell representation to generate an estimate of a position of the agent in the environment; the action selection neural network is configured to: receive an input comprising a grid cell representation and an observation characterizing a state of the environment; and process the input to generate an action selection network output.Type: ApplicationFiled: May 18, 2022Publication date: September 1, 2022Inventors: Andrea Banino, Sudarshan Kumaran, Raia Thais Hadsell, Benigno Uria-Martínez
-
Patent number: 11365972Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a grid cell neural network and an action selection neural network. The grid cell network is configured to: receive an input comprising data characterizing a velocity of the agent; process the input to generate a grid cell representation; and process the grid cell representation to generate an estimate of a position of the agent in the environment; the action selection neural network is configured to: receive an input comprising a grid cell representation and an observation characterizing a state of the environment; and process the input to generate an action selection network output.Type: GrantFiled: February 19, 2020Date of Patent: June 21, 2022Assignee: DeepMind Technologies LimitedInventors: Andrea Banino, Sudarshan Kumaran, Raia Thais Hadsell, Benigno Uria-Martínez
-
Patent number: 11010663Abstract: Systems, methods, and apparatus, including computer programs encoded on a computer storage medium, related to associative long short-term memory (LSTM) neural network layers configured to maintain N copies of an internal state for the associative LSTM layer, N being an integer greater than one. In one aspect, a system includes a recurrent neural network including an associative LSTM layer, wherein the associative LSTM layer is configured to, for each time step, receive a layer input, update each of the N copies of the internal state using the layer input for the time step and a layer output generated by the associative LSTM layer for a preceding time step, and generate a layer output for the time step using the N updated copies of the internal state.Type: GrantFiled: December 30, 2016Date of Patent: May 18, 2021Assignee: DeepMind Technologies LimitedInventors: Ivo Danihelka, Nal Emmerich Kalchbrenner, Gregory Duncan Wayne, Benigno Uría-Martínez, Alexander Benjamin Graves
-
Publication number: 20200265317Abstract: A method includes maintaining respective episodic memory data for each of multiple actions; receiving a current observation characterizing a current state of an environment being interacted with by an agent; processing the current observation using an embedding neural network in accordance with current values of parameters of the embedding neural network to generate a current key embedding for the current observation; for each action of the plurality of actions: determining the p nearest key embeddings in the episodic memory data for the action to the current key embedding according to a distance measure, and determining a Q value for the action from the return estimates mapped to by the p nearest key embeddings in the episodic memory data for the action; and selecting, using the Q values for the actions, an action from the multiple actions as the action to be performed by the agent.Type: ApplicationFiled: April 23, 2020Publication date: August 20, 2020Inventors: Benigno Uria-Martínez, Alexander Pritzel, Charles Blundell, Adrià Puigdomènech Badia
-
Publication number: 20200191574Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a grid cell neural network and an action selection neural network. The grid cell network is configured to: receive an input comprising data characterizing a velocity of the agent; process the input to generate a grid cell representation; and process the grid cell representation to generate an estimate of a position of the agent in the environment; the action selection neural network is configured to: receive an input comprising a grid cell representation and an observation characterizing a state of the environment; and process the input to generate an action selection network output.Type: ApplicationFiled: February 19, 2020Publication date: June 18, 2020Inventors: Andrea Banino, Sudarshan Kumaran, Raia Thais Hadsell, Benigno Uria-Martínez
-
Patent number: 10664753Abstract: A method includes maintaining respective episodic memory data for each of multiple actions; receiving a current observation characterizing a current state of an environment being interacted with by an agent; processing the current observation using an embedding neural network in accordance with current values of parameters of the embedding neural network to generate a current key embedding for the current observation; for each action of the plurality of actions: determining the p nearest key embeddings in the episodic memory data for the action to the current key embedding according to a distance measure, and determining a Q value for the action from the return estimates mapped to by the p nearest key embeddings in the episodic memory data for the action; and selecting, using the Q values for the actions, an action from the multiple actions as the action to be performed by the agent.Type: GrantFiled: June 19, 2019Date of Patent: May 26, 2020Assignee: DeepMind Technologies LimitedInventors: Benigno Uria-Martínez, Alexander Pritzel, Charles Blundell, Adria Puigdomenech Badia
-
Patent number: 10605608Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a grid cell neural network and an action selection neural network. The grid cell network is configured to: receive an input comprising data characterizing a velocity of the agent; process the input to generate a grid cell representation; and process the grid cell representation to generate an estimate of a position of the agent in the environment; the action selection neural network is configured to: receive an input comprising a grid cell representation and an observation characterizing a state of the environment; and process the input to generate an action selection network output.Type: GrantFiled: May 9, 2019Date of Patent: March 31, 2020Assignee: DeepMind Technologies LimitedInventors: Andrea Banino, Sudarshan Kumaran, Raia Thais Hadsell, Benigno Uria-Martinez
-
Publication number: 20190346272Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a grid cell neural network and an action selection neural network. The grid cell network is configured to: receive an input comprising data characterizing a velocity of the agent; process the input to generate a grid cell representation; and process the grid cell representation to generate an estimate of a position of the agent in the environment; the action selection neural network is configured to: receive an input comprising a grid cell representation and an observation characterizing a state of the environment; and process the input to generate an action selection network output.Type: ApplicationFiled: May 9, 2019Publication date: November 14, 2019Inventors: Andrea Banino, Sudarshan Kumaran, Raia Thais Hadsell, Benigno Uria-Martinez
-
Publication number: 20190303764Abstract: A method includes maintaining respective episodic memory data for each of multiple actions; receiving a current observation characterizing a current state of an environment being interacted with by an agent; processing the current observation using an embedding neural network in accordance with current values of parameters of the embedding neural network to generate a current key embedding for the current observation; for each action of the plurality of actions: determining the p nearest key embeddings in the episodic memory data for the action to the current key embedding according to a distance measure, and determining a Q value for the action from the return estimates mapped to by the p nearest key embeddings in the episodic memory data for the action; and selecting, using the Q values for the actions, an action from the multiple actions as the action to be performed by the agent.Type: ApplicationFiled: June 19, 2019Publication date: October 3, 2019Inventors: Benigno Uria-Martínez, Alexander Pritzel, Charles Blundell, Adria Puigdomenech Badia
-
Publication number: 20190205757Abstract: Methods, systems, and apparatus for selecting actions to be performed by an agent interacting with an environment. One method includes maintaining return data that maps each observation-action pair to a respective return, the action in each observation-action pair being an action that was performed by the agent in response to the observation in the observation-action pair and the respective return mapped to by each of the observation-action pairs being a return that resulted from the agent performing the action in the observation-action pair; receiving a current observation; determining whether the current observation matches any observation identified in the return data; and in response to determining that the current observation matches a first observation identified in the return data, selecting an action to be performed by the agent using the returns mapped to by observation-action pairs in the return data that include the first observation.Type: ApplicationFiled: May 18, 2017Publication date: July 4, 2019Inventors: Charles BLUNDELL, Benigno URIA-MARTINEZ
-
Publication number: 20170228642Abstract: Systems, methods, and apparatus, including computer programs encoded on a computer storage medium, related to associative long short-term memory (LSTM) neural network layers configured to maintain N copies of an internal state for the associative LSTM layer, N being an integer greater than one. In one aspect, a system includes a recurrent neural network including an associative LSTM layer, wherein the associative LSTM layer is configured to, for each time step, receive a layer input, update each of the N copies of the internal state using the layer input for the time step and a layer output generated by the associative LSTM layer for a preceding time step, and generate a layer output for the time step using the N updated copies of the internal state.Type: ApplicationFiled: December 30, 2016Publication date: August 10, 2017Inventors: Ivo Danihelka, Nal Emmerich Kalchbrenner, Gregory Duncan Wayne, Benigno Uría-Martínez, Alexander Benjamin Graves