Patents by Inventor Kurt Hartwig Graepel

Kurt Hartwig Graepel has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240046112
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating control policies for controlling agents in an environment. One of the methods includes, at each of a plurality of iterations: obtaining a current joint control policy for a plurality of agents, the current joint control policy specifying a respective current control policy for each agent; and updating the current joint control policy, comprising, for each agent: generating a respective reward estimate for each of a plurality of alternate control policies that is an estimate of a reward received by the agent if the agent is controlled using the alternate control policy while the other agents are controlled using the respective current control policies; computing a best response for the agent from the respective reward estimates; and updating the respective current control policy for the agent using the best response for the agent.
    Type: Application
    Filed: February 7, 2022
    Publication date: February 8, 2024
    Inventors: Luke Christopher Marris, Paul Fernand Michel Muller, Marc Lanctot, Thore Kurt Hartwig Graepel
  • Publication number: 20220374683
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for selecting an optimal feature point in a continuous domain for a group of agents. A computer-implemented system obtains, for each of a plurality of agents, respective training data that comprises a respective utility score for each of a plurality of discrete points in the continuous domain. The system trains, for each of the plurality of agents and on the respective training data for the agents, a respective neural network that is configured to receive an input comprising a point in the continuous domain and to generate as output a predicted utility score for the agent at the point.
    Type: Application
    Filed: February 9, 2022
    Publication date: November 24, 2022
    Inventors: Thomas Edward Eccles, Ian Michael Gemp, János Kramár, Marta Garnelo Abellanas, Dan Rosenbaum, Yoram Bachrach, Thore Kurt Hartwig Graepel
  • Publication number: 20220261635
    Abstract: Methods, systems and apparatus, including computer programs encoded on computer storage media, for training a policy neural network by repeatedly updating the policy neural network at each of a plurality of training iterations. One of the methods includes generating training data for the training iteration by controlling the agent in accordance with an improved policy that selects actions in response to input state representations. A best response computation is performed using (i) a candidate policy generated from respective policy neural networks as of one or more preceding iterations and (ii) a candidate value neural network. The candidate value neural network is configured to generate a value output that is an estimate of a value of the environment being in the state characterized by a state representation to complete a particular task. The policy neural network is updated by training the policy neural network on the training data.
    Type: Application
    Filed: January 7, 2022
    Publication date: August 18, 2022
    Inventors: Thomas William Anthony, Thomas Edward Eccles, Andrea Tacchetti, János Kramár, Ian Michael Gemp, Thomas Chalmers Hudson, Nicolas Pierre Mickaël Porcel, Marc Lanctot, Julien Perolat, Richard Everett, Thore Kurt Hartwig Graepel, Yoram Bachrach
  • Patent number: 11250475
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for efficiently allocating resources among participants. Methods can include receiving valuation data specifying, for each of a plurality of entities, a respective valuation for each of a plurality of resource subsets, each resource subset comprising a different combination of one or more resources of a plurality of resources. After receiving valuation data, assigning each resource in the plurality of resources to a respective entity of the plurality of entities based on the valuations and generating, for each particular entity, a respective input representation that is derived from valuations of every other entity in the plurality of entities other than the particular entity. The input representation for each particular entity is processed using a neural network to generate a rule for the particular entity and a payment based on the rule output for the entities.
    Type: Grant
    Filed: July 1, 2020
    Date of Patent: February 15, 2022
    Assignee: DeepMind Technologies Limited
    Inventors: Andrea Tacchetti, Daniel Joseph Strouse, Marta Garnelo Abellanas, Thore Kurt Hartwig Graepel, Yoram Bachrach
  • Publication number: 20220005079
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for efficiently allocating resources among participants. Methods can include receiving valuation data specifying, for each of a plurality of entities, a respective valuation for each of a plurality of resource subsets, each resource subset comprising a different combination of one or more resources of a plurality of resources. After receiving valuation data, assigning each resource in the plurality of resources to a respective entity of the plurality of entities based on the valuations and generating, for each particular entity, a respective input representation that is derived from valuations of every other entity in the plurality of entities other than the particular entity. The input representation for each particular entity is processed using a neural network to generate a rule for the particular entity and a payment based on the rule output for the entities.
    Type: Application
    Filed: July 1, 2020
    Publication date: January 6, 2022
    Inventors: Andrea Tacchetti, Daniel Joseph Strouse, Marta Garnelo Abellanas, Thore Kurt Hartwig Graepel, Yoram Bachrach
  • Patent number: 10867242
    Abstract: Methods, systems and apparatus, including computer programs encoded on computer storage media, for training a value neural network that is configured to receive an observation characterizing a state of an environment being interacted with by an agent and to process the observation in accordance with parameters of the value neural network to generate a value score. One of the systems performs operations that include training a supervised learning policy neural network; initializing initial values of parameters of a reinforcement learning policy neural network having a same architecture as the supervised learning policy network to the trained values of the parameters of the supervised learning policy neural network; training the reinforcement learning policy neural network on second training data; and training the value neural network to generate a value score for the state of the environment that represents a predicted long-term reward resulting from the environment being in the state.
    Type: Grant
    Filed: September 29, 2016
    Date of Patent: December 15, 2020
    Assignee: DeepMind Technologies Limited
    Inventors: Thore Kurt Hartwig Graepel, Shih-Chieh Huang, David Silver, Arthur Clement Guez, Laurent Sifre, Ilya Sutskever, Christopher Maddison
  • Patent number: 10685062
    Abstract: New methods of relational database management are described, for example, to enable completion and checking of data in relational databases, including completion of missing foreign key values, to facilitate understanding of data in relational databases, to highlight data that it would be useful to add to a relational database and for other applications. In various embodiments, the schema of a relational database is used to automatically create a probabilistic graphical model that has a structure related to the schema. For example, nodes representing individual rows are linked to rows of other tables according to the database schema. In examples, data in the relational database is used to carry out inference using inference algorithms derived from the probabilistic graphical model. In various examples, inference results, comprising probability distributions each for an individual table cell, are used to fill missing data, highlight errors, and for other purposes.
    Type: Grant
    Filed: December 31, 2012
    Date of Patent: June 16, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Sameer Singh, Thore Kurt Hartwig Graepel, Lucas Julien Bordeaux, Andrew Donald Gordon
  • Publication number: 20180032863
    Abstract: Methods, systems and apparatus, including computer programs encoded on computer storage media, for training a value neural network that is configured to receive an observation characterizing a state of an environment being interacted with by an agent and to process the observation in accordance with parameters of the value neural network to generate a value score. One of the systems performs operations that include training a supervised learning policy neural network; initializing initial values of parameters of a reinforcement learning policy neural network having a same architecture as the supervised learning policy network to the trained values of the parameters of the supervised learning policy neural network; training the reinforcement learning policy neural network on second training data; and training the value neural network to generate a value score for the state of the environment that represents a predicted long-term reward resulting from the environment being in the state.
    Type: Application
    Filed: September 29, 2016
    Publication date: February 1, 2018
    Inventors: Thore Kurt Hartwig Graepel, Shih-Chieh Huang, David Silver, Arthur Clement Guez, Laurent Sifre, Ilya Sutskever, Christopher Maddison
  • Publication number: 20180032864
    Abstract: Methods, systems and apparatus, including computer programs encoded on computer storage media, for training a value neural network that is configured to receive an observation characterizing a state of an environment being interacted with by an agent and to process the observation in accordance with parameters of the value neural network to generate a value score. One of the systems performs operations that include training a supervised learning policy neural network; initializing initial values of parameters of a reinforcement learning policy neural network having a same architecture as the supervised learning policy network to the trained values of the parameters of the supervised learning policy neural network; training the reinforcement learning policy neural network on second training data; and training the value neural network to generate a value score for the state of the environment that represents a predicted long-term reward resulting from the environment being in the state.
    Type: Application
    Filed: September 29, 2016
    Publication date: February 1, 2018
    Inventors: Thore Kurt Hartwig Graepel, Shih-Chieh Huang, David Silver, Arthur Clement Guez, Laurent Sifre, Ilya Sutskever, Christopher Maddison
  • Patent number: 9418086
    Abstract: Database access is described, for example, where data in a database is accessed by an inference engine. In various examples, the inference engine executes inference algorithms to access data from the database and carry out inference using the data. In examples the inference algorithms are compiled from a schema of the database which is annotated with expressions of probability distributions over data in the database. In various examples the schema of the database is modified by adding one or more latent columns or latent tables to the schema for storing data to be inferred by the inference engine. In examples the expressions are compositional so, for example, an expression annotating a column of a database table may be used as part of an expression annotating another column of the database.
    Type: Grant
    Filed: August 20, 2013
    Date of Patent: August 16, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Andrew Donald Gordon, Thore Kurt Hartwig Graepel, Nicolas Philippe Marie Rolland, Eric Johannes Borgstrom, Claudio Vittorio Russo
  • Publication number: 20150058337
    Abstract: Database access is described, for example, where data in a database is accessed by an inference engine. In various examples, the inference engine executes inference algorithms to access data from the database and carry out inference using the data. In examples the inference algorithms are compiled from a schema of the database which is annotated with expressions of probability distributions over data in the database. In various examples the schema of the database is modified by adding one or more latent columns or latent tables to the schema for storing data to be inferred by the inference engine. In examples the expressions are compositional so, for example, an expression annotating a column of a database table may be used as part of an expression annotating another column of the database.
    Type: Application
    Filed: August 20, 2013
    Publication date: February 26, 2015
    Applicant: Microsoft Corporation
    Inventors: Andrew Donald Gordon, Thore Kurt Hartwig Graepel, Nicolas Philippe Marie Rolland, Eric Johannes Borgstrom, Claudio Vittorio Russo
  • Patent number: 8904149
    Abstract: Methods, systems, and media are provided for a dynamic batch strategy utilized in parallelization of online learning algorithms. The dynamic batch strategy provides a merge function on the basis of a threshold level difference between the original model state and an updated model state, rather than according to a constant or pre-determined batch size. The merging includes reading a batch of incoming streaming data, retrieving any missing model beliefs from partner processors, and training on the batch of incoming streaming data. The steps of reading, retrieving, and training are repeated until the measured difference in states exceeds a set threshold level. The measured differences which exceed the threshold level are merged for each of the plurality of processors according to attributes. The merged differences which exceed the threshold level are combined with the original partial model states to obtain an updated global model state.
    Type: Grant
    Filed: June 24, 2010
    Date of Patent: December 2, 2014
    Assignee: Microsoft Corporation
    Inventors: Taha Bekir Eren, Oleg Isakov, Weizhu Chen, Jeffrey Scott Dunn, Thomas Ivan Borchert, Joaquin Quinonero Candela, Thore Kurt Hartwig Graepel, Ralf Herbrich
  • Publication number: 20140188928
    Abstract: New methods of relational database management are described, for example, to enable completion and checking of data in relational databases, including completion of missing foreign key values, to facilitate understanding of data in relational databases, to highlight data that it would be useful to add to a relational database and for other applications. In various embodiments, the schema of a relational database is used to automatically create a probabilistic graphical model that has a structure related to the schema. For example, nodes representing individual rows are linked to rows of other tables according to the database schema. In examples, data in the relational database is used to carry out inference using inference algorithms derived from the probabilistic graphical model. In various examples, inference results, comprising probability distributions each for an individual table cell, are used to fill missing data, highlight errors, and for other purposes.
    Type: Application
    Filed: December 31, 2012
    Publication date: July 3, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Sameer Singh, Thore Kurt Hartwig Graepel, Lucas Julien Bordeaux, Andrew Donald Gordon
  • Publication number: 20120158791
    Abstract: Feature vector construction techniques are described. In one or more implementations, an input is received at a computing device that describes a graph query that specifies one of a plurality of entities to be used to query a knowledge base graph that represents the plurality of entities. A feature vector is constructed, by the computing device, having a number of indicator variables, each of which indicates observance of a sub-graph feature represented by a respective indicator variable in the knowledge base graph.
    Type: Application
    Filed: December 21, 2010
    Publication date: June 21, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Gjergji Kasneci, David Hector Stern, Thore Kurt Hartwig Graepel, Ralf Herbrich
  • Publication number: 20110320767
    Abstract: Methods, systems, and media are provided for a dynamic batch strategy utilized in parallelization of online learning algorithms. The dynamic batch strategy provides a merge function on the basis of a threshold level difference between the original model state and an updated model state, rather than according to a constant or pre-determined batch size. The merging includes reading a batch of incoming streaming data, retrieving any missing model beliefs from partner processors, and training on the batch of incoming streaming data. The steps of reading, retrieving, and training are repeated until the measured difference in states exceeds a set threshold level. The measured differences which exceed the threshold level are merged for each of the plurality of processors according to attributes. The merged differences which exceed the threshold level are combined with the original partial model states to obtain an updated global model state.
    Type: Application
    Filed: June 24, 2010
    Publication date: December 29, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: Taha Bekir Eren, Oleg Isakov, Weizhu Chen, Jeffrey Scott Dunn, Thomas Ivan Borchert, Joaquin Quinonero Candela, Thore Kurt Hartwig Graepel, Ralf Herbrich
  • Publication number: 20110218946
    Abstract: A user may request a presentation of a content item set, such as a social network comprising a set of status messages or an image database comprising a set of images. However, the volume and diversity of content items of the content item set may reduce the interest of the user in the presented content items. The potential interest of the user in the presented content items may be improved by selecting content items that are associated with one or more topics of potential interest to the user, and having a positive trending popularity among users of the content item set. Moreover, the interaction of the user with a presented content item may be monitored and used to determine the interest of the user in the topics associated with the presented content item and the popularity of the content item.
    Type: Application
    Filed: March 3, 2010
    Publication date: September 8, 2011
    Applicant: Microsoft Corporation
    Inventors: David Stern, Ralf Herbrich, Milad Shokouhi, Thore Kurt Hartwig Graepel
  • Patent number: 7837543
    Abstract: Adaptive agents are driven by rewards they receive based on the outcome of their behavior during actual game play. Accordingly, the adaptive agents are able to learn from experience within the gaming environment. Reward-driven adaptive agents can be trained at either or both of game-time or development time. Computer-controlled agents receive rewards (either positive or negative) at individual action intervals based on the effectiveness of the agents' actions (e.g., compliance with defined goals). The adaptive computer-controlled agent is motivated to perform actions that maximize its positive rewards and minimize is negative rewards.
    Type: Grant
    Filed: April 30, 2004
    Date of Patent: November 23, 2010
    Assignee: Microsoft Corporation
    Inventors: Kurt Hartwig Graepel, Ralf Herbrich, Julian Gold