Patents by Inventor Yannick Schroecker

Yannick Schroecker has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10872294
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection policy neural network. In one aspect, a method comprises: obtaining an expert observation; processing the expert observation using a generative neural network system to generate a given observation-given action pair, wherein the generative neural network system has been trained to be more likely to generate a particular observation-particular action pair if performing the particular action in response to the particular observation is more likely to result in the environment later reaching the state characterized by a target observation; processing the given observation using the action selection policy neural network to generate a given action score for the given action; and adjusting the current values of the action selection policy neural network parameters to increase the given action score for the given action.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: December 22, 2020
    Assignee: DeepMind Technologies Limited
    Inventors: Mel Vecerik, Yannick Schroecker, Jonathan Karl Scholz
  • Publication number: 20200104684
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection policy neural network. In one aspect, a method comprises: obtaining an expert observation; processing the expert observation using a generative neural network system to generate a given observation-given action pair, wherein the generative neural network system has been trained to be more likely to generate a particular observation-particular action pair if performing the particular action in response to the particular observation is more likely to result in the environment later reaching the state characterized by a target observation; processing the given observation using the action selection policy neural network to generate a given action score for the given action; and adjusting the current values of the action selection policy neural network parameters to increase the given action score for the given action.
    Type: Application
    Filed: September 27, 2019
    Publication date: April 2, 2020
    Inventors: Mel Vecerik, Yannick Schroecker, Jonathan Karl Scholz