Patents Assigned to Osaro, Inc.
  • Publication number: 20170213150
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for reinforcement learning using a partitioned reinforcement learning input state space (RL input state space). One of the methods includes maintaining data defining a plurality of partitions of a space of reinforcement learning (RL) input states, each partition corresponding to a respective supervised learning model; obtaining a current state representation that represents a current state of the environment; for the current state representation and for each action in the set of actions, identifying a respective partition and processing the action and the current state representation using the supervised learning model that corresponds to the respective partition to generate a respective current value function estimate; and selecting an action to be performed by the computer-implemented agent in response to the current state representation using the respective current value function estimates.
    Type: Application
    Filed: July 8, 2016
    Publication date: July 27, 2017
    Applicant: Osaro, Inc.
    Inventors: Itamar Arel, Michael Kahane, Khashayar Rohanimanesh
  • Patent number: 9536191
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for reinforcement learning using confidence scores. One of the methods includes receiving a current observation; for each of multiple actions: determining a respective value function estimate that is an estimate of a return resulting from the agent performing the action in response to the current observation, determining a respective confidence score that is a measure of confidence that the respective value function estimate for the action is an accurate estimate of the return that will result from the agent performing the action in response to the current observation, adjusting the respective value function estimate for the action using the respective confidence score for the action to determine a respective adjusted value function estimate; and selecting an action to be performed by the agent in response to the current observation using the respective adjusted value function estimates.
    Type: Grant
    Filed: November 25, 2015
    Date of Patent: January 3, 2017
    Assignee: Osaro, Inc.
    Inventors: Itamar Arel, Michael Kahane, Khashayar Rohanimanesh