Patents by Inventor Craig Boutilier

Craig Boutilier has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11915130
    Abstract: Delusional bias can occur in function approximation Q-learning. Techniques for training and/or using a value network to mitigate delusional bias is disclosed herein, where the value network can be used to generate action(s) for an agent (e.g., a robot agent, a software agent, etc.). In various implementations, delusional bias can be mitigated by using a soft-consistency penalty. Additionally or alternatively, delusional bias can be mitigated by using a search framework over multiple Q-functions.
    Type: Grant
    Filed: September 25, 2020
    Date of Patent: February 27, 2024
    Assignee: GOOGLE LLC
    Inventors: Tyler Lu, Boon Teik Ooi, Craig Boutilier, Dale Schuurmans, DiJia Su
  • Publication number: 20220101111
    Abstract: Delusional bias can occur in function approximation Q-learning. Techniques for training and/or using a value network to mitigate delusional bias is disclosed herein, where the value network can be used to generate action(s) for an agent (e.g., a robot agent, a software agent, etc.). In various implementations, delusional bias can be mitigated by using a soft-consistency penalty. Additionally or alternatively, delusional bias can be mitigated by using a search framework over multiple Q-functions.
    Type: Application
    Filed: September 25, 2020
    Publication date: March 31, 2022
    Inventors: Tyler Lu, Boon Teik Ooi, Craig Boutilier, Dale Schuurmans, DiJia Su
  • Publication number: 20080052219
    Abstract: Each bid received via a computer network is an offer for the right to cause at least one advert associated with the bid to be output to at least one device that is part of the computer network or in communication with the computer network in response to the bid being allocated one or more user events. At a time t, at least one rule or decision variable for allocating user events to bids is determined based on bids received before time t and an estimate of bids, user events or user activity occurring after time t. Based on information or data regarding a user event received from one of the devices after time t, the user event is allocated to at least one bid based on the at least one rule or decision variable and the at least one word, term, phrase or string of characters of the bid.
    Type: Application
    Filed: July 27, 2007
    Publication date: February 28, 2008
    Applicant: CombineNet, Inc.
    Inventors: Tuomas Sandholm, David Parkes, Craig Boutilier, William Walsh
  • Publication number: 20060224496
    Abstract: In an on-line ad auction, bids for the right to display at least one advert on a display of a computer of a computer network in response to the bid being allocated a query received from the computer are received via a computer network. At a time t, at least one rule or decision variable for allocating queries to bids is determined based on the bids received before time t and an estimate of at least one of: an estimate of queries to be received after time t; an estimate of events to occur in response to the display of adverts after time t; and/or an estimate of bids to be received after time t. After time t, a query received from the computer is allocated to at least one of the received bids based on the at least one rule or decision variable.
    Type: Application
    Filed: March 31, 2006
    Publication date: October 5, 2006
    Applicant: CombineNet, Inc.
    Inventors: Tuomas Sandholm, David Parkes, Craig Boutilier
  • Publication number: 20050192865
    Abstract: A desirable allocation of bids in a combinatorial exchange can be selected by determining a first candidate allocation of the bids and a first value of a minimax regret, related to the difference in utility between the adversarial allocation and the candidate allocation, as a function of a first adversarial allocation of the bids. Based on the first candidate allocation, a second adversarial allocation of the bids and a first value of a maximum regret related to the difference in utility between the new adversarial allocation and the utility of the candidate allocation can be determined. When the value of the maximum regret is greater than the value of the minimax regret, the candidate allocation can be designated as the desirable allocation.
    Type: Application
    Filed: February 24, 2005
    Publication date: September 1, 2005
    Applicant: CombineNet, Inc.
    Inventors: Craig Boutilier, Tuomas Sandholm, Robert Shields