Patents by Inventor Gerald J. Tesauro

Gerald J. Tesauro has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10599991
    Abstract: A parameter-based multi-model blending method and system are described. The method includes selecting a parameter of interest among parameters estimated by each of a set of individual models, running the set of individual models with a range of inputs to obtain a range of estimates of the parameters from each of the set of individual models, and identifying, for each of the set of individual models, critical parameters among the parameters estimated, the critical parameters exhibiting a specified correlation with an error in estimation of the parameter of interest. For each subspace of combinations of the critical parameters, obtaining a parameter-based blended model is based on blending the set of individual models in accordance with the subspace of the critical parameters, the subspace defining a sub-range for each of the critical parameters.
    Type: Grant
    Filed: July 14, 2015
    Date of Patent: March 24, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Hendrik F. Hamann, Youngdeok Hwang, Levente Klein, Jonathan Lenchner, Siyuan Lu, Fernando J. Marianno, Gerald J. Tesauro, Theodore G. van Kessel
  • Patent number: 10592818
    Abstract: A parameter-based multi-model blending method and system are described. The method includes selecting a parameter of interest among parameters estimated by each of a set of individual models, running the set of individual models with a range of inputs to obtain a range of estimates of the parameters from each of the set of individual models, and identifying, for each of the set of individual models, critical parameters among the parameters estimated, the critical parameters exhibiting a specified correlation with an error in estimation of the parameter of interest. For each subspace of combinations of the critical parameters, obtaining a parameter-based blended model is based on blending the set of individual models in accordance with the subspace of the critical parameters, the subspace defining a sub-range for each of the critical parameters.
    Type: Grant
    Filed: July 14, 2015
    Date of Patent: March 17, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Hendrik F. Hamann, Youngdeok Hwang, Levente Klein, Jonathan Lenchner, Siyuan Lu, Fernando J. Marianno, Gerald J. Tesauro, Theodore G. van Kessel
  • Patent number: 10592817
    Abstract: A parameter-based multi-model blending method and system are described. The method includes selecting a parameter of interest among parameters estimated by each of a set of individual models, running the set of individual models with a range of inputs to obtain a range of estimates of the parameters from each of the set of individual models, and identifying, for each of the set of individual models, critical parameters among the parameters estimated, the critical parameters exhibiting a specified correlation with an error in estimation of the parameter of interest. For each subspace of combinations of the critical parameters, obtaining a parameter-based blended model is based on blending the set of individual models in accordance with the subspace of the critical parameters, the subspace defining a sub-range for each of the critical parameters.
    Type: Grant
    Filed: July 13, 2015
    Date of Patent: March 17, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Hendrik F. Hamann, Youngdeok Hwang, Levente Klein, Jonathan Lenchner, Siyuan Lu, Fernando J. Marianno, Gerald J. Tesauro, Theodore G. van Kessel
  • Patent number: 10572819
    Abstract: A system, method, and computer program product for automatically selecting from a plurality of analytic algorithms a best performing analytic algorithm to apply to a dataset is provided. The automatically selecting from the plurality of analytic algorithms the best performing analytic algorithm to apply to the dataset enables a training a plurality of analytic algorithms on a plurality of subsets of the dataset. Then, a corresponding prediction accuracy trend is estimated across the subsets for each of the plurality of analytic algorithms to produce a plurality of accuracy trends. Next, the best performing analytic algorithm is selected and outputted from the plurality of analytic algorithms based on the corresponding prediction accuracy trend with a highest value from the plurality of accuracy trends.
    Type: Grant
    Filed: July 29, 2015
    Date of Patent: February 25, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Tamir Klinger, Chandrasekhara K. Reddy, Ashish Sabharwal, Horst C. Samulowitz, Gerald J. Tesauro, Deepak S. Turaga
  • Patent number: 10373071
    Abstract: A system, method, and computer program product for automatically selecting from a plurality of analytic algorithms a best performing analytic algorithm to apply to a dataset is provided. The automatically selecting from the plurality of analytic algorithms the best performing analytic algorithm to apply to the dataset enables a training a plurality of analytic algorithms on a plurality of subsets of the dataset. Then, a corresponding prediction accuracy trend is estimated across the subsets for each of the plurality of analytic algorithms to produce a plurality of accuracy trends. Next, the best performing analytic algorithm is selected and outputted from the plurality of analytic algorithms based on the corresponding prediction accuracy trend with a highest value from the plurality of accuracy trends.
    Type: Grant
    Filed: November 24, 2015
    Date of Patent: August 6, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Tamir Klinger, Chandrasekhara K. Reddy, Ashish Sabharwal, Horst C. Samulowitz, Gerald J. Tesauro, Deepak S. Turaga
  • Publication number: 20170068905
    Abstract: A system, method, and computer program product for automatically selecting from a plurality of analytic algorithms a best performing analytic algorithm to apply to a dataset is provided. The automatically selecting from the plurality of analytic algorithms the best performing analytic algorithm to apply to the dataset enables a training a plurality of analytic algorithms on a plurality of subsets of the dataset. Then, a corresponding prediction accuracy trend is estimated across the subsets for each of the plurality of analytic algorithms to produce a plurality of accuracy trends. Next, the best performing analytic algorithm is selected and outputted from the plurality of analytic algorithms based on the corresponding prediction accuracy trend with a highest value from the plurality of accuracy trends.
    Type: Application
    Filed: November 24, 2015
    Publication date: March 9, 2017
    Inventors: TAMIR KLINGER, CHANDRASEKHARA K. REDDY, ASHISH SABHARWAL, HORST C. SAMULOWITZ, GERALD J. TESAURO, DEEPAK S. TURAGA
  • Publication number: 20170032277
    Abstract: A system, method, and computer program product for automatically selecting from a plurality of analytic algorithms a best performing analytic algorithm to apply to a dataset is provided. The automatically selecting from the plurality of analytic algorithms the best performing analytic algorithm to apply to the dataset enables a training a plurality of analytic algorithms on a plurality of subsets of the dataset. Then, a corresponding prediction accuracy trend is estimated across the subsets for each of the plurality of analytic algorithms to produce a plurality of accuracy trends. Next, the best performing analytic algorithm is selected and outputted from the plurality of analytic algorithms based on the corresponding prediction accuracy trend with a highest value from the plurality of accuracy trends.
    Type: Application
    Filed: July 29, 2015
    Publication date: February 2, 2017
    Inventors: TAMIR KLINGER, CHANDRASEKHARA K. REDDY, ASHISH SABHARWAL, HORST C. SAMULOWITZ, GERALD J. TESAURO, DEEPAK S. TURAGA
  • Publication number: 20170017895
    Abstract: A parameter-based multi-model blending method and system are described. The method includes selecting a parameter of interest among parameters estimated by each of a set of individual models, running the set of individual models with a range of inputs to obtain a range of estimates of the parameters from each of the set of individual models, and identifying, for each of the set of individual models, critical parameters among the parameters estimated, the critical parameters exhibiting a specified correlation with an error in estimation of the parameter of interest. For each subspace of combinations of the critical parameters, obtaining a parameter-based blended model is based on blending the set of individual models in accordance with the subspace of the critical parameters, the subspace defining a sub-range for each of the critical parameters.
    Type: Application
    Filed: July 14, 2015
    Publication date: January 19, 2017
    Inventors: Hendrik F. Hamann, Youngdeok Hwang, Levente Klein, Jonathan Lenchner, Siyuan Lu, Fernando J. Marianno, Gerald J. Tesauro, Theodore G. van Kessel
  • Publication number: 20170017732
    Abstract: A parameter-based multi-model blending method and system are described. The method includes selecting a parameter of interest among parameters estimated by each of a set of individual models, running the set of individual models with a range of inputs to obtain a range of estimates of the parameters from each of the set of individual models, and identifying, for each of the set of individual models, critical parameters among the parameters estimated, the critical parameters exhibiting a specified correlation with an error in estimation of the parameter of interest. For each subspace of combinations of the critical parameters, obtaining a parameter-based blended model is based on blending the set of individual models in accordance with the subspace of the critical parameters, the subspace defining a sub-range for each of the critical parameters.
    Type: Application
    Filed: July 13, 2015
    Publication date: January 19, 2017
    Inventors: Hendrik F. Hamann, Youngdeok Hwang, Levente Klein, Jonathan Lenchner, Siyuan Lu, Fernando J. Marianno, Gerald J. Tesauro, Theodore G. van Kessel
  • Publication number: 20170017896
    Abstract: A parameter-based multi-model blending method and system are described. The method includes selecting a parameter of interest among parameters estimated by each of a set of individual models, running the set of individual models with a range of inputs to obtain a range of estimates of the parameters from each of the set of individual models, and identifying, for each of the set of individual models, critical parameters among the parameters estimated, the critical parameters exhibiting a specified correlation with an error in estimation of the parameter of interest. For each subspace of combinations of the critical parameters, obtaining a parameter-based blended model is based on blending the set of individual models in accordance with the subspace of the critical parameters, the subspace defining a sub-range for each of the critical parameters.
    Type: Application
    Filed: July 14, 2015
    Publication date: January 19, 2017
    Inventors: Hendrik F. Hamann, Youngdeok Hwang, Levente Klein, Jonathan Lenchner, Siyuan Lu, Fernando J. Marianno, Gerald J. Tesauro, Theodore G. van Kessel
  • Patent number: 9298172
    Abstract: The present invention is a method and an apparatus for reward-based learning of policies for managing or controlling a system or plant. In one embodiment, a method for reward-based learning includes receiving a set of one or more exemplars, where at least two of the exemplars comprise a (state, action) pair for a system, and at least one of the exemplars includes an immediate reward responsive to a (state, action) pair. A distance metric and a distance-based function approximator estimating long-range expected value are then initialized, where the distance metric computes a distance between two (state, action) pairs, and the distance metric and function approximator are adjusted such that a Bellman error measure of the function approximator on the set of exemplars is minimized. A management policy is then derived based on the trained distance metric and function approximator.
    Type: Grant
    Filed: October 11, 2007
    Date of Patent: March 29, 2016
    Assignee: International Business Machines Corporation
    Inventors: Gerald J. Tesauro, Kilian Q. Weinberger
  • Patent number: 9047423
    Abstract: A method, system and computer program product for choosing actions in a state of a planning problem. The system simulates one or more sequences of actions, state transitions and rewards starting from the current state of the planning problem. During the simulation of performing a given action in a given state, a data record is maintained of observed contextual state information, and observed cumulative reward resulting from the action. The system performs a regression fit on the data records, enabling estimation of expected reward as a function of contextual state. The estimations of expected rewards are used to guide the choice of actions during the simulations. Upon completion of all simulations, the top-level action which obtained highest mean reward during the simulations is recommended to be executed in the current state of the planning problem.
    Type: Grant
    Filed: January 12, 2012
    Date of Patent: June 2, 2015
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gerald J. Tesauro, Alina Beygelzimer, Richard B. Segal, Mark N. Wegman
  • Patent number: 8545332
    Abstract: A system, method and computer program product for planning actions in a repeated Stackelberg Game, played for a fixed number of rounds, where the payoffs or preferences of the follower are initially unknown to the leader, and a prior probability distribution over follower types is available. In repeated Bayesian Stackelberg games, the objective is to maximize the leader's cumulative expected payoff over the rounds of the game. The optimal plans in such games make intelligent tradeoffs between actions that reveal information regarding the unknown follower preferences, and actions that aim for high immediate payoff. The method solves for such optimal plans according to a Monte Carlo Tree Search method wherein simulation trials draw instances of followers from said prior probability distribution. Some embodiments additionally implement a method for pruning dominated leader strategies.
    Type: Grant
    Filed: February 2, 2012
    Date of Patent: October 1, 2013
    Assignee: International Business Machines Corporation
    Inventors: Janusz Marecki, Richard B. Segal, Gerald J. Tesauro
  • Publication number: 20130204412
    Abstract: A system, method and computer program product for planning actions in a repeated Stackelberg Game, played for a fixed number of rounds, where the payoffs or preferences of the follower are initially unknown to the leader, and a prior probability distribution over follower types is available. In repeated Bayesian Stackelberg games, the objective is to maximize the leader's cumulative expected payoff over the rounds of the game. The optimal plans in such games make intelligent tradeoffs between actions that reveal information regarding the unknown follower preferences, and actions that aim for high immediate payoff. The method solves for such optimal plans according to a Monte Carlo Tree Search method wherein simulation trials draw instances of followers from said prior probability distribution. Some embodiments additionally implement a method for pruning dominated leader strategies.
    Type: Application
    Filed: February 2, 2012
    Publication date: August 8, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Janusz Marecki, Richard B. Segal, Gerald J. Tesauro
  • Publication number: 20130185039
    Abstract: A method, system and computer program product for choosing actions in a state of a planning problem. The system simulates one or more sequences of actions, state transitions and rewards starting from the current state of the planning problem. During the simulation of performing a given action in a given state, a data record is maintained of observed contextual state information, and observed cumulative reward resulting from the action. The system performs a regression fit on the data records, enabling estimation of expected reward as a function of contextual state. The estimations of expected rewards are used to guide the choice of actions during the simulations. Upon completion of all simulations, the top-level action which obtained highest mean reward during the simulations is recommended to be executed in the current state of the planning problem.
    Type: Application
    Filed: January 12, 2012
    Publication date: July 18, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gerald J. Tesauro, Alina Beygelzimer, Richard B. Segal, Mark N. Wegman
  • Patent number: 8060454
    Abstract: The present invention is a method and an apparatus for reward-based learning of management policies. In one embodiment, a method for reward-based learning includes receiving a set of one or more exemplars, where at least two of the exemplars comprise a (state, action) pair for a system, and at least one of the exemplars includes an immediate reward responsive to a (state, action) pair. A distance measure between pairs of exemplars is used to compute a Non-Linear Dimensionality Reduction (NLDR) mapping of (state, action) pairs into a lower-dimensional representation, thereby producing embedded exemplars, wherein one or more parameters of the NLDR are tuned to minimize a cross-validation Bellman error on a holdout set taken from the set of one or more exemplars. The mapping is then applied to the set of exemplars, and reward-based learning is applied to the embedded exemplars to obtain a learned management policy.
    Type: Grant
    Filed: October 11, 2007
    Date of Patent: November 15, 2011
    Assignee: International Business Machines Corporation
    Inventors: Rajarshi Das, Gerald J. Tesauro, Kilian Q. Weinberger
  • Patent number: 7599898
    Abstract: The present invention is a method and an apparatus for improved regression modeling to address the curse of dimensionality, for example for use in data analysis tasks. In one embodiment, a method for analyzing data includes receiving a set of exemplars, where at least two of the exemplars include an input pattern (i.e., a point in an input space) and at least one of the exemplars includes a target value associated with the input pattern. A function approximator and a distance metric are then initialized, where the distance metric computes a distance between points in the input space, and the distance metric is adjusted such that an accuracy measure of the function approximator on the set of exemplars is improved.
    Type: Grant
    Filed: October 17, 2006
    Date of Patent: October 6, 2009
    Assignee: International Business Machines Corporation
    Inventors: Gerald J. Tesauro, Kilian Q. Weinberger
  • Publication number: 20090099985
    Abstract: The present invention is a method and an apparatus for reward-based learning of policies for managing or controlling a system or plant. In one embodiment, a method for reward-based learning includes receiving a set of one or more exemplars, where at least two of the exemplars comprise a (state, action) pair for a system, and at least one of the exemplars includes an immediate reward responsive to a (state, action) pair. A distance metric and a distance-based function approximator estimating long-range expected value are then initialized, where the distance metric computes a distance between two (state, action) pairs, and the distance metric and function approximator are adjusted such that a Bellman error measure of the function approximator on the set of exemplars is minimized. A management policy is then derived based on the trained distance metric and function approximator.
    Type: Application
    Filed: October 11, 2007
    Publication date: April 16, 2009
    Inventors: GERALD J. TESAURO, Kilian Q. Weinberger
  • Publication number: 20090098515
    Abstract: The present invention is a method and an apparatus for reward-based learning of policies for managing or controlling a system or plant. In one embodiment, a method for reward-based learning includes receiving a set of one or more exemplars, where at least two of the exemplars comprise a (state, action) pair for a system, and at least one of the exemplars includes an immediate reward responsive to a (state, action) pair. A distance measure between pairs of exemplars is used to compute a Non-Linear Dimensionality Reduction (NLDR) mapping of (state, action) pairs into a lower-dimensional representation. The mapping is then applied to the set of exemplars, and reward-based learning is applied to the transformed exemplars to obtain a management policy.
    Type: Application
    Filed: October 11, 2007
    Publication date: April 16, 2009
    Inventors: Rajarshi Das, Gerald J. Tesauro, Kilian Q. Weinberger
  • Publication number: 20080154817
    Abstract: The present invention is a method and an apparatus for improved regression modeling to address the curse of dimensionality, for example for use in data analysis tasks. In one embodiment, a method for analyzing data includes receiving a set of exemplars, where at least two of the exemplars include an input pattern (i.e., a point in an input space) and at least one of the exemplars includes a target value associated with the input pattern. A function approximator and a distance metric are then initialized, where the distance metric computes a distance between points in the input space, and the distance metric is adjusted such that an accuracy measure of the function approximator on the set of exemplars is improved.
    Type: Application
    Filed: October 17, 2006
    Publication date: June 26, 2008
    Inventors: Gerald J. Tesauro, Kilian Q. Weinberger