Patents by Inventor Gerald J. Tesauro
Gerald J. Tesauro has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10599991Abstract: A parameter-based multi-model blending method and system are described. The method includes selecting a parameter of interest among parameters estimated by each of a set of individual models, running the set of individual models with a range of inputs to obtain a range of estimates of the parameters from each of the set of individual models, and identifying, for each of the set of individual models, critical parameters among the parameters estimated, the critical parameters exhibiting a specified correlation with an error in estimation of the parameter of interest. For each subspace of combinations of the critical parameters, obtaining a parameter-based blended model is based on blending the set of individual models in accordance with the subspace of the critical parameters, the subspace defining a sub-range for each of the critical parameters.Type: GrantFiled: July 14, 2015Date of Patent: March 24, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Hendrik F. Hamann, Youngdeok Hwang, Levente Klein, Jonathan Lenchner, Siyuan Lu, Fernando J. Marianno, Gerald J. Tesauro, Theodore G. van Kessel
-
Patent number: 10592818Abstract: A parameter-based multi-model blending method and system are described. The method includes selecting a parameter of interest among parameters estimated by each of a set of individual models, running the set of individual models with a range of inputs to obtain a range of estimates of the parameters from each of the set of individual models, and identifying, for each of the set of individual models, critical parameters among the parameters estimated, the critical parameters exhibiting a specified correlation with an error in estimation of the parameter of interest. For each subspace of combinations of the critical parameters, obtaining a parameter-based blended model is based on blending the set of individual models in accordance with the subspace of the critical parameters, the subspace defining a sub-range for each of the critical parameters.Type: GrantFiled: July 14, 2015Date of Patent: March 17, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Hendrik F. Hamann, Youngdeok Hwang, Levente Klein, Jonathan Lenchner, Siyuan Lu, Fernando J. Marianno, Gerald J. Tesauro, Theodore G. van Kessel
-
Patent number: 10592817Abstract: A parameter-based multi-model blending method and system are described. The method includes selecting a parameter of interest among parameters estimated by each of a set of individual models, running the set of individual models with a range of inputs to obtain a range of estimates of the parameters from each of the set of individual models, and identifying, for each of the set of individual models, critical parameters among the parameters estimated, the critical parameters exhibiting a specified correlation with an error in estimation of the parameter of interest. For each subspace of combinations of the critical parameters, obtaining a parameter-based blended model is based on blending the set of individual models in accordance with the subspace of the critical parameters, the subspace defining a sub-range for each of the critical parameters.Type: GrantFiled: July 13, 2015Date of Patent: March 17, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Hendrik F. Hamann, Youngdeok Hwang, Levente Klein, Jonathan Lenchner, Siyuan Lu, Fernando J. Marianno, Gerald J. Tesauro, Theodore G. van Kessel
-
Patent number: 10572819Abstract: A system, method, and computer program product for automatically selecting from a plurality of analytic algorithms a best performing analytic algorithm to apply to a dataset is provided. The automatically selecting from the plurality of analytic algorithms the best performing analytic algorithm to apply to the dataset enables a training a plurality of analytic algorithms on a plurality of subsets of the dataset. Then, a corresponding prediction accuracy trend is estimated across the subsets for each of the plurality of analytic algorithms to produce a plurality of accuracy trends. Next, the best performing analytic algorithm is selected and outputted from the plurality of analytic algorithms based on the corresponding prediction accuracy trend with a highest value from the plurality of accuracy trends.Type: GrantFiled: July 29, 2015Date of Patent: February 25, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Tamir Klinger, Chandrasekhara K. Reddy, Ashish Sabharwal, Horst C. Samulowitz, Gerald J. Tesauro, Deepak S. Turaga
-
Patent number: 10373071Abstract: A system, method, and computer program product for automatically selecting from a plurality of analytic algorithms a best performing analytic algorithm to apply to a dataset is provided. The automatically selecting from the plurality of analytic algorithms the best performing analytic algorithm to apply to the dataset enables a training a plurality of analytic algorithms on a plurality of subsets of the dataset. Then, a corresponding prediction accuracy trend is estimated across the subsets for each of the plurality of analytic algorithms to produce a plurality of accuracy trends. Next, the best performing analytic algorithm is selected and outputted from the plurality of analytic algorithms based on the corresponding prediction accuracy trend with a highest value from the plurality of accuracy trends.Type: GrantFiled: November 24, 2015Date of Patent: August 6, 2019Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Tamir Klinger, Chandrasekhara K. Reddy, Ashish Sabharwal, Horst C. Samulowitz, Gerald J. Tesauro, Deepak S. Turaga
-
Publication number: 20170068905Abstract: A system, method, and computer program product for automatically selecting from a plurality of analytic algorithms a best performing analytic algorithm to apply to a dataset is provided. The automatically selecting from the plurality of analytic algorithms the best performing analytic algorithm to apply to the dataset enables a training a plurality of analytic algorithms on a plurality of subsets of the dataset. Then, a corresponding prediction accuracy trend is estimated across the subsets for each of the plurality of analytic algorithms to produce a plurality of accuracy trends. Next, the best performing analytic algorithm is selected and outputted from the plurality of analytic algorithms based on the corresponding prediction accuracy trend with a highest value from the plurality of accuracy trends.Type: ApplicationFiled: November 24, 2015Publication date: March 9, 2017Inventors: TAMIR KLINGER, CHANDRASEKHARA K. REDDY, ASHISH SABHARWAL, HORST C. SAMULOWITZ, GERALD J. TESAURO, DEEPAK S. TURAGA
-
Publication number: 20170032277Abstract: A system, method, and computer program product for automatically selecting from a plurality of analytic algorithms a best performing analytic algorithm to apply to a dataset is provided. The automatically selecting from the plurality of analytic algorithms the best performing analytic algorithm to apply to the dataset enables a training a plurality of analytic algorithms on a plurality of subsets of the dataset. Then, a corresponding prediction accuracy trend is estimated across the subsets for each of the plurality of analytic algorithms to produce a plurality of accuracy trends. Next, the best performing analytic algorithm is selected and outputted from the plurality of analytic algorithms based on the corresponding prediction accuracy trend with a highest value from the plurality of accuracy trends.Type: ApplicationFiled: July 29, 2015Publication date: February 2, 2017Inventors: TAMIR KLINGER, CHANDRASEKHARA K. REDDY, ASHISH SABHARWAL, HORST C. SAMULOWITZ, GERALD J. TESAURO, DEEPAK S. TURAGA
-
Publication number: 20170017895Abstract: A parameter-based multi-model blending method and system are described. The method includes selecting a parameter of interest among parameters estimated by each of a set of individual models, running the set of individual models with a range of inputs to obtain a range of estimates of the parameters from each of the set of individual models, and identifying, for each of the set of individual models, critical parameters among the parameters estimated, the critical parameters exhibiting a specified correlation with an error in estimation of the parameter of interest. For each subspace of combinations of the critical parameters, obtaining a parameter-based blended model is based on blending the set of individual models in accordance with the subspace of the critical parameters, the subspace defining a sub-range for each of the critical parameters.Type: ApplicationFiled: July 14, 2015Publication date: January 19, 2017Inventors: Hendrik F. Hamann, Youngdeok Hwang, Levente Klein, Jonathan Lenchner, Siyuan Lu, Fernando J. Marianno, Gerald J. Tesauro, Theodore G. van Kessel
-
Publication number: 20170017732Abstract: A parameter-based multi-model blending method and system are described. The method includes selecting a parameter of interest among parameters estimated by each of a set of individual models, running the set of individual models with a range of inputs to obtain a range of estimates of the parameters from each of the set of individual models, and identifying, for each of the set of individual models, critical parameters among the parameters estimated, the critical parameters exhibiting a specified correlation with an error in estimation of the parameter of interest. For each subspace of combinations of the critical parameters, obtaining a parameter-based blended model is based on blending the set of individual models in accordance with the subspace of the critical parameters, the subspace defining a sub-range for each of the critical parameters.Type: ApplicationFiled: July 13, 2015Publication date: January 19, 2017Inventors: Hendrik F. Hamann, Youngdeok Hwang, Levente Klein, Jonathan Lenchner, Siyuan Lu, Fernando J. Marianno, Gerald J. Tesauro, Theodore G. van Kessel
-
Publication number: 20170017896Abstract: A parameter-based multi-model blending method and system are described. The method includes selecting a parameter of interest among parameters estimated by each of a set of individual models, running the set of individual models with a range of inputs to obtain a range of estimates of the parameters from each of the set of individual models, and identifying, for each of the set of individual models, critical parameters among the parameters estimated, the critical parameters exhibiting a specified correlation with an error in estimation of the parameter of interest. For each subspace of combinations of the critical parameters, obtaining a parameter-based blended model is based on blending the set of individual models in accordance with the subspace of the critical parameters, the subspace defining a sub-range for each of the critical parameters.Type: ApplicationFiled: July 14, 2015Publication date: January 19, 2017Inventors: Hendrik F. Hamann, Youngdeok Hwang, Levente Klein, Jonathan Lenchner, Siyuan Lu, Fernando J. Marianno, Gerald J. Tesauro, Theodore G. van Kessel
-
Patent number: 9298172Abstract: The present invention is a method and an apparatus for reward-based learning of policies for managing or controlling a system or plant. In one embodiment, a method for reward-based learning includes receiving a set of one or more exemplars, where at least two of the exemplars comprise a (state, action) pair for a system, and at least one of the exemplars includes an immediate reward responsive to a (state, action) pair. A distance metric and a distance-based function approximator estimating long-range expected value are then initialized, where the distance metric computes a distance between two (state, action) pairs, and the distance metric and function approximator are adjusted such that a Bellman error measure of the function approximator on the set of exemplars is minimized. A management policy is then derived based on the trained distance metric and function approximator.Type: GrantFiled: October 11, 2007Date of Patent: March 29, 2016Assignee: International Business Machines CorporationInventors: Gerald J. Tesauro, Kilian Q. Weinberger
-
Patent number: 9047423Abstract: A method, system and computer program product for choosing actions in a state of a planning problem. The system simulates one or more sequences of actions, state transitions and rewards starting from the current state of the planning problem. During the simulation of performing a given action in a given state, a data record is maintained of observed contextual state information, and observed cumulative reward resulting from the action. The system performs a regression fit on the data records, enabling estimation of expected reward as a function of contextual state. The estimations of expected rewards are used to guide the choice of actions during the simulations. Upon completion of all simulations, the top-level action which obtained highest mean reward during the simulations is recommended to be executed in the current state of the planning problem.Type: GrantFiled: January 12, 2012Date of Patent: June 2, 2015Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Gerald J. Tesauro, Alina Beygelzimer, Richard B. Segal, Mark N. Wegman
-
Patent number: 8545332Abstract: A system, method and computer program product for planning actions in a repeated Stackelberg Game, played for a fixed number of rounds, where the payoffs or preferences of the follower are initially unknown to the leader, and a prior probability distribution over follower types is available. In repeated Bayesian Stackelberg games, the objective is to maximize the leader's cumulative expected payoff over the rounds of the game. The optimal plans in such games make intelligent tradeoffs between actions that reveal information regarding the unknown follower preferences, and actions that aim for high immediate payoff. The method solves for such optimal plans according to a Monte Carlo Tree Search method wherein simulation trials draw instances of followers from said prior probability distribution. Some embodiments additionally implement a method for pruning dominated leader strategies.Type: GrantFiled: February 2, 2012Date of Patent: October 1, 2013Assignee: International Business Machines CorporationInventors: Janusz Marecki, Richard B. Segal, Gerald J. Tesauro
-
Publication number: 20130204412Abstract: A system, method and computer program product for planning actions in a repeated Stackelberg Game, played for a fixed number of rounds, where the payoffs or preferences of the follower are initially unknown to the leader, and a prior probability distribution over follower types is available. In repeated Bayesian Stackelberg games, the objective is to maximize the leader's cumulative expected payoff over the rounds of the game. The optimal plans in such games make intelligent tradeoffs between actions that reveal information regarding the unknown follower preferences, and actions that aim for high immediate payoff. The method solves for such optimal plans according to a Monte Carlo Tree Search method wherein simulation trials draw instances of followers from said prior probability distribution. Some embodiments additionally implement a method for pruning dominated leader strategies.Type: ApplicationFiled: February 2, 2012Publication date: August 8, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Janusz Marecki, Richard B. Segal, Gerald J. Tesauro
-
Publication number: 20130185039Abstract: A method, system and computer program product for choosing actions in a state of a planning problem. The system simulates one or more sequences of actions, state transitions and rewards starting from the current state of the planning problem. During the simulation of performing a given action in a given state, a data record is maintained of observed contextual state information, and observed cumulative reward resulting from the action. The system performs a regression fit on the data records, enabling estimation of expected reward as a function of contextual state. The estimations of expected rewards are used to guide the choice of actions during the simulations. Upon completion of all simulations, the top-level action which obtained highest mean reward during the simulations is recommended to be executed in the current state of the planning problem.Type: ApplicationFiled: January 12, 2012Publication date: July 18, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Gerald J. Tesauro, Alina Beygelzimer, Richard B. Segal, Mark N. Wegman
-
Patent number: 8060454Abstract: The present invention is a method and an apparatus for reward-based learning of management policies. In one embodiment, a method for reward-based learning includes receiving a set of one or more exemplars, where at least two of the exemplars comprise a (state, action) pair for a system, and at least one of the exemplars includes an immediate reward responsive to a (state, action) pair. A distance measure between pairs of exemplars is used to compute a Non-Linear Dimensionality Reduction (NLDR) mapping of (state, action) pairs into a lower-dimensional representation, thereby producing embedded exemplars, wherein one or more parameters of the NLDR are tuned to minimize a cross-validation Bellman error on a holdout set taken from the set of one or more exemplars. The mapping is then applied to the set of exemplars, and reward-based learning is applied to the embedded exemplars to obtain a learned management policy.Type: GrantFiled: October 11, 2007Date of Patent: November 15, 2011Assignee: International Business Machines CorporationInventors: Rajarshi Das, Gerald J. Tesauro, Kilian Q. Weinberger
-
Patent number: 7599898Abstract: The present invention is a method and an apparatus for improved regression modeling to address the curse of dimensionality, for example for use in data analysis tasks. In one embodiment, a method for analyzing data includes receiving a set of exemplars, where at least two of the exemplars include an input pattern (i.e., a point in an input space) and at least one of the exemplars includes a target value associated with the input pattern. A function approximator and a distance metric are then initialized, where the distance metric computes a distance between points in the input space, and the distance metric is adjusted such that an accuracy measure of the function approximator on the set of exemplars is improved.Type: GrantFiled: October 17, 2006Date of Patent: October 6, 2009Assignee: International Business Machines CorporationInventors: Gerald J. Tesauro, Kilian Q. Weinberger
-
Publication number: 20090099985Abstract: The present invention is a method and an apparatus for reward-based learning of policies for managing or controlling a system or plant. In one embodiment, a method for reward-based learning includes receiving a set of one or more exemplars, where at least two of the exemplars comprise a (state, action) pair for a system, and at least one of the exemplars includes an immediate reward responsive to a (state, action) pair. A distance metric and a distance-based function approximator estimating long-range expected value are then initialized, where the distance metric computes a distance between two (state, action) pairs, and the distance metric and function approximator are adjusted such that a Bellman error measure of the function approximator on the set of exemplars is minimized. A management policy is then derived based on the trained distance metric and function approximator.Type: ApplicationFiled: October 11, 2007Publication date: April 16, 2009Inventors: GERALD J. TESAURO, Kilian Q. Weinberger
-
Publication number: 20090098515Abstract: The present invention is a method and an apparatus for reward-based learning of policies for managing or controlling a system or plant. In one embodiment, a method for reward-based learning includes receiving a set of one or more exemplars, where at least two of the exemplars comprise a (state, action) pair for a system, and at least one of the exemplars includes an immediate reward responsive to a (state, action) pair. A distance measure between pairs of exemplars is used to compute a Non-Linear Dimensionality Reduction (NLDR) mapping of (state, action) pairs into a lower-dimensional representation. The mapping is then applied to the set of exemplars, and reward-based learning is applied to the transformed exemplars to obtain a management policy.Type: ApplicationFiled: October 11, 2007Publication date: April 16, 2009Inventors: Rajarshi Das, Gerald J. Tesauro, Kilian Q. Weinberger
-
Publication number: 20080154817Abstract: The present invention is a method and an apparatus for improved regression modeling to address the curse of dimensionality, for example for use in data analysis tasks. In one embodiment, a method for analyzing data includes receiving a set of exemplars, where at least two of the exemplars include an input pattern (i.e., a point in an input space) and at least one of the exemplars includes a target value associated with the input pattern. A function approximator and a distance metric are then initialized, where the distance metric computes a distance between points in the input space, and the distance metric is adjusted such that an accuracy measure of the function approximator on the set of exemplars is improved.Type: ApplicationFiled: October 17, 2006Publication date: June 26, 2008Inventors: Gerald J. Tesauro, Kilian Q. Weinberger