Patents by Inventor Gregory BRAMBLE

Gregory BRAMBLE has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11966340
    Abstract: To automate time series forecasting machine learning pipeline generation, a data allocation size of time series data may be determined based on one or more characteristics of a time series data set. The time series data may be allocated for use by candidate machine learning pipelines based on the data allocation size. Features for the time series data may be determined and cached by the candidate machine learning pipelines. Predictions of each of the candidate machine learning pipelines using at least the one or more features may be evaluated. A ranked list of machine learning pipelines may be automatically generated from the candidate machine learning pipelines for time series forecasting based upon evaluating predictions of each of the one or more candidate machine learning pipelines.
    Type: Grant
    Filed: March 15, 2022
    Date of Patent: April 23, 2024
    Assignee: International Business Machines Corporation
    Inventors: Long Vu, Bei Chen, Xuan-Hong Dang, Peter Daniel Kirchner, Syed Yousaf Shah, Dhavalkumar C. Patel, Si Er Han, Ji Hui Yang, Jun Wang, Jing James Xu, Dakuo Wang, Gregory Bramble, Horst Cornelius Samulowitz, Saket K. Sathe, Wesley M. Gifford, Petros Zerfos
  • Patent number: 11941541
    Abstract: Methods, computer program products and/or systems are provided that perform the following operations: obtaining a performance matrix representing accuracies obtained by executing a plurality of pipelines on a plurality of training data sets, wherein a pipeline comprises a series of operations performed on a data set; selecting a defined number of top pipelines as potential pipelines for a testing data set based, at least in part, on a similarity between the testing data set and each of the plurality of training data sets represented in the performance matrix; storing results from executing each of the potential pipelines as a new data set; determining a pipeline accuracy for each of the potential pipelines when executed against the testing data set; and providing a recommended pipeline for use with the testing data set based, at least in part, on the pipeline accuracy for each potential pipeline.
    Type: Grant
    Filed: August 10, 2020
    Date of Patent: March 26, 2024
    Assignee: International Business Machines Corporation
    Inventors: Saket Sathe, Gregory Bramble, Horst Cornelius Samulowitz, Charu C. Aggarwal
  • Patent number: 11861469
    Abstract: An embodiment of the invention may include a method, computer program product, and system for creating a data analysis tool. The method may include a computing device that generates an AI pipeline based on an input dataset, wherein the AI pipeline is generated using an Automated Machine Learning program. The method may include converting the AI pipeline to a non-native format of the Automated Machine Learning program. This may enable the AI pipeline to be used outside of the Automated Machine Learning program, thereby increasing the usefulness of the created program by not tying it to the Automated Machine Learning program. Additionally, this may increase the efficiency of running the AI pipeline by eliminating unnecessary computations performed by the Automated Machine Learning program.
    Type: Grant
    Filed: July 2, 2020
    Date of Patent: January 2, 2024
    Assignee: International Business Machines Corporation
    Inventors: Peter Daniel Kirchner, Gregory Bramble, Horst Cornelius Samulowitz, Dakuo Wang, Arunima Chaudhary, Gregory Filla
  • Patent number: 11829799
    Abstract: A method, a structure, and a computer system for predicting pipeline training requirements. The exemplary embodiments may include receiving one or more worker node features from one or more worker nodes, extracting one or more pipeline features from one or more pipelines to be trained, and extracting one or more dataset features from one or more datasets used to train the one or more pipelines. The exemplary embodiments may further include predicting an amount of one or more resources required for each of the one or more worker nodes to train the one or more pipelines using the one or more datasets based on one or more models that correlate the one or more worker node features, one or more pipeline features, and one or more dataset features with the one or more resources. Lastly, the exemplary embodiments may include identifying a worker node requiring a least amount of the one or more resources of the one or more worker nodes for training the one or more pipelines.
    Type: Grant
    Filed: October 13, 2020
    Date of Patent: November 28, 2023
    Assignee: International Business Machines Corporation
    Inventors: Saket Sathe, Gregory Bramble, Long Vu, Theodoros Salonidis
  • Publication number: 20230342627
    Abstract: Predefined pipelines may be created with predefined meta-features. Time series data may be segmented using lookback window parameters. Meta-features may be determined for windowed data. Those of the predefined pipelines having a maximum amount of matching predefined meta-features may be determined. Those of the lookback window parameters that result in the windowed data having the meta-features most similar to the meta-features of one or more of the plurality of predefined pipelines may be identified.
    Type: Application
    Filed: April 22, 2022
    Publication date: October 26, 2023
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Long VU, Saket K SATHE, Peter Daniel KIRCHNER, Gregory BRAMBLE
  • Patent number: 11681931
    Abstract: A system that provides a mathematical formulation for new problem of model validation and model selection in presence of test data feedback. The system comprises a memory that stores computer-executable components. A processor, operably coupled to the memory, executes the computer-executable components stored in the memory. A selection component selects a metric of performance evaluation accuracy; and a configuration component configures performance evaluation schemes for machine learning algorithms. A characterization component employs a supervised learning-based approach to characterize relationship between the configuration of the performance evaluation scheme and fidelity of performance estimates; and an optimization component that optimizes accuracy of the machine learning algorithms as a function of size of training data set relative to size of validation data set through selection of values associated with the configuration parameters.
    Type: Grant
    Filed: September 24, 2019
    Date of Patent: June 20, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bo Zhang, Gregory Bramble, Parikshit Ram, Horst Cornelius Samulowitz
  • Publication number: 20230153634
    Abstract: A domain of an input dataset is identified and one or more archived domain knowledge features corresponding to the identified domain are identified. One or more user feature definitions for one or more user features defined by a user are inputted. The identified archived domain knowledge features and the user features are processed to generate a set of candidate features for presentation to the user. A selection of a subset of the candidate features is obtained from the user and one or more predictive models are generated based on the selected features.
    Type: Application
    Filed: November 14, 2021
    Publication date: May 18, 2023
    Inventors: Dakuo Wang, Udayan Khurana, Chuang Gan, Gregory Bramble, Abel Valente, Arunima Chaudhary, Carolina Maria Spina, Micah Smith
  • Patent number: 11620582
    Abstract: Techniques regarding one or more automated machine learning processes that analyze time series data are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise a time series analysis component that selects a machine learning pipeline for meta transfer learning on time series data by sequentially allocating subsets of training data from the time series data amongst a plurality of machine learning pipeline candidates.
    Type: Grant
    Filed: July 29, 2020
    Date of Patent: April 4, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bei Chen, Long Vu, Syed Yousaf Shah, Xuan-Hong Dang, Peter Daniel Kirchner, Si Er Han, Ji Hui Yang, Jun Wang, Jing James Xu, Dakuo Wang, Dhavalkumar C. Patel, Gregory Bramble, Horst Cornelius Samulowitz, Saket Sathe, Chuang Gan
  • Publication number: 20230061222
    Abstract: Disclosed herein is a method of training an artificial intelligence model that comprises an iterative training loop. Said iterative training loop comprises: receiving a current set of training data; dividing said current set of training data into a predetermined number of training data subsets; sequentially training said artificial intelligence model with each of said predetermined number of training data subsets using a training portion and calculating a performance metric using a validation portion; and comparing performance metrics from a previous iteration of said iterative training loop to said calculated performance metric to determine if an improving performance metric condition is met. Said method further comprises halting said iterative training loop unless said improving performance metric condition is not met at least once within a predetermined number of previous iterations.
    Type: Application
    Filed: August 30, 2021
    Publication date: March 2, 2023
    Inventors: Lukasz G. Cmielowski, Daniel Jakub Ryszka, Wojciech Sobala, Gregory Bramble
  • Patent number: 11514361
    Abstract: Embodiments for providing automated machine learning visualization. Machine learning tasks, transformers, and estimators may be received into one or more machine learning composition modules. The machine learning composition modules generate one or more machine learning models. A machine learning model pipeline is a sequence of transformers and estimators and an ensemble of machine learning pipelines are an ensemble of machine learning pipelines. A machine learning model pipeline, an ensemble of a plurality of machine learning model pipelines, or a combination thereof, along with corresponding metadata, may be generated using the machine learning composition modules. Metadata may be extracted from the machine learning model pipeline, the ensemble of a plurality of machine learning model pipelines, or combination thereof.
    Type: Grant
    Filed: August 30, 2019
    Date of Patent: November 29, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Theodoros Salonidis, John Eversman, Dakuo Wang, Alex Swain, Gregory Bramble, Lin Ju, Nicholas Mazzitelli, Voranouth Supadulya
  • Publication number: 20220327058
    Abstract: To automate time series forecasting machine learning pipeline generation, a data allocation size of time series data may be determined based on one or more characteristics of a time series data set. The time series data may be allocated for use by candidate machine learning pipelines based on the data allocation size. Features for the time series data may be determined and cached by the candidate machine learning pipelines. Predictions of each of the candidate machine learning pipelines using at least the one or more features may be evaluated. A ranked list of machine learning pipelines may be automatically generated from the candidate machine learning pipelines for time series forecasting based upon evaluating predictions of each of the one or more candidate machine learning pipelines.
    Type: Application
    Filed: March 15, 2022
    Publication date: October 13, 2022
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Long VU, Bei CHEN, Xuan-Hong DANG, Peter Daniel KIRCHNER, Syed Yousaf SHAH, Dhavalkumar C. PATEL, Si Er HAN, Ji Hui YANG, Jun WANG, Jing James XU, Dakuo WANG, Gregory BRAMBLE, Horst Cornelius SAMULOWITZ, Saket K. SATHE, Wesley M. GIFFORD, Petros ZERFOS
  • Publication number: 20220261598
    Abstract: To rank time series forecasting in machine learning pipelines, time series data may be incrementally allocated from a time series data set for testing by candidate machine learning pipelines based on seasonality or a degree of temporal dependence of the time series data. Intermediate evaluation scores may be provided by each of the candidate machine learning pipelines following each time series data allocation. One or more machine learning pipelines may be automatically selected from a ranked list of the one or more candidate machine learning pipelines based on a projected learning curve generated from the intermediate evaluation scores.
    Type: Application
    Filed: October 26, 2021
    Publication date: August 18, 2022
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bei CHEN, Long VU, Dhavalkumar C. PATEL, Syed Yousaf SHAH, Gregory BRAMBLE, Peter Daniel KIRCHNER, Horst Cornelius SAMULOWITZ, Xuan-Hong DANG, Petros ZERFOS
  • Patent number: 11416469
    Abstract: In an approach to unsupervised feature learning for relational data, a computer trains one or more entity aware autoencoders on one or more tables in a relational database, where each of the one or more entity aware autoencoders corresponds to one of the one or more tables in the relational database, and where each of the one or more entity aware autoencoders are comprised of an encoder and a decoder. A computer transforms each of the one or more tables in the relational database with the encoder of the corresponding trained entity aware autoencoder. A computer joins a first transformed table of the one or more tables in the relational database with each remaining one or more transformed tables in the relational database to form one or more joined tables. A computer aggregates the one or more joined tables. A computer outputs one or more feature representations.
    Type: Grant
    Filed: November 24, 2020
    Date of Patent: August 16, 2022
    Assignee: International Business Machines Corporation
    Inventors: Thanh Lam Hoang, Long Vu, Theodoros Salonidis, Gregory Bramble
  • Publication number: 20220207444
    Abstract: A system and method for assessing Pay-As-You-Go (PAYG) Automatic machine learned (AutoML) model pipeline charge to a user on the basis of performance improvement achieved by configuring a model pipeline with performance enhancements relative to a performance obtained by a base model pipeline. The method performs a ranking of pipelines (customized models) based on a user-specified metric (for example, prediction accuracy, run time, F1 score) or combination of metrics. The price for ranked pipelines is specified based on a “surrogate” model where the surrogate model is fit to the base model price and the maximum price for a model. The base model price relates to use of a current cloud resource utilization-based pricing model. The pricing per model pipeline increments on the basis of performance metric(s) in a linear fashion, e.g., using a linear pricing model, or in an exponential fashion, e.g., using a fixed percentage hike price model.
    Type: Application
    Filed: December 30, 2020
    Publication date: June 30, 2022
    Inventors: Gregory Bramble, Saket Sathe, Long Vu, Theodoros Salonidis, Horst Cornelius Samulowitz, Jean-François Puget
  • Publication number: 20220164698
    Abstract: A method to automatically assess data quality of data input into a machine learning model and remediate the data includes receiving input data for an automated machine learning model. Selections for a multiple data quality metrics are displayed. A selection for data quality metrics is received. The data quality metrics are determined according to the selection. Selections for data remediation strategies based on the selection of the data quality metrics are displayed. A selection for remediation recommendation strategies is received. The selected data remediation strategies are performed on the input data. Learning from the selection of the data quality metrics and the selection for the remediation strategies is performed. A new customized machine learning model is generated based on the learning.
    Type: Application
    Filed: November 25, 2020
    Publication date: May 26, 2022
    Inventors: Arunima Chaudhary, Dakuo Wang, Abel Valente, Carolina Maria Spina, Hima Patel, Nitin Gupta, Gregory Bramble, Horst Cornelius Samulowitz, Sameep Mehta, Theodoros Salonidis, Daniel M. Gruen, Chaung Gan
  • Publication number: 20220164332
    Abstract: In an approach to unsupervised feature learning for relational data, a computer trains one or more entity aware autoencoders on one or more tables in a relational database, where each of the one or more entity aware autoencoders corresponds to one of the one or more tables in the relational database, and where each of the one or more entity aware autoencoders are comprised of an encoder and a decoder. A computer transforms each of the one or more tables in the relational database with the encoder of the corresponding trained entity aware autoencoder. A computer joins a first transformed table of the one or more tables in the relational database with each remaining one or more transformed tables in the relational database to form one or more joined tables. A computer aggregates the one or more joined tables. A computer outputs one or more feature representations.
    Type: Application
    Filed: November 24, 2020
    Publication date: May 26, 2022
    Inventors: Thanh Lam Hoang, Long Vu, Theodoros Salonidis, Gregory Bramble
  • Publication number: 20220114019
    Abstract: A method, a structure, and a computer system for predicting pipeline training requirements. The exemplary embodiments may include receiving one or more worker node features from one or more worker nodes, extracting one or more pipeline features from one or more pipelines to be trained, and extracting one or more dataset features from one or more datasets used to train the one or more pipelines. The exemplary embodiments may further include predicting an amount of one or more resources required for each of the one or more worker nodes to train the one or more pipelines using the one or more datasets based on one or more models that correlate the one or more worker node features, one or more pipeline features, and one or more dataset features with the one or more resources. Lastly, the exemplary embodiments may include identifying a worker node requiring a least amount of the one or more resources of the one or more worker nodes for training the one or more pipelines.
    Type: Application
    Filed: October 13, 2020
    Publication date: April 14, 2022
    Inventors: Saket Sathe, Gregory Bramble, Long VU, Theodoros Salonidis
  • Publication number: 20220083881
    Abstract: An automated analytic tool (AAT) apparatus analyzes a machine learning system (MLS). The tool comprises a processor configured to receive experiment parameters associated with an experiment performed on the MLS, and captures information from a plurality of stages of the experiment. The information comprises information regarding MLS results and choices made during the experiment. The tool automatically revise the captured information into revised information utilizing a knowledge base comprising information from prior experiments. The tool then presents the revised information to a user.
    Type: Application
    Filed: September 14, 2020
    Publication date: March 17, 2022
    Inventors: Arunima Chaudhary, Dakuo Wang, David John Piorkowski, Daniel M. Gruen, Chuang Gan, Peter Daniel Kirchner, Gregory Bramble, Bei Chen, Abel Valente, Carolina Maria Spina, John Thomas Richards, Abhishek Bhandwaldar
  • Publication number: 20220076144
    Abstract: The exemplary embodiments disclose a method, a computer program product, and a computer system for determining that one or more model pipelines satisfy one or more constraints. The exemplary embodiments may include detecting a user uploading data, one or more constraints, and one or more model pipelines, collecting the data, the one or more constraints, and the one or more model pipelines, and determining that one or more of the model pipelines satisfies all of the one or more constraints based on applying one or more algorithms to the collected data, constraints, and model pipelines.
    Type: Application
    Filed: September 9, 2020
    Publication date: March 10, 2022
    Inventors: Parikshit Ram, Dakuo Wang, Deepak Vijaykeerthy, Vaibhav Saxena, Sijia Liu, Arunima Chaudhary, Gregory Bramble, Horst Cornelius Samulowitz, Alexander Gray
  • Publication number: 20220051049
    Abstract: A computer automatically selects a machine learning model pipeline using a meta-learning machine learning model. The computer receives ground truth data and pipeline preference metadata. The computer determines a group of pipelines appropriate for the ground truth data, and each of the pipelines includes an algorithm. The pipelines may include data preprocessing routines. The computer generates hyperparameter sets for the pipelines. The computer applies preprocessing routines to ground truth data to generate a group of preprocessed sets of said ground truth data and ranks hyperparameter set performance for each pipeline to establish a preferred set of hyperparameters for each of pipeline. The computer selects favored data features and applies each of the pipelines, with associated sets of preferred hyperparameters, to score the favored data features of the preprocessed ground truth data. The computer ranks pipeline performance and selects a candidate pipeline according to the ranking.
    Type: Application
    Filed: August 11, 2020
    Publication date: February 17, 2022
    Inventors: Dakuo Wang, Chuang Gan, Gregory Bramble, Lisa Amini, Horst Cornelius Samulowitz, Kiran A. Kate, Bei Chen, Martin Wistuba, Alexandre Evfimievski, Ioannis Katsis, Yunyao Li, Adelmo Cristiano Innocenza Malossi, Andrea Bartezzaghi, Ban Kawas, Sairam Gurajada, Lucian Popa, Tejaswini Pedapati, Alexander Gray