Patents by Inventor Horst Cornelius Samulowitz

Horst Cornelius Samulowitz has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

QUALITY ASSESSMENT OF EXTRACTED FEATURES FROM HIGH-DIMENSIONAL MACHINE LEARNING DATASETS

Publication number: 20220292107

Abstract: A quality determination method, system, and computer program product that includes performing a dimensionality reduction on a high-dimensional dataset to form a dimensional-reduced dataset and determining, using a machine learning tool executed on a computing device, a quality of the dimensional-reduced dataset via a review of an extracted feature extracted from the dimensional-reduced dataset.

Type: Application

Filed: February 26, 2021

Publication date: September 15, 2022

Inventors: Petr Novotny, Aindrila Basak, Shaikh Shahriar Quader, Horst Cornelius Samulowitz, Chad Marston
AUTOMATED TIME SERIES FORECASTING PIPELINE RANKING

Publication number: 20220261598

Abstract: To rank time series forecasting in machine learning pipelines, time series data may be incrementally allocated from a time series data set for testing by candidate machine learning pipelines based on seasonality or a degree of temporal dependence of the time series data. Intermediate evaluation scores may be provided by each of the candidate machine learning pipelines following each time series data allocation. One or more machine learning pipelines may be automatically selected from a ranked list of the one or more candidate machine learning pipelines based on a projected learning curve generated from the intermediate evaluation scores.

Type: Application

Filed: October 26, 2021

Publication date: August 18, 2022

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bei CHEN, Long VU, Dhavalkumar C. PATEL, Syed Yousaf SHAH, Gregory BRAMBLE, Peter Daniel KIRCHNER, Horst Cornelius SAMULOWITZ, Xuan-Hong DANG, Petros ZERFOS
IMPLEMENTING PAY-AS-YOU-GO (PAYG) AUTOMATED MACHINE LEARNING AND AI

Publication number: 20220207444

Abstract: A system and method for assessing Pay-As-You-Go (PAYG) Automatic machine learned (AutoML) model pipeline charge to a user on the basis of performance improvement achieved by configuring a model pipeline with performance enhancements relative to a performance obtained by a base model pipeline. The method performs a ranking of pipelines (customized models) based on a user-specified metric (for example, prediction accuracy, run time, F1 score) or combination of metrics. The price for ranked pipelines is specified based on a “surrogate” model where the surrogate model is fit to the base model price and the maximum price for a model. The base model price relates to use of a current cloud resource utilization-based pricing model. The pricing per model pipeline increments on the basis of performance metric(s) in a linear fashion, e.g., using a linear pricing model, or in an exponential fashion, e.g., using a fixed percentage hike price model.

Type: Application

Filed: December 30, 2020

Publication date: June 30, 2022

Inventors: Gregory Bramble, Saket Sathe, Long Vu, Theodoros Salonidis, Horst Cornelius Samulowitz, Jean-François Puget
AUTOMATED DATA QUALITY INSPECTION AND IMPROVEMENT FOR AUTOMATED MACHINE LEARNING

Publication number: 20220164698

Abstract: A method to automatically assess data quality of data input into a machine learning model and remediate the data includes receiving input data for an automated machine learning model. Selections for a multiple data quality metrics are displayed. A selection for data quality metrics is received. The data quality metrics are determined according to the selection. Selections for data remediation strategies based on the selection of the data quality metrics are displayed. A selection for remediation recommendation strategies is received. The selected data remediation strategies are performed on the input data. Learning from the selection of the data quality metrics and the selection for the remediation strategies is performed. A new customized machine learning model is generated based on the learning.

Type: Application

Filed: November 25, 2020

Publication date: May 26, 2022

Inventors: Arunima Chaudhary, Dakuo Wang, Abel Valente, Carolina Maria Spina, Hima Patel, Nitin Gupta, Gregory Bramble, Horst Cornelius Samulowitz, Sameep Mehta, Theodoros Salonidis, Daniel M. Gruen, Chaung Gan
Random feature transformation forests for automatic feature engineering

Patent number: 11275974

Abstract: Embodiments for automated feature engineering by one or more processors are described. One or more selected transformations may be applied to a set of features in a dataset to create a set of transform features using random feature transformation forest (RFTF) classifiers. A transform feature may be selected from the set of transform features having a highest discriminative power as compared to other features of the set of transform features. At each node in a decision tree, store the selected feature, a split value, and the one or more selected transformations for the transform feature.

Type: Grant

Filed: September 17, 2018

Date of Patent: March 15, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Saket Sathe, Deepak S. Turaga, Horst Cornelius Samulowitz, Charu C. Aggarwal
MACHINE LEARNING WITH MULTIPLE CONSTRAINTS

Publication number: 20220076144

Abstract: The exemplary embodiments disclose a method, a computer program product, and a computer system for determining that one or more model pipelines satisfy one or more constraints. The exemplary embodiments may include detecting a user uploading data, one or more constraints, and one or more model pipelines, collecting the data, the one or more constraints, and the one or more model pipelines, and determining that one or more of the model pipelines satisfies all of the one or more constraints based on applying one or more algorithms to the collected data, constraints, and model pipelines.

Type: Application

Filed: September 9, 2020

Publication date: March 10, 2022

Inventors: Parikshit Ram, Dakuo Wang, Deepak Vijaykeerthy, Vaibhav Saxena, Sijia Liu, Arunima Chaudhary, Gregory Bramble, Horst Cornelius Samulowitz, Alexander Gray
USING META-LEARNING TO OPTIMIZE AUTOMATIC SELECTION OF MACHINE LEARNING PIPELINES

Publication number: 20220051049

Abstract: A computer automatically selects a machine learning model pipeline using a meta-learning machine learning model. The computer receives ground truth data and pipeline preference metadata. The computer determines a group of pipelines appropriate for the ground truth data, and each of the pipelines includes an algorithm. The pipelines may include data preprocessing routines. The computer generates hyperparameter sets for the pipelines. The computer applies preprocessing routines to ground truth data to generate a group of preprocessed sets of said ground truth data and ranks hyperparameter set performance for each pipeline to establish a preferred set of hyperparameters for each of pipeline. The computer selects favored data features and applies each of the pipelines, with associated sets of preferred hyperparameters, to score the favored data features of the preprocessed ground truth data. The computer ranks pipeline performance and selects a candidate pipeline according to the ranking.

Type: Application

Filed: August 11, 2020

Publication date: February 17, 2022

Inventors: Dakuo Wang, Chuang Gan, Gregory Bramble, Lisa Amini, Horst Cornelius Samulowitz, Kiran A. Kate, Bei Chen, Martin Wistuba, Alexandre Evfimievski, Ioannis Katsis, Yunyao Li, Adelmo Cristiano Innocenza Malossi, Andrea Bartezzaghi, Ban Kawas, Sairam Gurajada, Lucian Popa, Tejaswini Pedapati, Alexander Gray
AUTOMATED MACHINE LEARNING USING NEAREST NEIGHBOR RECOMMENDER SYSTEMS

Publication number: 20220044078

Abstract: Methods, computer program products and/or systems are provided that perform the following operations: obtaining a performance matrix representing accuracies obtained by executing a plurality of pipelines on a plurality of training data sets, wherein a pipeline comprises a series of operations performed on a data set; selecting a defined number of top pipelines as potential pipelines for a testing data set based, at least in part, on a similarity between the testing data set and each of the plurality of training data sets represented in the performance matrix; storing results from executing each of the potential pipelines as a new data set; determining a pipeline accuracy for each of the potential pipelines when executed against the testing data set; and providing a recommended pipeline for use with the testing data set based, at least in part, on the pipeline accuracy for each potential pipeline.

Type: Application

Filed: August 10, 2020

Publication date: February 10, 2022

Inventors: Saket Sathe, Gregory Bramble, Horst Cornelius Samulowitz, Charu C. Aggarwal
AUTOMATED MACHINE LEARNING PIPELINE GENERATION

Publication number: 20220036246

Abstract: Techniques regarding one or more automated machine learning processes that analyze time series data are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise a time series analysis component that selects a machine learning pipeline for meta transfer learning on time series data by sequentially allocating subsets of training data from the time series data amongst a plurality of machine learning pipeline candidates.

Type: Application

Filed: July 29, 2020

Publication date: February 3, 2022

Inventors: Bei Chen, Long VU, Syed Yousaf Shah, Xuan-Hong Dang, Peter Daniel Kirchner, Si Er Han, Ji Hui Yang, Jun Wang, Jing James Xu, Dakuo Wang, Dhavalkumar C. Patel, Gregory Bramble, Horst Cornelius Samulowitz, Saket Sathe, Chuang Gan
CODE GENERATION FOR AUTO-AI

Publication number: 20220004914

Abstract: An embodiment of the invention may include a method, computer program product, and system for creating a data analysis tool. The method may include a computing device that generates an AI pipeline based on an input dataset, wherein the AI pipeline is generated using an Automated Machine Learning program. The method may include converting the AI pipeline to a non-native format of the Automated Machine Learning program. This may enable the AI pipeline to be used outside of the Automated Machine Learning program, thereby increasing the usefulness of the created program by not tying it to the Automated Machine Learning program. Additionally, this may increase the efficiency of running the AI pipeline by eliminating unnecessary computations performed by the Automated Machine Learning program.

Type: Application

Filed: July 2, 2020

Publication date: January 6, 2022

Inventors: Peter Daniel Kirchner, Gregory Bramble, Horst Cornelius Samulowitz, Dakuo Wang, Arunima Chaudhary, Gregory Filla
Knowledge Aided Feature Engineering

Publication number: 20210216904

Abstract: Embodiments relate to a system, program product, and method for employing feature engineering to improve classifier performance. A first machine learning (ML) model with a first learning program is selected. The first selected ML model is operatively associated with a first structured dataset. First features in the first dataset directed at performance of the selected ML model are identified. A second structured dataset is assessed with respect to the identified features in the first dataset, and new features in the second dataset are identified, where the new feature is semantically related to the identified features in the first dataset. The first dataset is dynamically augmented with the identified new features in the second dataset. The dynamically augmented first dataset is applied to the selected ML model to subject an embedded learning algorithm of the selected ML model to training using the augmented first dataset.

Type: Application

Filed: January 13, 2020

Publication date: July 15, 2021

Applicant: International Business Machines Corporation

Inventors: Udayan Khurana, Sainyam Galhotra, Oktie Hassanzadeh, Kavitha Srinivas, Horst Cornelius Samulowitz
Methods and systems for feature engineering

Patent number: 11048718

Abstract: Embodiments for feature engineering by one or more processors are described. A plurality of transformations are applied to a set of features in each of a plurality of datasets. An output of each of the plurality of transformations is a score. For each of the sets of features, selecting those of the plurality of transformations for which said score is above a predetermined threshold. A signal representative of said selection is generated.

Type: Grant

Filed: August 10, 2017

Date of Patent: June 29, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Elias Khalil, Udayan Khurana, Fatemeh Nargesian, Horst Cornelius Samulowitz, Deepak S. Turaga
METHODS FOR AUTOMATICALLY CONFIGURING PERFORMANCE EVALUATION SCHEMES FOR MACHINE LEARNING ALGORITHMS

Publication number: 20210089937

Abstract: A system that provides a mathematical formulation for new problem of model validation and model selection in presence of test data feedback. The system comprises a memory that stores computer-executable components. A processor, operably coupled to the memory, executes the computer-executable components stored in the memory. A selection component selects a metric of performance evaluation accuracy; and a configuration component configures performance evaluation schemes for machine learning algorithms. A characterization component employs a supervised learning-based approach to characterize relationship between the configuration of the performance evaluation scheme and fidelity of performance estimates; and an optimization component that optimizes accuracy of the machine learning algorithms as a function of size of training data set relative to size of validation data set through selection of values associated with the configuration parameters.

Type: Application

Filed: September 24, 2019

Publication date: March 25, 2021

Inventors: Bo Zhang, Gregory Bramble, Parikshit Ram, Horst Cornelius Samulowitz
CREATING OPTIMIZED MACHINE-LEARNING MODELS

Publication number: 20200184380

Abstract: A machine-learning model generation method, system, and computer program product deciding, via a first algorithm, a machine-learning algorithm that is best for customer data, invoking the machine-learning algorithm to train a neural network model with the customer data, analyzing the neural network model produced by the training for an accuracy, and improving the accuracy by iteratively repeating the training of the neural network model until a customer-defined constraint is met, as determined by the first algorithm.

Type: Application

Filed: December 11, 2018

Publication date: June 11, 2020

Inventors: Gegi Thomas, Adelmo Cristiano Innocenza Malossi, Tejaswini Pedapati, Ganesh Venkataraman, Roxana Istrate, Martin Wistuba, Florian Michael Scheidegger, Chao Xue, Rong Yan, Horst Cornelius Samulowitz, Benjamin Herta, Debashish Saha, Hendrik Strobelt
RANDOM FEATURE TRANSFORMATION FORESTS FOR AUTOMATIC FEATURE ENGINEERING

Publication number: 20200090010

Abstract: Embodiments for automated feature engineering by one or more processors are described. One or more selected transformations may be applied to a set of features in a dataset to create a set of transform features using random feature transformation forest (RFTF) classifiers. A transform feature may be selected from the set of transform features having a highest discriminative power as compared to other features of the set of transform features. At each node in a decision tree, store the selected feature, a split value, and the one or more selected transformations for the transform feature.

Type: Application

Filed: September 17, 2018

Publication date: March 19, 2020

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Saket SATHE, Deepak S. TURAGA, Horst Cornelius SAMULOWITZ, Charu C. AGGARWAL
METHODS AND SYSTEMS FOR FEATURE ENGINEERING

Publication number: 20190050465

Abstract: Embodiments for feature engineering by one or more processors are described. A plurality of transformations are applied to a set of features in each of a plurality of datasets. An output of each of the plurality of transformations is a score. For each of the sets of features, selecting those of the plurality of transformations for which said score is above a predetermined threshold. A signal representative of said selection is generated.

Type: Application

Filed: August 10, 2017

Publication date: February 14, 2019

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Elias KHALIL, Udayan KHURANA, Fatemeh NARGESIAN, Horst Cornelius SAMULOWITZ, Deepak S. TURAGA
Managing a portfolio of experts

Patent number: 8433660

Abstract: Managing a portfolio of experts is described where the experts may be for example, automated experts or human experts. In an embodiment a selection engine selects an expert from a portfolio of experts and assigns the expert to a specified task. For example, the selection engine has a Bayesian machine learning system which is iteratively updated each time an experts performance on a task is observed. For example, sparsely active binary task and expert feature vectors are input to the selection engine which maps those feature vectors to a multi-dimensional trait space using a mapping learnt by the machine learning system. In examples, an inner product of the mapped vectors gives an estimate of a probability distribution over expert performance. In an embodiment the experts are automated problem solvers and the task is a hard combinatorial problem such as a constraint satisfaction problem or combinatorial auction.

Type: Grant

Filed: December 1, 2009

Date of Patent: April 30, 2013

Assignee: Microsoft Corporation

Inventors: David Stern, Horst Cornelius Samulowitz, Ralf Herbrich, Thore Graepel
Managing a Portfolio of Experts

Publication number: 20110131163

Abstract: Managing a portfolio of experts is described where the experts may be for example, automated experts or human experts. In an embodiment a selection engine selects an expert from a portfolio of experts and assigns the expert to a specified task. For example, the selection engine has a Bayesian machine learning system which is iteratively updated each time an experts performance on a task is observed. For example, sparsely active binary task and expert feature vectors are input to the selection engine which maps those feature vectors to a multi-dimensional trait space using a mapping learnt by the machine learning system. In examples, an inner product of the mapped vectors gives an estimate of a probability distribution over expert performance. In an embodiment the experts are automated problem solvers and the task is a hard combinatorial problem such as a constraint satisfaction problem or combinatorial auction.

Type: Application

Filed: December 1, 2009

Publication date: June 2, 2011

Applicant: Microsoft Corporation

Inventors: David Stern, Horst Cornelius Samulowitz, Ralf Herbrich, Thore Graepel

prev 1 2