Abstract: A computer-implemented method comprising: receiving, with a computer, first and second datasets; performing, with the computer, column discovery on the first and second datasets using a first trained machine-learning model to produce a column map that indexes one or more columns in the first dataset to one or more columns in the second dataset; performing, with the computer, row discovery on the first and second datasets using a second trained machine-learning model, a trained approximate nearest neighbor index, and the column discovery to produce a row map that indexes one or more rows in the first dataset to one or more rows in the second dataset; combining, with the computer, the first and second datasets using the column map and the row map to form a combined dataset; and performing one or more actions with the combined dataset.
Abstract: A computer-implemented method for automatically determining data relationships includes generating a graphical user interface (GUI) that allows a user to intuitively form a customized model of data from different data sources. The GUI includes icons that represent data sources, data variable selection, data modeling, and data prediction. The icons can be logically arranged to form a customized model without any additional user input or knowledge of data modeling. A prediction GUI allows the user to set customized weights of data variables in the model to form predictive controls for data prediction such as in what-if scenarios.