Abstract: A method comprises receiving data points from a spreadsheet, mapping the data points to a reference space, generating a cover of the reference space, clustering the data points mapped to the reference space to determine each node of a graph, each node including at least one data point, generating a visualization depicting the nodes, the visualization including an edge between every two nodes that share at least one data point, generating a translation data structure indicating location of the data points in the spreadsheet as well as membership of each node, detecting a selection of at least one node, determining the location of data points in the spreadsheet corresponding to data points that are members of the selected node(s) using the translation data structure, and providing a first command to a spreadsheet application to provide a first visual identification of the first set of data points in the spreadsheet.
Abstract: An example method comprises receiving a multidimensional data set, receiving a predetermined number of features for a set of landmark features, when a current number of features of the set is less than the predetermined number: for each landmark feature of the set of landmark features, calculate a distance between that particular landmark feature and each non-selected feature that is not within the set, identify a closest non-selected feature to that particular landmark feature, identify a particular closest non-selected feature related to a largest distance among the distances, and adding the particular non-selected feature to the set of landmark features, and if the current number of features of the set of landmark features is equal to or greater than the predetermined number of features for the set of landmark features, then providing identification of at least a subset of features of the set of landmark features.
Abstract: An example method comprises receiving first data associated with data points, receiving a lens function selection, a metric function selection, and a resolution function, the metric function identified by the metric function selection being capable of performing functions on data as matrix functions, mapping second data based on the first data to a reference space by utilizing matrix vector multiplication for application of selected lens function on second data based on the first data to map the second data to the reference space, generating cover of reference space including the second data, clustering second data in cover based on the selected metric function to determine each node of a plurality of nodes, each of the nodes of the plurality of nodes comprising members representative of at least one subset of the data points, and generating a visualization comprising the plurality of nodes and a plurality of edges wherein each of the edges of the plurality of edges connects nodes with shared members.
Abstract: An example method includes receiving analysis data and output indicator, mapping data points from a transposition of the analysis data to a reference space, generating a cover of the reference space, clustering the data points mapped to the reference space using the cover and a metric function to determine each node of a plurality of nodes, for each node, identifying data points that are members to identify similar features, grouping features as being similar to each other based on node(s), for each feature, determining correlation with at least some data associated with the output indicator and generate a correlation score, displaying at least groupings of similar features and displaying the correlation scores, receiving a selection of features, generating a set of models based on selection, determining fit of each generated model to output data and generate a model score, and generating a model recommendation report.