Abstract: A method for rapid data analysis includes receiving and interpreting a first query operating on a first dataset partitioned into shards by a first field; collecting a first data sample from a first set of data shards; calculating a first result to the first query based on analysis of the first data sample; and partitioning a second dataset into shards by a second field based on the first result.
Type:
Grant
Filed:
May 26, 2022
Date of Patent:
May 7, 2024
Assignee:
Scuba Analytics, Inc.
Inventors:
Robert Johnson, Lior Abraham, Ann Johnson, Boris Dimitrov, Don Fossgreen
Abstract: A method for stratified-sampling-based query execution includes: receiving a query; collecting a first data sample from the first dataset using a non-stratified sampling technique; performing statistical analysis on the first data sample; identifying a stratum classifier from the statistical analysis; generating a stratum classification by calculating strata boundaries for the stratum classifier; and calculating a result to the query based on analysis of the second data sample.
Abstract: A method for rapid data analysis includes receiving and interpreting a first query operating on a first dataset partitioned into shards by a first field; collecting a first data sample from a first set of data shards; calculating a first result to the first query based on analysis of the first data sample; and partitioning a second dataset into shards by a second field based on the first result.
Type:
Grant
Filed:
June 8, 2020
Date of Patent:
June 28, 2022
Assignee:
Scuba Analytics, Inc.
Inventors:
Robert Johnson, Lior Abraham, Ann Johnson, Boris Dimitrov, Don Fossgreen
Abstract: A method for enhancing rapid data analysis includes receiving a set of data; storing the set of data in a first set of data shards sharded by a first field; and identifying anomalous data from the set of data by monitoring a range of shard indices associated with a first shard of the first set of data shards, detecting that the range of shard indices is smaller than an expected range by a threshold value, and identifying data of the first shard as anomalous data.
Type:
Grant
Filed:
July 9, 2020
Date of Patent:
March 1, 2022
Assignee:
SCUBA ANALYTICS, INC.
Inventors:
Robert Johnson, Oleksandr Barykin, Alex Suhan, Lior Abraham, Don Fossgreen
Abstract: A method for stratified-sampling-based query execution includes: receiving a query; collecting a first data sample from the first dataset using a non-stratified sampling technique; performing statistical analysis on the first data sample; identifying a stratum classifier from the statistical analysis; generating a stratum classification by calculating strata boundaries for the stratum classifier; and calculating a result to the query based on analysis of the second data sample.