Patents by Inventor Suratna Budalakoti
Suratna Budalakoti has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11907263Abstract: A computer measures for each column in many rows, a respective frequency of statements that filter the column in a workload of database statements, a respective count of distinct values used for filtration on the column in each statement individually, a respective frequency of each of the counts of distinct values used for filtration on the column across all of the database statements, and a respective value range of the column for each of many storage zones. A respective efficiency is measured for each of many distinct interleaved sorts. Each interleaved sort uses a respective distinct subset of the columns. Each interleaved sort is based on portions of each of the values for each row in a sampled subset of rows in each column of the subset of the columns of the interleaved sort. Efficiency measurement is based on frequencies of statements, value ranges of columns for each storage zone, and frequencies of counts of distinct values.Type: GrantFiled: October 11, 2022Date of Patent: February 20, 2024Assignee: Oracle International CorporationInventor: Suratna Budalakoti
-
Patent number: 11868346Abstract: Techniques to create zone maps automatically and efficiently for database query processing are disclosed. The techniques comprise creating a sample dataset to represent an original dataset, building a query workload modeler to characterize a full workload of queries, constructing a clustering quality evaluator to evaluate query performance on a dataset with a certain clustering on the columns, finding a clustering solution by evaluating different applications of the workload to the sample dataset corresponding to different clusterings, and determining which columns of the clustering solution could benefit from zone maps.Type: GrantFiled: December 30, 2020Date of Patent: January 9, 2024Assignee: Oracle International CorporationInventor: Suratna Budalakoti
-
Patent number: 11537594Abstract: Herein are quantitative analytics to increase the accuracy of cardinality estimation without increasing sample size. In an embodiment, a computer selects a few sample values from a multiset. A high-frequency exact count of distinct values that have at least a threshold amount of occurrences in the sample values is counted. A low-frequency exact count of distinct values in the sample that do not have at least the threshold amount of occurrences in the sample is counted. Based on multiple binomial probabilities, an upper bound of a count of missing distinct values in the multiset that are not in the sample is calculated. A total count of distinct values (NDV) in the multiset is estimated based on: a) the high-frequency exact count of distinct values, b) the low-frequency exact count of distinct values, and c) the upper bound of the count of missing distinct values in the multiset that are not in the sample.Type: GrantFiled: February 5, 2021Date of Patent: December 27, 2022Assignee: Oracle International CorporationInventor: Suratna Budalakoti
-
Patent number: 11520834Abstract: Techniques are described for generating an approximate frequency histogram using a series of Bloom filters (BF). For example, to estimate the f1 and f2 cardinalities in a dataset, an ordered chain of three BFs is established (“BF1”, “BF2”, and “BF3”). An insertion operation is performed for each datum in the dataset, whereby the BFs are tested in order (starting at BF1) for the datum. If the datum is represented in a currently-tested BF, the subsequent BF in the chain is tested for the datum. If the datum is not represented in the currently-tested BF, the datum is added to the BF, a counter for the BF is incremented, and the insertion operation for the current datum ends. To estimate the cardinality of f1-values in the dataset, the BF2-counter is subtracted from the BF1-counter. Similarly, to estimate the cardinality of f2-values in the dataset, the BF3-counter is subtracted from the BF2-counter.Type: GrantFiled: July 28, 2021Date of Patent: December 6, 2022Assignee: Oracle International CorporationInventors: Tomas Karnagel, Suratna Budalakoti, Onur Kocberber, Nipun Agarwal, Alan Wood
-
Publication number: 20220253425Abstract: Herein are quantitative analytics to increase the accuracy of cardinality estimation without increasing sample size. In an embodiment, a computer selects a few sample values from a multiset. A high-frequency exact count of distinct values that have at least a threshold amount of occurrences in the sample values is counted. A low-frequency exact count of distinct values in the sample that do not have at least the threshold amount of occurrences in the sample is counted. Based on multiple binomial probabilities, an upper bound of a count of missing distinct values in the multiset that are not in the sample is calculated. A total count of distinct values (NDV) in the multiset is estimated based on: a) the high-frequency exact count of distinct values, b) the low-frequency exact count of distinct values, and c) the upper bound of the count of missing distinct values in the multiset that are not in the sample.Type: ApplicationFiled: February 5, 2021Publication date: August 11, 2022Inventor: Suratna Budalakoti
-
Publication number: 20220207032Abstract: Techniques to create zone maps automatically and efficiently for database query processing are disclosed. The techniques comprise creating a sample dataset to represent an original dataset, building a query workload modeler to characterize a full workload of queries, constructing a clustering quality evaluator to evaluate query performance on a dataset with a certain clustering on the columns, finding a clustering solution by evaluating different applications of the workload to the sample dataset corresponding to different clusterings, and determining which columns of the clustering solution could benefit from zone maps.Type: ApplicationFiled: December 30, 2020Publication date: June 30, 2022Inventor: SURATNA BUDALAKOTI
-
Patent number: 11113282Abstract: Techniques are provided for merging (a) statistics associated with data added to a table in a bulk load operation with (b) statistics associated with data that existed in the table before the bulk load operation. The statistics associated with the bulk load data are generated on-the-fly during the bulk load, and are merged with the pre-existing statistics as part of the same transaction that is used to perform the bulk load operation. Consequently, execution plans for queries that are assigned snapshot times after the commit time of the bulk load transaction will be selected based on the new statistics, while execution plans for queries that are assigned snapshot times before the commit time of the bulk load transaction will be selected based on the pre-existing statistics.Type: GrantFiled: September 28, 2018Date of Patent: September 7, 2021Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Sunil P. Chakkappen, Hong Su, Mohamed Zait, Suratna Budalakoti
-
Patent number: 11074516Abstract: Dynamic generation and implementation of assignment mappings of data items in large data files to distributed processors to achieve objectives such as reduced overall processing time like. Any appropriate key (e.g., character string) can be identified or obtained for each data item in a data file and the file can be segmented into sequential data blocks, where each data block includes a set of data items. The data items in each of a first plurality of the blocks (e.g., sampled block set) may be initially sorted into one of a plurality of key ranges of a search space (each corresponding to a different respective processor) and analyses conducted on the data items totals in each key range. The key range boundaries can be adjusted by accounting for uncertainty in the sample estimates to more evenly distribute data items from all blocks sent to each processor and thereby achieve the objective.Type: GrantFiled: January 26, 2018Date of Patent: July 27, 2021Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Randall Smith, Suratna Budalakoti, Alan Wood
-
Publication number: 20190236474Abstract: Dynamic generation and implementation of assignment mappings of data items in large data files to distributed processors to achieve objectives such as reduced overall processing time like. Any appropriate key (e.g., character string) can be identified or obtained for each data item in a data file and the file can be segmented into sequential data blocks, where each data block includes a set of data items. The data items in each of a first plurality of the blocks (e.g., sampled block set) may be initially sorted into one of a plurality of key ranges of a search space (each corresponding to a different respective processor) and analyses conducted on the data items totals in each key range. The key range boundaries can be adjusted by accounting for uncertainty in the sample estimates to more evenly distribute data items from all blocks sent to each processor and thereby achieve the objective.Type: ApplicationFiled: January 26, 2018Publication date: August 1, 2019Inventors: RANDALL SMITH, SURATNA BUDALAKOTI, ALAN WOOD
-
Patent number: 10353900Abstract: The present invention provides a re-partitioning-based sampling system and method which provides for generating a synopsis from large database tables such that an aggregation query performed on the synopsis provides an approximate answer to the aggregation query which is in prescribed error bounds relative to a query on the full database. The system includes a partition function generator, a synopsis vector calculator, and a synopsis constructor. The synopsis constructed by the system is sufficiently small to be held in memory to allow quick and resource efficient satisficing of aggregation queries.Type: GrantFiled: July 24, 2015Date of Patent: July 16, 2019Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Suratna Budalakoti, Alan Wood, Garret Swart, Smriti Ramakrishnan
-
Publication number: 20190102427Abstract: Techniques are provided for merging (a) statistics associated with data added to a table in a bulk load operation with (b) statistics associated with data that existed in the table before the bulk load operation. The statistics associated with the bulk load data are generated on-the-fly during the bulk load, and are merged with the pre-existing statistics as part of the same transaction that is used to perform the bulk load operation. Consequently, execution plans for queries that are assigned snapshot times after the commit time of the bulk load transaction will be selected based on the new statistics, while execution plans for queries that are assigned snapshot times before the commit time of the bulk load transaction will be selected based on the pre-existing statistics.Type: ApplicationFiled: September 28, 2018Publication date: April 4, 2019Inventors: Sunil P. Chakkappen, Hong Su, Mohamed Zait, Suratna Budalakoti
-
Publication number: 20170024452Abstract: The present invention provides a re-partitioning-based sampling system and method which provides for generating a synopsis from large database tables such that an aggregation query performed on the synopsis provides an approximate answer to the aggregation query which is in prescribed error bounds relative to a query on the full database. The system includes a partition function generator, a synopsis vector calculator, and a synopsis constructor. The synopsis constructed by the system is sufficiently small to be held in memory to allow quick and resource efficient satisficing of aggregation queries.Type: ApplicationFiled: July 24, 2015Publication date: January 26, 2017Inventors: Suratna Budalakoti, Alan Wood, Garret Swart, Smriti Ramakrishnan