Patents by Inventor Suratna Budalakoti

Suratna Budalakoti has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Automated interleaved clustering recommendation for database zone maps

Patent number: 11907263

Abstract: A computer measures for each column in many rows, a respective frequency of statements that filter the column in a workload of database statements, a respective count of distinct values used for filtration on the column in each statement individually, a respective frequency of each of the counts of distinct values used for filtration on the column across all of the database statements, and a respective value range of the column for each of many storage zones. A respective efficiency is measured for each of many distinct interleaved sorts. Each interleaved sort uses a respective distinct subset of the columns. Each interleaved sort is based on portions of each of the values for each row in a sampled subset of rows in each column of the subset of the columns of the interleaved sort. Efficiency measurement is based on frequencies of statements, value ranges of columns for each storage zone, and frequencies of counts of distinct values.

Type: Grant

Filed: October 11, 2022

Date of Patent: February 20, 2024

Assignee: Oracle International Corporation

Inventor: Suratna Budalakoti
Automated linear clustering recommendation for database zone maps

Patent number: 11868346

Abstract: Techniques to create zone maps automatically and efficiently for database query processing are disclosed. The techniques comprise creating a sample dataset to represent an original dataset, building a query workload modeler to characterize a full workload of queries, constructing a clustering quality evaluator to evaluate query performance on a dataset with a certain clustering on the columns, finding a clustering solution by evaluating different applications of the workload to the sample dataset corresponding to different clusterings, and determining which columns of the clustering solution could benefit from zone maps.

Type: Grant

Filed: December 30, 2020

Date of Patent: January 9, 2024

Assignee: Oracle International Corporation

Inventor: Suratna Budalakoti
Approximate estimation of number of distinct keys in a multiset using a sample

Patent number: 11537594

Abstract: Herein are quantitative analytics to increase the accuracy of cardinality estimation without increasing sample size. In an embodiment, a computer selects a few sample values from a multiset. A high-frequency exact count of distinct values that have at least a threshold amount of occurrences in the sample values is counted. A low-frequency exact count of distinct values in the sample that do not have at least the threshold amount of occurrences in the sample is counted. Based on multiple binomial probabilities, an upper bound of a count of missing distinct values in the multiset that are not in the sample is calculated. A total count of distinct values (NDV) in the multiset is estimated based on: a) the high-frequency exact count of distinct values, b) the low-frequency exact count of distinct values, and c) the upper bound of the count of missing distinct values in the multiset that are not in the sample.

Type: Grant

Filed: February 5, 2021

Date of Patent: December 27, 2022

Assignee: Oracle International Corporation

Inventor: Suratna Budalakoti
Chaining bloom filters to estimate the number of keys with low frequencies in a dataset

Patent number: 11520834

Abstract: Techniques are described for generating an approximate frequency histogram using a series of Bloom filters (BF). For example, to estimate the f1 and f2 cardinalities in a dataset, an ordered chain of three BFs is established (“BF1”, “BF2”, and “BF3”). An insertion operation is performed for each datum in the dataset, whereby the BFs are tested in order (starting at BF1) for the datum. If the datum is represented in a currently-tested BF, the subsequent BF in the chain is tested for the datum. If the datum is not represented in the currently-tested BF, the datum is added to the BF, a counter for the BF is incremented, and the insertion operation for the current datum ends. To estimate the cardinality of f1-values in the dataset, the BF2-counter is subtracted from the BF1-counter. Similarly, to estimate the cardinality of f2-values in the dataset, the BF3-counter is subtracted from the BF2-counter.

Type: Grant

Filed: July 28, 2021

Date of Patent: December 6, 2022

Assignee: Oracle International Corporation

Inventors: Tomas Karnagel, Suratna Budalakoti, Onur Kocberber, Nipun Agarwal, Alan Wood
APPROXIMATE ESTIMATION OF NUMBER OF DISTINCT KEYS IN A MULTISET USING A SAMPLE

Publication number: 20220253425

Abstract: Herein are quantitative analytics to increase the accuracy of cardinality estimation without increasing sample size. In an embodiment, a computer selects a few sample values from a multiset. A high-frequency exact count of distinct values that have at least a threshold amount of occurrences in the sample values is counted. A low-frequency exact count of distinct values in the sample that do not have at least the threshold amount of occurrences in the sample is counted. Based on multiple binomial probabilities, an upper bound of a count of missing distinct values in the multiset that are not in the sample is calculated. A total count of distinct values (NDV) in the multiset is estimated based on: a) the high-frequency exact count of distinct values, b) the low-frequency exact count of distinct values, and c) the upper bound of the count of missing distinct values in the multiset that are not in the sample.

Type: Application

Filed: February 5, 2021

Publication date: August 11, 2022

Inventor: Suratna Budalakoti
AUTOMATED LINEAR CLUSTERING RECOMMENDATION FOR DATABASE ZONE MAPS

Publication number: 20220207032

Abstract: Techniques to create zone maps automatically and efficiently for database query processing are disclosed. The techniques comprise creating a sample dataset to represent an original dataset, building a query workload modeler to characterize a full workload of queries, constructing a clustering quality evaluator to evaluate query performance on a dataset with a certain clustering on the columns, finding a clustering solution by evaluating different applications of the workload to the sample dataset corresponding to different clusterings, and determining which columns of the clustering solution could benefit from zone maps.

Type: Application

Filed: December 30, 2020

Publication date: June 30, 2022

Inventor: SURATNA BUDALAKOTI
Online optimizer statistics maintenance during load

Patent number: 11113282

Abstract: Techniques are provided for merging (a) statistics associated with data added to a table in a bulk load operation with (b) statistics associated with data that existed in the table before the bulk load operation. The statistics associated with the bulk load data are generated on-the-fly during the bulk load, and are merged with the pre-existing statistics as part of the same transaction that is used to perform the bulk load operation. Consequently, execution plans for queries that are assigned snapshot times after the commit time of the bulk load transaction will be selected based on the new statistics, while execution plans for queries that are assigned snapshot times before the commit time of the bulk load transaction will be selected based on the pre-existing statistics.

Type: Grant

Filed: September 28, 2018

Date of Patent: September 7, 2021

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Sunil P. Chakkappen, Hong Su, Mohamed Zait, Suratna Budalakoti
Load balancing for distributed processing of deterministically assigned data using statistical analysis of block data

Patent number: 11074516

Abstract: Dynamic generation and implementation of assignment mappings of data items in large data files to distributed processors to achieve objectives such as reduced overall processing time like. Any appropriate key (e.g., character string) can be identified or obtained for each data item in a data file and the file can be segmented into sequential data blocks, where each data block includes a set of data items. The data items in each of a first plurality of the blocks (e.g., sampled block set) may be initially sorted into one of a plurality of key ranges of a search space (each corresponding to a different respective processor) and analyses conducted on the data items totals in each key range. The key range boundaries can be adjusted by accounting for uncertainty in the sample estimates to more evenly distribute data items from all blocks sent to each processor and thereby achieve the objective.

Type: Grant

Filed: January 26, 2018

Date of Patent: July 27, 2021

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Randall Smith, Suratna Budalakoti, Alan Wood
LOAD BALANCING FOR DISTRIBUTED PROCESSING OF DETERMINISTICALLY ASSIGNED DATA USING STATISTICAL ANALYSIS OF BLOCK DATA

Publication number: 20190236474

Abstract: Dynamic generation and implementation of assignment mappings of data items in large data files to distributed processors to achieve objectives such as reduced overall processing time like. Any appropriate key (e.g., character string) can be identified or obtained for each data item in a data file and the file can be segmented into sequential data blocks, where each data block includes a set of data items. The data items in each of a first plurality of the blocks (e.g., sampled block set) may be initially sorted into one of a plurality of key ranges of a search space (each corresponding to a different respective processor) and analyses conducted on the data items totals in each key range. The key range boundaries can be adjusted by accounting for uncertainty in the sample estimates to more evenly distribute data items from all blocks sent to each processor and thereby achieve the objective.

Type: Application

Filed: January 26, 2018

Publication date: August 1, 2019

Inventors: RANDALL SMITH, SURATNA BUDALAKOTI, ALAN WOOD
System and method for creating an intelligent synopsis of a database using re-partitioning based sampling

Patent number: 10353900

Abstract: The present invention provides a re-partitioning-based sampling system and method which provides for generating a synopsis from large database tables such that an aggregation query performed on the synopsis provides an approximate answer to the aggregation query which is in prescribed error bounds relative to a query on the full database. The system includes a partition function generator, a synopsis vector calculator, and a synopsis constructor. The synopsis constructed by the system is sufficiently small to be held in memory to allow quick and resource efficient satisficing of aggregation queries.

Type: Grant

Filed: July 24, 2015

Date of Patent: July 16, 2019

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Suratna Budalakoti, Alan Wood, Garret Swart, Smriti Ramakrishnan
ONLINE OPTIMIZER STATISTICS MAINTENANCE DURING LOAD

Publication number: 20190102427

Abstract: Techniques are provided for merging (a) statistics associated with data added to a table in a bulk load operation with (b) statistics associated with data that existed in the table before the bulk load operation. The statistics associated with the bulk load data are generated on-the-fly during the bulk load, and are merged with the pre-existing statistics as part of the same transaction that is used to perform the bulk load operation. Consequently, execution plans for queries that are assigned snapshot times after the commit time of the bulk load transaction will be selected based on the new statistics, while execution plans for queries that are assigned snapshot times before the commit time of the bulk load transaction will be selected based on the pre-existing statistics.

Type: Application

Filed: September 28, 2018

Publication date: April 4, 2019

Inventors: Sunil P. Chakkappen, Hong Su, Mohamed Zait, Suratna Budalakoti
SYSTEM AND METHOD FOR CREATING AN INTELLIGENT SYNOPSIS OF A DATABASE USING RE-PARTITIONING BASED SAMPLING

Publication number: 20170024452

Abstract: The present invention provides a re-partitioning-based sampling system and method which provides for generating a synopsis from large database tables such that an aggregation query performed on the synopsis provides an approximate answer to the aggregation query which is in prescribed error bounds relative to a query on the full database. The system includes a partition function generator, a synopsis vector calculator, and a synopsis constructor. The synopsis constructed by the system is sufficiently small to be held in memory to allow quick and resource efficient satisficing of aggregation queries.

Type: Application

Filed: July 24, 2015

Publication date: January 26, 2017

Inventors: Suratna Budalakoti, Alan Wood, Garret Swart, Smriti Ramakrishnan