Patents by Inventor Calisto Zuzarte
Calisto Zuzarte has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250013641Abstract: Aspects of the invention include techniques for robust query execution plan selection. A non-limiting example method includes training a model to estimate, for an input including a query and one or more plans, an execution time for each plan and a respective uncertainty of the execution time. A new query and a search space including a plurality of candidate plans are input to the model. Given an estimated distribution for the execution time of each candidate plan, a suboptimality risk is computed for each candidate plan. A plan of the plurality of candidate plans is selected according to a plan selection policy. The plan selection policy includes at least one of: selecting a plan by assuming that plans have higher costs proportional to an estimated standard deviation of their respective uncertainty; and selecting a plan with minimum risk using model uncertainty, data uncertainty, or a total uncertainty.Type: ApplicationFiled: August 10, 2023Publication date: January 9, 2025Inventors: SEYED MOHAMMAD AMIN KAMALI, CALISTO ZUZARTE, Vincent Corvinelli, Brandon Lewis Frendo, Vasiliki Kantere, Ning WANG
-
Publication number: 20240427768Abstract: Aspects of the invention include techniques for providing a learned join cardinality estimation using a join graph representation. A non-limiting example method includes building a join cardinality estimation model. The model can be built by generating a training query having a known join cardinality, generating an adjacency matrix encoding a join graph of the training query, encoding one side of a diagonal axis of the adjacency matrix, and training the join cardinality estimation model using the encoded adjacency matrix and the known join cardinality. The method includes performing an inference using the join cardinality estimation model. The inference includes a predicted join cardinality for a query. The method includes executing a query execution plan for the query using the predicted join cardinality.Type: ApplicationFiled: June 22, 2023Publication date: December 26, 2024Inventors: SEYED MOHAMMAD AMIN KAMALI, Vincent Corvinelli, CALISTO ZUZARTE
-
Publication number: 20240086405Abstract: Examples described herein provide a computer-implemented method that includes training a machine learning model. The model is trained by generating a set of training queries using at least one of a query workload and relationships between tables in a database, building a query graph for each of the set of training queries, computing, for each training query of the set of training queries, a selectivity based at least in part on the query graph, and building, based at least in part on the set of training queries, an initial join result distribution as a collection of query graphs.Type: ApplicationFiled: September 14, 2022Publication date: March 14, 2024Inventors: Mohammed Fahd Alhamid, Vincent Corvinelli, Calisto Zuzarte
-
Patent number: 11921719Abstract: Examples described herein provide a computer-implemented method that includes training a machine learning model. The model is trained by generating a set of training queries using at least one of a query workload and relationships between tables in a database, building a query graph for each of the set of training queries, computing, for each training query of the set of training queries, a selectivity based at least in part on the query graph, and building, based at least in part on the set of training queries, an initial join result distribution as a collection of query graphs.Type: GrantFiled: September 14, 2022Date of Patent: March 5, 2024Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Mohammed Fahd Alhamid, Vincent Corvinelli, Calisto Zuzarte
-
Patent number: 11720565Abstract: A method, a computer system, and a computer program product for cardinality estimation is provided. Embodiments of the present invention includes accessing database relations. The database relations are utilized to collect a random sample from each of the database relations. Training data is then generated from the random sample. The training data is used to build a cumulative frequency function (CFF) model. The cumulative frequency function (CFF) model then provides a cardinality estimation for an output for SQL operators.Type: GrantFiled: August 27, 2020Date of Patent: August 8, 2023Assignee: International Business Machines CorporationInventors: Mohamad F. Kalil, Calisto Zuzarte, Mustafa Dawoud, Mohammed Fahd Alhamid, Vincent Corvinelli, Wai Keat Tan, Ronghao Yang
-
Patent number: 11593372Abstract: In an approach to improve query optimization in a database management system, embodiments identify opportunities for improvement in a cardinality estimate using a workload feedback process using a query feedback performed during query compilation. Embodiments identify correlations and relationships based on the structure of the query feedback and the runtime feedback performed, and collects data from the execution of a query to identify errors in estimates of the query optimizer. Further, embodiments submit the query feedback and the runtime feedback to a machine learning engine to update a set of models. Additionally, embodiments update a set of models based on the submitted query feedback and runtime feedback, and output a new, updated, or re-trained model based on collected data from the execution of the query to identify the errors in estimates of the query optimizer, the submitted query feedback and the runtime feedback, or a trained generated mode.Type: GrantFiled: July 1, 2020Date of Patent: February 28, 2023Assignee: International Business Machines CorporationInventors: Vincent Corvinelli, Mohammed Fahd Alhamid, Mohamad F. Kalil, Calisto Zuzarte
-
Publication number: 20220365931Abstract: Approaches presented herein enable dynamic optimization of a degree to which a query is parallelized for execution. More specifically, a priority associated with an obtained user query for execution is identified. A real-time metric indicating availability of one or more runtime resources is checked. An optimal degree of parallelism is calculated based on the priority associated with the obtained user query and the real-time availability metric. A plan is generated for executing the query using the calculated optimal degree of parallelism.Type: ApplicationFiled: May 14, 2021Publication date: November 17, 2022Inventors: Gaurav Mehrotra, Calisto Zuzarte, Bhavesh Rathore, Abhishek Iyer
-
Publication number: 20220067045Abstract: A method, a computer system, and a computer program product for cardinality estimation is provided. Embodiments of the present invention includes accessing database relations. The database relations are utilized to collect a random sample from each of the database relations. Training data is then generated from the random sample. The training data is used to build a cumulative frequency function (CFF) model. The cumulative frequency function (CFF) model then provides a cardinality estimation for an output for SQL operators.Type: ApplicationFiled: August 27, 2020Publication date: March 3, 2022Inventors: MOHAMAD F. KALIL, CALISTO ZUZARTE, MUSTAFA DAWOUD, MOHAMMED FAHD ALHAMID, Vincent Corvinelli, Wai Keat Tan, Ronghao Yang
-
Publication number: 20220004553Abstract: In an approach to improve query optimization in a database management system, embodiments identify opportunities for improvement in a cardinality estimate using a workload feedback process using a query feedback performed during query compilation. Embodiments identify correlations and relationships based on the structure of the query feedback and the runtime feedback performed, and collects data from the execution of a query to identify errors in estimates of the query optimizer. Further, embodiments submit the query feedback and the runtime feedback to a machine learning engine to update a set of models. Additionally, embodiments update a set of models based on the submitted query feedback and runtime feedback, and output a new, updated, or re-trained model based on collected data from the execution of the query to identify the errors in estimates of the query optimizer, the submitted query feedback and the runtime feedback, or a trained generated mode.Type: ApplicationFiled: July 1, 2020Publication date: January 6, 2022Inventors: Vincent Corvinelli, Mohammed Fahd Alhamid, Mohamad F. Kalil, Calisto Zuzarte
-
Patent number: 11210290Abstract: A maintenance subsystem of a database-management system (DBMS) receives a database query that requests access to data stored in a database column. The subsystem retrieves or infers frequent-value statistics for that column, each of which specifies the number of times one distinct value is stored in the column. The statistics are partitioned into Keep and Discard clusters and, using statistical or other computational methods based on the column's data distribution, the subsystem determines an optimal number of the statistics that should be kept by the DBMS in order to minimize cost, errors, or other parameters desired by an implementer. The subsystem then directly or indirectly directs a query-optimizer component of the DBMS to consider the optimal number of frequent-value statistics when selecting an optimal data-access plan. The selected plan is then used by the DBMS's storage-manager component to access the column when servicing the received query.Type: GrantFiled: January 6, 2020Date of Patent: December 28, 2021Assignee: International Business Machines CorporationInventors: Mohamad F. Kalil, Vincent Corvinelli, Calisto Zuzarte, Petrus Chan
-
Publication number: 20210209110Abstract: A maintenance subsystem of a database-management system (DBMS) receives a database query that requests access to data stored in a database column. The subsystem retrieves or infers frequent-value statistics for that column, each of which specifies the number of times one distinct value is stored in the column. The statistics are partitioned into Keep and Discard clusters and, using statistical or other computational methods based on the column's data distribution, the subsystem determines an optimal number of the statistics that should be kept by the DBMS in order to minimize cost, errors, or other parameters desired by an implementer. The subsystem then directly or indirectly directs a query-optimizer component of the DBMS to consider the optimal number of frequent-value statistics when selecting an optimal data-access plan. The selected plan is then used by the DBMS's storage-manager component to access the column when servicing the received query.Type: ApplicationFiled: January 6, 2020Publication date: July 8, 2021Inventors: Mohamad F. Kalil, Vincent Corvinelli, Calisto Zuzarte, Petrus Chan
-
Patent number: 11036737Abstract: A computer-implemented method for a partitioned bloom filter merge is provided. A non-limiting example of the computer-implemented method includes partitioning, by a processing device, a bloom filter into N equal size filter partitions. The method further includes distributing, by the processing device, each of the filter partitions to an associated node. The method further includes merging, by the processing device, the filter partitions in each of the associated nodes. The method further includes redistributing, by the processing device, the merged filter partitions to each of the N nodes. The method further includes joining, by the processing device, the merged filter partitions in each of the N nodes to assemble a complete merged bloom filter.Type: GrantFiled: May 9, 2019Date of Patent: June 15, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Naresh K. Chainani, Kiran K. Chinta, Ian R. Finlay, David Kalmuk, Timothy R. Malkemus, Calisto Zuzarte
-
Patent number: 11023464Abstract: A method, system and computer program product includes receiving and parsing an SQL query, identifying at least one common sub expression, or sub-query, or combination thereof, used multiple times within the SQL query, constructing for the at least one common sub expression, or sub-query, or combination thereof, a query execution plan that maintains as part of an initial result set a bit vector for a fact table, storing a result bit vector and an indicator that tracks a last valid tuple processed to produce the result bit vector when a TEMP operation is indicated in the query execution plan, reassessing a TEMP result in other portions of the query execution plan, priming a list of tuples using the TEMP result, and retrieving respective columns for further processing in the query execution plan.Type: GrantFiled: September 3, 2019Date of Patent: June 1, 2021Assignee: International Business Machines CorporationInventors: Ian Richard Finlay, Calisto Zuzarte, John Frederick Hornibrook
-
Publication number: 20210064617Abstract: A method, system and computer program product includes receiving and parsing an SQL query, identifying at least one common sub expression, or sub-query, or combination thereof, used multiple times within the SQL query, constructing for the at least one common sub expression, or sub-query, or combination thereof, a query execution plan that maintains as part of an initial result set a bit vector for a fact table, storing a result bit vector and an indicator that tracks a last valid tuple processed to produce the result bit vector when a TEMP operation is indicated in the query execution plan, reassessing a TEMP result in other portions of the query execution plan, priming a list of tuples using the TEMP result, and retrieving respective columns for further processing in the query execution plan.Type: ApplicationFiled: September 3, 2019Publication date: March 4, 2021Inventors: Ian Richard Finlay, Calisto Zuzarte, John Frederick Hornibrook
-
Publication number: 20200409948Abstract: A computer implemented method for processing database queries includes receiving a query and a set of runtime metrics corresponding to the query, wherein the query includes a set of elements, generating a set of encoded elements corresponding to the set of elements, processing the set of encoded elements and the set of runtime metrics to identify one or more possibly query classifications, determining a query execution plan according to the identified one or more possible query classifications, and executing the query according to the determined query execution plan. The computer implemented method may additionally include providing one or more runtime metrics corresponding to the executed query. A computer program product and a computer system corresponding to the method are also disclosed.Type: ApplicationFiled: June 26, 2019Publication date: December 31, 2020Inventors: Vincent Corvinelli, Calisto Zuzarte, Vinith Suriyakumar, Joel Raymond Scarfone, Diana Koval
-
Patent number: 10831843Abstract: Embodiments described herein provide a solution for optimizing a generating of query search results. A filtering search term (e.g., a search term that is used in a query to perform filtering aggregations of the query search results) is identified. A filtering bitmap that has a plurality of mapped locations corresponding to data values for the filtering search term is created. As a data value in the filtering search term is encountered during a scan of the query search results, the corresponding mapped location is updated. Each mapped location in the filtering bitmap is read to determine whether the value corresponding to the mapped location satisfies the filtering aggregation. The filtering aggregation can then be performed (e.g., prior to any grouping aggregation) by removing any of the query search results determined, based on the filtering bitmap, as having data values for which the filtering aggregation is not satisfied.Type: GrantFiled: November 1, 2017Date of Patent: November 10, 2020Assignee: International Business Machines CorporationInventors: Wenbin Ma, Liping Zhang, Calisto Zuzarte
-
Patent number: 10719512Abstract: A computer-implemented method for a partitioned bloom filter merge is provided. A non-limiting example of the computer-implemented method includes partitioning, by a processing device, a bloom filter into N equal size filter partitions. The method further includes distributing, by the processing device, each of the filter partitions to an associated node. The method further includes merging, by the processing device, the filter partitions in each of the associated nodes. The method further includes redistributing, by the processing device, the merged filter partitions to each of the N nodes. The method further includes joining, by the processing device, the merged filter partitions in each of the N nodes to assemble a complete merged bloom filter.Type: GrantFiled: October 23, 2017Date of Patent: July 21, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Naresh K. Chainani, Kiran K. Chinta, Ian R. Finlay, David Kalmuk, Timothy R. Malkemus, Calisto Zuzarte
-
Publication number: 20190266159Abstract: A computer-implemented method for a partitioned bloom filter merge is provided. A non-limiting example of the computer-implemented method includes partitioning, by a processing device, a bloom filter into N equal size filter partitions. The method further includes distributing, by the processing device, each of the filter partitions to an associated node. The method further includes merging, by the processing device, the filter partitions in each of the associated nodes. The method further includes redistributing, by the processing device, the merged filter partitions to each of the N nodes. The method further includes joining, by the processing device, the merged filter partitions in each of the N nodes to assemble a complete merged bloom filter.Type: ApplicationFiled: May 9, 2019Publication date: August 29, 2019Inventors: Naresh K. Chainani, Kiran K. Chinta, Ian R. Finlay, David Kalmuk, Timothy R. Malkemus, Calisto Zuzarte
-
Publication number: 20190130040Abstract: Embodiments described herein provide a solution for optimizing a generating of query search results. A filtering search term (e.g., a search term that is used in a query to perform filtering aggregations of the query search results) is identified. A filtering bitmap that has a plurality of mapped locations corresponding to data values for the filtering search term is created. As a data value in the filtering search term is encountered during a scan of the query search results, the corresponding mapped location is updated. Each mapped location in the filtering bitmap is read to determine whether the value corresponding to the mapped location satisfies the filtering aggregation. The filtering aggregation can then be performed (e.g., prior to any grouping aggregation) by removing any of the query search results determined, based on the filtering bitmap, as having data values for which the filtering aggregation is not satisfied.Type: ApplicationFiled: November 1, 2017Publication date: May 2, 2019Inventors: Wenbin Ma, Liping Zhang, Calisto Zuzarte
-
Publication number: 20190121890Abstract: A computer-implemented method for a partitioned bloom filter merge is provided. A non-limiting example of the computer-implemented method includes partitioning, by a processing device, a bloom filter into N equal size filter partitions. The method further includes distributing, by the processing device, each of the filter partitions to an associated node. The method further includes merging, by the processing device, the filter partitions in each of the associated nodes. The method further includes redistributing, by the processing device, the merged filter partitions to each of the N nodes. The method further includes joining, by the processing device, the merged filter partitions in each of the N nodes to assemble a complete merged bloom filter.Type: ApplicationFiled: October 23, 2017Publication date: April 25, 2019Inventors: Naresh K. Chainani, Kiran K. Chinta, Ian R. Finlay, David Kalmuk, Timothy R. Malkemus, Calisto Zuzarte