Patents by Inventor ILKER ENDER

ILKER ENDER has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Workload generation for optimal stress testing of big data management systems

Patent number: 12124362

Abstract: A computer-implemented method, system and computer program product for optimally performing stress testing against big data management systems. A set of random test queries is generated and compiled to determine the data points of the features (e.g., table type being queried) of the set of random test queries. A distance (e.g., Mahalanobis distance) is then measured between the data points of the features and the mean of a distribution of data points corresponding to each same feature of an extracted feature set. Each random test query whose distance exceeds a threshold distance is then ranked. The ranked random test queries are then executed in order of rank. Those executed random test queries which resulted in an error (e.g., system failure) are added to a log, which is used to identify those queries to perform a stress test against the big data management system.

Type: Grant

Filed: June 21, 2023

Date of Patent: October 22, 2024

Assignee: International Business Machines Corporation

Inventors: Ilker Ender, Austin Clifford, Pedro Miguel Barbas, Mara Elisa de Paiva Fernandes Matias, Hemant Asandas Bhatia
Automatic vertical partitioning of fact tables in a distributed query engine

Patent number: 11841878

Abstract: In an approach for automatic vertical partitioning of fact tables in a distributed query engine a processor analyzes a sample end-user workload of queries to extract filter predicates associated with each of multiple fact tables relating to a big data store. A processor, for each fact table, and for each column in the fact table to which a filter predicate is applied and where coarsification is required, generates a candidate partitioning expression incorporating an adjustment to a coarsification function based on a data distribution of values in the column. A processor scores the candidate partitioning expressions for each fact table based on cost data relating to the sample end-user workload and selects one or more candidate partitioning expressions to optimize partitioning of each fact table with each partition data being placed in a separate directory in a distributed file system.

Type: Grant

Filed: August 30, 2021

Date of Patent: December 12, 2023

Assignee: International Business Machines Corporation

Inventors: Austin Clifford, Hemant Asandas Bhatia, Ilker Ender, Mara Elisa de Paiva Fernandes Matias
Auto-scaling a query engine for enterprise-level big data workloads

Patent number: 11809424

Abstract: Aspects of the present invention disclose a method, computer program product, and system for auto-scaling a query engine. The method includes one or more processors monitoring query traffic at the query engine. The method further includes one or more processors classifying queries by a plurality of service classes based on a level of complexity of a query. The method further includes one or more processors comparing query traffic for each service class with a concurrency threshold of a maximum number of queries of the service class allowed to be concurrently processed. The method further includes one or more processors instructing auto-scaling of a cluster of worker nodes to change a number of worker nodes available in the cluster based on the comparison, over a defined period of time, of the query traffic relative to a defined upscaling threshold and a defined downscaling threshold.

Type: Grant

Filed: October 23, 2020

Date of Patent: November 7, 2023

Assignee: International Business Machines Corporation

Inventors: Austin Clifford, Ilker Ender, Mara Matias
WORKLOAD GENERATION FOR OPTIMAL STRESS TESTING OF BIG DATA MANAGEMENT SYSTEMS

Publication number: 20230333971

Abstract: A computer-implemented method, system and computer program product for optimally performing stress testing against big data management systems. A set of random test queries is generated and compiled to determine the data points of the features (e.g., table type being queried) of the set of random test queries. A distance (e.g., Mahalanobis distance) is then measured between the data points of the features and the mean of a distribution of data points corresponding to each same feature of an extracted feature set. Each random test query whose distance exceeds a threshold distance is then ranked. The ranked random test queries are then executed in order of rank. Those executed random test queries which resulted in an error (e.g., system failure) are added to a log, which is used to identify those queries to perform a stress test against the big data management system.

Type: Application

Filed: June 21, 2023

Publication date: October 19, 2023

Inventors: Ilker Ender, Austin Clifford, Pedro Miguel Barbas, Mara Elisa de Paiva Fernandes Matias, Hemant Asandas Bhatia
Workload generation for optimal stress testing of big data management systems

Patent number: 11741001

Abstract: A computer-implemented method, system and computer program product for optimally performing stress testing against big data management systems. A set of random test queries is generated and compiled to determine the data points of the features (e.g., table type being queried) of the set of random test queries. A distance (e.g., Mahalanobis distance) is then measured between the data points of the features and the mean of a distribution of data points corresponding to each same feature of an extracted feature set. Each random test query whose distance exceeds a threshold distance is then ranked. The ranked random test queries are then executed in order of rank. Those executed random test queries which resulted in an error (e.g., system failure) are added to a log, which is used to identify those queries to perform a stress test against the big data management system.

Type: Grant

Filed: October 1, 2021

Date of Patent: August 29, 2023

Assignee: International Business Machines Corporation

Inventors: Ilker Ender, Austin Clifford, Pedro Miguel Barbas, Mara Elisa de Paiva Fernandes Matias, Hemant Asandas Bhatia
WORKLOAD GENERATION FOR OPTIMAL STRESS TESTING OF BIG DATA MANAGEMENT SYSTEMS

Publication number: 20230103856

Abstract: A computer-implemented method, system and computer program product for optimally performing stress testing against big data management systems. A set of random test queries is generated and compiled to determine the data points of the features (e.g., table type being queried) of the set of random test queries. A distance (e.g., Mahalanobis distance) is then measured between the data points of the features and the mean of a distribution of data points corresponding to each same feature of an extracted feature set. Each random test query whose distance exceeds a threshold distance is then ranked. The ranked random test queries are then executed in order of rank. Those executed random test queries which resulted in an error (e.g., system failure) are added to a log, which is used to identify those queries to perform a stress test against the big data management system.

Type: Application

Filed: October 1, 2021

Publication date: April 6, 2023

Inventors: Ilker Ender, Austin Clifford, Pedro Miguel Barbas, Mara Elisa de Paiva Fernandes Matias, Hemant Asandas Bhatia
AUTOMATIC VERTICAL PARTITIONING OF FACT TABLES IN A DISTRIBUTED QUERY ENGINE

Publication number: 20230082010

Abstract: In an approach for automatic vertical partitioning of fact tables in a distributed query engine a processor analyzes a sample end-user workload of queries to extract filter predicates associated with each of multiple fact tables relating to a big data store. A processor, for each fact table, and for each column in the fact table to which a filter predicate is applied and where coarsification is required, generates a candidate partitioning expression incorporating an adjustment to a coarsification function based on a data distribution of values in the column. A processor scores the candidate partitioning expressions for each fact table based on cost data relating to the sample end-user workload and selects one or more candidate partitioning expressions to optimize partitioning of each fact table with each partition data being placed in a separate directory in a distributed file system.

Type: Application

Filed: August 30, 2021

Publication date: March 16, 2023

Inventors: Austin Clifford, Hemant Asandas Bhatia, Ilker Ender, Mara Elisa de Paiva Fernandes Matias
Temporary data storage in data node of distributed file system

Patent number: 11416180

Abstract: Proposed are concepts for providing resilience (i.e., fault tolerance) for the temporary data needs of a distributed file system. Such concepts may, for instance, provide a virtual storage layer in a data node of a distributed file system. The virtual storage layer may provide resilience for the temporary data needs of a Massively Parallel Processing (MPP) SQL on Hadoop engine.

Type: Grant

Filed: November 5, 2020

Date of Patent: August 16, 2022

Assignee: International Business Machines Corporation

Inventors: Austin Clifford, Mara Matias, Ilker Ender
TEMPORARY DATA STORAGE IN DATA NODE OF DISTRIBUTED FILE SYSTEM

Publication number: 20220137884

Abstract: Proposed are concepts for providing resilience (i.e., fault tolerance) for the temporary data needs of a distributed file system. Such concepts may, for instance, provide a virtual storage layer in a data node of a distributed file system. The virtual storage layer may provide resilience for the temporary data needs of a Massively Parallel Processing (MPP) SQL on Hadoop engine.

Type: Application

Filed: November 5, 2020

Publication date: May 5, 2022

Inventors: Austin Clifford, MARA MATIAS, ILKER ENDER
AUTO-SCALING A QUERY ENGINE FOR ENTERPRISE-LEVEL BIG DATA WORKLOADS

Publication number: 20220129460

Abstract: Aspects of the present invention disclose a method, computer program product, and system for auto-scaling a query engine. The method includes one or more processors monitoring query traffic at the query engine. The method further includes one or more processors classifying queries by a plurality of service classes based on a level of complexity of a query. The method further includes one or more processors comparing query traffic for each service class with a concurrency threshold of a maximum number of queries of the service class allowed to be concurrently processed. The method further includes one or more processors instructing auto-scaling of a cluster of worker nodes to change a number of worker nodes available in the cluster based on the comparison, over a defined period of time, of the query traffic relative to a defined upscaling threshold and a defined downscaling threshold.

Type: Application

Filed: October 23, 2020

Publication date: April 28, 2022

Inventors: Austin Clifford, ILKER ENDER, MARA MATIAS