FRAMEWORK FOR WORKLOAD PREDICTION AND PHYSICAL DATABASE DESIGN

According to some embodiments, methods and systems may be associated with a cloud computing environment. A workload prediction framework may receive observed workload information associated with a database in the cloud computing environment (e.g., a DataBase as a Service (“DBaaS”)). Based on the observed workload information, a Statement Arrival Rate (“SAR”) prediction may be generated. In addition, a host variable assignment prediction may be generated based on the observed workload information. The workload prediction framework may then use the SAR prediction and the host variable assignment prediction to automatically create a workload prediction for the database. A physical database design advisor (e.g., a table partitioning advisor) may receive the workload prediction and, responsive to the workload prediction, automatically generate a recommended physical layout for the database (e.g., using a cost model, the current physical layout, and an objective function).

Skip to: Description  ·  Claims  · Patent History  ·  Patent History
Description
BACKGROUND

An enterprise may utilize a cloud computing environment to let users perform tasks. For example, the enterprise might let various users execute an application via the cloud computing environment to process purchase orders, adjust human resources information, generate invoices, etc. The cloud computing environment may be associated with one or more databases, and the physical layout of such databases can substantially impact performance and memory consumption. As a result, an inappropriate physical layout can significantly degrade performance or increase the memory consumption and thereby increase the hardware costs of an organization operating the database (the increase of hardware costs may be a consequence of bad performance or increased memory consumption). Because the amount of provisioned Dynamic Random Access Memory (“DRAM”) dominates hardware costs, DataBase-as-a-Service (“DBaaS”) providers in particular must carefully consider the economical tradeoffs between memory consumption and performance. Finding an optimal physical layout, e.g., with respect to memory consumption or performance, is a complex and time-consuming process usually performed by database experts. As a result, academia and industry developed tools for automated physical database design. Typically, such automated approaches focus on static workloads. Note, however, that workloads will change over time. Whenever workload changes are not addressed timely the current physical layout may no longer be optimal, resulting in a significant degradation in performance or an increase in memory consumption. The question of how to find appropriate physical layouts for workloads that will change over time is largely open despite high practical relevance.

It would therefore be desirable to provide accurate and efficient workload predictions which can serve as an input to a physical database design advisor. As a consequence, the physical layout may be timely adapted to workload changes and therefore be optimized for the future workload, such that future high performance and/or low memory consumption is achieved.

SUMMARY

According to some embodiments, methods and systems may be associated with a cloud computing environment. A workload prediction framework may receive observed workload information associated with a database in the cloud computing environment (e.g., a DBaaS). Based on the observed workload information, a Statement Arrival Rate (“SAR”) prediction may be generated. In addition, a host variable assignment prediction may be generated based on the observed workload information. The workload prediction framework may then use the SAR prediction and the host variable assignment prediction to automatically create a workload prediction for the database. A physical database design advisor (e.g., a table partitioning advisor) may receive the workload prediction and, responsive to the workload prediction, automatically generate a recommended physical layout for the database (e.g., using a cost model, the current physical layout, and an objective function). Note that the current physical layout may be considered in this decision as switching from one to another layout can also degrade performance or increase memory consumption.

Some embodiments comprise means for receiving, by a computer processor of a workload prediction framework, observed workload information associated with a database in the cloud computing environment; based on the observed workload information, means for generating a SAR prediction; based on the observed workload information, means for generating a host variable assignment prediction; means for automatically creating a workload prediction for the database using the SAR prediction and the host variable assignment prediction; and responsive to the workload prediction, means for automatically generating a recommended physical layout for the database by a physical database design advisor.

Some technical advantages of some embodiments disclosed herein are improved systems and methods to provide accurate and efficient workload predictions which can serve as an input to a physical database design advisor.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A through 1B illustrate table partitioning layouts.

FIG. 2 is a high-level block diagram associated with a cloud-based computing system in accordance with some embodiments.

FIG. 3 is a more detailed diagram of a system according to some embodiments.

FIG. 4 is a custom adapter creation method in accordance with some embodiments.

FIG. 5 illustrates a system according to some embodiments.

FIG. 6 is a more detailed system in accordance with some embodiments.

FIG. 7 is an example of workload drift according to some embodiments.

FIG. 8 illustrates a static workload in accordance with some embodiments.

FIG. 9 is a SAR method in accordance with some embodiments.

FIGS. 10A and 10B illustrate the relationship between SAR and a discrete Fourier transform of an observed SAR according to some embodiments.

FIGS. 11A through 11F show examples of detection and classification of SAR for various SQL statements in accordance with some embodiments.

FIG. 12 is a Directed Acyclic Graph (“DAG”) classifier to determine workload drift type according to some embodiments.

FIG. 13 is a host variable assignment method according to some embodiments.

FIGS. 14A and 14B illustrates the classification of assignments in accordance with some embodiments.

FIG. 15 is a probability mass function according to some embodiments.

FIGS. 16A through 16C are examples for the prediction of assignments in accordance with some embodiments.

FIG. 17 shows the impact of η on the choice of a future database layout according to some embodiments.

FIG. 18 is a workload prediction and physical database design display according to some embodiments.

FIG. 19 is an apparatus or platform according to some embodiments.

FIG. 20 is a tabular portion of a workload prediction data store in accordance with some embodiments.

FIG. 21 illustrates a tablet computer according to some embodiments.

DETAILED DESCRIPTION

In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of embodiments. However, it will be understood by those of ordinary skill in the art that the embodiments may be practiced without these specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail so as not to obscure the embodiments.

One or more specific embodiments of the present invention will be described below. In an effort to provide a concise description of these embodiments, all features of an actual implementation may not be described in the specification. It should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another. Moreover, it should be appreciated that such a development effort might be complex and time consuming, but would nevertheless be a routine undertaking of design, fabrication, and manufacture for those of ordinary skill having the benefit of this disclosure.

Real-world applications are characterized by workloads where the arrival rate and parameterization of Structured Query Language (“SQL”) statements can change over time. Without continuous adjustments of the physical layout to reflect such workload drifts, memory consumption and performance can deteriorate significantly. Existing approaches for automated physical database design, particularly table partitioning advisors, fail to address workload drifts because the physical layouts they propose are based on the observed workload. Some embodiments described herein provide a framework for the continuous adaptation of physical layout under workload drifts. Some embodiments predict the future workload based on detected workload drifts in the arrival rate and in the parameterization of SQL statements. The predicted workload is then fed into a physical database design advisor. Using a real-world application, accurate predictions of the future workload and a substantial hardware cost reduction on a commercial cloud database can be achieved compared to existing approaches.

Consider a stream of parameterized SQL statements (SELECT, INSERT, UPDATE, DELETE, . . . ) as the workload, where each statement contains host variables that are assigned by parameter values at execution time. A workload drift is then characterized by a temporal change in the arrival rate of statements or in the parameter values assigned to the host variables. Whenever workload drifts are not addressed in a timely fashion, the current physical layout may no longer be optimal, which can lead to a significant increase in workload costs.

A straightforward method to deal with workload drifts is to repeatedly feed the observed workload into a physical database design advisor, e.g., at fixed intervals. Such an approach is inherently backward-looking as the suggested physical layout lags behind workload drifts and may already be suboptimal when data reorganization (e.g., table repartitioning) starts. By contrast, some embodiments described herein provide a forward-looking approach that determines the new physical layout using a prediction of the future workload. Typical workload predictors are only able to forecast a future arrival rate of statements. However, physical database design advisors, in particular table partitioning advisors, often rely on fine-granular workload statistics, e.g., the parameterization of SQL statements. As a result, some embodiments described herein predict the future parameter values assigned to the host variables (in addition to the arrival rate of statements).

As a real-world scenario for workload drift, consider historical test results that many software development projects store in a database to identify bugs or authorize patches. For instance, the SAP® HANA development project retains statistics on more than 30 billion test runs. The corresponding test result database groups related test cases (e.g., TPC-H benchmark queries) into test profiles. Therefore, the statement “SELECT status FROM test_cases WHERE id_test_profile=:1” returns the status (e.g., running, successfully executed, or failed) of all test cases with the given test profile ID.

FIGS. 1A through 1B illustrate table partitioning layouts. In particular, FIG. 1A is a backward-looking approach 110 where a currently observed workload 111 () is used create partitions P1 and P2 in layout Lobs 113 (regardless of what the future workload 112 () looks like). Note that most of the statement instantiations are above 72,154,000 in the future workload and only few statement instantiations are below 72,154,000. As a consequence, the four pages of P2 are hot because only those pages are frequently accessed (and need to stay in DRAM) while all pages of P1 are cold because those pages are only rarely accessed (and can be evicted to cheaper secondary storage). P2 contains four hot pages (which need to stay in DRAM) to handle statement instantiations above 72,154,000 while P1 has cold pages to handle statement instantiations below 72,154,000. Note that each partition 113 may be associated with multiple columns (e.g., one for status and one for id_test_profile as illustrated by dotted lines in FIG. 1A) and the approach 110 is associated with the observed workload during a six-hour time window (from 08:00 to 14:00) on a regular workday. Each dot represents a statement execution at a specific time (x-axis) with a specific parameter value (y-axis) assigned to host variable:1. Observe that the workload drifts over time as the later the statement is instantiated, the larger the parameter value becomes.

Using the backward-looking approach 110, a table partitioning advisor proposes layout Lobs 113 that groups frequently accessed records into hot partition P2 and all other records into cold partition P1. In contrast, FIG. 1B is a forward-looking approach 120 where a predicted workload 121 () is used to create partitions P1, P2, and P3 in layout Lobs 123 (closer to what the future workload 122 () looks like as compared to FIG. 1A). P1 contains two hot pages (which need to stay in DRAM) to handle statement instantiations above 72,164,000 while P2 and P3 have cold pages to handle statement instantiations below 72,164,000. The proposed layout Lpred 123 groups frequently accessed records into a hot partition P3, while rarely and never accessed records are separated into cold partitions P2 and P1. In order to assess how both approaches 110, 120 impact memory consumption, one may execute the actual future workload 122 () in SAP® HANA's column store using both layouts 113, 123. The data is stored on pages, not clustered by id_test_profile and no index exists on id_test_profile. FIG. 1B shows whether the pages of the status and id_test_profile columns in Lobs 113 and Lpred 123 were classified as hot or cold (based on the five-minute rule). Overall, layout Lpred 123, as proposed by the forward-looking approach 120, reduces the number of hot pages held in DRAM by a factor of two as compared to Lobs 113 as suggested by the backward-looking approach 110. The main reason is that the small, hot partition P3 of Lpred allows for a more aggressive partition pruning during the evaluation of the selection predicate than the large partition P2 of Lobs 113 since most statement instantiations are above 72,164,000 in the actual future workload. Therefore, only partition P3 in Lpred and P2 in Lobs needs to be accessed for most of the statement instantiations. As P3 in Lpred contains only two pages while P2 in Lobs contains four pages the memory consumption with Lpred can be reduced by a factor of two as compared to Lobs.

Thus, the goal of some embodiments may be to find a predicted workload from an observed workload , such that the predicted workload approximates the future workload . There may be several challenges, however, when trying to predict the future workload based on the observed workload . In real-world applications, various workload drift types (e.g., linear, exponential, reoccurring, static, and irregular) have been identified. Moreover, these drift types can also overlap. For an individual statement, the arrival rate as well as the assignments of parameter values to host variables can be impacted by different drift types. Therefore, a workload predictor must be able to handle multiple drift types (and at least some combinations thereof) at the arrival rate and assignment level.

Moreover, many real-world applications are characterized by a continuously drifting workload, thus necessitating regular readjustments of the physical layout. However, the data reorganization (e.g., table repartitioning) incurred by changing the physical layout can be prohibitively costly. Therefore, a change in the physical layout might only be advised when the expected benefits of workload cost reductions outweigh the cost of data reorganization.

To address these challenges, some embodiments described herein provide a system 200 that consists of two phases as illustrated in FIG. 2. In the first phase, a workload prediction framework 210 predicts the future workload. In the second phase, the predicted workload is provided to a physical database design advisor 250 along with information about the current physical layout which can then automatically create a recommended future physical layout for a database. Note that the current physical layout may be used as an input to the physical database design advisor (since a change in the physical layout might only be advised when the expected benefits of workload cost reductions outweigh the cost of data reorganization). The term “automatically” may refer to a process that requires little or no human intervention. As illustrated by the system 300 of FIG. 3, a workload prediction framework 310 may predict the future Statement Arrival Rate (“SAR”) 320 and the future assignments of parameter values to host variables 330 to create the predicted workload that is given to a physical database design advisor 350 along with information about the current physical layout. Some embodiments may incorporate table repartitioning costs in the physical design advice phase. Furthermore, the system 300 may allow for the exchanging or extending all components to support workload drifts in novel environments. In addition, both phases may be periodically repeated to ensure a continuous adaption of the physical layout.

As used herein, devices, including those associated with the system 300 and any other device described herein, may exchange information via any communication network which may be one or more of a Local Area Network (“LAN”), a Metropolitan Area Network (“MAN”), a Wide Area Network (“WAN”), a proprietary network, a Public Switched Telephone Network (“PSTN”), a Wireless Application Protocol (“WAP”) network, a Bluetooth network, a wireless LAN network, and/or an Internet Protocol (“IP”) network such as the Internet, an intranet, or an extranet. Note that any devices described herein may communicate via one or more such communication networks.

The workload prediction framework 310 and physical database design advisor 350 may store information into and/or retrieve information from various data stores (e.g., various predictions or recommendation), which may be locally stored or reside remote from the workload prediction framework 310 and/or physical database design advisor 350. Although a single workload prediction framework 310 and physical database design advisor 350 are shown in FIG. 3, any number of such devices may be included. Moreover, various devices described herein might be combined according to embodiments of the present invention. For example, in some embodiments, the workload prediction framework 310 and physical database design advisor 350 might comprise a single apparatus. The system 300 functions may be performed by a constellation of networked apparatuses, such as in a distributed processing or cloud-based architecture.

An administrator may access the system 300 via a remote device (e.g., a Personal Computer (“PC”), tablet, or smartphone) to view information about and/or manage operational information in accordance with any of the embodiments described herein. In some cases, an interactive Graphical User Interface (“GUI”) display may let an operator or administrator define and/or adjust certain parameters via the remote device (e.g., to adjust prediction rules or logic, alter (pun and/or (pp, map elements to data sources, etc.) and/or provide or receive automatically generated recommendations, results, or alerts associated with the system 300.

FIG. 4 is a method 400 that might be performed by some or all of the elements of the systems 200, 300 described with respect to FIGS. 2 and 3. The flow charts described herein do not imply a fixed order to the steps, and embodiments of the present invention may be practiced in any order that is practicable. Note that any of the methods described herein may be performed by hardware, software, or any combination of these approaches. For example, a computer-readable storage medium may store thereon instructions that when executed by a machine result in performance according to any of the embodiments described herein.

At S410, a computer processor of a workload prediction framework may receive observed workload information associated with a database (e.g., a DBaaS), such as one implemented in a cloud computing environment. Based on the observed workload information, the system may generate a SAR prediction at S420. Based on the observed workload information, the system may also generate a host variable assignment prediction at S430. At S440, the system can then automatically create a workload prediction for the database using the SAR prediction and the host variable assignment prediction. Responsive to the workload prediction and a current physical layout, at S450 a physical database design advisor automatically generates a recommended physical layout for the database.

To discuss the problem of predicting a future workload based on an observed workload, notation as set forth in Table I will be utilized.

TABLE I Notation for Workload Prediction  = {q1, . . . , qi, . . . , qn} Set of n parameterized SQL statements hi = [hi1, . . . , hij, . . . , himi] Vector of mi host variables of qi ∈  vi = [vi1, . . . , vij, . . . , vimi] Vector of mi parameter values (t, qi, vi) Instantiation of statement qi with vi at timestamp t Workload (set of p statement instantiations)  (   , hij) All assignments to host variable hij in workload   ,   ,  Observed, future, and predicted workload os, oe Start and end timestamps of  fs, fe Start and end timestamps of  and 

={q1, . . . , qi, . . . , qn} may be defined as a set of n∈ parameterized SQL statements (e.g., SELECT, INSERT, UPDATE, and DELETE statements). Each statement qi∈ contains a vector hi=[hi1, . . . , hij, . . . , himi] of mi∈ host variables. The triple (t, qi, vi) may be defined as an instantiation of qi∈ with vector vi=[vi1, . . . , vij, . . . , vimi] of mi parameter values at timestamp t∈. A statement instantiation (t, qi, vi) contains mi assignments, such that parameter value vij∈dom(hij) is assigned to host variable hij at timestamp t, where dom(hij) denotes the domain of hij. Workload ={(t1, qi1, vi1), . . . , (tp, qip, vip)} may be defined as a set of p∈ statement instantiations. Further, (, hij) may be defined as the set of all assignments to host variable hij in workload by certain parameter values at certain timestamps (note that vij is part of vi):


(,hij):={(t,vij)|(t,qi,vi)∈,vij is assigned to hij}.

Let os, oe, fs, fe∈ be four timestamps, such that os<oe≤fs<fe. Define an observed workload as , such that ∀(t, qi, vi)∈: os≤t<oe, and a future workload as , such that ∀(t, qi, vi)∈: fs≤t<fe. Some embodiments described herein find a predicted workload from an observed workload , where ∀(t, qi, vi)∈:fs≤t<fe, such that the predicted workload approximates the future workload .

FIG. 5 is an overview of a system 500 according to some embodiments. It consists of a workload prediction framework 510 and a physical database design advisor 550. The workload prediction framework 510 predicts the future workload based on the observed workload on a per-statement basis. As it may be computationally infeasible to consider all statements in the observed workload for the prediction, some embodiments only utilize statements in the SQL plan cache because they typically represent more than 99% of the workload cost.

For each statement qi∈, the future workload is predicted in two independent stages. In stage I, SAR detection 521, SAR classification 522, and SAR prediction 523 are used to predict the statement arrival rate. In stage II, host variable assignment detection 531, host variable assignment classification 532, and host variable assignment prediction 533 are used to predict the future assignments of parameter values to host variables. The predicted workload can then be obtained by combining the results of both stages. The advisor 550 uses this information along with information about a current physical layout to propose the future physical layout Lfut with the smallest combined workload and data reorganization costs based on the predicted workload , and the current physical layout Lcur. According to some embodiments, the workload prediction framework 510 and advisor 550 periodically repeat these steps to adopt the physical layout in small and cheap adjustments.

FIG. 6 is a more detailed system 600 in accordance with some embodiments.

As before consists of a workload prediction framework 610 and a physical database design advisor 650. The workload prediction framework 610 predicts the future workload associated with SQL statements based on the observed workload . For each statement qi∈, the future workload is predictedPI i on a SAR prediction and a host variable assignment prediction. The SAR prediction utilizes SAR detection 621, SAR classification 622 (e.g., via classifier α), and SAR prediction 623 (e.g., via predictors α1 through αz). The future assignments of parameter values to host variables utilizes host variable assignment detection 631, host variable assignment classification 632 (e.g., via classifier β), and host variable assignment prediction 633 (e.g., via predictors β1 through βz). The predicted workload can then be obtained by combining the results of the SAR and host variable assignment predictions. The advisor 650 uses along with information about the current physical layout Lcur to propose the future physical layout Lfut with the smallest combined workload and data reorganization costs. According to some embodiments, the workload prediction framework 610 and advisor 650 periodically repeat these steps to adopt the physical layout.

Note that workloads can vary over time in various different ways. For example, a real-world application workload might drift in a linear, exponential, reoccurring, static, or irregular fashion.

A linear or exponential workload drift is characterized by a linearly or exponentially increasing (or decreasing) arrival rate of statements, respectively parameter values assigned to host variables. For example, the parameter values of the domain of id_test_profile might grow linearly over time (e.g., as shown by the observed workload 111 in FIG. 1A). An exponential drift, for instance, might occur close to the release of a new version of a system (or the addition of a new feature to a system) when certain statements are instantiated exponentially more frequently than during regular development phases.

Since databases often interact with humans, workloads may follow reoccurring patterns. For example, to test a specific build, a test environment, called “install session,” might be created on a dedicated test server using the SQL statement “INSERT INTO install_sessions (id_make, id_server) VALUES (:1, :2)”. FIG. 7 is an example 700 of workload drift instantiations of this statement between December 11 and 31, 2020 at a certain time (x-axis on graph 710) using a specific build ID (y-axis). Two reoccurring drift patterns can be found. First, there is a weekly pattern, where a steep increase of id_make on weekdays alternates with much slighter increases on weekends. Second, there is an annual pattern, where id_make increases more moderately around Christmas as compared to the rest of the month. Since id_make follows two reoccurring patterns and grows linearly, FIG. 7 illustrates an example 700 of two overlapping drift types.

A static workload is a workload where no temporal drift can be observed. For example, the dimension table test_case_info represents existing tests (e.g., their SQL string), and statement “SELECT id FROM test_cases WHERE id_test_case_info=:1” returns all test case executions with the given test_case ID from the fact table test_cases. FIG. 8 illustrates 800 a static workload with a graph 810 that shows all instantiations of this statement between 08:00 and 20:00 on a regular workday. No change in the parameter values can be observed during the considered time frame. This might be expected, for example, when the same tests are repeatedly executed over an extended period of time to identify bugs or authorize patches.

An irregular workload drift is characterized by an abrupt and unexpected change of the statement arrival rate or the parameter values. For example, occasional stress test campaigns, in addition to regular testing, can lead to an irregular drift of the statement arrival rate. Unlike other drift types, irregular workload drifts can neither be modeled nor predicted. Nonetheless, some embodiments described herein may handle irregular workload drifts.

FIG. 9 is a SAR prediction method 900 in accordance with some embodiments. S910 starts by discretizing the observed workload to reduce noise and other short-term fluctuations. For this purpose, a set of equidistant timestamps may be defined using a discretization interval δ∈ between two successive, discrete timestamps. By way of example, δ may be set to five minutes.

Δ(os, oe) may be defined as a set of equidistant timestamps between the start and end timestamps os, oe


Δ(os,oe){os+λ·δ|λ∈,os≤λ·δoe}.

To calculate the observed statement arrival frequency, all statement instantiations in the observed workload that fall within interval [t, t+δ) may be aggregated for a timestamp t ∈Δ(os, oe). The observed statement arrival frequency F(t, , qi) at timestamp t∈Δ(os, oe) for statement qi∈ in the observed workload may be defined as:


F(t,,qi):=|{(t′,q′,v′)∈|qi′=qi∧t≤t′<t+δ}|.

Finally, the observed statement arrival rate may be defined as a series of consecutive statement arrival frequencies between the start and end timestamps os, oe∈N in the observed workload . The observed statement arrival rate (observed SAR) for statement qi∈ in the observed workload may be defined as:


SAR(,qj):={(t,F(t,,qi))|t∈Δ(os,oe)}.

FIGS. 10A and 10B illustrate the relationship between SAR and a discrete Fourier transform of an observed SAR according to some embodiments. In particular, FIG. 10A shows 1010 the observed SAR (, qi) for equidistant timestamps between 08:00 and 14:00 with a discretization interval δ of five minutes, calculated from the observed workload 111 of FIG. 1A. Note there is a wavelike pattern with peaks at 08:30, 11:00, and 13:00.

Referring again to FIG. 9, at S920 the system may detect which workload drift types are present in the observed SAR. According to some embodiments, a condition-based detector may be used in connection with the observed SAR for each drift type. To detect a linear workload drift, embodiments may use the Pearson correlation between the series of discrete timestamps and the series of statement arrival frequencies (both series being abbreviated as SAR(, qi)). The condition is satisfied if the Pearson correlation coefficient ρ is greater or equal than a threshold φlin∈[0, 1]:


SAR(,qi) is linear ⇔|ρ(SAR(,qi))|≥φexp.

To detect exponential workload drifts, embodiments may take advantage of the fact that a function grows exponentially if its logarithm grows linearly. Therefore, the condition is satisfied if the Pearson correlation coefficient ρ between the series of discrete timestamps and the series of the logarithm of each statement arrival frequency is greater or equal than a threshold φexp∈[0, 1] (note that the logarithm is only applied on the series of statement arrival frequencies F(t, , qi) in SAR(, qi) and not on the series of discrete timestamps):


SAR(,qi) is exponential⇔|ρ(log(SAR(,qi)))|≥φexp.

To detect a reoccurring workload drift, embodiments may first compute the Discrete Fourier Transform (“DFT”), i.e., convert the observed SAR from the time domain into the frequency domain. Afterwards, the system may evaluate whether a sinusoid with an amplitude greater or equal than a threshold θ(, qi) is present in the DFT. To specify the threshold θ(, qi), embodiments may first determine the smallest and largest observed statement arrival frequencies lbi and ubi of statement qi in the observed workload. Given a fixed φcyc∈[0,1], the threshold θ(, qi) is obtained as φcyc times the largest possible amplitude range but at least φcyc times the smallest statement arrival frequency.


ubi:=maxt∈Δ(os,oe)F(t,,qi)


lbi:=mint∈Δ(os,oe)F(t,,qi)


θ(,qi):=φcyc·max(ubi−lbi,lbi)


SAR(,qi) is reoccurring⇔∃∈DFT(SAR(,qi)): ||≤θ(,qi).

FIG. 10B shows 1020 the Discrete Fourier Transform DFT(SAR (, qi)) as a graph mapping the amplitude to the period of sinusoid (in minutes). Each dot represents a sinusoid with a certain period (x-axis) and amplitude (y-axis). Also depicted is a threshold θ(, qi) calculated using the smallest (1498) and largest (11491) statement arrival frequencies and φcyc=0.1. As can be seen, the amplitudes of two sinusoids with periods of 120 and 180 minutes exceed the threshold, which constitutes a reoccurring drift.

In order to identify static workloads, embodiments may examine whether the observed SAR is stable, i.e., it fluctuates only within narrow limits around the mean statement arrival frequency μi. Given a fixed φstatic ∈[0,1], the condition is satisfied if Root Mean Square Error (“RMSE”) between the observed statement arrival frequencies and u, is less than or equal to φstatic times φi:

μ i := t Δ ( o s , o e ) F ( t , 𝒪 , q i ) "\[LeftBracketingBar]" Δ ( o s , o e ) "\[RightBracketingBar]" RMSE ( SAR ( 𝒪 , q i ) ) := ( t , F ( t , 𝒪 , q i ) ) SAR ( 𝒪 , q i ) ( F ( t , 𝒪 , q i ) - μ i ) 2 "\[LeftBracketingBar]" Δ ( o s , o e ) "\[RightBracketingBar]" SAR ( 𝒪 , q i ) is static RMSE ( SAR ( 𝒪 , q i ) ) φ s t a t i c · μ i .

Since irregular workload drifts are, by definition, abrupt and unexpected, a traditional condition might not be useful to handle these types of drifts. According to some embodiments, a workload is classified as irregular if none of the above conditions for linear, exponential, reoccurring, or static drifts are met. Note that embodiments described herein are extensible such that other conditions could be plugged in to detect an irregular drift.

To demonstrate how workload drifts can be detected using the above conditions, FIGS. 11A through 11F show six SQL statements (qi through q6) the observed SAR and whether the conditions are satisfied (✓) or not (X). Due to space constraints, only the sinusoid with the largest amplitude in the condition for a reoccurring drift is presented in these figures. The following parameters were used: φlin=0.8, φexp=0.8, φcyc=0.1, and φstatic=0.1. Note that in some cases multiple drift types are detected, e.g., in FIG. 11B an exponential and a reoccurring drift is detected.

Referring again to FIG. 9, at S930 conflicts between multiple detected workload drift types may be resolved such that the observed SAR is classified as one drift type or a combination of different drift types that are compatible with each other. FIG. 12 is a Directed Acyclic Graph (“DAG”) classifier 1200 that uses decisions 1210 determines workload drift type according to some embodiments.

If exactly one drift type is detected, the classifier 1200 decides that the observed SAR is the detected drift type. If two or more drift types are detected, the classifier distinguishes compatible and incompatible drift combinations. According to some embodiments, the combination of linear and reoccurring is considered compatible (e.g., FIG. 7 and q4 in FIG. 11D). In contrast, since an exponential drift asymptotically subsumes a linear drift, both together are an incompatible combination. In this case, the classifier 1200 prefers the drift type with the stronger Pearson correlation coefficient ρ. Further, an exponential drift is preferred over a reoccurring drift because the exponential drift may appear as a sinusoid in the DFT (e.g., q2 in FIG. 11B).

Table II shows classification outcomes and the predicted statement arrival frequencies at future discrete timestamps t∈Δ(fs, fe) for the six SQL statements shown in FIGS. 11A through 11F. Because of a 6-hour observation window and a choice of 5 minutes as discretization interval δ, the timestamps of range between 0 and 71, while timestamps of begin at 72. If none of the conditions are satisfied, the classifier 1200 decides that the observed SAR is irregular.

TABLE II Classification Outcomes Classification Predicted Statement Arrival Frequency q1 Linear F(t,   , q1) = 13t + 7071 q2 Exponential F(t,   , q2) = e0.09t+1.41 q3 Reoccurring F(t,   , q3) = 1216 cos(2πt) + 18 sin(2πt) − 1004 cos(2πt) − 1701 sin(2πt) + 4954 q4 Linear and F(t,   , q4) = 61t + 5801 + 185 cos(2πt) + Reoccurring 135 sin(2πt) − 239 cos(2πt) − 317 sin (2πt) q5 Static F(t,   , q5) = 10690 q6 Irregular F(t,   , q6) = F(t mod 72,   , q6)

Referring again to FIG. 9, at S940 the system may predict the future SAR using the observed SAR and the outcome of the prior classification. To do this, the system may calculate the predicted statement arrival frequency F(t,, qi) at future timestamps t∈Δ(fs, fe) for each statement qj∈. The predicted SAR (, qi) is then obtained as the series of consecutive statement arrival frequencies between fs and fe. According to some embodiments, the prediction is based on the classification of the observed SAR:

    • Linear: Use a linear approximation function obtained by ordinary least-squares regression on the observed SAR.
    • Exponential: First perform ordinary least-squares regression on the natural logarithm of the observed SAR. The future SAR is calculated by raising e to the power of the prediction produced by the linear approximation function.
    • Reoccurring: Convert all sinusoids with an amplitude greater or equal than threshold θ(, qi) into a sum of trigonometric functions in the time domain.
    • Static: Use the mean observed statement arrival frequency μi as a prediction of future statement arrival frequencies.
    • Irregular: We copy the observed SAR and paste it into the future, similar to a backward-looking approach.
    • Linear and reoccurring: Compute a sum of a linear function and trigonometric functions. The linear function is obtained by ordinary least-squares regression on the observed SAR. Then calculate a normalized observed SAR by subtracting the linear function from the observed SAR. Finally, obtain the trigonometric functions by transforming all sinusoids in the DFT of the normalized SAR with an amplitude greater or equal than threshold θ(, qi).
      Sinusoids with a small amplitude may be filtered out when handling a reoccurring drift because they may result from short-term fluctuations or noise and would therefore lead to overfitting. Since irregular drifts can neither be modeled nor predicted, the framework cannot know the future. Instead, the system may copy the observed SAR into the future (e.g., a backward-looking approach). Because the overhead of the approach is relatively low, the system can run the workload predictor and the advisor periodically (so that it responds timely to the outcome of an irregular drift).

FIG. 13 is a host variable assignment method 1300 according to some embodiments to predict the future assignments of parameter values to host variables. For pragmatic reasons, note that the system may predict the assignments to a host variable independently of other host variables. To illustrate the main idea of this method 1300, consider the graph 1410 FIG. 14A, which is an abstract representation of the observed workload in FIG. 1A. Each dot represents a single assignment (t, vij)∈(, hij) of parameter value vij (y-axis) to host variable hij at timestamp t (x-axis). For now, it is sufficient to understand that an assignment is classified as “uncertain” if it occurs close to the beginning of the observation period, as “regular” if vij is regularly assigned to hij(e.g., those assignments are time independent because they occur so often that no periodic/reoccurring pattern can be observed and, therefore, they belong to a static workload), as “fresh” if vij has not been assigned yet to hij within a recent time interval, and as “subsequent” otherwise. Furthermore, note that subsequent assignments frequently occur shortly after fresh assignments (and rarely further on). Finally, the regular assignments are distributed across the entire observed workload.

Referring again to FIG. 13, at S1310 the system may perform detection and classification. In order to classify an assignment (t, vij)∈(, hij), a threshold interval of length k·δ may be selected, with k∈, as a multiple of the discretization interval δ∈. According to some embodiments, the threshold interval k·δ is used to distinguish time-dependent and time-independent assignments. Given a threshold interval of length k·δ, an assignment (t, vij)∈(, hij) is classified as:

    • UNCERTAIN if os≤t<os+k·δ
    • REGULAR else if ∀{circumflex over (t)}∈[os, oe−k·δ)(t′, vij)∈(, hij): {circumflex over (t)}≤t′<{circumflex over (t)}+k·δ
    • FRESH else if (t′, vij)∈(, hij): t−k·δ≤t′<t
    • SUBSEQUENT else.

Assignments, where the same parameter value is assigned to the same host variable at least once every k·δ time units, are independent of the time and thus part of the static workload. Such assignments are classified as regular. For example, in FIG. 14A this is the case for parameter value vij′. In contrast, the absence of a repeated assignment of a parameter value every k·δ time units implies that the assignment is time-dependent and thus part of the workload drift. If the parameter value in such assignments is first assigned after k·δ time units, the assignment is classified as fresh, otherwise (second, third, etc.) as subsequent. For example, in FIG. 14A the assignment of vij′″ at timestamp t is fresh as vij′ was not assigned in the previous k·δ time units, whereas later assignments of vij′ are subsequent. Finally, assignments in the interval [os, os+k·δ) are classified as uncertain because the system cannot exclude the possibility that fresh assignments in this interval are actually subsequent assignments that follow fresh assignments before os.

At S1320, the system predicts the future fresh assignments and models the distribution of subsequent and regular assignments. For fresh assignments, the system may extrapolate the series of fresh assignments in the observed workload into the future and thereby predict the series of future fresh assignments. For example, FIG. 14B is a graph 1420 shows fresh assignments with a linear characteristic that can be extrapolated into future predictions 1422 (e.g., from fs=14:00 to fe=20:00).

The observed series of Fresh Assignments (observed “SFA”) of host variable hij in the observed workload may be defined as the set of all fresh assignments in (, hij):


SFA(,hij):=((t,vij)|(t,vij)∈(,hij),(t,vij) is fresh).

As before, the system may detect which drift types are present in the observed SFA and then use a classifier (such as the one in FIG. 12) to resolve conflicts between multiple detected drift types. For example, the observed SFA in FIG. 14A is classified as linear (as can be seen in FIG. 14B). The system can then predict the future series of fresh assignments SFA(, ) based on SFA(, hij) and the outcome of the prior classification.

The system may model the distribution of subsequent assignments of parameter values in relation to their corresponding fresh assignments. This model can then be used to predict future subsequent assignments in relation to their predicted fresh assignments. To do this, first formalize the time difference between a subsequent assignment and its corresponding fresh assignment as the temporal offset between the two. Let sub(, hij) be the set of subsequent assignments to host variable hij in the observed workload . The time difference τ(t, vij) between a subsequent assignment (t, vij)∈sub and its corresponding fresh assignment is defined as:


τ(t,vij)=min({t−t′|(t′,vij)∈(,hij),(t′,vij) is fresh, t′<t}).

Note that for the same parameter value, more than one fresh assignment can exist. This may happen, for example, with a reoccurring drift. To handle such ambiguities, some embodiments only consider the temporal offset to the most recent corresponding fresh assignment.

Another special case is when no fresh assignment exists for a subsequent assignment because the fresh assignment occurred at the beginning of the observed workload and was classified as uncertain. For example, in FIG. 14A this happens for vij″. In such cases, the system may estimate a fresh assignment by extrapolating the observed SFA into the past.

A probability mass function may be built that models the probability of time differences between a subsequent assignment and its corresponding fresh assignment. Some embodiments model the time difference in terms of time intervals I(λ)=[λ·δ, (λ+1)·δ), λ∈. Some embodiments use δ as the interval length only for the pragmatic reason that the observed workload has been discretized in steps of length δ. According to some embodiments, a probability mass function psub is defined as the probability that the time difference τ(t, vij) between a subsequent assignment of parameter value vij to host variable hij and its corresponding fresh assignment falls into interval I(λ):

p s u b ( h ij , λ ) = "\[LeftBracketingBar]" { ( t , v i j ) ( t , v i j ) ( , h i j ) , τ ( t , v i j ) I ( λ ) } "\[RightBracketingBar]" ( , h i j ) .

FIG. 15 is a probability mass function 1500 (with the y-axis shown in log scale) according to some embodiments. The function 1500 shows the probability mass function psub for the observed workload of FIG. 1A. Observe that the larger time differences between fresh and subsequent assignments are associated with smaller probabilities. Note that x-axis show time intervals I(λ) for λs between 0 and 72.

Finally, some embodiments may model the distribution of regular assignments in terms of a probability mass function. Using such a model, the system can then predict future regular assignments. Let be the set of regular assignments to host variable hij in the observed workload . Define a probability mass function preg as the probability of a regular assignment of parameter value vij to host variable hij:

p r e g ( h ij , v i j ) = "\[LeftBracketingBar]" { ( t , v i j ) ( t , v i j ) ( , h i j ) } "\[RightBracketingBar]" ( , h i j ) .

The prediction is based on the following two assumptions: first, regular assignments to a host variable in the observed workload will continue in the future workload; and second, regular assignments are uniformly distributed. To motivate both assumptions, consider FIG. 1A where parameter values of id_test_profile between 72,154,000 and 72,159,000 belong to background jobs which check the flakiness of tests. The goal of these jobs is to ensure that all tests are stable and produce deterministic results. Note that both psub and preg are just examples of such probability mass functions. Both can be exchanged with any other probability distribution.

Table II contains an algorithm that can be used to compute a predicted workload . The algorithm expects as input for each SQL statement qi∈ the predicted statement arrival rate SAR(, qi), and for each host variable hij in statement qi the predicted series of fresh assignments SFA(hij) and the probability mass functions psub(hij, λ) and preg(hij, vij).

TABLE III Workload Prediction Algorithm Input : For each statement qi ∈   : SAR(   , qi)  For each host variable hij in qi: SF A(P, hij), psub, preg 1  P := { } // initialize the predicted workload 2  for 1 ≤ i ≤ n do // iterate over indexes of SQL statements 3   for t ∈ Δ(fs, fe) do // iterate over future equidistant timestamps 4    F(t, P, qi) := SAR(P, qi) [t] // get statement arrival frequency 5    for 1 . . . F (t, P, qi) do// iterate over statement instantiations 6    draw a random timestamp trand ∈ [t, t + δ) // uniform 7    allocate vector vi with a capacity of mi parameter values 8     for 1 ≤ j ≤ mi do // iterate over indexes of host variables 9     draw assignment category fresh, subsequent, or regular according to     their distribution in   (  , hij) 10     switch assignment category do 11      case fresh do // generate fresh assignment 12      vij := SFA(P, hij) [trand] // get value from SFA at trand 13      case subsequent do // generate subsequent assignment 14      draw an interval [λ·δ, (λ+1)·δ) according to psub 15      draw a random time difference tdiff ∈ [λ·δ, (λ+1)·δ) 16      tfresh := trand−tdiff //calculate time of fresh assignment 17      vij := SFA(P, hij) [tfresh] // get value from SFA at tfresh 18      case regular do // generate regular assignment 19      vij := draw parameter value according to preg 20     insert parameter value vij into vector vi 21    P = P ∪ {(trand, qi, vi)} // add predicted statement instantiation to P

The algorithm first initializes as an empty set (line 1). Next, the algorithm iterates over all SQL statements qi∈ (line 2) and all future equidistant timestamps t∈Δ(fs, fe) (line 3). The system then derives the number of statement instantiations F(t, , qi) to be predicted in the current interval [t, t+δ) from the predicted SAR, where SAR(,qi)[t] de-notes the statement arrival frequency at timestamp t (line 4). For each statement instantiation to predict, the algorithm draws a random timestamp trand uniformly from [t, t+δ) (lines 5 and 6) and allocates a vector vi with a capacity of mi parameter values (line 7). Finally, the system iterates over all host variables hij to predict their assignment by parameter values vij (lines 8 to 20) and adds the predicted statement instantiation (trand, qi, vi) to the predicted workload (line 21).

Some embodiments assume that the distribution of fresh, subsequent, and regular assignments in the observed and future workloads are identical. Therefore, to predict the future parameter value vij of a host variable hij at timestamp trand, the system may first draw an assignment category fresh, subsequent, or regular with the same probabilities as they occur (, hij) (line 9). For example, the probability of generating a subsequent assignment is |sub (, hij)|/(|(, hij)|−unc|(, hij)|), where sub (resp. unc) denotes the set of subsequent (resp. uncertain) assignments to host variable hij in .

The prediction of parameter value vij of host variable hij at timestamp trand depends on which assignment category is drawn:

    • Fresh: generate a fresh assignment with parameter value vij set to the value of the predicted series of fresh assignments SFA(, hij) at timestamp trand (line 12).
    • Subsequent: The timestamp and parameter value of subsequent assignments are modeled in terms of their corresponding fresh assignments. Generate a subsequent assignment in two steps: In a first step, draw a time difference tdiff between the subsequent assignment to generate at timestamp trand and its corresponding fresh assignment at tfresh. More specifically, draw an interval [λ·δ(λ+1)·δ) according to the probability given by psub (line 14), then draw tdiff uniformly from this interval (line 15), and calculate tfresh as difference between trand and tdiff (line 16). In a second step, generate a subsequent assignment with parameter value vij set to the value of the predicted series of fresh assignments SFA(, hij) at tfresh (line 17).
    • Regular: Generate a regular assignment with parameter value vij drawn according to the probability given by preg (line 19).

FIGS. 16A through 16C illustrate the prediction of regular, fresh, and subsequent assignments to host variable hij at a random future timestamp trand∈[t, t+δ). Also shown is the predicted SFAP (, hij) 1612. The value of the fresh assignment vij is obtained as SFA(, hij)[trand] as illustrated 1610 by the line 1612 in FIG. 16A. For the subsequent assignment, the time difference tdiff is shown drawn between tfresh and trand as illustrated 1620 in FIG. 16B. The value of the subsequent assignment {circumflex over (v)}ij is then obtained as SFA(, hij)[tfresh]. Finally, value vij′ of a regular assignment is already regularly assigned to hij as illustrated 1630 in FIG. 16C (see also vij′ in FIG. 14A).

The second phase of the framework feeds the predicted workload into a physical database design advisor (e.g., a table partitioning advisor). According to some embodiments, a table partitioning advisor for column stores to minimize main memory costs in $ while meeting all performance guarantees assured in Service-Level Agreements (“SLAs”). Although $ is used herein as an example, note that embodiments may be associated with any other currency or representation of monetary costs. As the amount of provisioned DRAM dominates hardware costs, this is in particular crucial for DBaaS providers. The advisor may group frequently accessed data into hot column partitions that remain in main memory, and rarely accessed data into cold column partitions that can be moved to cheaper storage layers.

Given (i) the maximum execution time SLA(W) of a workload W in seconds as a performance SLA, (ii) the number of accesses f(C, W) by workload W to column partition C, and (iii) the π-second rule (which accounts for prices, capacities, and performance of recent DRAM and disk) that expresses the economical break-even point between storing data in DRAM versus performing disk IO on each access, a column partition C is classified as hot if it is accessed more frequently than every π-seconds:

π := Disk Costs [ $ / sec ] Disk IO P [ Page / sec ] / DRAM costs [ $ / Page / sec ] isHot ( C , W ) S L A ( W ) [ sec ] f ( C , W ) π [ sec ] .

The classification of a column partition C determines its cost. Cold-classified column partitions can be pruned during the evaluation of the selection predicate (e.g., as described with respect to FIGS. 1A and 1B) and therefore are not required to be held in DRAM during the workload. Because disk IO is performed instead for each access, their memory costs depend on the number of accessed pages by workload W and the costs in $ per disk IOP. In contrast, a hot-classified column partition depends on the DRAM costs in $/byte/second, the column partition size ∥C∥ in bytes, and the maximum workload execution time SLA(W) in seconds:

workload ( C , W ) := { h o t ( C , W ) [ $ ] if isHot ( C , W ) cold ( C , W ) [ $ ] otherwise h o t ( C , W ) := [ $ / Byte sec ] · C [ Byte ] · SLA ( W ) [ sec ] cold ( C , W ) := f ( C , W ) · [ C [ Byte ] Page Size [ Byte Page ] ] · Disk Costs [ $ / sec ] Disk IOP Page sec .

According to some embodiments, the physical layout is changed only if the expected benefits of workload cost savings outweigh the costs of data reorganization, a cost model may include table repartitioning costs. A straightforward approach would be to calculate repartitioning costs globally, i.e., at the level of the entire physical layout. However, if an enumeration algorithm operates at the level of column partitions, the system may compute the repartitioning costs per column partition. Further embodiments may model repartitioning costs in $ to be able to compare them to workload costs.

If a column partition C∈Lfut, enumerated in the future layout Lfut, already exists in the current layout Lcur, the repartitioning cost may be set to $0. Otherwise, C is created by splitting or merging partitions in Lcur. In this case, some assumptions might be made:

    • (1) As column stores often employ read-optimized data structures (e.g., compressed dictionaries) that cannot be modified easily, assume that C is created from scratch.
    • (2) Each record written to column partition C was read from an existing column partition in Lcur. Accordingly, assume that the total amount of data accessed during repartitioning is twice the column partition size ∥C∥.
    • (3) To reduce the impact on performance SLAs, ensure fast repartitioning times by assuming that all data accessed during repartitioning is held in DRAM.
    • (4) Assume that the system exhibits a repartitioning rate Rrate in Byte/sec for creating column partition C, as well as a fixed overhead Rovrhd in sec for initializing C.

Based on these assumptions, the repartitioning costs will first be defined and then the system will combine the workload and repartitioning costs into the total cost in $ of a column partition C∈Lfut:

repart ( C , L cur ) := { $0 if C L cur DRAM Costs [ $ Byte sec ] 2 C [ Byte ] ( 2 C [ Byte ] R rate [ Byte sec ] + R ovrhd [ sec ] ) otherwise total ( C , W , L cur ) := workload ( C , W ) + repart ( C , L cur ) .

As a second modification, an objective function may be modified. Contrary to some systems, which minimize the workload costs workload of the observed workload , some embodiments propose the future layout Lfut that minimizes the combined workload and repartitioning costs total of the predicted workload .

arg min L fut C L fut total ( C , , L cur )

A natural question that arises during the optimization of this equation: At which timestamp in the future is a repartitioning beneficial? In general, repartitioning costs can be seen as a mortgage that needs to be amortized by the benefit in terms of workload costs in a future layout. The longer the duration of the predicted workload (i.e., fe−fs), the more likely the repartitioning costs are amortized. On the contrary, the longer the prediction, the less accurate the prediction becomes. To resolve this conflict, some embodiments calculate the duration of the predicted workload as a prediction confidence factor μ ∈ times the observation period ω∈ (i.e., oe−os).

    • os:=current timestamp
    • oe:=os
    • fs:=oe
    • fe:=fs+η·ω.

FIG. 17 shows 1700 the impact of η on the choice of a future database layout Lfut according to some embodiments. The x-axis shows future timestamp fs, fe′, and fe″ depending on the choice of η. The total costs in $ for Lcur and two future layouts Lfut1 and Lfut2 are shown on the y-axis. The costs for both future layouts are greater than $0 at the y-axis intercept due to repartitioning costs. Observe that Lcur is optimal for a short-time prediction as the repartitioning costs of both future layouts are not yet amortized. In contrast, for a mid-term and long-term prediction Lfut1 and Lfut2, respectively, become optimal.

For the integration of the advisor into the prediction framework, assume a fixed observation period ω∈ and a fixed prediction confidence factor η∈ as prerequisites. Also initialize the start timestamp of the observed workload os as the current timestamp and the layout Lcur as the current layout. The system can then start the first iteration of the framework by computing the end timestamp of the observed workload oe as os+ω and waiting until that time. Using the algorithm of Table III, the system next calculates the predicted workload between start timestamp fs set to oe and end timestamp fe set to fs+η·ω based on the observed workload between timestamps os and oe. Afterwards, the predicted workload and the current layout Lcur can be fed into the physical database design advisor and the system may switch to the proposed future layout Lfut. Note that the current layout could also be considered as the best alternative. Finally, we increment os by ƒ·ω and start the next iteration of the framework.

During testing, a workload during a regular workday between 08:00 and 20:00 was used. During this period, 401,280,169 SQL statements were executed. An observation period ω was set to 360 minutes and the prediction confidence factor η was set to 1.0. Further, a discretization interval δ of 5 minutes was used and the parameters for detecting linear and exponential drifts φlin and φexp were set to 0.8 each. To detect reoccurring drifts and static workloads, set φcyc and φstatic to 0.1 each. Both parameters represent a maximal allowed fluctuation (and fluctuations smaller than 10% may be imperceptible). Finally, Rrate was set to 133 MB/sec and Rovrhd to 0.16 seconds using an q-error approximation.

The backward-looking approach used by most existing work was used as a baseline, where the observed workload instead of the predicted workload is fed into an advisor. To demonstrate the potential of embodiments described herein, a best-case analysis was performed where the actual future workload is fed into the system. For all approaches, the layout is only changed if the expected benefits of workload cost savings outweigh the costs of table repartitioning.

A memory costs and performance experiment analyzed how the partitioning layouts proposed by embodiments described herein and a backward-looking approach alter the relation between memory costs and workload performance. As hardware costs are dominated by DRAM costs, one may focus on the memory consumption. It was observed that embodiments described herein always outperform a backward-looking approach in terms of performance. For example, with enough available memory (i.e., high memory costs), the proposed framework can improve execution time by 1.4 compared to a backward-looking approach. The main reason for the observed performance improvements and memory cost savings is a sharper separation of frequently and rarely accessed data into hot and cold partitions as compared to the backward-looking approach. This avoids polluting the buffer cache with cold data, allows for more aggressive partition pruning, and speeds up delta merges.

A second experiment evaluated how accurately embodiments predict the future workload. For this, predictions of both the statement arrival rate and the assignment of parameter values to host variables were considered. As a measure how well the observed and predicted statement arrival rate, respectively observed and predicted assignments, match the future statement arrival rate, respectively future assignments, one may use the coefficient of determination R2 ∈(−∞, 1]. An R2 score of 1 indicates a perfect prediction, while especially negative values, which are unbounded, represent inaccurate predictions.

For the ten most expensive statements in the SQL plan cache, the R2 score for assignments of the predicted workload have a substantially higher R2 score compared to the observed workload. In addition, the R2 score of the assignments of the predicted workload is frequently close to 1.

A third experiment analyzed the impact of the observation period a and the prediction confidence factor η on the prediction accuracy. The R2 score for statement q6 for ω between 60 and 570 minutes were calculated for η between 0.1 and 8.0. In general, it was observed that longer observation periods lead to higher R2 scores. Further, the longer the predicted period, i.e., the larger η is, the less accurate predictions become. The poor accuracy resulting from η larger than 1 is expected as it is no longer possible to detect reoccurring drifts with a period longer than the observation period. The choice of ω set to 360 minutes and η set to 1.0 leads both to accurate predictions and to an amortization of the repartitioning costs due to the long prediction period.

A fourth experiment evaluated how precisely the modified version of advisor estimates table repartitioning costs. For this purpose, 363 random partitioning layouts for different SQL database tables were generated and then randomly split and merged existing partitions in those layouts. For each considered table repartitioning, Rrate was set to 133 MB/sec and Rovrhd was set to 0.16 sec as repartitioning costs were measured. There was a strong correlation between the estimated and actual repartitioning costs and the maximum q-error was 2.27.

A fifth and final experiment analyzed the time required by the framework to predict the future workload for observation periods ω between 60 and 360 minutes and η was set to 1.0. The required prediction time ranges between only 1.2% and 3.2% of the observation period. Since the advisor also has a low optimization time, one can conclude that that framework is suitable for practical usage.

FIG. 18 is a framework for workload prediction and physical database design display 1800 according to some embodiments. The display 1800 includes a graphical representation 1810 or dashboard that might be used by an operator or administrator to configure the components described herein (e.g., for a cloud computing environment). In particular, selection of an element (e.g., via a touchscreen or computer mouse pointer 1890) might result in the display of a popup window that contains more detailed data. The display 1800 may also include a user selectable “Save” icon 1820 to store configurations and/or system mappings (e.g., to data sources) and an “Update” icon 1830 to adjust values as appropriate.

Note that the embodiments described herein may be implemented using any number of different hardware configurations. For example, FIG. 19 is a block diagram of an apparatus or platform 1900 that may be, for example, associated with the systems 200, 300, 500, and 600 of FIGS. 2, 3, 5, and 6, respectively (and/or any other system described herein). The platform 1900 comprises a processor 1910, such as one or more commercially available Central Processing Units (“CPUs”) in the form of one-chip microprocessors, coupled to a communication device 1960 configured to communicate via a communication network (not shown in FIG. 19). The communication device 1960 may be used to communicate, for example, with one or more databases 1924 (e.g., via a communication network 1922), system administrators, etc. The platform 1900 further includes an input device 1940 (e.g., a computer mouse and/or keyboard to input, create and/or manage prediction algorithm parameters) and/an output device 1950 (e.g., a computer monitor to render a display, transmit recommendations, generate alerts, and/or create reports about workload predictions, physical database design recommendations, etc.). According to some embodiments, a mobile device and/or PC may be used to exchange information with the platform 1900.

The processor 1910 also communicates with a storage device 1930. The storage device 1930 may comprise any appropriate information storage device, including combinations of magnetic storage devices (e.g., a hard disk drive), optical storage devices, mobile telephones, and/or semiconductor memory devices. The storage device 1930 stores a program 1912 and/or workload prediction framework engine 1914 for controlling the processor 1910. The processor 1910 performs instructions of the programs 1912, 1914, and thereby operates in accordance with any of the embodiments described herein. For example, the processor 1910 may be associated with a cloud computing environment and receive observed workload information. Based on the observed workload information, a SAR prediction may be generated by the processor 1910. In addition, a host variable assignment prediction may be generated by the processor 1910 based on the observed workload information. The processor 1910 may then use the SAR prediction and the host variable assignment prediction to automatically create a workload prediction for the database. A physical database design advisor (e.g., a table partitioning advisor) may receive the workload prediction and, responsive to the workload prediction, automatically generate a recommended physical layout for the database (e.g., using a cost model, the current physical layout, and an objective function).

The programs 1912, 1914 may be stored in a compressed, uncompiled and/or encrypted format. The programs 1912, 1914 may furthermore include other program elements, such as an operating system, clipboard application, a database management system, and/or device drivers used by the processor 1910 to interface with peripheral devices.

As used herein, information may be “received” by or “transmitted” to, for example: (i) the platform 1900 from another device; or (ii) a software application or module within the platform 1900 from another software application, module, or any other source.

In some embodiments (such as the one shown in FIG. 19), the storage device 1930 further stores a workload prediction data store 2000. An example of a database that may be used in connection with the platform 1900 will now be described in detail with respect to FIG. 20. Note that the database described herein is only one example, and additional and/or different information may be stored therein. Moreover, various databases might be split or combined in accordance with any of the embodiments described herein.

Referring to FIG. 20, a table is shown that represents the workload prediction data store 2000 that may be stored at the platform 1900 according to some embodiments. The table may include, for example, entries identifying workload predictions and physical database design changes for a cloud computing environment. The table may also define fields 2002, 2004, 2006, 2008, 2010, 2012 for each of the entries. The fields 2002, 2004, 2006, 2008, 2010, 2012 may, according to some embodiments, specify a database identifier 2002, an observed workload identifier 2004, SAR prediction data 2006, host variable assignment prediction data 2008, a workload prediction 2010, and a recommended physical layout 2012 for a database. The workload prediction data store 2000 may be created and updated, for example, when new observations are received, a new DBaaS is implemented, etc.

The database identifier 2002 might be a unique alphanumeric label that is associated with an SQL database to be analyzed and/or optimized. The observed workload identifier 2004 may be a file associated with a substantial number of historic statements. The SAR prediction data 2006 and host variable assignment prediction data 2008 may be automatically generates by any of the embodiments described herein (e.g., based on the observed workload). The workload prediction 2010 may be created by combining the SAR prediction data 2006 and host variable assignment prediction data 2008. The recommended physical layout for a database may be generated by an advisor based on the workload prediction and the current physical layout (e.g., an advisor associated with table partitioning, data compression, buffer pool size, another advisor that is based on access counters of a workload, etc.).

Thus, embodiments may provide accurate and efficient workload predictions which can serve as an input to a physical database design advisor (e.g., in a cloud computing environment). The following illustrates various additional embodiments of the invention. These do not constitute a definition of all possible embodiments, and those skilled in the art will understand that the present invention is applicable to many other embodiments. Further, although the following embodiments are briefly described for clarity, those skilled in the art will understand how to make any changes, if necessary, to the above-described apparatus and methods to accommodate these and other embodiments and applications.

Although specific hardware and data configurations have been described herein, note that any number of other configurations may be provided in accordance with some embodiments of the present invention (e.g., some of the information associated with the databases described herein may be combined or stored in external systems). Moreover, although some embodiments are focused on particular types of statements and advisors, etc., any of the embodiments described herein could be applied to other types of statements and advisors. Moreover, the displays shown herein are provided only as examples, and any other type of user interface could be implemented. For example, FIG. 21 illustrates a tablet computer 2100 providing a framework for workload prediction and physical database design display 2110. The display 2110 might be used, for example, to create recommended database adjustments to improve performance/costs in a cloud computing or other environment. Moreover, the display 2110 might be used to update and/or create prediction algorithms, layout recommendations, etc. via a “Save” icon 2120.

The present invention has been described in terms of several embodiments solely for the purpose of illustration. Persons skilled in the art will recognize from this description that the invention is not limited to the embodiments described but may be practiced with modifications and alterations limited only by the spirit and scope of the appended claims.

Claims

1. A system associated with a cloud computing environment, comprising:

a workload prediction framework, including: a computer processor, and a computer memory storing instructions that, when executed by the computer processor, cause the workload prediction framework to: receive observed workload information associated with a database in the cloud computing environment, based on the observed workload information, generate a Statement Arrival Rate (“SAR”) prediction, based on the observed workload information, generate a host variable assignment prediction, and using the SAR prediction and the host variable assignment prediction to automatically create a workload prediction for the database; and
a physical database design advisor, coupled to the workload prediction framework, to receive the workload prediction and, responsive to the workload prediction and a current physical layout of the database, automatically generate a recommended physical layout for the database.

2. The system of claim 1, wherein the workload prediction is associated with at least one of: (i) a linear workload drift, (ii) an exponential workload drift, (iii) a reoccurring workload drift, (iv) a static workload, and (v) an irregular workload drift.

3. The system of claim 2, wherein the workload prediction is based on a combination of multiple workload drift types.

4. The system of claim 1, wherein the SAR prediction is generated by the following algorithm:

discretizing the observed workload,
detecting at least one workload drift type,
classifying multiple workload drift types, and
predicting a future SAR using an observed SAR and the classified workload drift.

5. The system of claim 1, wherein the host variable assignment prediction is generated by the following algorithm:

detecting and classifying an assignment using a threshold interval, and
predicting future fresh assignments and a modeled distribution of subsequent and regular assignments.

6. The system of claim 1, wherein the workload prediction is based on: (i) regular assignments, (ii) fresh assignments, and (iii) subsequent assignments.

7. The system of claim 1, wherein the physical database design advisor is associated with at least one of: (i) a table partitioning advisor, (ii) an index advisor, (iii) a data compression advisor, (iv) a buffer pool size advisor, and (v) any other advisor that is based on access counters of a workload.

8. The system of claim 1, wherein the physical database design advisor is associated a table partitioning advisor that utilizes a cost model, a current physical layout, and an objective function.

9. The system of claim 8, wherein the table partitioning advisor computes repartitioning costs on a per-column partition basis.

10. The system of claim 9, wherein the table partitioning advisor proposes a future layout that minimizes a combined workload and repartitioning cost of the workload prediction.

11. A computer-implemented method associated with a cloud computing environment, comprising:

receiving, by a computer processor of a workload prediction framework, observed workload information associated with a database in the cloud computing environment;
based on the observed workload information, generating a Statement Arrival Rate (“SAR”) prediction;
based on the observed workload information, generating a host variable assignment prediction;
automatically creating a workload prediction for the database using the SAR prediction and the host variable assignment prediction; and
responsive to the workload prediction and a current physical layout of the database, automatically generating a recommended physical layout for the database by a physical database design advisor.

12. The method of claim 11, wherein the workload prediction is associated with at least one of: (i) a linear workload drift, (ii) an exponential workload drift, (iii) a reoccurring workload drift, (iv) a static workload, and (v) an irregular workload drift.

13. The method of claim 12, wherein the workload prediction is based on a combination of multiple workload drift types.

14. The method of claim 11, wherein the SAR prediction is generated by the following algorithm:

discretizing the observed workload,
detecting at least one workload drift type,
classifying multiple workload drift types, and
predicting a future SAR using an observed SAR and the classified workload drift.

15. The method of claim 11, wherein the host variable assignment prediction is generated by the following algorithm:

detecting and classifying an assignment using a threshold interval, and
predicting future fresh assignments and a modeled distribution of subsequent and regular assignments.

16. The method of claim 11, wherein the workload prediction is based on: (i) regular assignments, (ii) fresh assignments, and (iii) subsequent assignments.

17. A non-transitory, computer-readable medium storing instructions, that, when executed by a processor, cause the processor to perform a method associated with a cloud computing environment, the method comprising:

receiving, by a computer processor of a workload prediction framework, observed workload information associated with a database in the cloud computing environment;
based on the observed workload information, generating a Statement Arrival Rate (“SAR”) prediction;
based on the observed workload information, generating a host variable assignment prediction;
automatically creating a workload prediction for the database using the SAR prediction and the host variable assignment prediction; and
responsive to the workload prediction and a current physical layout of the database, automatically generating a recommended physical layout for the database by a physical database design advisor.

18. The medium of claim 17, wherein the physical database design advisor is associated with at least one of: (i) a table partitioning advisor, (ii) an index advisor, (iii) a data compression advisor, (iv) a buffer pool size advisor, and (v) any other advisor that is based on access counters of a workload.

19. The medium of claim 17, wherein the physical database design advisor is associated a table partitioning advisor that utilizes a cost model, a current physical layout, and an objective function.

20. The medium of claim 19, wherein the table partitioning advisor computes repartitioning costs on a per-column partition basis.

21. The medium of claim 20, wherein the table partitioning advisor proposes a future layout that minimizes a combined workload and repartitioning cost of the workload prediction.

Patent History
Publication number: 20230306009
Type: Application
Filed: Mar 28, 2022
Publication Date: Sep 28, 2023
Inventors: Michael BRENDLE (Leimen), Norman MAY (Karlsruhe), Robert SCHULZE (Chemnitz), Alexander BOEHM (Schwetzingen), Michael GROSSNIKLAUS (Kreuzlingen), Guido MOERKOTTE (Schriesheim)
Application Number: 17/705,728
Classifications
International Classification: G06F 16/22 (20060101); G06F 16/21 (20060101); G06F 16/25 (20060101); G06F 11/34 (20060101);