NETWORK DEVICE AND METHOD FOR MONITORING OF BACKEND TRANSACTIONS IN DATA CENTERS

Info

Publication number: 20090125496
Type: Application
Filed: Apr 17, 2008
Publication Date: May 14, 2009
Applicant: B-HIVE NETWORKS, INC (San Mateo, CA)
Inventors: Asaf Wexler (Raanana), Mayan Weiss (Alfei Manashe), Or Kroyzer (Tel Aviv), Ronen Heled (Kiryat-Ono)
Application Number: 12/105,092

Abstract

A network device and method for learning and monitoring transactions executed by back-end systems in data servers. Specifically, it allows learning and monitoring at least standard query language (SQL) transactions sent from an application server hosting a web application to a database server and executed thereon. Monitoring of SQL transactions allows measuring performance parameters with regards to databases, databases' tables, operations and queries that are part of the transactions. Furthermore, the measurement of performance parameters with respect to HTTP requests of the respective SQL transactions is provided.

Description

Description

CROSS REFERENCE TO RELATED APPLICATION

This application claims benefits from a U.S. provisional application 60/987,743 filed on Nov. 13, 2007 which is hereby incorporated for all that it contains.

TECHNICAL FIELD

The present invention relates generally to controlling and managing the performance of web applications in data centers, and more specifically to monitoring of transactions executed.

BACKGROUND OF THE INVENTION

Enterprises and organizations expose their business information and functionality on the web through software applications, usually referred to as “web applications.” Web applications provide great opportunities for an organization. The web applications use the Internet technologies and infrastructures. These applications are generally event-driven software programs which react to hypertext transfer protocol (HTTP) requests from the client. The applications are generally executed on application servers coupled to back-end systems.

FIG. 1 shows an exemplary data center 100 that is utilized for executing web applications. Clients 110 submit requests (e.g., HTTP requests) to web servers 120 through a network 170. A load balancer 160 distributes the requests between the servers 120 to balance the load. A web server 120 dynamically generates presentation, for example, using servlets, or extensible markup language (XML), extensible style-sheet language (XSL), and the likes. Application servers 130 are often responsible for deploying and running the business logic layer and for interacting with, and integrating various enterprise-wide resources, such as web servers 120 and back-end systems 150. The back-end systems 150 may include, for example, a database server and a legacy system. Typically, the back-end systems 150 operate and respond to requests sent from the clients 110 and forwarded by the application servers 130.

As an example, the web application executed by the data center 100 is a finance application (such as one used to access a bank account) through which a user of a client 110 requests to view the account's balance. Typically, the client 110 generates a HTTP request that triggers a SQL query with input values of at least the account number of the user. In that case, a URL field in the HTTP request includes the account number and the requested action. The SQL query generated based on that input in the URL's field may be:

select balance from Accounts where Account_Number=<input account number>

An application server 130 processes the incoming HTTP request and forwards the SQL query to one of the back-end systems 150 (e.g., a database server) that maintains the account's balance of the user. That is, the back-end system executes the SQL query generated in response to the HTTP request from, and thereafter replies with the balance value which presented to the user using a web server 120.

In the related art there are many tools to monitor the operation and performance of data centers in order to prevent situations of an unpredictable level of service and uncontrolled user experience. Typically, such monitoring tools provide the function of fault management or performance management. Fault management pertains to whether a component (a device or application) is operating or not. Performance management pertains to a measure of how well a component is working and to historical and future trends.

Existing monitoring tools typically measure performance parameters, such as latency, number of errors, and throughput, that may be influenced by different systems in the path. For example, a latency measure encompasses the entire duration between sending of a HTTP request and the receipt of the full reply. This information is not always sufficient to help a system administrator to determine the source of problems. That is, a problem can be at the network 170, the web servers 120, the application servers 130, or the back-end systems 150. Furthermore, existing tools typically monitor the performance of back-end systems as stand-alone systems and not as part of the data center. As a results, such tools cannot correlate between HTTP requests to transactions (e.g., SQL queries) executed by back-end systems, and thus to help a system administrator in identifying the root cause of performance problems.

Therefore, it would be advantageous to provide a solution for monitoring transactions executed in back-end systems of a data center.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1—is a non-limiting data center utilized for executing web applications;

FIG. 2—is a diagram of a network system used to describe the various embodiments in accordance with the present invention;

FIG. 3—is a non-limiting block diagram of the network device disclosed in accordance with an embodiment of the present invention;

FIG. 4—is a diagram of a transaction tree;

FIG. 5—is a flowchart describing the operation of a network device disclosed in accordance with an exemplary embodiment of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

The overcome the shortcomings of prior art monitoring tools, a network device and method for learning and monitoring transactions executed by back-end systems in data servers are provided. Specifically, the present invention allows learning and monitoring of at least standard query language (SQL) transactions sent from an application server hosting a web application to a database server and executed thereon. Monitoring of SQL transactions allows the measuring of performance parameters with regards to databases, databases' tables, operations, and queries that are all part of the transactions. Furthermore, the invention allows measuring the performance parameters with respect to HTTP requests of respective SQL transactions.

FIG. 2 shows a non-limiting and exemplary diagram of a data center 200 used to describe the principals of the present invention. The system 200 includes clients 210-1 through 210-N, web servers 220-1 through 220-M, application servers 230-1 through 230-Q connected to back-end systems 250, a load balancer 260, and a network 270. The system 200 further includes a network device 280 connected between the application servers 230 and the back-end system 250. The connection maybe through a SPAN port of a switch (not shown) or a TAP device (not shown) that sniffs traffic sent from the application servers 230.

The web servers 220 process requests sent from the clients 210 and respond with the processing result. The application servers 230 execute the business logic of the web applications and communicate with the back-end systems 250, which implement the data layer of the applications. The load balancer 260 mainly distributes incoming requests to servers 220 and 230 that run the web applications to which the requests are targeted.

The back-end systems 250 may include database servers, legacy systems, and the likes. A database server may be, but is not limited to, Oracle® Database Server, Microsoft® SQL server, DB2, Sybase, and so on. A database server may include any type of non-volatile storage and is directly coupled to this server. In some configurations, a web server and a web application may act as a single entity, e.g., a server 230-Q.

The network device 280 analyzes traffic directed to the back-end systems 250. As depicted in FIG. 2, the network device 280 is configured to operate in the line of traffic, i.e., traffic passing directly through system 280 to the back-ends systems 250. The network device 280 may also operate as a passive sniffing device coupled between application servers 230 and the back-ends systems 250.

In accordance with an embodiment of the present invention the network device 280 identifies, learns and monitors transactions, such as SQL transactions executed by at least a database server of a back-end system 250. A SQL transaction typically includes a SQL query represented by the use of a proprietary protocol of a database server. The identification task includes recognition of SQL queries in traffic flows from the application servers 230 to the database server 250. The learning task includes generating a transaction tree and classification of incoming queries to mutually exclusive groups, where each group contains a set of SQL queries that represent a logical action. The monitoring task measures performance parameters, such as latency, throughput, response time, and number of errors. These tasks are described in greater detail below.

FIG. 3 shows an exemplary and non-limiting block diagram of the network device 280 implemented in accordance with an embodiment of the present invention. The network device 280 comprises a traffic processor 310, a transaction learner 320, and a transaction monitor 330, connected to a common bus 340. The network device 280 further includes databases 360 and 370 coupled to the transaction learner 320 and a database 380 that is further coupled to the transaction monitor 330. In other embodiments, the network device 280 includes a single database commonly coupled to the transaction learner 320 and monitor 330. The transaction learner 320 further comprises classifier and collector modules (not shown).

The traffic processor 310 captures HTTP requests submitted by clients 210 and directed to the back-end systems 250. When a session is established with a back-end system 250 the traffic processor 310 tries to identify at least a SQL query in the HTTP request. A query may be in the following format:

- SELECT a, c FROM t WHERE w=z
  the ‘a’ and ‘c’ values represent columns to be retrieved from a table ‘t’ and which satisfy the condition ‘w=z’.

In order to determine if the traffic includes a query, the traffic processor 310 sniffs designated ports of the back-end systems 250 and buffers data packets sent through those ports. The buffered data typically includes the SQL query wrapped in the database server's protocol, and thus the data is parsed in order to extract the query. Thereafter, identified SQL queries are parsed by the traffic processor 310 to generate a “SQL skeleton”. A SQL skeleton is a logical division of the query that allows the efficient clustering of queries. As a non-limiting example, two types of SQL skeletons are defined, one that includes only columns and tables appear in the query and the other that includes the query's columns, tables and the condition-parameter names (without the parameters' values). As an example, for the query shown above the two skeletons are:

- SELECT a, b FROM t
- SELECT a, b FROM t WHERE w

It should be apparent to a person skilled in the art that the types of SQL skeletons described herein are merely examples, and other types may be defined.

In accordance with another embodiment of the present invention SQL skeletons can be derived from database commands. Typically, such commands are embedded in the database server protocol and are not part of the queries. As an example, “use db” is a database command that cannot be translated to an SQL query, but is marked in a specific command in the database protocol. The SQL skeletons derived for database commands are used to cluster traffic which does not include SQL queries to mutually exclusive clusters.

The use of SQL skeletons allows the clustering of SQL queries into mutually exclusive groups, where each group contains a set of queries that represent one logical action. For example, the following queries are clustered into a single group defined by the SQL skeleton select a, b, c from t:

- select a, b, c from t
- select a, b, c from t where x=y
- select a, b, c from t where w=z

The transaction learner 320 receives SQL skeletons generated by the traffic processor 310 and tries, using its classifier, to match each skeleton to already identified SQL transactions. Specifically, the classifier checks if an entry with the skeleton exists in a classification table. If so, a transaction identification (ID) number is retrieved. This means that the corresponding SQL transaction is already identified and should not be learnt. Otherwise, if the skeleton does not include in the table, then the skeleton, together with its respective request, are saved in the database 360 in a table that includes unclassified skeletons. In accordance with the present invention, the table may be a hash table or any other type of data structure that enables an association between SQL skeletons and transactions IDs.

The unclassified skeletons tables further includes an appearance counter (not shown) that counts the number of appearances of each unclassified skeleton. The transaction learner 320 processes data stored in this table and attempts to discover new SQL transactions and generate a transactions tree. Specifically, a skeleton having its appearance counter above a predefined threshold is considered as a classified skeleton. The transaction learner 320 may be invoked every predefined period of time or whenever the number of collected queries is above a predefined threshold. Alternatively, the transaction learner 320 may be always active.

The transaction learner 320 identifies the structure and content of SQL transactions and registers the learnt information in a classify skeleton table (CST). The CST includes a list of identified database servers, for each database a list of its tables, and for each table a list operations and SQL skeletons that construct the queries. The CST also includes a transaction ID associates with a classified skeleton. The transaction learner 320 further generates for display purposes a transaction tree. The transaction tree and the CST are also saved in database 370. An example for a transaction tree generated from following transaction code is shown in FIG. 4.

- use db
- select a from b where c=‘d’
- select w from e
- update e set done=1
- use db1
- select a from b

A database ‘db’ includes the tables ‘b’ and ‘e’ and a database ‘db1’ include table ‘a’. The web application uses both databases ‘db’ and ‘db1’. The operations of these queries are ‘select’ and ‘update’. The SQL skeletons are ‘select a from b where c=’; ‘select w from e’; ‘update e set done=’; and ‘select a from b’. The CST maintains at least the information presented in a transaction tree.

The transaction monitor 330 executes all activities related to the generation of statistics respective of the operation of the back-end systems 250 and their respective transactions. The statistics are measured for performance parameters including, but not limited to, measuring throughput, response time, number of errors, latency, and so on. The statistics are measured for SQL skeletons and kept in database 380 on a per skeleton basis, on a per database basis, on a per table basis, and on a per database, table, and skeleton combination basis. In accordance with an embodiment of the disclosed invention, a plurality of reports are produced based on the gathered statistics. These reports can be presented by means of a graphical user interface (GUI), sent to a system administrator by email, and/or or printed on any tangible form. The statistics computed for the performance parameters are checked if they are within the allowed range, and if not correction actions are taken.

It should be apparent by a person skilled in the art that the monitoring of SQL skeleton provides the users with the root cause of low performance problems. For example, if a high latency is measured for all skeletons included in the transaction tree 400, the problem is probably in the access to tables ‘d’, ‘e’ and ‘a’ in databases ‘db’ and ‘db1’. If the high latency is measured only to “select” skeletons then the root cause is the execution of the select method. If high latency is measured only on specific skeletons, the problem is the execution of SQL queries related to those skeletons or skeleton. In accordance with an embodiment of the invention, the entity that causes the degradation in performance is highlighted in the transaction tree 400.

In accordance with an embodiment of the present invention statistics of the performance parameters are also measured with respect to HTTP requests of the respective transactions. With this aim, the HTTP requests need to be correlated to transactions. The correlation is performed by collecting SQL transactions received during a time frame. The time frame may be the time between a reception of the HTTP request and its response. Then, it is checked if the query's parameters also exist in the HTTP request. All SQL transactions that have matched parameters are correlated with the HTTP request.

By correlating between a HTTP request to SQL transactions detailed measures of performance parameters can be computed. For example, the monitor 320 can measure the latency in various places along the path, i.e., application latency, network latency, and back-end latency. The application latency is the time that takes to wait for a web application to respond. The network latency is the time that it takes for packets to go through the network. The back-end latency is the time required for a back-end system to execute a SQL transaction and respond to application server 230. This is opposed to prior art approaches that measure only the time between sending a request to receiving a full response from a server. Therefore, the monitoring tasks, executed by the network device 280, produce information that allows a system administrator to easily detect the root-cause of at least latency related problems.

FIG. 5 shows a non-limiting and exemplary flowchart 500 describing the operation of network device 280, in accordance with one embodiment of the present invention. At S510, a request that may include a SQL query is sent from an application server 230 and is received at the network device 280. At S520, if the request is identified as potentially including a SQL query, the request is parsed to extract the query. At S525 the query is parsed to generate a SQL skeleton that represents that query. The type of SQL skeleton to be used is predefined. At S530, the generated SQL skeleton is classified to determine whether the query belongs to a known or unknown transaction. This is performed by matching the skeleton against a table that includes pairs of skeletons and transaction IDs. If S530 results with a valid transaction ID the incoming request belongs to a known (learnt) transaction. At S540, it is determined if a transaction ID was detected, and if so execution continues with S560; otherwise, executions proceeds to S545 where the skeleton and its respective request are saved in database 360. Subsequently, the request is relayed to a back-end system 250 by either the application server or the network device. At S550, the transaction learner 320 discovers the databases, tables, operations and queries (represented by skeletons) that are part of transactions executed over backend system 250. The learnt information is kept, at S555, in a CST format in database 370.

At S560, statistics respective of the transactions are gathered. That is, at least for classified skeleton the performance parameters, throughput, response time, hits per second, latency and number of returned errors, are measured. The measured statistics are saved to database 380. It should be noted that the measured performance parameters can be compared to predefined thresholds, and they are not within the allowed range, where one or more corrective actions can be performed in order to ensure service level according to a service level agreement (SLA).

In another embodiment of the present invention the method and network device 280 described herein can be utilized to monitor performance of databases by analyzing transactions which are not generated by web applications.

In an embodiment of the present invention, some or all of the method components are implemented as a computer executable code. Such a computer executable code contains a plurality of computer instructions that when performed in a predefined order result with the execution of the tasks disclosed herein. The computer executable code may be uploaded to, and executed by, a machine comprising any suitable architecture. Such computer executable code may be available as source code or in object code, and may be further comprised as part of, for example, a portable memory device or downloaded from the Internet, or embodied on a program storage unit or computer readable medium. Preferably, the machine is implemented on a computer platform having hardware such as one or more central processing units (“CPU”), a random access memory (“RAM”), and input/output interfaces. The computer platform may also include an operating system and microinstruction code. The various processes and functions described herein may be either part of the microinstruction code or part of the application program, or any combination thereof, which may be executed by a CPU or distributed across multiple CPUs or computer platforms, whether or not such computer or processor is explicitly shown. In addition, various other peripheral units may be connected to the computer platform such as an additional data storage unit and a printing unit.

The principles of the present invention may be implemented as a combination of hardware and software and because some of the constituent system components and methods depicted in the accompanying drawings may be implemented in software, the actual connections between the system components or the process function blocks may differ depending upon the manner in which the present invention is programmed.

The foregoing detailed description has set forth a few of the many forms that the present invention can take. It is intended that the foregoing detailed description be understood as an illustration of selected forms that the invention can take and not as a limitation to the definition of the invention. It is only the claims, including all equivalents that are intended to define the scope of this invention.

Claims

1. A method for monitoring of back-end transactions in data centers, comprising:

identifying a backend query in a request sent to a backend system of a data center;

parsing the backend query to generate a backend skeleton;

classifying the backend skeleton to a known backend transaction; and

measuring performance parameters for a classified backend skeleton.

2. The method of claim 1, wherein the request is generated by at least one of: an application server and a web server.

3. The method of claim 2, wherein the backend transaction is at least a standard query language (SQL), and wherein the backend query is at least a SQL query.

4. The method of claim 3, wherein the back-end system is at least a database server, and wherein the database server is at least one of: an Oracle Database Server, a Microsoft SQL server, a DB2, and a Sybase.

5. The method of claim 1, wherein the backend skeleton is a logical division of the backend query and is utilized for clustering purposes.

6. The method of claim 5, wherein the backend skeleton is predefined.

7. The method of claim 3, wherein the performance parameters include at least one of: throughput, response time, hits per second, latency and number of returned errors.

8. The method of claim 7, wherein measures of the performance parameters are compared to predefined thresholds to determine if the backend transaction meets a service level according to a service level agreement (SLA).

9. The method of claim 1, further comprising:

processing unclassified SQL skeletons together with their respective requests and queries to discover transactions related data.

10. The method of claim 9, wherein the transactions related data comprises at least databases, tables, operations and queries that are part of transactions executed over the backend system.

11. A computer-readable medium having stored thereon computer executable code when executed by a computer for monitoring of backend transactions in data centers, the computer executable code comprising:

identifying a backend query in a request sent to a back-end system of a data center;

parsing the backend query to generate a backend skeleton;

classifying the backend skeleton to a known backend transaction; and

measuring performance parameters for a classified backend skeleton.

12. The computer-readable medium of claim 11, wherein the request is generated by at least one of: an application server and a web server.

13. The computer-readable medium of claim 11, wherein the backend transaction is at least a standard query language (SQL), and wherein the backend query is at least a SQL query.

14. The computer-readable medium of claim 11, wherein the back-end system is at least a database server, and wherein the database server is at least one of: an Oracle Database Server, a Microsoft SQL server, a DB2, and a Sybase.

15. The computer-readable medium of claim 11, wherein the backend skeleton is a logical division of the backend query and is utilized for clustering of the backend query.

16. The computer-readable medium of claim 15, wherein the backend skeleton is predefined.

17. The computer-readable medium of claim 13, wherein the performance parameters include at least one of: throughput, response time, hits per second, latency and number of returned errors.

18. The computer-readable medium of claim 17, wherein measures of the performance parameters are compared to predefined thresholds to determine if the backend transaction meets a service level according to a service level agreement (SLA).

19. The computer-readable medium of claim 11, further comprising:

processing unclassified backend skeletons together with their respective requests and queries to discover transactions related data.

20. The computer-readable medium of claim 17, wherein the transactions related data comprises at least databases, tables, operations and queries that are part of transactions executed over the backend system.

21. A network device connected in a data center and capable of learning and monitoring transactions executed by backend systems, comprises:

a traffic processor for detecting backend queries in data sent to the backend systems, and wherein the traffic processor is further capable of generating backend skeletons from backend transactions; and

a transaction learner for classifying backend skeletons to identified backend transactions; and

a transaction monitor for measuring performance parameters on backend transactions.