SERVER FARM AND METHOD FOR OPERATING THE SAME
A method for operating a server farm with a plurality of servers operably connected with each other includes: receiving a job request of a computational task to be handled by the server farm; determining, from the plurality of servers, one or more servers operable to accept the job request; determining a respective effective energy efficiency value associated with at least the one or more servers; and assigning the computational task to a server with the highest effective energy efficiency value. The effective energy efficiency value is defined by a service rate of the respective server divided by a difference between an energy consumption rate value when the respective server is busy and an energy consumption rate value when the respective server is idle. The present invention also relates to a server farm operated by the method.
The present invention relates to a system and method for operating a server farm, and particularly, although not exclusively, to an asymptotically optimal job assignment method for operating an energy-efficient processor sharing server farms.
BACKGROUNDData centers with server farms are essential to the functioning of computer systems in different applications and sectors in the modern economy. Generally, server farms in data centers include a large number of servers that consume power during operation to process and handle jobs or computational tasks. These servers account for the major portion of energy consumption of data centers.
Since excessive power consumption in server farm may increase operation cost and cause environmental concerns, various approaches have been proposed to optimize energy utilization in server farms. In one example, speed scaling is applied to control server speed. In another example, right-sizing of server farms is applied by powering servers on/off according to traffic load.
Rapid improvements in computer hardware have resulted in frequent upgrades of parts of the server farms, and this has led to server farms with different computer resources (heterogeneous servers) being deployed. The heterogeneity of servers in server farm significantly complicates the optimization of energy utilization. Therefore, there remains a need for server farm designers and/or operators to devise an optimal strategy in operating and managing server farms so as to conserve energy and maximize the effective energy efficiency of server farms.
SUMMARY OF THE INVENTIONIn accordance with a first aspect of the present invention, there is provided a method for operating a server farm with a plurality of servers operably connected with each other, the method comprising the steps of: receiving a job request of a computational task to be handled by the server farm; determining, from the plurality of servers, one or more servers operable to accept the job request; determining a respective effective energy efficiency value associated with at least the one or more servers; and assigning the computational task to a server with the highest effective energy efficiency value; wherein the effective energy efficiency value is defined by: a service rate of the respective server divided by a difference between an energy consumption rate value when the respective server is busy (performing computational tasks) and an energy consumption rate value when the respective server is idle (not performing computational tasks). Preferably, the method steps can be in different order as listed as long as they could be logically rearranged. For example, the job request could be received after the one or more servers operable to accept the job request are determined. Optionally, the respective effective energy efficiency values associated with all of the servers, instead of only those operable to accept the job request, are determined.
In one embodiment of the first aspect, the method further comprises sorting the one or more servers according to the respective determined effective energy efficiency values. The sorting could be in ascending or descending order.
In one embodiment of the first aspect, the step of determining from the plurality of servers one or more servers operable to accept the job request comprises determining, from the plurality of servers, all servers operable to accept the job request.
In one embodiment of the first aspect, the plurality of servers cannot be powered off during operation of the server farm. In one embodiment of the first aspect, the plurality of servers cannot be powered off during operation.
In one embodiment of the first aspect, assignment of computation tasks in the server farm is substantially independent of an arrival rate of computation tasks at the server farm.
In one embodiment of the first aspect, assignment of computation tasks in the server farm is substantially independent of a respective size of the computation tasks received at the server farm.
In one embodiment of the first aspect, the plurality of servers each includes a finite buffer for queuing job requests.
In one embodiment of the first aspect, the one or more servers operable to accept the job request each has at least one vacancy in their respective buffer.
In one embodiment of the first aspect, the server farm is heterogeneous in that some or all of the plurality of servers can have different server speeds, energy consumption rates, and/or buffer sizes.
In one embodiment of the first aspect, the server farm is a non-jockeying server farm in which computational task being handled by one of the plurality of servers cannot be reassigned to other servers.
In accordance with a second aspect of the present invention, there is provided a system for operating a server farm with a plurality of servers operably connected with each other, the system comprising one or more processors arranged to: receive a job request of a computational task to be handled by the server farm; determine, from the plurality of servers, one or more servers operable to accept the job request; determine a respective effective energy efficiency value associated with at least the one or more servers; and assign the computational task to a server with the highest effective energy efficiency value; wherein the effective energy efficiency value is defined by: a service rate of the respective server divided by a difference between an energy consumption rate value when the respective server is busy (performing computational tasks) and an energy consumption rate value when the respective server is idle (not performing computational tasks).
In one embodiment of the second aspect, the one or more processors may be incorporated in one or more servers in the server farm. In another embodiment, the one or more processors may be arranged external to the server farm, but are operably connected with the servers in the server farm.
In accordance with a third aspect of the present invention, there is provided a server farm comprising: a plurality of servers operably connected with each other; one or more processor operably connected with the plurality of server, the one or more processor being arranged to: receive a job request of a computational task to be handled by the server farm; determine, from the plurality of servers, one or more servers operable to accept the job request; determine a respective effective energy efficiency value associated with at least the one or more servers; and assign the computational task to a server with the highest effective energy efficiency value; wherein the effective energy efficiency value is defined by: a service rate of the respective server divided by a difference between an energy consumption rate value when the respective server is busy (performing computational tasks) and an energy consumption rate value when the respective server is idle (not performing computational tasks).
In one embodiment of the third aspect, the one or more processor is further operable to sort the one or more servers according to the respective determined effective energy efficiency values.
In one embodiment of the third aspect, the one or more processor is further operable to: determine, from the plurality of servers, all servers operable to accept the job request.
In one embodiment of the third aspect, the plurality of servers cannot be powered off during operation of the server farm.
In one embodiment of the third aspect, the one or more processor is arranged such that assignment of computation tasks in the server farm is substantially independent of an arrival rate of computation tasks at the server farm.
In one embodiment of the third aspect, the one or more processor is arranged such that assignment of computation tasks in the server farm is substantially independent of a respective size of the computation tasks received at the server farm.
In one embodiment of the third aspect, the plurality of servers each includes a finite buffer for queuing job requests; and wherein the one or more servers operable to accept the job request each has at least one vacancy in their respective buffer.
In one embodiment of the third aspect, the server farm is heterogeneous in that the plurality of servers can have different server speeds, energy consumption rates, and/or buffer sizes.
In one embodiment of the third aspect, the server farm is a non-jockeying server farm in which computational task being handled by one of the plurality of servers cannot be reassigned to other servers.
In one embodiment of the third aspect, the one or more processors are incorporated in at least one of the plurality of servers.
In accordance with a fourth aspect of the present invention, there is provided a non-transient computer readable medium for storing computer instructions that, when executed by one or more processors, causes the one or more processors to perform a method for operating a server farm with a plurality of servers operably connected with each other, the method comprising the steps of: receiving a job request of a computational task to be handled by the server farm; determining, from the plurality of servers, one or more servers operable to accept the job request; determining a respective effective energy efficiency value associated with at least the one or more servers; and assigning the computational task to a server with the highest effective energy efficiency value; wherein the effective energy efficiency value is defined by: a service rate of the respective server divided by a difference between an energy consumption rate value when the respective server is busy (performing computational tasks) and an energy consumption rate value when the respective server is idle (not performing computational tasks).
It is an object of the present invention to address the above needs, to overcome or substantially ameliorate the above disadvantages or, more generally, to provide an improved method for assigning jobs in a large-scale server farm by taking into account the power consumed by the servers when idle.
Embodiments of the present invention will now be described, by way of example, with reference to the accompanying drawings in which:
The environment 100 in
Referring to
A person skilled in the art would appreciate that the steps 302, 304, 306, 308 in method 300 need not be performed in the order as listed, but can be in any other order as long as it is logical. For example, steps 304 and 306 can be performed before step 302.
In a preferred embodiment, the server farm is heterogeneous in that the servers can have different server speeds, energy consumption rates, and/or buffer sizes. All servers in the farm, even the idle ones, have non-negligible energy consumption rate. Preferably, the server farm is a non-jockeying server farm in which computational task being handled by one of the servers cannot be reassigned to other servers. In one embodiment, the plurality of servers cannot be powered off during operation of the server farm. This could refer to, in practice, periods of operation during which no powering off of the servers takes place. In one embodiment, method 300 may be combined with a right-sizing technique by powering off idle servers, although frequent powering off/on increases wear and tear and the need for costly replacement and maintenance.
In a preferred embodiment, the processor sharing (PS) discipline is imposed on each queue of the servers, so that all jobs on the same queue share the processing capacity and are served at the same rate. This arrangement avoids unfair delays for those jobs that are preceded by extremely large jobs, making it an appropriate model for web server farms, where job-size distributions are highly variable. The finite buffer size queuing model with PS discipline can be applied in situations where a minimum service rate is required for processing a job in the system.
In one embodiment, the server farm is a large-scale realistically-dimensioned server farm that cannot reject a job if it has buffer space available. Although in situations where a server farm has some inefficient servers and rejection of some jobs might save energy, this is not permitted in some embodiments of the present invention.
An objective function of the optimization of the present invention is the energy efficiency of a server farm, defined as the ratio of the long-run expected throughput divided by the expected energy consumption rate. This objective function represents the amount of useful work (e.g., data rate, throughput, processes per second) per watt, and is well-accepted as a performance measure in ICT applications.
I—System ModelThe following table (Table I) includes definition of some of the symbols used in the following description.
The present embodiment considers a heterogeneous server farm modeled as a multi-queue system with reassignment of incomplete jobs (e.g. jobs being processed) disallowed. In this embodiment, the server farm has K≥2 servers, forming the set ={1, 2, . . . , K}. These servers are characterized by their service rates, energy consumption rates, and buffer sizes. For j ∈ , the service rate for server j is denoted by μj. The energy consumption rate of server j is εj when it is busy and εj0 when it is idle, where εj>εj0≥0. In the present invention, the ratio μj/(εj−εj0) is referred to as the effective energy efficiency of server j. In one embodiment, the buffer size of server j is denoted by Bj≥2.
Preferably, job arrivals follow a Poisson process with rate λ, indicating the average number of arrivals per time unit. An arriving job is assigned to one of the servers with at least one vacant slot in its buffer, subject to the control of an assignment policy ϕ. In one embodiment, if all buffers are full, the arriving job is lost.
In the present embodiment, it is assumed that job sizes are independent and identically distributed. The average size of jobs is normalized, without loss of generality, to one. Preferably, each server j serves its jobs at a total rate of μj using the PS service discipline.
The following consideration is limited to realistic cases by assuming that the ratio of the arrival rate to the total service rate, ρλ/Σj=1Kμj, is sufficiently large to be economically justifiable but not too large to violate the required quality of service (QoS). In the following, ρ is referred to as the normalized offered traffic.
The job throughput of the system under policy ϕ, which is equivalent to the long-run average job departure rate, is denoted by ϕ. The power consumption of the system under policy ϕ, which is equivalent to the long-run average energy consumption rate, is denoted by εϕ. By definition, ϕ/εϕ is the energy efficiency of the system under policy ϕ.
II—MAIP Job Assignment MethodIn the present embodiment, the server farm managing module makes decisions at arrival events to assign a new job to one of the servers (queues) in the server farm (queuing system). A server selected to accept new jobs is called a tagged server, while all other servers are untagged. If all of the servers are full, i.e., has no capacity to accept new job requests, then no server is tagged at that time and new arrivals are blocked until completion of some job in the system.
Preferably, MAIP is obtained by considering the effective energy efficiency of servers, taking into account the effect of idle power, i.e., energy consumption rate when the server is idle. Preferably, the method in the present embodiment always selects a server with the highest effective energy efficiency among all servers that are not full. Such a server is regarded as the most energy-efficient server available to accept new jobs.
A simple explanation of MAIP of the present embodiment is as follows. Consider a system with two servers only, where μ1=μ2=1, ε1=2, ε10=1, ε22.5, and ε20=2. It is clear that in this example ε1<ε2 and ε10<ε20. If a job arrives when both servers are idle, the scheduler has two choices:
- (1) Assigning the job to server 1 makes server 1 busy. And the energy consumption rate of the whole system becomes ε1+ε20=4.
- (2) Assigning the job to server 2 makes server 2 busy. And the energy consumption rate of the whole system becomes ε2+ε10=3.5.
Since (ε1+ε20)>(ε2+ε10), which is equivalently (ε1−ε10)>(ε2−ε20), and since both servers have the same service rate, choosing server 2 for serving the job in this particular example turns out to be better in terms of the energy efficiency of the system, despite the fact that server 2 consumes more power when busy than server 1 does.
In examples where power consumption of idle servers in a system is not necessarily negligible, the energy used by the system can be categorized into two parts, a productive part and an unproductive part. The productive part contributes to job throughput, whereas the unproductive part is a waste of energy. For a server j, when it is idle, the service rate is 0 accompanied by an energy consumption rate of εj0; when it is busy, the service rate becomes μj and the energy consumption rate increases to εj. The additional service rate is considered as a reward at the cost of an additional energy consumption rate εj−εj0. In other words, if jobs are assigned to server j, the productive power used to support the service rate μj is effectively εj−εj0. In the design of MAIP in one embodiment of the present invention, productive power is the main consideration.
Since MAIP in the present embodiment aims for energy-efficient job assignment, in the following description, the servers are labeled according to their effective energy efficiency. In particular, in the context of MAIP, server i is defined to be more energy-efficient than server j if and only if μj/(εj−εj0)>μj/(εj−εj0). That is, for any pair of servers i and j, if i<j, then μj/(εj−εj0)≥μj/(εj−εj0). MAIP in the present embodiment operates by always selecting a server with the highest effective energy efficiency among all servers that contain at least one vacant slot in their buffers, where ties are broken arbitrarily. Advantageously, MAIP in the present embodiment is a simple approach that requires only binary state information (i.e., available or unavailable) from each server for its implementation.
III—Analysis A. Stochastic ProcessLet j denote the set of all states of server j, where the state, nj is the number of jobs queuing or being served at server j. Thus, j={0, 1, . . . , Bj}, where Bj≥2 is the buffer size for server j. For server j, states 0, 1, . . . , Bj−1 are called controllable, and the state Bj is called uncontrollable. The set of controllable states for server j, in which the server is available to be tagged, is denoted by j(0,1)={0, 1, . . . , Bj−1} while, for the uncontrollable state in the set j(0)={Bj}, the server is forced to be untagged because it cannot accept jobs.
The vectors n=(n1, n2, . . . , nK) represents the state of the multi-queue system, nj ∈ j ∈. The set of all such states n is denoted by , the sets of uncontrollable and controllable states in are, respectively,
(0)={n ∈ |nj ∈ j(0), ∀ j ∈ },
(0, 1)={n ∈ |n ∉ (0)}. (1)
Define Xϕ(t)=(X1ϕ(t), X2ϕ. . . XKϕ(t)) to be a vector of random variables representing the state at time t under policy ϕ of the stochastic process of the multi-queue system. Without loss of generality set the initial state Xϕ(0)=x(0), x(0) ∈ .
Decisions made on job arrivals rely on the values of X(t) just before an arrival occurs. Use ajϕ(i), j ∈ as an indicator of activity at time t under policy ϕ so that ajϕ(t)=1 if server j is tagged, and ajϕ(t)=0 otherwise. Then Σj=1Kajϕ(t)≤1 for all t>0. All job assignment policies considered in the present embodiment are stationary, and so ajϕ(n), n ∈ , is used to represent the action to be taken on the stochastic process when the system is in state n. A policy ϕ comprises those) aϕ(n)=(a1ϕ(n), a2ϕ(n), . . . , aKϕ(n)) for all n ∈ .
Define a mapping Rj: j→R, where Rj(nj)(nj ∈ j) is the reward rate of server j in state nj. Let j be the set of all such mappings Rj. Then, for a given vector of mappings R=(R1, R2, . . . , RK), the long-run average reward under policy ϕ is defined to be
R is referred to as the reward rate function. Along similar lines, consider μj(nj) and εj(nj), the service rate and energy consumption rate of server j in state nj, respectively, as rewards; that is μj, εj ∈ j. As previously defined, μj(nj)=μj, εj(nj)=εj for nj>0, μj(0)=0 and εj(0)=εj0, where μj>0, εj>εj0≥0, j ∈ . For the vectors μ=(μ1, μ2, . . . , μK) and ε=(ε1, ε2, . . . , εK), the long-run average job service rate of the entire system is, then, γϕ(μ) and the long-run average energy consumption rate of the system is γϕ(ε). For simplicity, long-run average job service rate and the long-run average energy consumption rate are referred to, in this description, as the job throughput and energy consumption rate, respectively. Since the energy efficiency of the system is the ratio of job throughput to energy consumption rate, the problem of maximizing energy efficiency is encapsulated in
Based on the definition given above, MAIP can be formally defined as follows.
A well-known index theorem for SFABP was published in 1974 in J. C. Gittins and D. M. Jones, “A dynamic allocation index for the sequential design of experiments,” in Progress in Statistics, J. Gani, Ed. Amsterdam, NL: North-Holland, 1974, pp. 241-266. The optimal solution for the general multi-armed bandit problem (MABP) was published in 1979 in J. C. Gittins, “Bandit processes and dynamic allocation indices,” Journal of the Royal Statistical Society. Series B (Methodological), pp. 148-177,1979. Relaxing the constraint that only one machine (project/bandit/process) is played at a time, and only the played machine changes state, Whittle, in P. Whittle, “Restless bandits: Activity allocation in a changing world,” J. Appl. Probab., vol. 25, pp. 287-298, 1988, published a more general model, the restless multi-armed bandit (RMAB) and proposed as an index the so-called Whittle's index as an approximation for optimality.
The general definition of Whittle's index for the problem of the present embodiment is given here; a closed-form expression will be provided in Section C for the case when job sizes are exponentially distributed.
Based on Theorem 1 provided in Z. Rosberg, Y. Peng, J. Fu, J. Guo, E. W. M. Wong, and M. Zukerman, “Insensitive job assignment with throughput and energy criteria for processor-sharing server farms,” IEEE/ACM Trans. Netw., vol. 22, no. 4, pp. 1257-1270, August 2014, there exists a value e*>0, given by
the optimization problem in equation (4) can be written as
where the reward rate function R=(R1, R2, . . . , RK), Rj ∈ j, Rj(nj)=μj(nj)−ε*εj(nj), j ∈ .
Following the Whittle's index approach, the problem in equation (6) can be relaxed as
This would mean that ajϕ(t) becomes random variables, and so that sometimes more than one server will be tagged simultaneously. This is unrealistic and is not preferable in the present invention.
The linear constraint in equation (8) is covered by the introduction of a Lagrange multiplier v.
For a given v, equation (8) can be decomposed into K sub-problems:
where ajϕ(u)=0 when Xjϕ(u) ∈j(0), for 0<u<t, j ∈.
In P. Whittle, “Restless bandits: Activity allocation in a changing world, ” J. Appl. Probab., vol. 25, pp. 287-298, 1988, Whittle defined a v-subsidy policy for a project (server) as an optimal solution for equation (9), which provides the set of states where the given project will be passive (untagged), and introduced the following definition.
Definition 1. Let D(v) be the set of passive states of a project under a v-subsidy policy. The project is indexable if D(v) increases monotonically from ∅ to the set of all possible states for the project as v increases from −∞ to +∞.
In particular, if a project (server) j is indexable and there is a v* satisfying nj∉ D(v) for v≤v* and nj ∈ D(v) otherwise then this v* is the value of Whittle's index for project (server) j at state nj. Whittle's index policy for the multi-queue system chooses a controllable server (a server in controllable states) with highest Whittle's index to be tagged (with others untagged) at each decision making epoch.
C. IndexabilityThe closed form of the optimal solution for equation (9) is given—it is equivalent to the Whittle's index policy for the case with exponentially distributed job sizes. The method of the present embodiment uses the theory of semi-Markov decision processes and the Hamilton-Jacobi-Bellman equation. Formulation in this way requires the exponential job size assumption, but in some embodiments the method of the present invention is not limited to such job size distribution.
Let Vjϕ
Now, let jH, j ∈ represent a process for server j that starts from state 0 until it reaches state 0 again, where ϕj is constrained to those policies satisfying ajϕ
Now an application of the g-revised criterion in M. Ross, Applied probability models with optimization applications. Dover Publications (New York), 1992 yields the followed corollary to these two theorems.
Corollary 1. For a server j and a given v<+∞, let Rj(nj)=μj(nj)−ε*εj(nj)<+∞, there exists a real g, with Rjg(nj)−g such that if policy ϕ*j∈ ΦjH maximizes Vjϕ
In other words, by comparing the maximized average reward of process jH under policy ϕ*j and policy ϕj0 with ajϕ
The first step involves finding ϕ*j. Let Vjv(nj, Rjg)=supϕ
where τj1(nj) and τj0(nj), are the expected sojourn time in state nj for ajϕ
For equation (10), there is a specific v, referred to as v*j(nj, Rjg), satisfying
For an indexable server j, a policy can be defined as follows:
- if v<v*j(nj, Rjg), j will be tagged
- if v>v*j(nj, Rjg), j will be untagged, and
- if v=v*j(nj, Rjg), j can be either tagged or untagged. (12)
The v*(nj, Rjg), nj ∈j, j ∈ , constitute Whittle's index in this context, and equation (12) defines the optimal solution for equation/problem (9). According to equation (11), although the value of v*j(nj, Rjg) may appear to rely on v, it can be shown that in the present embodiment, the value of v*j(nj, Rjg) can be expressed in close form and is independent of v, and that the server farm in the present embodiment is indexable according to the definition in P. Whittle, “Restless bandits: Activity allocation in a changing world, ” J. Appl. Probab., vol. 25, pp. 287-298,1988.
Proposition 1. For the system of the present embodiment defined in Section I, j ∈ ,
The optimal policy, denoted by ϕ*j, that maximizes vjϕ
Proposition 2. For the system of the present embodiment defined in Section I, j ∈ ,
The following Proposition 3 is a consequence of Propositions 1 and 2.
Proposition 3. For the system defined in Section I, if job-sizes are exponentially distributed then the Whittle's index of server j at state nj is:
Evidently then, the system is indexable.
It is clear that the Whittle's index policy, which prioritizes servers with the highest index value at each decision making epoch, is similar to the MAIP method of the present embodiment defined in equation (4), when job sizes are exponentially distributed.
D. Asymptotic OptimalityThis section serves to prove the asymptotic optimality of MAIP for the server farm of the present embodiment comprising multiple groups of identical servers, as the numbers of servers in these groups become large and when the job sizes are exponentially distributed (the number of servers is scaled under appropriate and reasonable conditions for large server farms).
The proof methodology disclosed in R. R. Weber and G. Weiss, “On an index policy for restless bandits, ” J. Appl. Probab., no. 3, pp. 637-648, September 1990 for the asymptotic optimality of index policies is applied to the problem of the present embodiment. However, this proof cannot be directly applied to the present problem because of the presence of uncontrollable states (since buffering spill-over creates dependencies between servers) in the server farm of the present embodiment. In the following, an additional server is defined, designated as server K+1, to handle the blocking case when all original servers are full; this server has only one state (server K+1 never changes state) with zero reward rate. In a preferred embodiment, this is a virtual server that is used only in the proof of the asymptotic optimality in this section. In particular, =1 and {0}=∅. Also, define −= ∪(K+1) as the set of servers including this added zero-reward server. The set of controllable states of these K+1 servers is defined as z,85 {0, 1}=∪j∈K+{0, 1} and the set of uncontrollable states is {0}=∪j∈K+j{0}.
In this section, servers with identical buffer size, service rate, and energy consumption rate are grouped as a server group, and these server groups are labeled as server groups 1, 2, . . . {tilde over (K)}. For servers i, j of the same server group, i{0, 1}=j{0, 1} and i{0}=j{0}. For clarity of presentation, define j{0, 1} and j{0}, i=1, 2, . . . , {tilde over (K)} as, respectively, the sets of controllable and uncontrollable states of servers in server group i. States for different server groups are regarded as different states, that is, j{0, 1}∩ j{0, 1}=∅; and j{0}∩ j{0}=∅; for different server groups i and j; j=1, 2, . . . {tilde over (K)}. Let Zjϕ(t) be the random variable representing the proportion of servers in state i ∈ {0, 1}∪ {0} at time t under policy ϕ. Again, states i ∈ {0, 1}∪ {0} are labeled as 1, 2, . . . , I, where I=|{0, 1}∪ {0}| and Zϕ(t) is used to denote the random vector (Z1ϕ(t), Z2ϕ(t), . . . , Z1ϕ(t)). Correspondingly, actions ajϕ(nj), nj∈j, j ∈ + correspond to actions aϕ(i), i ∈ {0, 1}∪ {0}.
Let z, z′ ∈RI be possible values of Zϕ(t), T>0, ϕ ∈ Φ. Transitions of the random vector Zϕ(t) from z to z′ can be written as z′=z+ei,i′, where ej, p is a vector of which the ith element is
the i'th element is
and otherwise is zero, is i, i′∈ {0, 1}∪ {0}. In particular, for the server farm of the present embodiment defined in Section I, server j only appears in state i ∈ j; that is, the transition from z to z′=z+ei,i′, i∈ j{0, 1}∪ j{0}, i′∈ j{0, 1}∪ j{0}j, j′=1, 2, . . . , {tilde over (K)}, j≠j′ never occurs. In order to address such impossible transitions, the corresponding transition probabilities are set to zero. Then, order/sort the states i ∈ {0, 1} according to descending index values, where all states i ∈ {0} come after the controllable states, with aϕ(i)=0 for i ∈ {0}. Next, set the state i ∈ K+2{0, 1} of the zero-reward server, which is also a controllable state, to come after all the other controllable states but to precede the uncontrollable states. Because of the existence of the zero-reward server K+1, the number of servers in controllable states can always meet the constraint in equation (7). Note here that the state of server K+1 and the uncontrollable states are manually moved to certain positions without following their indices which are zero. It can be shown that such movements will not affect the long-run average performance of Whittle's index policy, which exists and is equivalent to MAIP in the present embodiment. The position of a state in the ordering i=1, 2, . . . , I is also defined as its label.
Let γOR(ϕ) be the long-run average reward of the original problem in equation (6) under policy ϕ, and γLR(ϕ) be the long-run average reward of the relaxed problem in equation (7) under policy ϕ. In addition, let
the maximal long-run average reward of the original problem, and
the maximal long-run average reward of the relaxed problem. From the definition of the system of the present embodiment, γLR(ϕ)/K, γOR(ϕ)/K≤maxj∈a+,n
To demonstrate the asymptotic optimality, the following describes the stationary policies, including Whittle's index policy, in another way. Let μjϕ(z) ∈ [0, 1], z ∈ RI, i=1, 2, . . . I, be the probability for a server in state i ∈ {0, 1}∪ {0} to be tagged (aϕ(i)=1) when Zϕ(t)=z. Then, 1−viϕ(z) is the probability for a server in state i to be untagged (aϕ(i)=0).
Define i+, i ∈ {0, 1}∪ {0} as the set of states that precede state i in the ordering. Then, for Whittle's index policy, obtain
The multi-queue system of the present embodiment is stable, since any stationary policy will lead to an irreducible Markov chain for the associated process and the number of states is finite. Then, for a policy ϕ ∈ Φ, the vector Xϕ(t) converges as t→∞ in distribution to a random vector Xϕ. In the equilibrium region, let πjϕ be the steady state distribution of Xjϕ for server j, j ∈ +, under ϕ ∈ Φ, where πjϕ(i), i ∈ j, is the steady state probability of state i. For clarity of presentation, extend vector πjϕ, to a vector of length I, written πjϕ, of which the ith element is πjϕ(i), if i ∈ j, and otherwise, 0. The long-run expected value of Zϕ(t) is Σj=1K+1jϕ/(K+1). In the server farm embodiment defined in Section I, the long-run expected value of Zϕ(t) should be a member of the set
Define q1(z, zi, zi′), and q0(z, zi, zi′), z ∈ Z, i ∈ {0, 1}∪ {0}, as the average transition rate of the ith element in vector z from zi to zi′, under tagged and untagged action, respectively. Then, the average transition rate of the ith element of z under policy ϕ is given by
qϕ(z, zi, zi′)=uiϕ(z)q1(z, z, zi′)+(1−ujϕ(z))q0(z, zi , zi′). (18)
Consider the following differential equation for a stochastic process, denoted by:
Because of the global balance at an equilibrium point of limt→+∞∫0tzϕ(u)du/t, if exists, denoted by zϕ, dzϕ(t)/dt|z
For a small δ>0, define
is an upper bound of the absolute value of the reward rate divided by K. Then,
The server farm is decomposed into {tilde over (K)} server groups, with number of servers in the ith group denoted by Ki, i=1, 2, . . . {tilde over (K)}. Then, K=Σδ=1KKi.
Based on the proof provided in R. R. Weber and G. Weiss, “On an index policy for restless bandits,” J. Appl. Probab., no. 3, pp. 637-648, September 1990, for any Ki=Ki0n, Ki0=1, 2, . . . , {tilde over (K)}, n=1, 2, . . . , δ>0 and ϕ is set to be either index or OPT,
Then, as n→+∞, the existence of an equilibrium point of limt→+∞∫0tZϕ(u)du/t leads to the existence of zϕ=limt→+∞∫0tzϕ(u)du/t (using the Lipschitz continuity of the right side of Equation (17) as a function of zϕ(t)). The following is obtained:
Finally, γOR(index)/K−γOR/K→0 as n→+∞, that is, MAIP (Whittle's index policy) approaches the optimal solution in terms of energy efficiency as the number of servers in each server group tends to infinity at the appropriate rate.
IV—Numerical ResultsIn this section, the performance of the MAIP method of the present embodiment is evaluated by extensive numerical results obtained by simulation. All results in the following are presented in the form of an observed mean from multiple independent runs of the corresponding experiment. The confidence intervals at the 95% level based on the Student's t-distribution are maintained within ±5% of the observed mean. For convenience of describing the results, given two numerical quantities x>0 and y>0, the relative difference of x to y is defined as (x−y)/υ.
In all experiments performed, a system of servers divided into three server groups were utilized. Servers in each server group i, i=1, 2, 3, have the same buffer size, service rate, and energy consumption rate, denoted by
To demonstrate the effect of idle power on job assignment, the following compares the MAIP method with a baseline method. The baseline method used is the “Most energy-efficient available server first Neglecting Idle Power” (MNIP) job assignment method. As its name suggests, the MNIP method neglects idle power and hence treats εj0−0 for all j ∈ K in the process of selecting servers for job assignment. The following compares MAIP with MNIP in terms of energy efficiency, job throughput, and energy consumption rate under various system parameters.
For the set of experiments in
The same settings as that for obtaining the results in
For the set of experiments for
The results of
MAIP in the present embodiment is designed as a non-jockeying policy, which is more appropriate than jockeying policies for job assignment in a large-scale server farm. In general, jockeying policies suit a small server farm where the cost associated with jockeying is negligible. In large-scale systems, the cost associated with jockeying can be significant and may have a snowball effect on the system performance. The following demonstrates the benefits of MAIP in a server farm where jockeying costs are high, by comparing it with a jockeying policy known as Most Energy Efficient Server First (MEESF) proposed in J. Fu, J. Guo, E. W. M. Wong, and M. Zukerman, “Energy-efficient heuristics for insensitive job assignment in processor sharing server farms,” IEEE J. Sel. Areas Commun., vol. 33, no. 12, pp. 2878-2891, December 2015.
The settings of servers in each of the three server groups are based on the benchmark results of Dell PowerEdge rack servers R610 (August 2010), R620 (May 2012) and R630 (April 2015). Specifically, μ3 and ε3 is normalised to one, and the following settings are applied: μ1/μ3=3.5, ε1/ε3=1.2, ε10/ε1=0.2, μ2/μ3=1.4, ε2/ε2=0.2, and ε30/ε3=0.3. Also set Bj=10 for i=1, 2, 3, and ρ=0.6. The number of servers K is varied from 3 to 270, where K is increased by increasing the number of servers in each of the three server groups.
In the present example, assume that each jockeying action incurs a (constant) delay Δ. That is, when a job is reassigned from server i to server j, it will be suspended for a period Δ before resumed on server j. Clearly, when Δ>0, this is equivalent to increasing the size of the job and hence its service requirement. Accordingly, for a given system, a non-zero cost per jockeying action indeed increases the traffic load. In the present example, three different values of Δ are considered. The case where Δ=0 is for zero jockeying cost, the case where Δ=0.0005 indicates a relatively small cost per jockeying action, and the case where Δ=0.01 represents a large cost per jockeying action. The results are presented in
For the case where Δ=0, it can be observed that
For the case where Δ=0.0005, it can be observed from
The effect is more profound when Δ is increased to 0.01. In this case, as shown in
The workload characterizations of many computer science applications, such as Web file sizes, IP flow durations, and the lifetimes of supercomputing jobs, are shown to exhibit heavy-tailed Pareto distributions. To determine whether the performance of MAIP is sensitive to the job-size distribution, three different distributions, in addition to the exponential distribution, are considered in the following. These distributions are deterministic, Pareto with the shape parameter set to 2.001 (Pareto-1 for short), and Pareto with the shape parameter set to 1.98 (Pareto-2 for short). In all cases, the mean was set to be one.
The same settings as that for obtaining the results in
The embodiments of the MAIP job assignment method in the present invention as broadly described above address job assignment problem in a server farm comprising multiple processor sharing servers with different service rates, energy consumption rates and buffer sizes. The MAIP method in embodiments of the present invention takes into account of idle power, and can maximize the energy efficiency of the entire system, defined as the ratio of the long-run average throughput to the long-run average energy consumption rate, by effectively assigning jobs/requests to these servers.
Advantageously, the MAIP method only requires information of full/non-full states of servers, and can be implemented by using a binary variable for each server. Also, this method does not require any estimation or prediction of average arrival rate. MAIP has been proven to approach optimality as the numbers of servers in server groups tend to infinity and when job sizes are exponentially distributed. This asymptotic property is particularly appropriate to a large-scale server farm that is likely to purchase and upgrade a large number of servers with the same style and attributes at the same time. Also, the MAIP method is highly energy efficient in cases of exponential and Pareto job-size distributions, and so it is suitable for a server farm with highly varying job sizes. MAIP is also more appropriate than MEESF for a server farm with non-zero jockeying cost, and it is useful for a real large-scale system which has significant cost for job reassignment. Various other advantages of the methods of the present invention can be determined by a person skilled in the art upon considering the above description and the referenced drawings.
Although not required, the embodiments described with reference to the Figures can be implemented as an application programming interface (API) or as a series of libraries for use by a developer or can be included within another software application, such as a terminal or personal computer operating system or a portable computing device operating system. Generally, as program modules include routines, programs, objects, components and data files assisting in the performance of particular functions, the skilled person will understand that the functionality of the software application may be distributed across a number of routines, objects or components to achieve the same functionality desired herein.
It will also be appreciated that where the methods and systems of the present invention are either wholly implemented by computing system or partly implemented by computing systems then any appropriate computing system architecture may be utilized. This will include stand-alone computers, network computers and dedicated hardware devices. Where the terms “computing system” and “computing device” are used, these terms are intended to cover any appropriate arrangement of computer hardware capable of implementing the function described.
It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the invention as shown in the specific embodiments without departing from the spirit or scope of the invention as broadly described. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.
Any reference to prior art contained herein is not to be taken as an admission that the information is common general knowledge, unless otherwise indicated.
Claims
1. A method for operating a server farm with a plurality of servers operably connected with each other, the method comprising the steps of:
- receiving a job request of a computational task to be handled by the server farm;
- determining, from the plurality of servers, one or more servers operable to accept the job request;
- determining a respective effective energy efficiency value associated with at least the one or more servers; and
- assigning the computational task to a server with the highest effective energy efficiency value;
- wherein the effective energy efficiency value is defined by: a service rate of the respective server divided by a difference between an energy consumption rate value when the respective server is busy and an energy consumption rate value when the respective server is idle.
2. The method in accordance with claim 1, further comprising the step of:
- sorting the one or more servers according to the respective determined effective energy efficiency values.
3. The method in accordance with claim 1, wherein the step of determining from the plurality of servers one or more servers operable to accept the job request comprises determining, from the plurality of servers, all servers operable to accept the job request.
4. The method in accordance with claim 1, wherein the plurality of servers cannot be powered off during operation of the server farm.
5. The method in accordance with claim 1, wherein assignment of computation tasks in the server farm is substantially independent of an arrival rate of computation tasks at the server farm.
6. The method in accordance with claim 1, wherein assignment of computation tasks in the server farm is substantially independent of a respective size of the computation tasks received at the server farm.
7. The method in accordance with claim 1, wherein the plurality of servers each includes a finite buffer for queuing job requests.
8. The method in accordance with claim 7, wherein the one or more servers operable to accept the job request each has at least one vacancy in their respective buffer.
9. The method in accordance with claim 1, wherein the server farm is heterogeneous in that the plurality of servers can have different server speeds, energy consumption rates, and/or buffer sizes.
10. The method in accordance with claim 1, wherein the server farm is a non-jockeying server farm in which computational task being handled by one of the plurality of servers cannot be reassigned to other servers.
11. A server farm comprising:
- a plurality of servers operably connected with each other;
- one or more processor operably connected with the plurality of server, the one or more processor being arranged to:
- receive a job request of a computational task to be handled by the server farm;
- determine, from the plurality of servers, one or more servers operable to accept the job request;
- determine a respective effective energy efficiency value associated with at least the one or more servers; and
- assign the computational task to a server with the highest effective energy efficiency value;
- wherein the effective energy efficiency value is defined by: a service rate of the respective server divided by a difference between an energy consumption rate value when the respective server is busy and an energy consumption rate value when the respective server is idle.
12. The server farm in accordance with claim 11, wherein the one or more processor is further operable to:
- sort the one or more servers according to the respective determined effective energy efficiency values.
13. The server farm in accordance with claim 11, wherein the one or more processor is further operable to: determine, from the plurality of servers, all servers operable to accept the job request.
14. The server farm in accordance with claim 11, wherein the plurality of servers cannot be powered off during operation of the server farm.
15. The server farm in accordance with claim 11, wherein the one or more processor is arranged such that assignment of computation tasks in the server farm is substantially independent of an arrival rate of computation tasks at the server farm.
16. The server farm in accordance with claim 11, wherein the one or more processor is arranged such that assignment of computation tasks in the server farm is substantially independent of a respective size of the computation tasks received at the server farm.
17. The server farm in accordance with claim 11, wherein the plurality of servers each includes a finite buffer for queuing job requests; and wherein the one or more servers operable to accept the job request each has at least one vacancy in their respective buffer.
18. The server farm in accordance with claim 11, wherein the server farm is heterogeneous in that the plurality of servers can have different server speeds, energy consumption rates, and/or buffer sizes.
19. The server farm in accordance with claim 11, wherein the server farm is a non-jockeying server farm in which computational task being handled by one of the plurality of servers cannot be reassigned to other servers.
20. The server farm in accordance with claim 11, wherein the one or more processors are incorporated in at least one of the plurality of servers.
Type: Application
Filed: Oct 11, 2016
Publication Date: Apr 12, 2018
Inventors: Jing Fu (New Territories), William Morgan (Balwyn), Jun Guo (New Territories), Moshe Zukerman (New Territories), Wing Ming Eric Wong (Kowloon)
Application Number: 15/290,106