Dynamic energy-aware capacity provisioning for cloud computing environments

Dynamic Energy-Aware Capacity Provisioning for Cloud
Mohamed Faten Zhani University of Waterloo University of Waterloo National University of Defense Waterloo, ON, Canada Waterloo, ON, Canada Changsha, Hunan, China Joseph L. Hellerstein University of Illinois at University of Waterloo Google, Inc.
Urbana-Champaign, USA Waterloo, ON, Canada Seattle, Washington, USA Pohang University of Science and Technology (POSTECH) Pohang 790-784, Korea Control Theory; I.6.3 [Simulation and Modeling]: Ap- Data centers have recently gained significant popularity as a cost-effective platform for hosting large-scale service appli-cations. While large data centers enjoy economies of scale by amortizing initial capital investment over large number of Management, Performance, Experimentation machines, they also incur tremendous energy cost in termsof power distribution and cooling. An effective approach forsaving energy in data centers is to adjust dynamically the data center capacity by turning off unused machines. How- Cloud Computing; Resource Management; Energy Manage- ever, this dynamic capacity provisioning problem is known ment; Model Predictive Control to be challenging as it requires a careful understanding of theresource demand characteristics as well as considerations to various cost factors, including task scheduling delay, ma-chine reconfiguration cost and electricity price fluctuation.
Data centers today are home to a vast number and a vari- In this paper, we provide a control-theoretic solution to ety of applications with diverse resource demands and per- the dynamic capacity provisioning problem that minimizes formance objectives. Typically, a cloud application can be the total energy cost while meeting the performance objec- divided into one or more tasks executed in one or more con- tive in terms of task scheduling delay. Specifically, we model tainers (e.g., virtual machines (VMs)). At run time, sched- this problem as a constrained discrete-time optimal control ulers are responsible for assigning tasks to machines. In to- problem, and use Model Predictive Control (MPC) to find day's reality, production data centers such as Google's cloud the optimal control policy. Through extensive analysis and backend often execute tremendous number (e.g., millions) simulation using real workload traces from Google's compute of tasks on a daily basis [27]. Such extremely large-scale clusters, we show that our proposed framework can achieve workload hosted by data centers not only consumes signifi- significant reduction in energy cost, while maintaining an cant storage and computing power, but also huge amounts acceptable average scheduling delay for individual tasks.
of energy. In practice, the operational expenditure on en-ergy not only comes from running physical machines, butalso from cooling down the entire data center. It has been Categories and Subject Descriptors
reported that energy consumption accounts for more than12% of monthly operational expenditures of a typical data C.4 [Performance of Systems]: Modeling techniques; I.2.8 center [5]. For large companies like Google, a 3% reduction [Problem Solving, Control Methods, and Search]: in energy cost can translate into over a million dollars incost savings [19]. On the other hand, governmental agenciescontinue to implement standards and regulations to promoteenergy-efficient (i.e., "Green") computing [1]. Motivated by Permission to make digital or hard copies of all or part of this work for these observations, cutting down electricity cost has become personal or classroom use is granted without fee provided that copies are a primary concern of today's data center operators.
not made or distributed for profit or commercial advantage and that copies In the research literature, a large body of recent work tries bear this notice and the full citation on the first page. To copy otherwise, to to improve energy efficiency of data centers. A plethora of republish, to post on servers or to redistribute to lists, requires prior specific techniques have been proposed to tackle different aspects permission and/or a fee.
of the problem, including the control of power distribution ICAC'12, September 18–20, 2012, San Jose, California, USA.
Copyright 2012 ACM 978-1-4503-1520-3/12/09 .$15.00.
systems [20], cooling systems [8], computer hardware [25], software components such as virtualization [24] and load- traces for one of Google's production compute clusters and balancing algorithms [11, 15]. It is known that one of the illustrate the benefits of our approach. Section 4 describes most effective approach for reducing energy cost is to dy- the architecture of our proposed system. In Section 5, we namically adjust the data center capacity by turning off present our demand prediction model and control algorithm.
unused machines, or to set them to a power-saving (e.g., In Section 6, we provide our detailed formulation for the op- "sleep") state. This is supported by the evidence that an timal control problem and present our solution based on the idle machine can consume as much as 60% of the power MPC framework. In Section 7, we evaluate our proposed when the machine is fully utilized [11, 15, 17]. Unsurpris- system using Google workload traces, and demonstrate the ingly, a number of efforts are trying to leverage this fact to benefits under various parameter settings. Finally, we draw save energy using techniques such as VM consolidation [24] our conclusions in Section 8.
and migration [23]. However, these studies have mainly fo-cused on improving the utilization of clusters by improving RELATED WORK
the "bin-packing" algorithm for VM scheduling. In a pro- Much effort has been made to achieve energy savings in duction data center where resource requests for tasks can data centers. Dynamic capacity provisioning is one of the arrive dynamically over time, deciding the number of ma- most promising solutions to reduce energy cost that consists chines to be switched off is not only affected by the effi- of dynamically turning on and off data center servers. For ciency of the scheduling algorithm, but also time-dependent instance, motivated by the time-dependent variation of the characteristics of resource demand. While over-provisioning number of users and TCP connections in Windows live mes- the data center capacity can lead to sub-optimal energy sav- senger login servers, Chen et al. [11] have derived a frame- ings, under-provisioning the data center capacity can cause work for dynamic server provisioning and load dispatching.
significant performance penalty in terms of scheduling delay, They have proposed a technique to evaluate the number of which is the time a task has to wait before it is scheduled needed servers based on the predicted load in terms of users' on a machine. A high scheduling delay can significantly login rate and active TCP connections. The load dispatch- hurt the performance of some services that must be sched- ing algorithm ensures that incoming requests are distributed uled as soon as possible to satisfy end user requests (e.g., among the servers. However, their framework does not con- user-facing applications). On the other hand, tasks in pro- sider the cost of switching on and off machines. Guenter duction data centers often desire multiple types of resources, et al. [16] have proposed another automated server provi- such as CPU, memory, disk and network bandwidth. In this sioning and load dispatching system based on the predicted context, devising a dynamic capacity provisioning mecha- demand while considering the cost of transitioning servers nism that considers demand for multiple types of resources between different power states (e.g., "on", "off", "hibernate").
becomes a challenging problem. Furthermore, there are re- This cost depends on the transition time, the energy cost configuration costs associated with switching on and off ma- and the long-term reliability of the server. Different from chines. In particular, turning on and off a machine often our work, they analyze the number of requests that can be consumes large amount of energy due to saving and load- satisfied instead of request scheduling delay. Furthermore, ing system states to memory and disk [18]. When turning the multi-dimensional aspect of resource demand and fluctu- off a machine with running tasks, it is necessary to consider ations of electricity prices are not considered in their work.
the performance penalty due to migrating (or terminating) Kusic et al. [18] have proposed a dynamic resource provi- the tasks on the machine. Therefore, the reconfiguration sioning framework for virtualized server environments based cost due to server switching should be considered as well.
on lookahead control. The framework minimizes power con- Finally, another aspect often neglected in the existing lit- sumption by adjusting the number of physical and virtual erature is the electricity price. For example, it is known machines. It also estimates the CPU share and the work- that in many regions of the U.S., the price of electricity can load directed to every virtual machine. In addition, their change depending on the time of the day. Electricity price is controller manages to maximize the number of transactions thus another factor that should be considered when making that satisfy Service Level Agreement (SLA) in terms of av- capacity adjustment decisions.
erage response time while taking into account the cost of In this paper, we present a solution to the dynamic capac- turning on and off the machines. However, they mainly con- ity provisioning problem with the goal of minimizing total sider the performance of application servers rather than the energy cost while maintaining an acceptable task schedul- scheduling of VMs. Furthermore, time-dependent variations ing delay. Different from existing works on server capacity of electricity prices are not considered in their framework.
provisioning problem, we formulate the problem as a con- Abbasi et al. [7] have proposed a thermal-aware server pro- vex optimization problem that considers multiple resource visioning technique and a workload distribution algorithm.
types and fluctuating electricity prices. We then analyze In this approach, active servers are selected using heuristics the optimality condition of this problem and design a Model in a way that minimizes cooling and computing energy cost.
Predictive Control (MPC) algorithm that adjusts the num- The workload distribution algorithm ensures that servers' ber of servers to track the optimality condition while taking utilizations do not exceed a threshold in order to satisfy SLA into account switching costs of machines. Through analy- defined in terms of average response time. However, their sis and simulation using real workload traces from Google's approach does not consider switching cost of machines.
compute clusters, we show our proposed solution is capa- There is also a large body of work that applies control the- ble of achieving significant energy savings while minimizing ory to achieve energy savings in data centers. Fu et al. [15] SLA violations in terms of task scheduling delay.
have proposed a control-theoretic thermal balancing that The remainder of the paper is organized as follows. Sec- reduces temperature differences among servers. Hence, the tion 2 presents a survey of related work in the research liter- controller acts on the utilization of each processor in order to ature. In Section 3, we present an analysis of real workload reduce or increase its temperature. Model predictive control Figure 1: Total CPU demand and usage (29 days). Figure 2: Total memory demand and usage (29 days).
demand and usage for CPU, memory and disk 1. The usage Number of used machines Number of available machines of each type of resource is reported at 5 minute intervals.
Our current analysis mainly focuses on CPU and memory, as they are typically scarce compared to disk. However, we believe it is straightforward to extend our approach to con- sider other resources such as disk space.
Number of machines In addition to resource demand, the user can also specify Average scheduling delay (sec) a scheduling class, a priority and placement constraints for each task. The scheduling class captures the type of the task (e.g., user-facing or batch). The priority determines the im-portance of each task. The task placement constraints spec- Figure 3: Number of ma- ify additional scheduling constraints concerning the machine chines available and used scheduling delay vs. Re- configurations, such as processor architecture of the physical in the cluster.
source utilization.
machine [21]. To simplify, we do not consider the schedulingclass, priority and task placement constraints in our model.
is used by Wang et al. [26] to reduce the total power con- The analysis of these factors is left for future work.
sumption of a cluster by tuning CPU frequency level for the We first plot the total demand and usage for both CPU processor of each server. However, most of previous work and memory over the entire duration. The results are shown has focused on capacity provisioning from a service provider in Figure 1 and 2 respectively. The total usage at a given perspective, i.e., provisioning server capacities (e.g., number time is computed by summing up the resource usage of all of web servers) to accommodate end user requests. Existing the running tasks at that time. On the other hand, the to- solutions to this problem rely on queuing-theoretic models tal demand at a given time is determined by total resource that consider only a single type of resource (mainly CPU).
requirement by all the tasks in the system, including the In contrast, our approach investigates the problem from the tasks that are waiting to be scheduled. From Figure 1 and cloud provider's perspective, where resource demand and us- 2, it can be observed that both usage and demand for each age are multi-dimensional. Our solution considers resource type of resource can fluctuate significantly over time. Fig- usage and capacity for multiple resource types, such as CPU ure 3 shows the number of machines available and used in and memory. Furthermore, none of the existing work has the cluster. Specifically, a machine is available if it can be considered additional factors such as the fluctuating elec- turned on to execute tasks. A machine is used if there is tricity prices. Our approach is also lightweight and indepen- at least one task running on it. Figure 3 shows that the dent of the scheduling algorithm, making it more suitable for capacity of this cluster is not adjusted based on resource demand, as the number of used machines is almost equalto the number of available machines. Combining the obser-vations from Figure 1, 2 and 3, it is evident that a large number of machines can be turned off to save energy. For To motivate the problem and justify our solution approach, instance, we estimated that a perfect energy saving schedule we have conducted an analysis of workload traces for a pro- where the provisioned capacity exactly matches the current duction compute cluster at Google [4] consisting of approxi- demand can achieve about 22% and 17% percent resource mately 12, 000 machines. The dataset was released on Novem- reduction for CPU and memory, respectively, compared to ber 29, 2011. The workload traces contain scheduling events provisioning capacity according to the peak demand. This as well as resource demand and usage records for a total of indicates that there is great potential for energy savings in 672, 003 jobs and 25, 462, 157 tasks over a time span of 29 this compute cluster using dynamic capacity provisioning.
days. Specifically, a job is an application that consists of one However, while turning off active machines can reduce or more tasks. Each task is scheduled on a single physical total energy consumption, turning off too many machines machine. When a job is submitted, the user can specify the can also hurt task performance in terms of scheduling delay.
maximum allowed resource demand for each task in terms of Classic queuing theory indicates that task scheduling delay required CPU, memory and disk size. At run time, the us-age of a task measures the actual consumption of each type 1Note that the values reported in Google cluster traces were of resources. The current Google cluster traces provide task normalized between 0 and 1.
Number of tasks in the queue Number of machines Predicted CPU Usage Predicted Memory Usage Figure 5: System architecture.
grows exponentially with resource utilization. To quantify • The prediction module receives statistics about the us- this effect, we analyzed the relationship between scheduling age of all resources (CPU and memory) in the cluster delay experienced by each task and the average utilization of and predicts the future usage for all of them.
the bottleneck resource (e.g., CPU) while the task is waitingto be scheduled. We then plotted the average task schedul- • The controller implements a MPC algorithm that con- ing delay as a function of the utilization of the bottleneck trols the number of machines based on the predicted resource, as shown in Figure 4. The error bar in this dia- usage of the cluster and taking into account the recon- gram represents the standard deviation of task scheduling figuration cost.
delay with average utilization at each given value. Indeed,we found that there is a direct relationship between task • The capacity provisioning module gathers the status scheduling delay and resource utilization. We also modeled information of machines from the controller, and de- the relationship through curve fitting. It seems that both cides which machines in particular should be added or a linear function (i.e., d = a · U + b) or a delay function removed. It then provides the scheduler with the list for M/M/1 queuing model (i.e., d = a · U of active machines.
the curve well. Similar observations have been reported inrecent work [27][21].
It is worth mentioning that different schedulers may adopt The above observations suggest that while the benefits different resource allocation schemes. For example, in a pub- of dynamic capacity provisioning is apparent for produc- lic cloud environment such as Amazon EC2, it is necessary tion data center environments, designing an effective dy- to schedule tasks according to their resource demand (e.g., namic capacity provisioning scheme is challenging, as it in- VM size). However, since the actual resource usage of each volves finding an optimal tradeoff between energy savings task can be much lower than the demand, many advanced and scheduling delay. Furthermore, turning off active ma- schedulers adjust dynamically resource allocation based on chines may require killing or migrating tasks running on task usage [22]. Even though our framework is applicable these machines, which will introduce an additional perfor- to both scenarios, in this work, we use the latter case for mance penalty. The goal of this paper is to provide a so- illustration, not only because it is more general, but also lution to this dynamic capacity provisioning problem that because it reflects the behavior of Google cluster schedulers.
finds the optimal trade-off between energy savings and the In particular, Google's schedulers intentionally over-commit cost of reconfigurations, including cost of turning on and off resources on each machine [3]. Finally, as an initial effort to- machines and killing/migrating tasks.
wards solving this problem, we currently consider that all themachines in the cluster are homogenous and with identicalresource capacities. It is part of our future work to extend our model to consider machine heterogeneity (e.g., multiple Our proposed system architecture is depicted in Figure 5.
generations of machines [21]).
It consists of the following components: In the following sections, we will describe the design of each component in details.
• The scheduler is responsible for assigning incoming tasks to active machines in the cluster. It also reports the average number of tasks in the queue during eachcontrol period to help the controller make informed In this section, we describe our model for predicting usage of each resource type. We used the Auto-Regressive Inte-grated Moving Average (ARIMA) model [9] to predict the • The monitoring module is responsible for collecting time series Grk which represents the usage of resource type r CPU and memory usage statistics of every machine in in all the machines at time k. For convenience, we drop the the cluster. The monitoring is performed periodically.
superscript r and we write simply Gk in this section.
Table 1: Example of multi-step prediction (n = 3).
Knowing the last n observations of Gk, i.e., Gk−n+1,.
Gk, we want to predict Gk+1, which is the expected usageat time k + 1 predicted at time k. The time series G Inputs of the model an ARMA(n,q) model if it is stationary and if for every k: k−2 k ,Gk−1 k ,Gk k φ0Gk + . + φn−1Gk−n+1 k−1 k ,Gk k ,Gk+1 k k+1 + θ0ǫk + . + θq−1ǫk+1−q , Gk k,Gk+1 k,Gk+2 k k+1 k ,Gk+2 k ,Gk+3 k i and θj are constants estimated from available data. The terms ǫk are error terms which are assumed to beindependent, identically distributed samples from a normaldistribution with zero mean and finite variance σ2. The resource types. Denote by Cr ∈ R+ the capacity for resource parameters n and q are the number of lags used by the model type r ∈ R of a single machine.
(i.e., the number of last measured values of the usage) and To model the system dynamics, we divide time into inter- the number of error terms respectively. Equation (1) can vals of equal duration. We assume reconfiguration happens also be written in a concise form as: at the beginning of each time interval. At interval k ∈ N+,the measured usage for resource type r in the cluster is de- k . Let xk denote the number of active machines.
φiLiGk + ǫk+1 + ( Denote by uk ∈ R the change in the number of active ma- chines. A positive value of uk means more machines will where L is the backward shift operator defined as follows: be turned on, whereas a negative value of uk means some LiGk = Gk−i. We point out that AR and MA models are active machines will be powered off. Therefore, we have the special cases of the ARMA model when q = 0 or n = 0.
following simple state equation that calculates the number The ARMA model fitting procedure assumes that the data of active machines at time k + 1: are stationary. If the time series exhibits variations, we usedifferencing operation in order to make it stationary. It is xk+1 =xk + uk.
Our objective is to control the number of machines in or- der to reduce the total operational cost in terms of energy k = Gk − Gk−1.
consumption and penalty due to violating the SLA, while It can be shown that a polynomial trend of degree k is taking into consideration the cost of dynamic reconfigura- reduced to a constant by differencing k times, that is, by tion. In what follows, we describe how to model each of the applying the operator (1−L)ky(t). An ARIMA(n,d,q) model cost factors in details.
is an ARMA(n,q) model that has been differenced d times.
Thus, the ARIMA(n,d,q) can be given by: Modeling SLA penalty cost
In our model, the SLA is expressed in terms of an upper d on the average task scheduling delay. Thus, in order to meet the SLA, the average task scheduling delaydk at time k should not exceed ¯ d. As suggested in Section 3, the average task scheduling delay is correlated with the In our model, we aim to predict future resource usage over cluster resources' utilization, and more particularly with the a time window H ∈ N+. This requires predicting resource utilization of the bottleneck resource. Therefore, we define usage h ∈ N+ steps ahead from an end-of-sample Gk for all the bottleneck resource b ∈ R at time k ∈ N+ as the resource that has the highest utilization. In our model, the utilization k+h k denote the hth step prediction of Gk knowing the last n observations, i.e., Gk−n+1,. Gk. Thus, of resource r ∈ R in the cluster at time k ∈ N+ is given by: we aim to predict Gk+1 k,Gk+2 k.,Gk+h k. The multi-step prediction is obtained by iterating the one-step ahead pre- diction. The hth step prediction G k+h k is given by: Therefore, the utilization of the bottleneck resource b can Gk+h k = f (Gk+h−n k, ., Gk+h−i k, ., Gk+h−1 k), be calculated as: where Gk−i k = Gk−i ∀i ∈ [0, n], the function f is the pre- U bk = max {U rk} .
diction model, n is the number of lags used by the model and h is the prediction step. Table 1 illustrates how one-step Then the average scheduling delay at time k ∈ N can be prediction is iterated to obtain multi-step predictions.
We formally describe the dynamic capacity provisioning where qb(U bk) denotes the average latency given current uti- problem in this section. We assume the cluster consists of lization U bk for the bottleneck resource b. The function qb(·) Mk ∈ N+ homogeneous machines at time k. The number of can be obtained using various techniques, such as queue the- machines in the cluster can change due to machine failure oretic models, or directly from empirical measurements as and recovery. We assume each machine has d ∈ N types of described in Section 3.
resources. For example, a physical machine provides CPU, We adopt a simple penalty cost model for SLA violation.
memory and disk. Let R = {1, 2, ., d} denote the set of Specifically, if the delay bound is violated, then there will be a SLA penalty cost P SLA proportional to the degree of In order to solve this optimization problem, we use the violation. Therefore, the penalty function P SLA Karush-Kuhn-Tucker (KKT) approach [10]. The Lagrangian (U bk) = NkpSLA(q(U bk) − ¯ L(xk, γ) =NkpSLA q where pSLA represents the unit penalty cost for violating the delay upperbound, and N ) + µ (0 − xk) .
k is the weight factor representing the severity of the violation at time k (e.g., Nk can be thenumber of requests in the scheduling queue at time k).
The KKT conditions are: Modeling the total energy cost
idle − Nk pSLAwk In the research literature, it is known that the total energy consumption of a physical machine can be estimated by a linear function of CPU, memory and disk usage [16, 7, 11, 14]. Thus, the energy consumption of a machine at time k can be expressed as: µ, γ ≥ 0.
We need to consider three cases: (1) γ > 0, (2) µ > 0, Let ppower denote the electricity price at time k. Then, for a and (3) γ = 0 and µ = 0. The first two cases correspond given number of machines xk, the total energy cost P power to boundary conditions whether x∗ at time k can be expressed as the third case, assuming q(·) is convex and differentiable, wecan solve x∗ k using the first condition. For instance, q(U ) = a · U + b (which is the case for M/M/1 queuing model), Formulation of the optimization problem
As mentioned previously, our objective is to control the The above equation reveals many insights. First, the op- number of servers so as to minimize the total operational timal number of servers x∗k depends mainly on the cluster cost, which is the sum of SLA penalty cost P SLA utilization, which is captured by the variable wk. Second, energy cost P power(x k ). At the same time, we need to en- x∗k is also dependent on the electricity price and the SLA sure that the number of active machines in the cluster must violations. In particular, it increases either when the elec- not exceed Mk, the total number of physical machines in the tricity price drops down or when the SLA penalty cost rises.
cluster. This can be formulated by the following optimiza- Therefore, it can be seen that equation (17) tries to strike a balance between saving electricity cost and SLA penalty cost in a dynamic manner.
Capacity provisioning module
The capacity provisioning module takes as input the num- ber of machines that should be added or removed from the cluster, and determines which machine should be turned on s.t. 0 ≤ xk ≤ Mk, or off. The decision of switching on a particular machine can where (x)+ = max (x, 0). Notice that ppowerx be made based on different criteria such as its usage and its does not depend on x location in the cluster. However, choosing which machine k at time k, thus it can be omit- ted in the optimization formulation.
In addition, define to power off is more complicated since some tasks could berunning on it. Thus, more criteria should be considered such wk = maxr{ Grk }, we can further simplify the problem to: as the number of running tasks on the machine, their priori- ties, the cost of migrating or killing those tasks as well as the resource usage in the machine. For simplicity, define c the cost for migrating (or terminating) the task t, depending Assuming that q( wk ) is a decreasing function of x on the scheduling policy applied to task t. For example, if the average queuing delay decreases as the number of ma- task t is an interactive task such as a web server, it is better chines increases), we can see that the optimal solution x∗ to migrate the server to another machine to minimize the of this problem satisfies x∗ service down time. On the other hand, if the task t belongs d)+ = 0, in this case we can decrease x∗ to a MapReduce job, it is more cost-effective to simply ter- minate the task and restart it on a different machine [12].
to reduce energy cost while maintaining (q( wk ) − ¯ We define the cost of powering off a particular machine i, Therefore, we can further simplify the problem to: 1 ≤ i ≤ Mk at time k as d) + ppowerxkEidle Algorithm 1 MPC Algorithm for DCP 91: Provide initial state xk, k ← 0 93: At the end of control period k: Predict Nk+h k, wk+h k, ppower for time h ∈ {1, H} Solve tracking problem to obtain u(k + h k) for hori- Electricity Price ($/MWh) 15 zons h ∈ {0, · · · , H} Perform the reconfiguration using the capacity provi- sioning module according to u(k k) Figure 6: Electricity price of Houston, TX over 24 values of Q and R is to normalize both terms. In our case, where Sik denotes the set of tasks running on the machine i we normalize them by converting both terms into monetary at time k ∈ N. It is clear that cik increases with the number r = 1 (cavg,on + cavg,off ) where cavg,on and of tasks and their costs. Consequently, the capacity provi- cavg,off are the average cost for turning on and off servers, sioning module turns off machines having the lowest cost cik.
respectively. Similarly, we define ¯ over is average cost introduced per machine due to provi- Designing the MPC controller
sioning, and cavg is average cost introduced per machine The goal of our controller is to adjust the number of ma- due to underprovisining, respectively. Even though it is pos- chines to minimize the violation of KKT conditions, while sible to compute analytically the values of cavg,on, cavg,off , taking into consideration the reconfiguration cost. As N , it is more practical to estimate their values through empirical measurement. Notice that we set ¯ k , ppower can change over time, we adopt the well-known MPC framework to design an online controller for this prob- to the average penalty cost of both positive and negative er- lem. The MPC algorithm is illustrated by Algorithm 1. It rors, because in practice, the number of occurrences of both can be intuitively described as follows: At time k, the pre- positive and negative errors will likely to be the same, if the diction module is responsible for predicting the future values capacity provisioned by our controller only fluctuates within a fixed range. Finally, although we can set (Q, R) = (¯ k , wk , ppower for a prediction window H . The controller will then solve an optimal control problem that will deter- to ensure both terms are in the unit of dollar2, to simplify mine the optimal decisions for the entire window H. As only our model, we set (Q, R) = (1, ¯r2 ) in our experiment so that the first step is required, the controller will only carry out we only need to control R to achieve different trade-offs be- the first control decision. The procedure will repeat at the tween solution optimality and reconfiguration cost.
beginning of every time interval k, k + 1, and so on.
More formally, we can define Nk+h k,wk+h k, ppower as the values of Nk, wk, ppower predicted for time k + h, given the historical values up to time k. We also define We have implemented our system shown in Figure 5 and evaluated the quality of our solution using trace-driven sim- ek+h k = xk+h k − x∗k+h k, ulations. In our experiment, we set the CPU and memory as the tracking error at time k, the objective of the controller capacity of each machine to 1. This represents a majority of is to solve the following program: machines in the Google cluster2. In our simulation we imple-mented a greedy First-Fit (FF) scheduling algorithm, which is used by many cloud computing platforms such as Euca- Q(ek+h k)2 + R(uk+h k)2 lyptus [2]. In our simulations, we set E idle to 200 Watts, αr to 121 and 0 for CPU and memory, respectively, similar s. t. xk+h+1 k = xk+h k + uk+h k, ∀0 ≤ h ≤ H − 1 to the values used in [18]. For electricity price, we used the electricity prices for the city of Houston, Texas, obtained k+h k = xk+h k − x∗ from a publicly available source [6]. Figure 6 shows the fluc- 0 ≤ xk+h k ≤ N, tuation of electricity price over a duration of 24 hours. It where H is the horizon of interest. The first term represents can be observed that the electricity price is generally higher the tracking error, the second term represents the control during day time. The fluctuation sometimes can be as large penalty. The tracking error aims to reduce the error be- as 20% compared to the average electricity price over the 24 tween the actual and the optimal number of machines. The hours. Finally, we set ¯ d to 10 seconds as an upperbound on second term is the control penalty which takes into account task scheduling delay.
the cost of adding or removing machines. Thus, Jk can beinterpreted as a plan of action for next H time intervals. Qand R are weight factors that will control the stability and convergence rate of the controller. If Q is much bigger than In our first experiment, we assess the performance of the R, then the controller will place a higher weight on maximiz- multi-step prediction for resource usages. We first describe ing power savings and adjust number of servers aggressively.
the prediction procedure and the performance criteria. Then On the other hand, if R is large compared to Q, then thecontroller will adjust the capacity less aggressively to mini- 2The values of CPU and memory capacity reported in Google mize reconfiguration cost. A standard way to determine the traces were normalized to the configuration of the largest machine.
50 100 150 200 250 300 350 400 Figure 7: Prediction of resource usage in the Google Performance of multi-step prediction cluster - one-step prediction - ARIMA(2, 1, 1).
- ARIMA(2, 1, 1).
Figure 7 shows the one-step prediction of the CPU and memory usage compared to the real usage. The graph shows that the predicted values are always close to the real ones even during peaks. The prediction relative squared error RSE1 is close to zero which proves that the ARIMA(2,1,1) provides an accurate prediction of the usage either for CPU (RSE1 ≈ 0.062) or memory (RSE1 ≈ 0.086).
Figure 8 depicts the effect of increasing the number of lags used in the ARIMA model to predict CPU and memoryusage. Regarding CPU usage prediction (Figure 8(a)), it Figure 8: Effect of the number of lags on usage pre- is apparent from the results that starting from the second diction - One-step prediction (h = 1).
lag (n = 2), the prediction error becomes stable aroundRSE1 ≈ 0.062. If we now turn to memory usage prediction,Figure 8(b) shows the prediction error remains almost stable we study the effect of the number of lags and the prediction regardless of the number of lags used for the ARIMA model horizon on the prediction accuracy.
(RSE1 ≈ 0.086). Consequently, there is no improvement of To evaluate the quality of our prediction technique, the the prediction performance beyond two lags (n = 2). This available traces (e.g., measured CPU or memory usage) are result is interesting since a small number of lags reduces the divided into a training data set and a validation data set.
ARIMA model complexity and allows to implement it online Training data are used to identify the model parameters n, with minimal overhead and high accuracy.
d, q and the coefficient φi and θj. For given n and q, the We also conducted more experiments to examine the im- coefficients φi, i ≤ n and θj, j ≤ q are estimated using pact of the horizon h on the prediction performance. Since the RPS toolkit [13]. The validation data set is then used using more than two lags does not reduce the prediction er- to assess the accuracy of the prediction model. The per- ror, we only considered two lags as input for the ARIMA formance metric used to evaluate the prediction accuracy is model (i.e., n = 2). As expected, when we increase the pre- the relative squared error (RSE). It is calculated for every diction horizon, the prediction error grows for both CPU prediction step h as: and memory usage (Figure 9). What is interesting in theseresults is that the error remains small (RSE h ≤ 1) for mul- tiple prediction steps. In particular, the prediction error h remains below 1 for 400 steps ahead (≈ 33 hour) for CPU usage and for 50 steps ahead (≈ 250 min) for memory where T is the size of the validation data set and µ is the usage. We also mention that increasing the number of er- average of Gk over the T time intervals. The advantage of ror terms (q) for the ARIMA model does not improve the the RSEh is that it neither depends on the used scale nor prediction performance. In summary, these results suggest on the size of data. Having the RSEh lower than 1 means that we can apply ARIMA(2,1,1) using two lags to predict that the predictor is outperforming the use of the average of 12 steps ahead (equivalent to one hour), and this ensures the data as prediction for Gk (Gk+h h = µ). In addition, the that the prediction error does not exceed 0.3 and 0.5 for smaller is the RSEh , the more accurate is the prediction.
CPU and memory, respectively (Figure 9).
The RSE can also be seen as the ratio of the mean squarederror divided by the variance of validation data set.
Since our model exploits the predicted usage of the cluster in terms of CPU and memory to proactively add and remove We conducted several experiments to evaluate the perfor- servers, we assess the prediction model accuracy. We applied mance of our controller. In our experiment, we set the con- the ARIMA model to the real data collected at the Google trol frequency to once every 5 minutes to match Google's cluster and we evaluated the effect of the number of lags Cluster measurements frequency [4]. Typically, a high con- used as input for the prediction model (n) and the effect trol frequency implies fast response to demand fluctuations.
of the prediction horizon (h) on the multi-step squared er- However, it also incurs a high computational overhead. How- ror (RSEh). Memory and CPU usage are measured every ever, we found the computational overhead of both demand five minutes. Hence, a one-step prediction is equivalent to prediction algorithm and controller to be almost negligible, predict the cluster usage in the next five minutes.
thus, once every 5 minutes is a reasonable control frequency.
Energy Comsumption (kWh) 200 Average Scheduling Delay (second) Figure 13: Average Figure 14: Energy Con- Queuing Delay.
Figure 10: Capacity vs. Usage over 24 hours(R=0.1).
Average cost per hour ($) Number of active machines Scheduling delay (seconds) Figure 15: Average cost per hour.
Figure 11: Number of Figure 12: Average machines over 24 hours scheduling delay over 24 in energy costs, which implies 20% reduction in energy cost, hours (R=0.1).
while achieving the desired scheduling delay (for R = 0.225).
Furthermore, depending on the desired average schedulingdelay (Figure 13), our proposed approach can reduce total Lastly, we set Q = 1 in our experiments. Thus, the recon- operational cost by about 18.5 − 50% (Figure 15).
figuration cost can be controlled by properly adjusting thevalue of R.
In our experiments, we first evaluated the response of our system to usage fluctuation. The number of active servers Data centers have become a cost-effective infrastructure provisioned over the 24-hour duration is shown in Figure 11 for data storage and hosting large-scale service applications.
(for R = 0.1). Figure 10 show the capacity and the usage However, large data centers today consume significant a- of the cluster. It can be observed that the controller ad- mounts of energy. This not only rises the operational ex- justs the number of servers dynamically in reaction to usage penses of cloud providers, but also raises environmental con- fluctuation, while avoiding rapid change in the number of cerns with regard to minimizing carbon footprint. In this active machines. The cumulative distribution function of paper, we mitigate this concern by designing a dynamic ca- task scheduling delay is shown in Figure 12. It can be seen pacity provisioning system that controls the number of ac- that more than 60% of tasks are scheduled immediately.
tive servers in the data center according to (1) demand fluc- We performed several other experiments for comparison tuation, (2) variability in energy prices and (3) the cost of purpose. In the first experiment, the number of machines dynamic capacity reconfiguration. Our solution is based on is provisioned statically according to peak usage (i.e., 4100 the well-established Model Predictive Control framework, machines). In the remaining experiments, we applied our and aims to find a good trade-off between energy savings controller using different values of R. Figures 13 and 14 and capacity reconfiguration cost. Simulations using real show the corresponding average scheduling delay and en- traces obtained from a production Google compute clusters ergy consumption for different values of R compared to the demonstrate our approach achieves considerable amount of static provisioning. It can be observed that the static pro- reduction in energy cost. As such, we believe our approach visioning achieves the lowest scheduling delay since it sig- represents an initial step towards building a full-fledged ca- nificantly overprovisions the cluster capacity. On the other pacity management framework for cloud data centers.
hand, dynamic provisioning with R = 0.5 causes a signifi- There are several promising directions we can pursue in cant scheduling delay although it allows to reduce the energy the future. First, our current approach assumes that ma- consumption (up to 50%). Furthermore, setting R to a small chines are homogenous. While this is applicable to many value (e.g., 0.02) does not achieve significant energy reduc- situations (cloud providers often buy large quantities of iden- tion. Through experiments, we found that setting R = 0.225 tical machines in bulk), recent literature suggests that pro- achieves our desired SLA objective of keeping the average duction data centers often consists of multiple types (some- scheduling delay around 10 seconds while reducing the en- times multiple generations) of machines. Extending our cur- ergy consumption by 18.5%. Figure 15 shows the actual rent solution to handle machine heterogeneity requires care- energy cost per hour, taking into consideration the fluctua- ful consideration of scheduling capability of each type of tion of the electricity price. It can be seen that our dynamic machine. Another interesting problem is to understand the capacity provisioning mechanism reduces 7 dollars per hour interplay between the scheduler and the capacity controller.
We believe it is possible to further reduce the cost of energy Proceedings of the annual international symposium on consumption and reconfiguration (i.e., task preemption and Computer architecture (ISCA), 2007.
migration cost) if the scheduler and the capacity controller [15] Y. Fu, C. Lu, and H. Wang. Robust control-theoretic can cooperate tightly at a fine-grained level (e.g., interac- thermal balancing for server clusters. In IEEE tion of server consolidation algorithms with our capacity International Symposium on Parallel Distributed Processing (IPDPS), April 2010.
[16] B. Guenter, N. Jain, and C. Williams. Managing cost, performance, and reliability tradeoffs for energy-awareserver provisioning. In IEEE INFOCOM, April 2011.
This work was supported in part by a Google ResearchAward, by the Natural Science and Engineering Council [17] A. Gulati, A. Holler, M. Ji, G. Shanmuganathan, of Canada (NSERC) SAVI Research Network, and by the C. Waldspurger, and X. Zhu. VMware distributed World Class University (WCU) Program under the Korea resource management: Design, implementation, and Science and Engineering Foundation funded by the Ministry lessons learned. In VMware Technical Journal, 2012.
of Education, Science and Technology (Project No. R31- [18] D. Kusic, J. O. Kephart, J. E. Hanson, N. Kandasamy, and G. Jiang. Power and performance management ofvirtualized computing environments via lookaheadcontrol. In Proceedings of the International Conference on Autonomic Computing (ICAC), 2008.
[1] Energy star computers specification - feb. 14, 2012.
[19] A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag, and B. Maggs. Cutting the electric bill for Internet-scale systems. In ACM SIGCOMM Computer Draft 1 Version 6.0 Specification.pdf.
Communication Review, volume 39, 2009.
[2] Eucalyptus community. http://open.eucalyptus.com/.
[20] R. Raghavendra, P. Ranganathan, V. Talwar, [3] Google cluster-usage traces: format + schema.
Z. Wang, and X. Zhu. No power struggles: Coordinated multi-level power management for the data center. In ACM SIGARCH Computer Architecture News, volume 36. ACM, 2008.
[4] Googleclusterdata - traces of google workloads.
[21] B. Sharma, V. Chudnovsky, J. Hellerstein, R. Rifaat, and C. Das. Modeling and synthesizing task placement [5] Technology research - Gartner Inc.
constraints in google compute clusters. In Proceedings of ACM Symposium on Cloud Computing, 2011.
[6] U.S. Energy Information Administration (EIA).
[22] Z. Shen, S. Subbiah, X. Gu, and J. Wilkes.
Cloudscale: Elastic resource scaling for multi-tenant [7] Z. Abbasi, G. Varsamopoulos, and S. K. S. Gupta.
cloud systems. In Proceedings of the ACM Symposium Thermal aware server provisioning and workload on Cloud Computing, 2011.
distribution for Internet data centers. In Proceedings [23] A. Verma, P. Ahuja, and A. Neogi. pMapper: power of the ACM International Symposium on High and migration cost aware application placement in Performance Distributed Computing (HPDC), 2010.
virtualized systems. In ACM/IFIP/USENIX [8] C. Bash, C. Patel, and R. Sharma. Dynamic thermal Middleware, 2008.
management of air cooled data centers. In IEEE [24] A. Verma, G. Dasgupta, T. Nayak, P. De, and Intersociety Conference on the Thermal and R. Kothari. Server workload analysis for power Thermomechanical Phenomena in Electronics Systems minimization using consolidation. In Proceedings of (ITHERM), 2006.
the conference on USENIX Annual technical [9] G. E. P. Box, G. M. Jenkins, and G. C. Reinsel. Time conference. USENIX Association, 2009.
Series Analysis, Forecasting, and Control.
[25] G. von Laszewski, L. Wang, A. Younge, and X. He.
Prentice-Hall, third edition, 1994.
Power-aware scheduling of virtual machines in [10] S. Boyd and L. Vandenberghe. Convex Optimization.
DVFS-enabled clusters. In IEEE International Cambridge University Press, New York, USA, 2004.
Conference on Cluster Computing and Workshops [11] G. Chen, W. He, J. Liu, S. Nath, L. Rigas, L. Xiao, (CLUSTER), 2009.
and F. Zhao. Energy-aware server provisioning and [26] X. Wang and M. Chen. Cluster-level feedback power load dispatching for connection-intensive Internet control for performance optimization. In IEEE services. In USENIX Symposium on Networked International Symposium on High Performance Systems Design and Implementation (NSDI), 2008.
Computer Architecture (HPCA), February 2008.
[12] J. Dean and S. Ghemawat. MapReduce: Simplified [27] Q. Zhang, J. Hellerstein, and R. Boutaba.
data processing on large clusters. Communications of Characterizing task usage shapes in Google's compute the ACM, 51(1), 2008.
clusters. In Workshop on Large Scale Distributed [13] P. A. Dinda. Design, implementation, and Systems and Middleware (LADIS), 2011.
performance of an extensible toolkit for resourceprediction in distributed systems. IEEE Trans.
Parallel Distrib. Syst., 17, February 2006.
[14] X. Fan, W.-D. Weber, and L. A. Barroso. Power provisioning for a warehouse-sized computer. In

Source: http://www.savinetwork.ca/wp-content/uploads/savifunded/Raouf%20Boutaba_UWaterloo_Dynamic%20Energy-Aware%20Capacity%20Provisioning%20for%20Cloud%20Computing%20Environments_Proceedings%20of%20the%20IEEE%20ACM%20Intl.%20Conference%20on%20Autonomic%20Computing_Sep.12.pdf


The Infant Feeding Strategy 2008 – 2012 & Action Plans NHS Greater Glasgow & Clyde Section 1. The Introduction and Overview 1.1 Strategic Aims 1.2 Reducing Health Inequalities 1.3 Philosophy and Principles of Change 1.4 Engaging with Communities 1.5 Partnership Working 1.6 Responsibility for Achieving the Strategy Aims and Objectives


Undergraduate Research Symposium The Auburn Montgomery School of Sciences Table of Contents Schedule of Events . 5 Poster Session I . 6 Oral Session I . 7 Poster Session II. 8 Abstracts . 9-28 Phagocytic Activity in Bufo marinus . 10 Antibiogram of Coliform Bacteria Isolated from River Water . 11 Anuran Immunology and the Effect of Corticosterone on Basophil Proliferation . 12