Dynamic energy-aware capacity provisioning for cloud computing environments
Dynamic Energy-Aware Capacity Provisioning for Cloud
Mohamed Faten Zhani
University of Waterloo
University of Waterloo
National University of Defense
Waterloo, ON, Canada
Waterloo, ON, Canada
Changsha, Hunan, China
Joseph L. Hellerstein
University of Illinois at
University of Waterloo
Google, Inc.
Urbana-Champaign, USA
Waterloo, ON, Canada
Seattle, Washington, USA
Pohang University of Science
and Technology (POSTECH)
Pohang 790-784, Korea
Control Theory; I.6.3 [Simulation and Modeling]: Ap-
Data centers have recently gained significant popularity as a
cost-effective platform for hosting large-scale service appli-cations. While large data centers enjoy economies of scale
by amortizing initial capital investment over large number of
Management, Performance, Experimentation
machines, they also incur tremendous energy cost in termsof power distribution and cooling. An effective approach forsaving energy in data centers is to adjust dynamically the
data center capacity by turning off unused machines. How-
Cloud Computing; Resource Management; Energy Manage-
ever, this dynamic capacity provisioning problem is known
ment; Model Predictive Control
to be challenging as it requires a careful understanding of theresource demand characteristics as well as considerations to
various cost factors, including task scheduling delay, ma-chine reconfiguration cost and electricity price fluctuation.
Data centers today are home to a vast number and a vari-
In this paper, we provide a control-theoretic solution to
ety of applications with diverse resource demands and per-
the dynamic capacity provisioning problem that minimizes
formance objectives. Typically, a cloud application can be
the total energy cost while meeting the performance objec-
divided into one or more tasks executed in one or more con-
tive in terms of task scheduling delay. Specifically, we model
tainers (e.g., virtual machines (VMs)). At run time, sched-
this problem as a constrained discrete-time optimal control
ulers are responsible for assigning tasks to machines. In to-
problem, and use Model Predictive Control (MPC) to find
day's reality, production data centers such as Google's cloud
the optimal control policy. Through extensive analysis and
backend often execute tremendous number (e.g., millions)
simulation using real workload traces from Google's compute
of tasks on a daily basis [27]. Such extremely large-scale
clusters, we show that our proposed framework can achieve
workload hosted by data centers not only consumes signifi-
significant reduction in energy cost, while maintaining an
cant storage and computing power, but also huge amounts
acceptable average scheduling delay for individual tasks.
of energy. In practice, the operational expenditure on en-ergy not only comes from running physical machines, butalso from cooling down the entire data center. It has been
Categories and Subject Descriptors
reported that energy consumption accounts for more than12% of monthly operational expenditures of a typical data
C.4 [Performance of Systems]: Modeling techniques; I.2.8
center [5]. For large companies like Google, a 3% reduction
[Problem Solving, Control Methods, and Search]:
in energy cost can translate into over a million dollars incost savings [19]. On the other hand, governmental agenciescontinue to implement standards and regulations to promoteenergy-efficient (i.e., "Green") computing [1]. Motivated by
Permission to make digital or hard copies of all or part of this work for
these observations, cutting down electricity cost has become
personal or classroom use is granted without fee provided that copies are
a primary concern of today's data center operators.
not made or distributed for profit or commercial advantage and that copies
In the research literature, a large body of recent work tries
bear this notice and the full citation on the first page. To copy otherwise, to
to improve energy efficiency of data centers. A plethora of
republish, to post on servers or to redistribute to lists, requires prior specific
techniques have been proposed to tackle different aspects
permission and/or a fee.
of the problem, including the control of power distribution
ICAC'12, September 18–20, 2012, San Jose, California, USA.
Copyright 2012 ACM 978-1-4503-1520-3/12/09 .$15.00.
systems [20], cooling systems [8], computer hardware [25],
software components such as virtualization [24] and load-
traces for one of Google's production compute clusters and
balancing algorithms [11, 15]. It is known that one of the
illustrate the benefits of our approach. Section 4 describes
most effective approach for reducing energy cost is to dy-
the architecture of our proposed system. In Section 5, we
namically adjust the data center capacity by turning off
present our demand prediction model and control algorithm.
unused machines, or to set them to a power-saving (e.g.,
In Section 6, we provide our detailed formulation for the op-
"sleep") state. This is supported by the evidence that an
timal control problem and present our solution based on the
idle machine can consume as much as 60% of the power
MPC framework. In Section 7, we evaluate our proposed
when the machine is fully utilized [11, 15, 17]. Unsurpris-
system using Google workload traces, and demonstrate the
ingly, a number of efforts are trying to leverage this fact to
benefits under various parameter settings. Finally, we draw
save energy using techniques such as VM consolidation [24]
our conclusions in Section 8.
and migration [23]. However, these studies have mainly fo-cused on improving the utilization of clusters by improving
RELATED WORK
the "bin-packing" algorithm for VM scheduling. In a pro-
Much effort has been made to achieve energy savings in
duction data center where resource requests for tasks can
data centers. Dynamic capacity provisioning is one of the
arrive dynamically over time, deciding the number of ma-
most promising solutions to reduce energy cost that consists
chines to be switched off is not only affected by the effi-
of dynamically turning on and off data center servers. For
ciency of the scheduling algorithm, but also time-dependent
instance, motivated by the time-dependent variation of the
characteristics of resource demand. While over-provisioning
number of users and TCP connections in Windows live mes-
the data center capacity can lead to sub-optimal energy sav-
senger login servers, Chen et al. [11] have derived a frame-
ings, under-provisioning the data center capacity can cause
work for dynamic server provisioning and load dispatching.
significant performance penalty in terms of scheduling delay,
They have proposed a technique to evaluate the number of
which is the time a task has to wait before it is scheduled
needed servers based on the predicted load in terms of users'
on a machine. A high scheduling delay can significantly
login rate and active TCP connections. The load dispatch-
hurt the performance of some services that must be sched-
ing algorithm ensures that incoming requests are distributed
uled as soon as possible to satisfy end user requests (e.g.,
among the servers. However, their framework does not con-
user-facing applications). On the other hand, tasks in pro-
sider the cost of switching on and off machines. Guenter
duction data centers often desire multiple types of resources,
et al. [16] have proposed another automated server provi-
such as CPU, memory, disk and network bandwidth. In this
sioning and load dispatching system based on the predicted
context, devising a dynamic capacity provisioning mecha-
demand while considering the cost of transitioning servers
nism that considers demand for multiple types of resources
between different power states (e.g., "on", "off", "hibernate").
becomes a challenging problem. Furthermore, there are re-
This cost depends on the transition time, the energy cost
configuration costs associated with switching on and off ma-
and the long-term reliability of the server. Different from
chines. In particular, turning on and off a machine often
our work, they analyze the number of requests that can be
consumes large amount of energy due to saving and load-
satisfied instead of request scheduling delay. Furthermore,
ing system states to memory and disk [18]. When turning
the multi-dimensional aspect of resource demand and fluctu-
off a machine with running tasks, it is necessary to consider
ations of electricity prices are not considered in their work.
the performance penalty due to migrating (or terminating)
Kusic et al. [18] have proposed a dynamic resource provi-
the tasks on the machine. Therefore, the reconfiguration
sioning framework for virtualized server environments based
cost due to server switching should be considered as well.
on lookahead control. The framework minimizes power con-
Finally, another aspect often neglected in the existing lit-
sumption by adjusting the number of physical and virtual
erature is the electricity price. For example, it is known
machines. It also estimates the CPU share and the work-
that in many regions of the U.S., the price of electricity can
load directed to every virtual machine. In addition, their
change depending on the time of the day. Electricity price is
controller manages to maximize the number of transactions
thus another factor that should be considered when making
that satisfy Service Level Agreement (SLA) in terms of av-
capacity adjustment decisions.
erage response time while taking into account the cost of
In this paper, we present a solution to the dynamic capac-
turning on and off the machines. However, they mainly con-
ity provisioning problem with the goal of minimizing total
sider the performance of application servers rather than the
energy cost while maintaining an acceptable task schedul-
scheduling of VMs. Furthermore, time-dependent variations
ing delay. Different from existing works on server capacity
of electricity prices are not considered in their framework.
provisioning problem, we formulate the problem as a con-
Abbasi et al. [7] have proposed a thermal-aware server pro-
vex optimization problem that considers multiple resource
visioning technique and a workload distribution algorithm.
types and fluctuating electricity prices. We then analyze
In this approach, active servers are selected using heuristics
the optimality condition of this problem and design a Model
in a way that minimizes cooling and computing energy cost.
Predictive Control (MPC) algorithm that adjusts the num-
The workload distribution algorithm ensures that servers'
ber of servers to track the optimality condition while taking
utilizations do not exceed a threshold in order to satisfy SLA
into account switching costs of machines. Through analy-
defined in terms of average response time. However, their
sis and simulation using real workload traces from Google's
approach does not consider switching cost of machines.
compute clusters, we show our proposed solution is capa-
There is also a large body of work that applies control the-
ble of achieving significant energy savings while minimizing
ory to achieve energy savings in data centers. Fu et al. [15]
SLA violations in terms of task scheduling delay.
have proposed a control-theoretic thermal balancing that
The remainder of the paper is organized as follows. Sec-
reduces temperature differences among servers. Hence, the
tion 2 presents a survey of related work in the research liter-
controller acts on the utilization of each processor in order to
ature. In Section 3, we present an analysis of real workload
reduce or increase its temperature. Model predictive control
Figure 1: Total CPU demand and usage (29 days). Figure 2: Total memory demand and usage (29 days).
demand and usage for CPU, memory and disk 1. The usage
Number of used machines
Number of available machines
of each type of resource is reported at 5 minute intervals.
Our current analysis mainly focuses on CPU and memory,
as they are typically scarce compared to disk. However, we
believe it is straightforward to extend our approach to con-
sider other resources such as disk space.
Number of machines
In addition to resource demand, the user can also specify
Average scheduling delay (sec)
a scheduling class, a priority and placement constraints for
each task. The scheduling class captures the type of the task
(e.g., user-facing or batch). The priority determines the im-portance of each task. The task placement constraints spec-
Figure 3: Number of ma-
ify additional scheduling constraints concerning the machine
chines available and used
scheduling delay vs. Re-
configurations, such as processor architecture of the physical
in the cluster.
source utilization.
machine [21]. To simplify, we do not consider the schedulingclass, priority and task placement constraints in our model.
is used by Wang et al. [26] to reduce the total power con-
The analysis of these factors is left for future work.
sumption of a cluster by tuning CPU frequency level for the
We first plot the total demand and usage for both CPU
processor of each server. However, most of previous work
and memory over the entire duration. The results are shown
has focused on capacity provisioning from a service provider
in Figure 1 and 2 respectively. The total usage at a given
perspective, i.e., provisioning server capacities (e.g., number
time is computed by summing up the resource usage of all
of web servers) to accommodate end user requests. Existing
the running tasks at that time. On the other hand, the to-
solutions to this problem rely on queuing-theoretic models
tal demand at a given time is determined by total resource
that consider only a single type of resource (mainly CPU).
requirement by all the tasks in the system, including the
In contrast, our approach investigates the problem from the
tasks that are waiting to be scheduled. From Figure 1 and
cloud provider's perspective, where resource demand and us-
2, it can be observed that both usage and demand for each
age are multi-dimensional. Our solution considers resource
type of resource can fluctuate significantly over time. Fig-
usage and capacity for multiple resource types, such as CPU
ure 3 shows the number of machines available and used in
and memory. Furthermore, none of the existing work has
the cluster. Specifically, a machine is available if it can be
considered additional factors such as the fluctuating elec-
turned on to execute tasks. A machine is used if there is
tricity prices. Our approach is also lightweight and indepen-
at least one task running on it. Figure 3 shows that the
dent of the scheduling algorithm, making it more suitable for
capacity of this cluster is not adjusted based on resource
demand, as the number of used machines is almost equalto the number of available machines. Combining the obser-vations from Figure 1, 2 and 3, it is evident that a large
number of machines can be turned off to save energy. For
To motivate the problem and justify our solution approach,
instance, we estimated that a perfect energy saving schedule
we have conducted an analysis of workload traces for a pro-
where the provisioned capacity exactly matches the current
duction compute cluster at Google [4] consisting of approxi-
demand can achieve about 22% and 17% percent resource
mately 12, 000 machines. The dataset was released on Novem-
reduction for CPU and memory, respectively, compared to
ber 29, 2011. The workload traces contain scheduling events
provisioning capacity according to the peak demand. This
as well as resource demand and usage records for a total of
indicates that there is great potential for energy savings in
672, 003 jobs and 25, 462, 157 tasks over a time span of 29
this compute cluster using dynamic capacity provisioning.
days. Specifically, a job is an application that consists of one
However, while turning off active machines can reduce
or more tasks. Each task is scheduled on a single physical
total energy consumption, turning off too many machines
machine. When a job is submitted, the user can specify the
can also hurt task performance in terms of scheduling delay.
maximum allowed resource demand for each task in terms of
Classic queuing theory indicates that task scheduling delay
required CPU, memory and disk size. At run time, the us-age of a task measures the actual consumption of each type
1Note that the values reported in Google cluster traces were
of resources. The current Google cluster traces provide task
normalized between 0 and 1.
Number of tasks in the queue
Number of machines
Predicted CPU Usage
Predicted Memory Usage
Figure 5: System architecture.
grows exponentially with resource utilization. To quantify
• The prediction module receives statistics about the us-
this effect, we analyzed the relationship between scheduling
age of all resources (CPU and memory) in the cluster
delay experienced by each task and the average utilization of
and predicts the future usage for all of them.
the bottleneck resource (e.g., CPU) while the task is waitingto be scheduled. We then plotted the average task schedul-
• The controller implements a MPC algorithm that con-
ing delay as a function of the utilization of the bottleneck
trols the number of machines based on the predicted
resource, as shown in Figure 4. The error bar in this dia-
usage of the cluster and taking into account the recon-
gram represents the standard deviation of task scheduling
figuration cost.
delay with average utilization at each given value. Indeed,we found that there is a direct relationship between task
• The capacity provisioning module gathers the status
scheduling delay and resource utilization. We also modeled
information of machines from the controller, and de-
the relationship through curve fitting. It seems that both
cides which machines in particular should be added or
a linear function (i.e., d = a · U + b) or a delay function
removed. It then provides the scheduler with the list
for M/M/1 queuing model (i.e., d = a · U
of active machines.
the curve well. Similar observations have been reported inrecent work [27][21].
It is worth mentioning that different schedulers may adopt
The above observations suggest that while the benefits
different resource allocation schemes. For example, in a pub-
of dynamic capacity provisioning is apparent for produc-
lic cloud environment such as Amazon EC2, it is necessary
tion data center environments, designing an effective dy-
to schedule tasks according to their resource demand (e.g.,
namic capacity provisioning scheme is challenging, as it in-
VM size). However, since the actual resource usage of each
volves finding an optimal tradeoff between energy savings
task can be much lower than the demand, many advanced
and scheduling delay. Furthermore, turning off active ma-
schedulers adjust dynamically resource allocation based on
chines may require killing or migrating tasks running on
task usage [22]. Even though our framework is applicable
these machines, which will introduce an additional perfor-
to both scenarios, in this work, we use the latter case for
mance penalty. The goal of this paper is to provide a so-
illustration, not only because it is more general, but also
lution to this dynamic capacity provisioning problem that
because it reflects the behavior of Google cluster schedulers.
finds the optimal trade-off between energy savings and the
In particular, Google's schedulers intentionally over-commit
cost of reconfigurations, including cost of turning on and off
resources on each machine [3]. Finally, as an initial effort to-
machines and killing/migrating tasks.
wards solving this problem, we currently consider that all themachines in the cluster are homogenous and with identicalresource capacities. It is part of our future work to extend
our model to consider machine heterogeneity (e.g., multiple
Our proposed system architecture is depicted in Figure 5.
generations of machines [21]).
It consists of the following components:
In the following sections, we will describe the design of
each component in details.
• The scheduler is responsible for assigning incoming
tasks to active machines in the cluster. It also reports
the average number of tasks in the queue during eachcontrol period to help the controller make informed
In this section, we describe our model for predicting usage
of each resource type. We used the Auto-Regressive Inte-grated Moving Average (ARIMA) model [9] to predict the
• The monitoring module is responsible for collecting
time series Grk which represents the usage of resource type r
CPU and memory usage statistics of every machine in
in all the machines at time k. For convenience, we drop the
the cluster. The monitoring is performed periodically.
superscript r and we write simply Gk in this section.
Table 1: Example of multi-step prediction (n = 3).
Knowing the last n observations of Gk, i.e., Gk−n+1,.
Gk, we want to predict Gk+1, which is the expected usageat time k + 1 predicted at time k. The time series G
Inputs of the model
an ARMA(n,q) model if it is stationary and if for every k:
k−2 k ,Gk−1 k ,Gk k
φ0Gk + . + φn−1Gk−n+1
k−1 k ,Gk k ,Gk+1 k
k+1 + θ0ǫk + . + θq−1ǫk+1−q ,
Gk k,Gk+1 k,Gk+2 k
k+1 k ,Gk+2 k ,Gk+3 k
i and θj are constants estimated from available
data. The terms ǫk are error terms which are assumed to beindependent, identically distributed samples from a normaldistribution with zero mean and finite variance σ2. The
resource types. Denote by Cr ∈ R+ the capacity for resource
parameters n and q are the number of lags used by the model
type r ∈ R of a single machine.
(i.e., the number of last measured values of the usage) and
To model the system dynamics, we divide time into inter-
the number of error terms respectively. Equation (1) can
vals of equal duration. We assume reconfiguration happens
also be written in a concise form as:
at the beginning of each time interval. At interval k ∈ N+,the measured usage for resource type r in the cluster is de-
k . Let xk denote the number of active machines.
φiLiGk + ǫk+1 + (
Denote by uk ∈ R the change in the number of active ma-
chines. A positive value of uk means more machines will
where L is the backward shift operator defined as follows:
be turned on, whereas a negative value of uk means some
LiGk = Gk−i. We point out that AR and MA models are
active machines will be powered off. Therefore, we have the
special cases of the ARMA model when q = 0 or n = 0.
following simple state equation that calculates the number
The ARMA model fitting procedure assumes that the data
of active machines at time k + 1:
are stationary. If the time series exhibits variations, we usedifferencing operation in order to make it stationary. It is
xk+1 =xk + uk.
Our objective is to control the number of machines in or-
der to reduce the total operational cost in terms of energy
k = Gk − Gk−1.
consumption and penalty due to violating the SLA, while
It can be shown that a polynomial trend of degree k is
taking into consideration the cost of dynamic reconfigura-
reduced to a constant by differencing k times, that is, by
tion. In what follows, we describe how to model each of the
applying the operator (1−L)ky(t). An ARIMA(n,d,q) model
cost factors in details.
is an ARMA(n,q) model that has been differenced d times.
Thus, the ARIMA(n,d,q) can be given by:
Modeling SLA penalty cost
In our model, the SLA is expressed in terms of an upper
d on the average task scheduling delay. Thus, in
order to meet the SLA, the average task scheduling delaydk at time k should not exceed ¯
d. As suggested in Section
3, the average task scheduling delay is correlated with the
In our model, we aim to predict future resource usage over
cluster resources' utilization, and more particularly with the
a time window H ∈ N+. This requires predicting resource
utilization of the bottleneck resource. Therefore, we define
usage h ∈ N+ steps ahead from an end-of-sample Gk for all
the bottleneck resource b ∈ R at time k ∈ N+ as the resource
that has the highest utilization. In our model, the utilization
k+h k denote the hth step prediction of Gk
knowing the last n observations, i.e., Gk−n+1,. Gk. Thus,
of resource r ∈ R in the cluster at time k ∈ N+ is given by:
we aim to predict Gk+1 k,Gk+2 k.,Gk+h k. The multi-step
prediction is obtained by iterating the one-step ahead pre-
diction. The hth step prediction G
k+h k is given by:
Therefore, the utilization of the bottleneck resource b can
Gk+h k = f (Gk+h−n k, ., Gk+h−i k, ., Gk+h−1 k),
be calculated as:
where Gk−i k = Gk−i ∀i ∈ [0, n], the function f is the pre-
U bk = max {U rk} .
diction model, n is the number of lags used by the model
and h is the prediction step. Table 1 illustrates how one-step
Then the average scheduling delay at time k ∈ N can be
prediction is iterated to obtain multi-step predictions.
We formally describe the dynamic capacity provisioning
where qb(U bk) denotes the average latency given current uti-
problem in this section. We assume the cluster consists of
lization U bk for the bottleneck resource b. The function qb(·)
Mk ∈ N+ homogeneous machines at time k. The number of
can be obtained using various techniques, such as queue the-
machines in the cluster can change due to machine failure
oretic models, or directly from empirical measurements as
and recovery. We assume each machine has d ∈ N types of
described in Section 3.
resources. For example, a physical machine provides CPU,
We adopt a simple penalty cost model for SLA violation.
memory and disk. Let R = {1, 2, ., d} denote the set of
Specifically, if the delay bound is violated, then there will
be a SLA penalty cost P SLA
proportional to the degree of
In order to solve this optimization problem, we use the
violation. Therefore, the penalty function P SLA
Karush-Kuhn-Tucker (KKT) approach [10]. The Lagrangian
(U bk) = NkpSLA(q(U bk) − ¯
L(xk, γ) =NkpSLA q
where pSLA represents the unit penalty cost for violating the
delay upperbound, and N
) + µ (0 − xk) .
k is the weight factor representing
the severity of the violation at time k (e.g., Nk can be thenumber of requests in the scheduling queue at time k).
The KKT conditions are:
Modeling the total energy cost
idle − Nk pSLAwk
In the research literature, it is known that the total energy
consumption of a physical machine can be estimated by a
linear function of CPU, memory and disk usage [16, 7, 11,
14]. Thus, the energy consumption of a machine at time k
can be expressed as:
µ, γ ≥ 0.
We need to consider three cases: (1) γ > 0, (2) µ > 0,
Let ppower denote the electricity price at time k. Then, for a
and (3) γ = 0 and µ = 0. The first two cases correspond
given number of machines xk, the total energy cost P power
to boundary conditions whether x∗
at time k can be expressed as
the third case, assuming q(·) is convex and differentiable, wecan solve x∗
k using the first condition. For instance, q(U ) =
a · U + b (which is the case for M/M/1 queuing model),
Formulation of the optimization problem
As mentioned previously, our objective is to control the
The above equation reveals many insights. First, the op-
number of servers so as to minimize the total operational
timal number of servers x∗k depends mainly on the cluster
cost, which is the sum of SLA penalty cost P SLA
utilization, which is captured by the variable wk. Second,
energy cost P power(x
k ). At the same time, we need to en-
x∗k is also dependent on the electricity price and the SLA
sure that the number of active machines in the cluster must
violations. In particular, it increases either when the elec-
not exceed Mk, the total number of physical machines in the
tricity price drops down or when the SLA penalty cost rises.
cluster. This can be formulated by the following optimiza-
Therefore, it can be seen that equation (17) tries to strike
a balance between saving electricity cost and SLA penalty
cost in a dynamic manner.
Capacity provisioning module
The capacity provisioning module takes as input the num-
ber of machines that should be added or removed from the
cluster, and determines which machine should be turned on
s.t. 0 ≤ xk ≤ Mk,
or off. The decision of switching on a particular machine can
where (x)+ = max (x, 0). Notice that ppowerx
be made based on different criteria such as its usage and its
does not depend on x
location in the cluster. However, choosing which machine
k at time k, thus it can be omit-
ted in the optimization formulation.
In addition, define
to power off is more complicated since some tasks could berunning on it. Thus, more criteria should be considered such
wk = maxr{ Grk }, we can further simplify the problem to:
as the number of running tasks on the machine, their priori-
ties, the cost of migrating or killing those tasks as well as the
resource usage in the machine. For simplicity, define c
the cost for migrating (or terminating) the task t, depending
Assuming that q( wk ) is a decreasing function of x
on the scheduling policy applied to task t. For example, if
the average queuing delay decreases as the number of ma-
task t is an interactive task such as a web server, it is better
chines increases), we can see that the optimal solution x∗
to migrate the server to another machine to minimize the
of this problem satisfies x∗
service down time. On the other hand, if the task t belongs
d)+ = 0, in this case we can decrease x∗
to a MapReduce job, it is more cost-effective to simply ter-
minate the task and restart it on a different machine [12].
to reduce energy cost while maintaining (q( wk ) − ¯
We define the cost of powering off a particular machine i,
Therefore, we can further simplify the problem to:
1 ≤ i ≤ Mk at time k as
d) + ppowerxkEidle
Algorithm 1 MPC Algorithm for DCP
91: Provide initial state xk, k ← 0
93: At the end of control period k:
Predict Nk+h k, wk+h k, ppower for time h ∈ {1, H}
Solve tracking problem to obtain u(k + h k) for hori-
Electricity Price ($/MWh) 15
zons h ∈ {0, · · · , H}
Perform the reconfiguration using the capacity provi-
sioning module according to u(k k)
Figure 6: Electricity price of Houston, TX over 24
values of Q and R is to normalize both terms. In our case,
where Sik denotes the set of tasks running on the machine i
we normalize them by converting both terms into monetary
at time k ∈ N. It is clear that cik increases with the number
r = 1 (cavg,on + cavg,off ) where cavg,on and
of tasks and their costs. Consequently, the capacity provi-
cavg,off are the average cost for turning on and off servers,
sioning module turns off machines having the lowest cost cik.
respectively. Similarly, we define ¯
over is average cost introduced per machine due to provi-
Designing the MPC controller
sioning, and cavg
is average cost introduced per machine
The goal of our controller is to adjust the number of ma-
due to underprovisining, respectively. Even though it is pos-
chines to minimize the violation of KKT conditions, while
sible to compute analytically the values of cavg,on, cavg,off ,
taking into consideration the reconfiguration cost. As N
, it is more practical to estimate their values
through empirical measurement. Notice that we set ¯
k , ppower can change over time, we adopt the well-known
MPC framework to design an online controller for this prob-
to the average penalty cost of both positive and negative er-
lem. The MPC algorithm is illustrated by Algorithm 1. It
rors, because in practice, the number of occurrences of both
can be intuitively described as follows: At time k, the pre-
positive and negative errors will likely to be the same, if the
diction module is responsible for predicting the future values
capacity provisioned by our controller only fluctuates within
a fixed range. Finally, although we can set (Q, R) = (¯
k , wk , ppower for a prediction window H . The controller
will then solve an optimal control problem that will deter-
to ensure both terms are in the unit of dollar2, to simplify
mine the optimal decisions for the entire window H. As only
our model, we set (Q, R) = (1, ¯r2 ) in our experiment so that
the first step is required, the controller will only carry out
we only need to control R to achieve different trade-offs be-
the first control decision. The procedure will repeat at the
tween solution optimality and reconfiguration cost.
beginning of every time interval k, k + 1, and so on.
More formally, we can define Nk+h k,wk+h k, ppower as the
values of Nk, wk, ppower predicted for time k + h, given the
historical values up to time k. We also define
We have implemented our system shown in Figure 5 and
evaluated the quality of our solution using trace-driven sim-
ek+h k = xk+h k − x∗k+h k,
ulations. In our experiment, we set the CPU and memory
as the tracking error at time k, the objective of the controller
capacity of each machine to 1. This represents a majority of
is to solve the following program:
machines in the Google cluster2. In our simulation we imple-mented a greedy First-Fit (FF) scheduling algorithm, which
is used by many cloud computing platforms such as Euca-
Q(ek+h k)2 + R(uk+h k)2
lyptus [2]. In our simulations, we set E
idle to 200 Watts,
αr to 121 and 0 for CPU and memory, respectively, similar
s. t. xk+h+1 k = xk+h k + uk+h k,
∀0 ≤ h ≤ H − 1
to the values used in [18]. For electricity price, we used the
electricity prices for the city of Houston, Texas, obtained
k+h k = xk+h k − x∗
from a publicly available source [6]. Figure 6 shows the fluc-
0 ≤ xk+h k ≤ N,
tuation of electricity price over a duration of 24 hours. It
where H is the horizon of interest. The first term represents
can be observed that the electricity price is generally higher
the tracking error, the second term represents the control
during day time. The fluctuation sometimes can be as large
penalty. The tracking error aims to reduce the error be-
as 20% compared to the average electricity price over the 24
tween the actual and the optimal number of machines. The
hours. Finally, we set ¯
d to 10 seconds as an upperbound on
second term is the control penalty which takes into account
task scheduling delay.
the cost of adding or removing machines. Thus, Jk can beinterpreted as a plan of action for next H time intervals. Qand R are weight factors that will control the stability and
convergence rate of the controller. If Q is much bigger than
In our first experiment, we assess the performance of the
R, then the controller will place a higher weight on maximiz-
multi-step prediction for resource usages. We first describe
ing power savings and adjust number of servers aggressively.
the prediction procedure and the performance criteria. Then
On the other hand, if R is large compared to Q, then thecontroller will adjust the capacity less aggressively to mini-
2The values of CPU and memory capacity reported in Google
mize reconfiguration cost. A standard way to determine the
traces were normalized to the configuration of the largest machine.
50 100 150 200 250 300 350 400
Figure 7: Prediction of resource usage in the Google
Performance of multi-step prediction
cluster - one-step prediction - ARIMA(2, 1, 1).
- ARIMA(2, 1, 1).
Figure 7 shows the one-step prediction of the CPU and
memory usage compared to the real usage. The graph shows
that the predicted values are always close to the real ones
even during peaks. The prediction relative squared error
RSE1 is close to zero which proves that the ARIMA(2,1,1)
provides an accurate prediction of the usage either for CPU
(RSE1 ≈ 0.062) or memory (RSE1 ≈ 0.086).
Figure 8 depicts the effect of increasing the number of lags
used in the ARIMA model to predict CPU and memoryusage. Regarding CPU usage prediction (Figure 8(a)), it
Figure 8: Effect of the number of lags on usage pre-
is apparent from the results that starting from the second
diction - One-step prediction (h = 1).
lag (n = 2), the prediction error becomes stable aroundRSE1 ≈ 0.062. If we now turn to memory usage prediction,Figure 8(b) shows the prediction error remains almost stable
we study the effect of the number of lags and the prediction
regardless of the number of lags used for the ARIMA model
horizon on the prediction accuracy.
(RSE1 ≈ 0.086). Consequently, there is no improvement of
To evaluate the quality of our prediction technique, the
the prediction performance beyond two lags (n = 2). This
available traces (e.g., measured CPU or memory usage) are
result is interesting since a small number of lags reduces the
divided into a training data set and a validation data set.
ARIMA model complexity and allows to implement it online
Training data are used to identify the model parameters n,
with minimal overhead and high accuracy.
d, q and the coefficient φi and θj. For given n and q, the
We also conducted more experiments to examine the im-
coefficients φi, i ≤ n and θj, j ≤ q are estimated using
pact of the horizon h on the prediction performance. Since
the RPS toolkit [13]. The validation data set is then used
using more than two lags does not reduce the prediction er-
to assess the accuracy of the prediction model. The per-
ror, we only considered two lags as input for the ARIMA
formance metric used to evaluate the prediction accuracy is
model (i.e., n = 2). As expected, when we increase the pre-
the relative squared error (RSE). It is calculated for every
diction horizon, the prediction error grows for both CPU
prediction step h as:
and memory usage (Figure 9). What is interesting in theseresults is that the error remains small (RSE
h ≤ 1) for mul-
tiple prediction steps. In particular, the prediction error
h remains below 1 for 400 steps ahead (≈ 33 hour) for
CPU usage and for 50 steps ahead (≈ 250 min) for memory
where T is the size of the validation data set and µ is the
usage. We also mention that increasing the number of er-
average of Gk over the T time intervals. The advantage of
ror terms (q) for the ARIMA model does not improve the
the RSEh is that it neither depends on the used scale nor
prediction performance. In summary, these results suggest
on the size of data. Having the RSEh lower than 1 means
that we can apply ARIMA(2,1,1) using two lags to predict
that the predictor is outperforming the use of the average of
12 steps ahead (equivalent to one hour), and this ensures
the data as prediction for Gk (Gk+h h = µ). In addition, the
that the prediction error does not exceed 0.3 and 0.5 for
smaller is the RSEh , the more accurate is the prediction.
CPU and memory, respectively (Figure 9).
The RSE can also be seen as the ratio of the mean squarederror divided by the variance of validation data set.
Since our model exploits the predicted usage of the cluster
in terms of CPU and memory to proactively add and remove
We conducted several experiments to evaluate the perfor-
servers, we assess the prediction model accuracy. We applied
mance of our controller. In our experiment, we set the con-
the ARIMA model to the real data collected at the Google
trol frequency to once every 5 minutes to match Google's
cluster and we evaluated the effect of the number of lags
Cluster measurements frequency [4]. Typically, a high con-
used as input for the prediction model (n) and the effect
trol frequency implies fast response to demand fluctuations.
of the prediction horizon (h) on the multi-step squared er-
However, it also incurs a high computational overhead. How-
ror (RSEh). Memory and CPU usage are measured every
ever, we found the computational overhead of both demand
five minutes. Hence, a one-step prediction is equivalent to
prediction algorithm and controller to be almost negligible,
predict the cluster usage in the next five minutes.
thus, once every 5 minutes is a reasonable control frequency.
Energy Comsumption (kWh) 200
Average Scheduling Delay (second)
Figure 13: Average
Figure 14: Energy Con-
Queuing Delay.
Figure 10: Capacity vs. Usage over 24 hours(R=0.1).
Average cost per hour ($)
Number of active machines
Scheduling delay (seconds)
Figure 15: Average cost per hour.
Figure 11: Number of
Figure 12: Average
machines over 24 hours
scheduling delay over 24
in energy costs, which implies 20% reduction in energy cost,
hours (R=0.1).
while achieving the desired scheduling delay (for R = 0.225).
Furthermore, depending on the desired average schedulingdelay (Figure 13), our proposed approach can reduce total
Lastly, we set Q = 1 in our experiments. Thus, the recon-
operational cost by about 18.5 − 50% (Figure 15).
figuration cost can be controlled by properly adjusting thevalue of R.
In our experiments, we first evaluated the response of our
system to usage fluctuation. The number of active servers
Data centers have become a cost-effective infrastructure
provisioned over the 24-hour duration is shown in Figure 11
for data storage and hosting large-scale service applications.
(for R = 0.1). Figure 10 show the capacity and the usage
However, large data centers today consume significant a-
of the cluster. It can be observed that the controller ad-
mounts of energy. This not only rises the operational ex-
justs the number of servers dynamically in reaction to usage
penses of cloud providers, but also raises environmental con-
fluctuation, while avoiding rapid change in the number of
cerns with regard to minimizing carbon footprint. In this
active machines. The cumulative distribution function of
paper, we mitigate this concern by designing a dynamic ca-
task scheduling delay is shown in Figure 12. It can be seen
pacity provisioning system that controls the number of ac-
that more than 60% of tasks are scheduled immediately.
tive servers in the data center according to (1) demand fluc-
We performed several other experiments for comparison
tuation, (2) variability in energy prices and (3) the cost of
purpose. In the first experiment, the number of machines
dynamic capacity reconfiguration. Our solution is based on
is provisioned statically according to peak usage (i.e., 4100
the well-established Model Predictive Control framework,
machines). In the remaining experiments, we applied our
and aims to find a good trade-off between energy savings
controller using different values of R. Figures 13 and 14
and capacity reconfiguration cost. Simulations using real
show the corresponding average scheduling delay and en-
traces obtained from a production Google compute clusters
ergy consumption for different values of R compared to the
demonstrate our approach achieves considerable amount of
static provisioning. It can be observed that the static pro-
reduction in energy cost. As such, we believe our approach
visioning achieves the lowest scheduling delay since it sig-
represents an initial step towards building a full-fledged ca-
nificantly overprovisions the cluster capacity. On the other
pacity management framework for cloud data centers.
hand, dynamic provisioning with R = 0.5 causes a signifi-
There are several promising directions we can pursue in
cant scheduling delay although it allows to reduce the energy
the future. First, our current approach assumes that ma-
consumption (up to 50%). Furthermore, setting R to a small
chines are homogenous. While this is applicable to many
value (e.g., 0.02) does not achieve significant energy reduc-
situations (cloud providers often buy large quantities of iden-
tion. Through experiments, we found that setting R = 0.225
tical machines in bulk), recent literature suggests that pro-
achieves our desired SLA objective of keeping the average
duction data centers often consists of multiple types (some-
scheduling delay around 10 seconds while reducing the en-
times multiple generations) of machines. Extending our cur-
ergy consumption by 18.5%. Figure 15 shows the actual
rent solution to handle machine heterogeneity requires care-
energy cost per hour, taking into consideration the fluctua-
ful consideration of scheduling capability of each type of
tion of the electricity price. It can be seen that our dynamic
machine. Another interesting problem is to understand the
capacity provisioning mechanism reduces 7 dollars per hour
interplay between the scheduler and the capacity controller.
We believe it is possible to further reduce the cost of energy
Proceedings of the annual international symposium on
consumption and reconfiguration (i.e., task preemption and
Computer architecture (ISCA), 2007.
migration cost) if the scheduler and the capacity controller
[15] Y. Fu, C. Lu, and H. Wang. Robust control-theoretic
can cooperate tightly at a fine-grained level (e.g., interac-
thermal balancing for server clusters. In IEEE
tion of server consolidation algorithms with our capacity
International Symposium on Parallel Distributed
Processing (IPDPS), April 2010.
[16] B. Guenter, N. Jain, and C. Williams. Managing cost,
performance, and reliability tradeoffs for energy-awareserver provisioning. In IEEE INFOCOM, April 2011.
This work was supported in part by a Google ResearchAward, by the Natural Science and Engineering Council
[17] A. Gulati, A. Holler, M. Ji, G. Shanmuganathan,
of Canada (NSERC) SAVI Research Network, and by the
C. Waldspurger, and X. Zhu. VMware distributed
World Class University (WCU) Program under the Korea
resource management: Design, implementation, and
Science and Engineering Foundation funded by the Ministry
lessons learned. In VMware Technical Journal, 2012.
of Education, Science and Technology (Project No. R31-
[18] D. Kusic, J. O. Kephart, J. E. Hanson, N. Kandasamy,
and G. Jiang. Power and performance management ofvirtualized computing environments via lookaheadcontrol. In Proceedings of the International Conference
on Autonomic Computing (ICAC), 2008.
[1] Energy star computers specification - feb. 14, 2012.
[19] A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag,
and B. Maggs. Cutting the electric bill for
Internet-scale systems. In ACM SIGCOMM Computer
Draft 1 Version 6.0 Specification.pdf.
Communication Review, volume 39, 2009.
[2] Eucalyptus community. http://open.eucalyptus.com/.
[20] R. Raghavendra, P. Ranganathan, V. Talwar,
[3] Google cluster-usage traces: format + schema.
Z. Wang, and X. Zhu. No power struggles:
Coordinated multi-level power management for the
data center. In ACM SIGARCH Computer
Architecture News, volume 36. ACM, 2008.
[4] Googleclusterdata - traces of google workloads.
[21] B. Sharma, V. Chudnovsky, J. Hellerstein, R. Rifaat,
and C. Das. Modeling and synthesizing task placement
[5] Technology research - Gartner Inc.
constraints in google compute clusters. In Proceedings
of ACM Symposium on Cloud Computing, 2011.
[6] U.S. Energy Information Administration (EIA).
[22] Z. Shen, S. Subbiah, X. Gu, and J. Wilkes.
Cloudscale: Elastic resource scaling for multi-tenant
[7] Z. Abbasi, G. Varsamopoulos, and S. K. S. Gupta.
cloud systems. In Proceedings of the ACM Symposium
Thermal aware server provisioning and workload
on Cloud Computing, 2011.
distribution for Internet data centers. In Proceedings
[23] A. Verma, P. Ahuja, and A. Neogi. pMapper: power
of the ACM International Symposium on High
and migration cost aware application placement in
Performance Distributed Computing (HPDC), 2010.
virtualized systems. In ACM/IFIP/USENIX
[8] C. Bash, C. Patel, and R. Sharma. Dynamic thermal
Middleware, 2008.
management of air cooled data centers. In IEEE
[24] A. Verma, G. Dasgupta, T. Nayak, P. De, and
Intersociety Conference on the Thermal and
R. Kothari. Server workload analysis for power
Thermomechanical Phenomena in Electronics Systems
minimization using consolidation. In Proceedings of
(ITHERM), 2006.
the conference on USENIX Annual technical
[9] G. E. P. Box, G. M. Jenkins, and G. C. Reinsel. Time
conference. USENIX Association, 2009.
Series Analysis, Forecasting, and Control.
[25] G. von Laszewski, L. Wang, A. Younge, and X. He.
Prentice-Hall, third edition, 1994.
Power-aware scheduling of virtual machines in
[10] S. Boyd and L. Vandenberghe. Convex Optimization.
DVFS-enabled clusters. In IEEE International
Cambridge University Press, New York, USA, 2004.
Conference on Cluster Computing and Workshops
[11] G. Chen, W. He, J. Liu, S. Nath, L. Rigas, L. Xiao,
(CLUSTER), 2009.
and F. Zhao. Energy-aware server provisioning and
[26] X. Wang and M. Chen. Cluster-level feedback power
load dispatching for connection-intensive Internet
control for performance optimization. In IEEE
services. In USENIX Symposium on Networked
International Symposium on High Performance
Systems Design and Implementation (NSDI), 2008.
Computer Architecture (HPCA), February 2008.
[12] J. Dean and S. Ghemawat. MapReduce: Simplified
[27] Q. Zhang, J. Hellerstein, and R. Boutaba.
data processing on large clusters. Communications of
Characterizing task usage shapes in Google's compute
the ACM, 51(1), 2008.
clusters. In Workshop on Large Scale Distributed
[13] P. A. Dinda. Design, implementation, and
Systems and Middleware (LADIS), 2011.
performance of an extensible toolkit for resourceprediction in distributed systems. IEEE Trans.
Parallel Distrib. Syst., 17, February 2006.
[14] X. Fan, W.-D. Weber, and L. A. Barroso. Power
provisioning for a warehouse-sized computer. In
Source: http://www.savinetwork.ca/wp-content/uploads/savifunded/Raouf%20Boutaba_UWaterloo_Dynamic%20Energy-Aware%20Capacity%20Provisioning%20for%20Cloud%20Computing%20Environments_Proceedings%20of%20the%20IEEE%20ACM%20Intl.%20Conference%20on%20Autonomic%20Computing_Sep.12.pdf
The Infant Feeding Strategy 2008 – 2012 & Action Plans NHS Greater Glasgow & Clyde Section 1. The Introduction and Overview 1.1 Strategic Aims 1.2 Reducing Health Inequalities 1.3 Philosophy and Principles of Change 1.4 Engaging with Communities 1.5 Partnership Working 1.6 Responsibility for Achieving the Strategy Aims and Objectives
Undergraduate Research Symposium The Auburn Montgomery School of Sciences Table of Contents Schedule of Events . 5 Poster Session I . 6 Oral Session I . 7 Poster Session II. 8 Abstracts . 9-28 Phagocytic Activity in Bufo marinus . 10 Antibiogram of Coliform Bacteria Isolated from River Water . 11 Anuran Immunology and the Effect of Corticosterone on Basophil Proliferation . 12