1 Introduction

Cloud computing is a large-scale distributed computing paradigm supported by state-of-the-art data centers, in which a pool of computing resources is available to users via the Internet [1]. In recent years, increasing demand for computational resources has led to a significant growth in the number of physical servers, along with almost a double of the energy consumed by these servers and cooling infrastructures that support them [2]. Their share of power consumption is approximately between 1.1% and 1.5% of the total electricity used worldwide and is projected to rise even more [3]. High energy consumption gives rise to a large amount of operational cost which can accumulate more than the construction cost of servers and infrastructures in a short period [4]. Data center providers are going to great lengths to minimize their cooling costs, with some providers installing their data halls in cold weather climates [5, 6].This indicates the need for cloud service providers to adopt energy efficient resource management approach to ensure that their profit margin is not dramatically reduced due to high energy costs [7].

Virtualization technology allows cloud providers to create multiple VMs on a single physical server (PS) and provides performance isolation between applications running on the different VMs, which is widely adopted in modern cloud data centers (CDCs) [8]. By using live migration [9, 10], VMs distributed on multiple low-utilization servers can be dynamically consolidated on a fewer PSs and the idle ones can be switched to low-power modes to reduce energy consumption. This approach is known as VM consolidation, which is seen as an efficient solution to cut down electric cost of CDCs [11, 12]. However, VM migration may depress the performance of the migrated VM, or even temporarily interrupting its service [13]. At the same time, this process also aggravates the overheads of data transmissions and results in additional energy consumption of a CDC. All these negative influences belong to the migration cost (MC) caused by VM migration.

It is note that different types of VMs occupy different amount of physical resources (such as compute, memory, storage and bandwidth, etc.) and execute different applications, thus the migration costs caused by heterogeneous VMs are distinguishing heavily [14]. Although the MC of a single migrated VM is relatively low, the comprehensive MC of a modern CDC is quite considerable due to the fact that frequent VM migrations may be needed in the daily management of a CDC [15]. Sometimes, the MC is even higher than the benefit of energy saving by VM consolidation. Therefore, the migration cost caused by VM consolidation has become an important cost factor in CDCs and can’t be ignored any more.

Otherwise, another concerned factor, remaining runtime of VMs to be migrated, also has impact on the efficiency of VM consolidation. For example, if a VM with short remaining runtime is selected to be migrated, the MC of migrating the VM cloud be larger than the cost saving of consolidating the VM.

From above discussions, it can be seen that, in VM consolidation, migration costs and remaining runtimes of the migrated VMs are two important aspects, which can significantly influence the efficiency of VM consolidation. The two aspects shall be considered and properly addressed so that a more comprehensive VM consolidation algorithm for CDCs can be developed, which could help to decrease the energy consumption and electric cost of cloud providers, leading to a sustainable and green cloud system [15].

Many efforts have been focused on VM consolidation in CDCs [2, 3, 10,11,12,13] (details are given in Sect. 2). In the literature, most common methods to select VMs to be migrated are according to the factors of the number of migrations [2, 16], occupied resources of VMs [4, 17], or load balancing [17, 18] and etc. Migration cost and remaining runtime, as two important restraining aspects in VM consolidation, are still not well addressed. To the best of our knowledge, none of the existing VM consolidation algorithms has taken into account the factor of remaining runtime so far and only few research works on VM consolidation consider the factor of MC, which deserves further research. Therefore, in this paper, we make an endeavor to investigate the problem of MC-aware VM consolidation in CDCs by considering the influence of the two important impacted aspects. The studied problem is thus formulated as a multi-constraint optimization model. Then, a multi-step heuristic algorithm, called MC-aware VM consolidation (MVC) algorithm, is developed. Finally, extensive experimental studies have been conducted based on a real-world cloud trace. Experimental results show that, compared with some popular algorithms (i.e., algorithms presented in [17, 19]), the proposed MVC algorithm can effectively decrease the migration cost and, at the same time, guarantee the energy consumption within a certain low level at the cost of slightly relaxing 0.12 percentage point of SLA violation.

The rest of this paper is organized as follows. Section 2 introduces a review of the related references. Section 3 presents the details of MC-aware VM consolidation problem and its formulation. The descriptions of the developed MVC algorithm are presented in Sect. 4. Simulations, results and analyses are given in Sect. 5. Finally, we conclude this paper and present the future work in Sect. 6.

2 Related Work

Extensive attentions have been paid on energy-aware resource management in cloud computing environments. VM consolidation is seen as an efficient solution to improve resource utilization and reduce electric cost of CDCs. Recently, many studies have explored VM consolidation in CDCs and some VM consolidation methods have been proposed. For example, Beloglazov and Buyya [7] introduced a modified best fit decreasing (MBFD) algorithm by first sorting the VMs in the decreasing order and PSs in the increasing order on the basis of CPU processing capacity and then dynamically migrating VMs on PSs with suitable resources by using first fit decreasing algorithm. Sharma et al. [20] proposed a hybrid approach, HGAPSO, to minimize the energy consumption by saving the wastage of resource in a CDC. Perumal and Subbiah [21] proposed a worst fit heuristic VM placement algorithm to place the VMs over the PSs and a VM consolidation algorithm to improve power conservation while Mastroianni et al. [22] developed a probability-based VM consolidation algorithm, ecoCloud, in self-organization clouds. Marotta et al. [23] developed a simulated annealing based algorithm to solve VM consolidation problem by evaluating the attractiveness of the possible VM migrations. Cui et al. [24] studied policy-based VM consolidation problem under a multi-tier tree topology DC network environment and proposed a synergistic scheme to jointly consolidate network policies and VMs.

Guo et al. [13] and Zhao et al. [25] further took into account the association between VMs. In [13], the authors proposed a VM migration algorithm based on a group selection method, which is according to the degree of connectivity of the partitioned VM groups. Zhao et al. [25] developed a power-aware and performance-guaranteed VM placement algorithm to balance between saving energy cost and avoiding VM performance degradation in CDCs. While Fioccola et al. [26] introduced an energy-aware resource orchestration framework and a green migration based VM consolidation algorithm to minimize the overall power consumption and, at the same time, satisfy users’ QoS requirements.

Moreover, Gutierrez-Garcia et al. [17] and Xu et al. [18] further took load balance into account. In [17], a load balancing and energy-aware consolidation protocol algorithm was proposed to consolidate heterogeneous VMs in a distributed manner. While in [18], the authors developed a greedy-based load balance consolidation algorithm to minimize the number of active PSs and at the same time, balance the loads among these PSs. Although these research works have shown the effectiveness in saving energy consumption of CDCs, the algorithms presented above mostly ignored the impact of VM migrations.

Some studies aimed to reduce energy consumption while keeping the number of migrations as small as possible. For example, Beloglazov et al. [7] investigated the problem of energy-aware resource allocation of cloud data centers and proposed several heuristic algorithms to reduce the energy consumption via dynamic allocation of VMs. Ye et al. [27] tried to eliminate the performance degradation due to VM co-location and migration. They proposed a profiling-based VM consolidation framework and a polynomial time algorithm to minimize both the numbers of used PSs and VM migrations. Wolke and Pfeiffer [28] adopted several well-known vector bin-packing heuristics (such as first fit and best fit) to address this issue. Tao et al. [29] further took communications between VMs into account, formulated the problem of dynamic VM migration as a triple-objective optimization model and designed a binary graph matching-based bucket-code learning algorithm. Mann [30] studied VM consolidation in multi-core cloud environments by also taking into account three aspects: the number of used PSs, the number of migrations and the number of overload CPUs. The author adopted a constraint programming method to solve the proposed weighted cost function of the three aspects. However, these methods estimate the migration cost only in terms of the number of VM migrations, without considering the inherent heterogeneity of VM migration cost [9, 14].

Only few woks consider migration cost in VM consolidation. For example, Wu et al. [19] investigate how to minimize energy cost at low migration cost by designing a consolidation score function, in which the two conflicting objectives are normalized and summed up with different preference weights. They proposed an improved grouping genetic algorithm (IGGA) to obtain the maximum value of consolidation score by dynamic VM consolidation.

From above-mentioned review of the related works, it can be seen that many efforts have been focused on tackling the VM consolidation problem and proposed some effective approaches based on random selection migration (RSM), minimization number of migrations (MUoM), minimum migration time (MMT), resource-based migration (RbM), or balance-based migration (BbM) and evolution-based algorithms [31, 32]. However, the impact of migration cost, as an important concern in CDCs, has not been well solved and none of the existing approaches considers the impact of remaining runtimes of migrated VMs [33]. To overcome these disadvantages, this paper studies MC-aware VM consolidation problem and formulates the problem as a multi-constraint optimization model by considering the impact of migration cost and remaining runtimes of migrated VMs.

3 MC-Aware VM Consolidation Problem

3.1 Problem Description

In a normal operation period (T) of a CDC, VM consolidation is to dynamically adjust the mapping of VMs to suitable PSs, according to the dynamic nature of the CDC, such as arrivals of new VM requests, completed execution of VMs, changing of occupied resources of running VMs, and so on. Before presenting the details of the studied MC-aware VM consolidation problem, some definitions should be explicitly understood.

Definition 1

Time Slot (TS). Without loss of generality, assume that the whole normal operation period of a cloud data center is T. As shown in Fig. 1, the whole period of time T can be divided into |T| equal time units. The length of each time unit is t. One time unit is called a time slot, which can be seen as the essential process unit of time, and thus T can be represented by T = {1, 2, …, |T|}.

Fig. 1
figure 1

The division of the whole normal operation period of a cloud data center, T

Definition 2

Cloud Data Center (CDC). Without loss of generality, suppose that a CDC consists of M heterogeneous PSs, denoted by PS = {PS1, PS2, …, PSj, …, PSM}, in which PSj (1 ≤ jM) represent j-th PS. Denote by Ωj and Γj the number of CPU cores and memory size provided by physical server PSj, respectively. In a certain time slot t(1 ≤ t ≤ |T|), if there is at least one VM running on a PS, we call the server as an active server. All the active servers in time slot t constitute an active server set, represented by AS(t). Denote by m(t) the number of active servers in AS(t). Thus, AS(t) and m(t) satisfy the following relations: AS(t\( \subseteq \) PS and m(t) ≤ M.

Definition 3

Virtual Machine (VM). All the VM requests arrival to the CDC during the whole period of time T constitute a VM set V. Denote by N = |V| the number of VM requests in V. All the running VMs in time slot t constitute a running VM set V(t). Denote by n(t) the number of VMs in V(t). Thus, V(t) and n(t) satisfy the following relations: V(t\( \subseteq \) V and n(t) ≤ N. Let Vi represent the i-th VM in V(t), thus Vi ∈ V(t) and \( 1 \le i \le n(t) \). Denote by ωit and γit the amounts of CPU cores and memory required by VM request Vi during time t respectively. According to some real-life cloud environments [34], the occupied resources of VMs have a strong dynamic nature. Therefore, in order to improve the applicability, this paper deal with the scenario that the occupied resource capacities of VM requests are varied during different time slots, which means that the values of ωit and γit are changing with time. Denote by at(Vi) and rt(Vi) the arrival time and runtime of VM request Vi respectively, and then the running VM set V(t) in time slot t can be formally described as:

$$ {\mathbf{V}}\left( t \right) = \left\{ {V_{i} |\;at\left( {V_{i} } \right) < \text{t} \;{\text{and}}\;at\left( {V_{i} } \right) + rt\left( {V_{i} } \right) > t} \right\}. $$
(1)

where expression at(Vi) < t means that VM request Vi is submitted to the CDC by a certain cloud user before time slot t and expression at(Vi) + rt(Vi) > t means that the execution of VM request Vi is not completed before time slot t.

Definition 4

Hot Servers (HSs). For a certain PS, if the sum of the occupied resources of all VMs running on it exceeds its resource capacity for any kind of resources (such cores of CPU, Memory, etc.), then we call the PS as a hot server (HS). Once a physical server becomes a HS, it will inevitably violate the service level agreement (SLA), in which the cloud provider guarantees to provide the negotiated quality of service (QoS) to cloud users. The situation will result in loss of profit for the cloud provider. In order to prevent the time of violating SLA from increasing, one or more VMs running on the HS should be migrated to other PSs with adequate amounts of resources.

Definition 5

VM Mapping Policy (VMP). For any time slot t (1 ≤ t≤|T|), a mapping of VMs in V(t) to PSs in PS(t) is a VMP in time slot t, which can be denoted by a matrix X(t) = (xij)n(tm(t). If VM request Vi is running on physical server PSj, then xij = 1; otherwise, xij = 0. n(t) and m(t) are the numbers of running VMs and active PSs in the CDC during time slot t, respectively.

Definition 6

States of PSs. Suppose that all the PSs in the CDC can only be in normal-power state or low-power state. For any physical server PSj \( \in \) PS, if there is at least one VM running on the server, then PSj is an active server and in normal-power state; otherwise, PSj is in low-power state. Denote by sj(t) (1 ≤ jM) the state of physical server PSj during time slot t. The states of all the PSs in CDC during time slot t constitute a m-dimensional state vector S(t) = (s1(t), s2(t), …, sM(t)), where sj(t) = 1 indicates that physical server PSj is in normal-power state and sj(t) = 0 indicates that physical server PSj is in low-power state. S(t) can be calculated by the obtained VM mapping policy X(t), so we have

$$ s_{j} (t) = \left\{ {\begin{array}{*{20}c} {1,} & {\sum\nolimits_{i = 1}^{n(t)} {x_{ij} \ge 1;} } \\ {0,} & {{\text{otherwise}}.} \\ \end{array} } \right. $$
(2)

From the above-mentioned definitions, the MC-aware VM consolidation (MVC) problem can be formally described as follows: for any time slot t (1 ≤ t ≤ |T|), all the running VMs in the CDC constitute a set V(t) and n(t) = |V(t)|; all the active PSs in the CDC constitute a set PS(t) and m(t) = |PS(t)|; the current VMP is X(t) = (xij)n(tm(t); then the MVC problem is how to dynamically adjust the mappings of VMs in V(t) to minimization number of PSs in normal-working state to decrease the energy consumption of the CDC under and some given constraints, according to the dynamic characters of CDC. The mentioned dynamic characters include arrivals of new VM requests, changing of required resources of VMs and the completions of VM requests.

3.2 Optimization Model

VM consolidation is generally used to optimize resource usage and reduce the energy consumption of CDCs. Reducing the number of active PSs in cloud data center to serve the same amount of VM requests with similar performance is of great attractions for cloud providers. Similar to other research on VM consolidation [2, 11, 16,17,18,19], this paper uses the number of active PSs as the main metric to measure the degree of energy consumption of a cloud data center. Different with existing studies, we investigate the MVC problem in a long run of the whole normal operation period of a CDC, by conducting VM consolidation in each time slot according to the dynamic natures of the data center. Therefore, the studied problem is formulated as follows:

$$ {\text{Min}}\;\sum\limits_{j = 1}^{M} {s_{j} (t)} \quad \forall t \in \{ 0,1, \ldots ,|T|\} . $$
(3)

The optimization objective is to minimize the number of active physical servers in each time slot. Besides energy optimization, we also consider the other two important influencing factors (i.e., migration cost and remaining runtimes) in MVC and the proposed optimization model is subject to following constraints:

C1). Range of variables: the values of decision variables xij and sj can only be 0 or 1. Formally:

$$ \forall V_{i} \in {\mathbf{V}}(t),\quad \forall PS_{i} \in {\mathbf{PS}}(t)\,:\,x_{ij} \in \{ 0,1\} \;{\text{and}}\;s_{j} (t) \in \{ 0,1\} . $$
(4)

C2). VM mapping constraint: each VM request can only be allocated to exactly one PS to execute. Formally:

$$ \forall V_{i} \in {\mathbf{V}}(t):\quad \sum\limits_{j = 1}^{m(t)} {x_{ij} = 1} . $$
(5)

C3). Resource capacity constraint: the total CPU/memory requirements of VMs allocated on a physical server should not exceed its CPU/memory capacity. Formally:

$$ \forall PS_{j} \in {\mathbf{PS}}(t):\quad \sum\limits_{i = 1}^{n(t)} {x_{ij} \cdot \omega_{i} (t) \le \varOmega_{j} } \;{\text{and}}\;\sum\limits_{i = 1}^{n(t)} {x_{ij} \cdot \gamma_{i} (t) \le \varGamma_{j} } . $$
(6)

C4). SLA violation percentage (SVP) constraint: The maximum duration of a PS to be allowed to violate SLA must not be longer than one time slot. As the occupied resources vary with time, it could result in a situation that the total resource requirements of VMs allocated on a physical server exceed its resource capacity and violate the SLA. To decrease the time of SLA violation, one or more VMs running on the hot server need to migrate to other physical servers with adequate amounts of resources. Denote by SLAj(t) whether physical server PSj violates SLA during time slot t and it can be given by:

$$ SLA_{j} (t) = \left\{ {\begin{array}{*{20}l} {1,} \hfill & {\sum\nolimits_{i = 1}^{n(t)} {x_{ij} \cdot \omega_{i} (t) > \varOmega_{j} } \;{\text{or}}\;\sum\nolimits_{i = 1}^{n(t)} {x_{ij} \cdot \gamma_{i} (t) > \varGamma_{j} } ;} \hfill \\ {0,} \hfill & {{\text{otherwise}} .} \hfill \\ \end{array} } \right. $$
(7)

For the purpose of improving the QoS, the duration of SLA violation must be as short as possible during the whole operation period T. Therefore, in the proposed optimization model, we restrict that the sustained time of SLA violation for a PS must not be longer than a time slot.

C5). Remaining runtime constraint: as the analysis presented in Sect. 1, the MC of a VM with short remaining runtime cloud be larger than the cost saving of consolidating the VM. In order to avoid unnecessary VM migrations, the proposed optimization model restricts that the VMs whose remaining runtimes are less than a time slot will not be migrated.

Overall, the studied MC-aware VM consolidation is formulated as an optimization model with multiple constraints, i.e., minimizing the number of active PSs in each time slot (Eq. (3)) and constraints C1)–C5). From the first two constraints, it can be seen that this problem is actually a variant of the combinatorial optimization problem. Due to the NP-hardness of the studied problem, approximation algorithms that suffice to find a near optimal solution are more promising. Therefore, we propose a heuristic algorithm which is presented in next section.

4 MC-Aware VM Consolidation (MVC) Algorithm

In this section, we present the details of the developed MVC algorithm, which consists of four steps, i.e., pre-processing step, hot server removing step, VM placement step and VM consolidation step.

4.1 Pre-Processing (PP) Step

In this step, MVC algorithm mainly addresses two issues: firstly, for each VM running in the previous time slot, judging whether it has been completed or not, if yes, then MVC algorithm should release its occupied physical resource free; secondly, adjusting the occupied resources of the running VMs in current time slot t, according to their real-time resource requirements. The two aspects are realized by a pre-processing (PP) function whose pseudo-code is shown in Algorithm 1. The inputs of the PP function are the current time slot t and the set of running VMs in time slot (t − 1), V(t − 1). The output is the set of running VMs in time slot t, V(t). The detailed processes of the PP function are as follows:

figure a

Firstly, before the starting of each time slot t, PP function should free the occupied resources of VMs completed in time slot (t − 1) and shutdown the completed VMs (lines 2–7, Algorithm 1). The judging condition in line 3, at(Vi) + rt(Vi) ≥ (t − 1) and at(Vi) + rt(Vi) < t, means that VM Vi is still running in the start of time slot (t − 1) and completed before the start of time slot t. It indicates that VM request Vi is completed during time slot (t − 1).

Secondly, for every 5 time slots, PP function changes the occupied resources of each running VM according to its current resource requirements (lines 8–12, Algorithm 1). The purpose of this step is to accommodate to the dynamic natures of the real-life cloud scenarios. In google cluster-usage traces (version 2) which is one of the popular real-life cloud traces reported by Google Inc in 2014, the occupied resources of each VM change every 5 min periodically. Therefore, in the developed algorithm, we choose to dynamically change the required resources of running VMs for every five time slots as well.

4.2 Hot Server Removing (HS-Removing) Step

After the PP step, the MVC algorithm gets into HS-Removing step whose main assignment is to cut down the resource utilization of hot servers by migrating one or more VMs form them. The assignment is realized by HS-Removing function (as shown in Algorithm 2). The inputs of the HS-Removing function are: t-the current time slot t, V(t)-the set of running VMs, PS(t)-the set of active physical servers in time slot t, and the current VM mapping policy X(t). The output is a new VM mapping policy X′(t) after removing hot servers in time slot t.

In HS-Removing function, two important factors, MC and remaining runtime, are taken into account when chooses the VMs to migrate. The MC of VM Vi includes four parts, i.e., migration time Tmig(Vi), downtime Tdown(Vi), migration energy Emig(Vi) and performance degradation caused by migrating VM Vi [2, 9, 13, 14]. The fourth part is equal to increase the runtime of the migrated VM by 10% of its migration time because, in the migration process, the average performance degradation can be estimated as approximately 10% of the CPU utilization [2]. The other three parts are different in both unit and weight of their values. Thus, in order to find the suitable VM with minimal MC in the hot server, the HS-Removing function defines a parameter of cost factor, shown in Definition 7.

Definition 7

Cost Factor (CF). The cost factor of migrating a VM is a weighted normalization value of migration time, downtime and migration energy. If physical server PSj violates SLA in time slot t, denote by Vj(t) the VMs running on PSj. For each VM Vi in Vj(t), the cost factor of migrating VM Vi, denoted as CFmig(Vi), can be given by:

$$ CF_{mig} (V_{i} ) = \alpha \cdot \frac{{T_{mig} (V_{i} )}}{{\mathop {\hbox{max} }\limits_{{V_{i} \in {\mathbf{V}}_{j} (t)}} \{ T_{mig} (V_{i} )\} }} + \beta \frac{{T_{down} (V_{i} )}}{{\mathop {\hbox{max} }\limits_{{V_{i} \in {\mathbf{V}}_{j} (t)}} \{ T_{down} (V_{i} )\} }} + \gamma \frac{{E_{mig} (V_{i} )}}{{\mathop {\hbox{max} }\limits_{{V_{i} \in {\mathbf{V}}_{j} (t)}} \{ E_{mig} (V_{i} )\} }}. $$
(8)

In Eq. (8), the terms \( \mathop {\hbox{max} }\limits_{{V_{i} \in {\mathbf{V}}_{j} (t)}} \{ T_{mig} (V_{i} )\} \), \( \mathop {\hbox{max} }\limits_{{V_{i} \in {\mathbf{V}}_{j} (t)}} \{ T_{down} (v_{i} )\} \) and \( \mathop {\hbox{max} }\limits_{{V_{i} \in {\mathbf{V}}_{j} (t)}} \{ E_{mig} (v_{i} )\} \) represent the maximum of migration time, downtime and migration energy of migrating all the VMs in Vj(t), respectively. α, β and γ are weighting coefficients of three aspects and α + β + γ = 1.

figure b

The detailed processes of the HS-Removing function are as follows:

Initially, the values of elements in X′(t) are set as the values of corresponding elements in X(t).

Then, for each physical server PSj in PS(t), the HS-Removing function judges whether PSj is a hot server or not, according to Eq. (9). If yes, it will get into the following procedures (lines 3–21, Algorithm 2): First, for all VMs running on PSj, HS-Removing function finds out the VMs that can be completed during time slot t and then judges whether PSj can remove SLA violation when these VMs free their occupied resource; if yes, the HS-Removing function can guarantee the constraint of SLP without any VM migration and continue to judge next PSs (lines 4–10, Algorithm 2); otherwise, the HS-Removing function selects the VM, which has the minimum MC and can satisfy the SLA violation constraint after migration, and migrates the VM to other PSs with adequate resource capacity (lines 11-20, Algorithm 2).

$$ \sum\limits_{i = 1}^{n(t)} {x_{ij} \cdot \omega_{i} (t) > \varOmega_{j} } \quad {\text{or}}\quad \sum\limits_{i = 1}^{n(t)} {x_{ij} \cdot \gamma_{i} (t) > \varGamma_{j} } . $$
(9)

In line 18 of Algorithm 2, PS_Selection(Vk) function is used to select a destination PS for VM Vk, which needs to migrate. The pseudo-code of the function is shown in Algorithm 3 whose inputs are VM Vk and PS(t), and output is the selected destination server PSj. The detailed processes of the PS selection function are as follows: The function firstly selects the destination server for VM Vk from active server set PS(t) (lines 1–6, Algorithm 3). However, if there is no suitable PS exists in PS(t) to accommodate Vk, the function will start a new PS (lines 7–11, Algorithm 3).

figure c

4.3 VM Placement Step

Denote by Vin(t) the set of VM requests submitted to CDC by users during time slot t. For each VM request Vi in Vin(t), its arrival time satisfies the following expression: t ≤ at(Vi) < (t + 1). VM placement step is invoked at the start of time slot t and its main task is to deploy all the new VMs requests submitted during time slot (t − 1) to suitable PSs. The algorithm of this step is called VM placement (VMP) function whose pseudo-code is shown in Algorithm 4. The inputs are the current time slot t, Vin(t − 1), PS(t) and X(t). The output is a new VM mapping policy X′(t) after deploying the VM requests in Vin(t − 1) to PSs in PS(t). The detailed processes of the VMP function are as follows:

Initially, the values of elements in X’(t) are set as the values of corresponding elements in X(t) (line 1, Algorithm 4).

Then, VMP function sorts the active PSs in ascending order on the basis of the sum of residual CPU and memory capacity, such as PS1, PS2, …, PSm(t) (line 2, Algorithm 4).

Third, for each new VM request Vk, VMP function first checks whether there is a suitable PS to deploy Vk from active server set PS(t) (line 4, Algorithm 4). If yes, VMP function deploys VM request Vk to the selected physical server PSj and add Vk to the running VM set V(t) (lines 5–10, Algorithm 4). Otherwise, the VMP function will start a new physical server and deploys VM request Vk to the new started server (lines 12–18, Algorithm 4).

figure d

4.4 VM Consolidation (VMC) Step

In VMC step, the proposed MVC algorithm adopts threshold-based approach, which has been proven to be an effective approach [7, 20, 35]. The main task of this step is to consolidate VMs running on under-utilized physical servers. Denote by SUMTmig(PSj) the overall migration time of physical servers PSj, which is equal to the sum of migration times of all the VMs running on PSj. Thus, we have:

$$ SUMT_{mig} (PS_{j} ) = \sum\limits_{{V_{i} \in {\mathbf{V}}_{j} (t)}} {T_{mig} (V_{i} )} . $$
(10)

The inputs of the VMC function include the defined lower utilization threshold ULower, PS(t) and X(t). The output is a new VM mapping policy X′(t) after VM consolidation in time slot t. The pseudo-code of VMC function is shown in Algorithm 5. When there exists a physical server PSj whose sum of CPU and memory utilization is lower than ULower in PS(t), VMC function will execute the following processes(lines 2–16, Algorithm 5): First, calculates the overall migration time SUMTmig(PSj) according to Eq. (10). Second, for each VM (such as Vi) running on PSj, VMC function will judge whether the remaining runtime of Vi is shorter than SUMTmig(PSj) or not. If yes, it means that VM Vi can be completed before SUMTmig(PSj) and there is no need to migrate VM Vi. Thus, the function leaves VM Vi continually running on physical server PSj until Vi is completed (lines 5–7, Algorithm 5). Otherwise, VMC function will migrate VM Vi to a suitable physical server (lines 8–11, Algorithm 5). Finally, after processing all the VMs running on PSj, VMC function will switch physical server PSj to low-power state when there is no VM running on it (lines 14 and 15, Algorithm 5).

Continue the above-mentioned processes until the resource utilizations of the physical servers in PS(t) are all higher than ULower.

figure e

Hereinbefore, we introduce the four steps of the developed MVC algorithm in details. The overall flow of MVC algorithm is as follows: at the start of each time slot t (1 ≤ t ≤ |T|), MVC algorithm firstly executes pre-processing function to free the occupied resources of VMs which are completed in time slot (t − 1), and dynamically changes the occupied resources of running VMs according to their current requirements at each time slot t, where t%5 = 0. After changing the occupied resources, some physical servers may become hot servers and violate SLA. Once a physical server violating SLA, MVC algorithm will call HS-Removing function to migrate one or more VMs from the hot server to avoid going against the SVP constraint of the proposed optimization model. Then MVC algorithm invokes VMP function to deploy the new VM requests, submitted by users during time slot (t − 1), to suitable physical servers. Finally, MVC algorithm execute VMC function to consolidate VMs running on low-utilization servers and switch the ideal ones to low-power state to cut down the energy consumption of the cloud data center.

5 Performance Evaluation

A series of experiment studies has been conducted to verify the feasibility of the proposed optimization model and the performance of the proposed MVC algorithm. The experimental configurations and result analyses are presented in the following sections.

5.1 Experimental Configurations

Our experiments are conducted based on CloudSim toolkit [36], which is a modern and extensible cloud simulation framework and widely adopted by most related research. The setups of cloud infrastructure and VM requests used in the experiments are extracted from a real-life cloud scenario, Google cluster-usage traces [34]. The reported traces contain the records from a cluster of about 12.5 k machines over about a month-long period. From which, we generally choose 29,944 task records from task event tables generated in first 12 h and 1046 physical machine records from a machine event table. Each task record, which can be seen as a VM request, contains attributes of arrival time, runtime, dynamic CPU requests and memory requests that change every 5 min. The last two attributes are normalized values from 0 to 1, in which 1 represents the maximum value of the CPU/memory request in the task event table. Each machine record can be seen as a physical server, which contains attributes of machine ID, states, CPU and memory capacity, which are also normalized from 0 to 1. These normalized values are directly used in the experiment because the CPU/memory capacities of physical servers are about 4–10 times as large as the CPU/memory requests. The total length of simulation time continues 12 h and the time slot defined in Definition 1 is set as 1 min. Thus the whole normal operation period (12 h) of the simulated cloud data center is divided into 720 time slots. The three weighting coefficients in Eq. (8), i.e., α, β and γ, are set as 0.3, 0.3 and 0.4 respectively. The lower utilization thresholds of CPU and memory are both set to 0.2.

5.2 Analyses of Results

To evaluate the performance of the developed MVC algorithm, we compare it with other two recent and representative algorithms on VM consolidation, i.e., energy-aware server consolidation protocol (ESCP) [17] and improved grouping genetic algorithm (IGGA) [19], on three aspects of metrics: energy consumption, VM migration and SLA violation percentage (SVP). In order to decrease causal factors’ influences, each experiment has been run 10 times and the results presented in the paper are the mean value of the results obtained by the 10 experiments.

5.2.1 Energy Consumption

In this aspect, we compare mainly two energy consumption related metrics, i.e., number of active physical servers in each time slot and the total number of active physical servers in the whole experimental period, i.e., 12 h.

Active physical servers are the physical servers used in each time slot. The smaller value of the number of active physical servers, the lower energy consumption of the cloud data center. As shown in Fig. 2, to execute the same number of VM requests submitted by users, the number of used physical servers obtained by the proposed MVC algorithm is smaller than that of ESCP algorithm, and slightly more than that of IGGA algorithm. More specially, Table 1 shows the results of the total number of used physical servers during the whole experimental period. The total used physical servers of MVC algorithm is 7.5% smaller than that of ESCP algorithm and 2.7% slightly more than that of IGGA algorithm. It means that IGGA algorithm and the proposed MVC algorithm can be more effective on energy saving than ESCP algorithm in the long run. The reason lies in: for a VM to be allocated, MVC algorithm always chooses the physical server with the minimal residual resource capacity that can satisfy resource requirements of the VM, while ESCP algorithm prefers to select the physical server with lowest resource usage, which will have bad impact on the step of VM consolidation. Both of the two algorithms don’t tolerate SLA violation and they will invoke hot server removing function when a physical server becomes a hot server. However, IGGA allows all physical servers execute workloads out of their proceeding capacities and thus IGGA needs the smallest physical servers to process the same number of VM requests.

Fig. 2
figure 2

Number of used physical servers in each time slot

Table 1 Total no. of used physical servers during the whole experimental period (12 h)

Otherwise, another phenomenon shown in Fig. 2 is that the results obtained by the three compared algorithms fluctuate heavily in different time slots. This is caused by the trend of new arrival VM requests to the data center. The number of VM requests submitted to the data center in each time slot is presented in Fig. 3. It can be seen easily that the number of active physical servers in each time slot has almost the same trend with the new arrival VM requests to the data center.

Fig. 3
figure 3

Number of new arrival VM requests in each time slot

5.2.2 VM Migration

Generally, the VM migration related metrics are the total number of migrated VMs (TMV) in the whole experimental period, the total migration cost and the average migration cost. The total number of migrated VMs in the whole period equals the sum of migrated VMs in each time slot and so does the total migration cost. The average migration cost is the total migration cost divided by the total number of migrated VMs. These three metrics is mainly used to analysis the negative effective of VM migrations of different algorithms.

Table 2 shows the results of the three studied metrics obtained by the three compared algorithms during the whole experimental period. It can be seen that, the values of the three metrics obtained by MVC algorithm and IGGA algorithm are near the same and both of the two algorithms can both perform well on the three metrics. Specially, the results of TMV, total MC and average MC obtained by MVC algorithm are 13.6%, 41.7% and 32.7% respectively smaller than these of ESCP algorithm, which means that the proposed MVC algorithm can guarantee a relatively low impact on the performance of the cloud data center during the process of VM consolidations. The reasons are as follows: firstly, when choosing VMs to be migrated in HS-Removing step, the proposed MVC algorithm takes the factor of remaining runtime into account and if the remaining runtime of a selected VM is less than one time slot, then MVC algorithm doesn’t migrate it, and thus can reduce some unnecessary migrations; secondly, when there is a need to migrate VM, the proposed MVC algorithm choose the VM with lowest migration cost from eligible VMs; however, ESCP algorithm selects the migrated VM with maximum occupied CPU/memory resource to avoid frequent SLA violation, and thus the average MC obtained by MVC algorithm is 44.6% lower than that obtain by ESCP algorithm. Therefore, considering the impacts of remaining runtime and migration cost in choosing VMs to be migrated can effectively avoid unnecessary migration and reduce additional negative influences caused by VM migrations during the process of VM consolidation.

Table 2 Results obtained by different algorithms during the whole period (12 h)

5.2.3 SLA Violation Percentage (SVP)

SVP is the percentage of SLA violation time in the whole period and can be calculated by the total SLA violation time of all hot physical servers divided to the total running time of all active physical servers in the whole period T, which can be given by Eq. (11). The smaller the value is, the lower time that a physical server violates SLA.

$$ SVP = \frac{{\sum\nolimits_{t = 1}^{|T|} {\sum\nolimits_{j = 1}^{m(t)} {SLA_{j} (t)} } }}{{\sum\nolimits_{t = 1}^{|T|} {m(t)} }} $$
(11)

In the conducted experiments, the whole period T is divided to |T| time slots with equal length. Thus, in Eq. (11), the length of SLA violation for each physical servers equals to the times of the server’s violating SLA.

Table 3 shows the results of SVP obtained by the compared algorithms during the whole experimental period. It can be seen that IGGA algorithm violates SLA more frequently than the other two algorithms and the SLA violation percentage is up to 5.62%. This is because IGGA algorithm doesn’t consider the factor of service level agreement and allows a hot server executes workloads out of its proceeding capacities, and thus will cause higher probability to violate SLA. While MVC algorithm and ESCP algorithm will invoke VM migration immediately once a physical server becomes a hot server and violates SLA. Moreover, in MVC algorithm, physical servers are permitted to violate SLA within a constrained range. The constraint is that the sustainable time of violating SLA for each physical server must not be longer than one time slot (1 min in experiments). With this constraint, once a physical server violate SLA, MVC algorithm firstly judges whether there is one or more VMs whose remaining runtimes are short than a time slot. If yes, this VM (or these VMs) can be completed within a time slot and the hot server will not violate SLA any longer. In this situation, the hot server can satisfy the SVP restrict, constraint C4 of the proposed optimization model (in Sect. 3.2), and there is no need to migrate any VM. That is why the SVP of the proposed MVC algorithm is slightly higher than that of ESCP algorithm.

Table 3 SLA violation percentage of different algorithms during the whole period (12 h)

6 Conclusions

Energy consumption has become one of the major concerns for the owners of CDCs. VM consolidation is seen as an effective approach to improve energy efficiency of CDCs, but it may depress the performance of migrated VMs, or even temporarily interrupting their services, during the migration period. As a result, these negative influences caused by VM migrations should not be ignored. Some studies focus on minimizing energy consumption while keeping the number of migrations as small as possible. However, these methods estimate the migration cost only in terms of the number of migrations and fail to take into account the inherent heterogeneity of costs caused by migrating different VMs. In order to decreasing the negative influences of VM migrations, this paper develops a MC-aware VM consolidation (MVC) algorithm by considering remaining runtime and migration cost when choosing VMs to be migrated, Simulation results show that, compared with some popular algorithms, MVC algorithm can decrease the migration cost and, at the same time guarantee the energy consumption within a certain low level.