1 Introduction

Recently, workflow has become a popular paradigm to model the execution process of big data application in distributed environments such as clouds and clusters [1]. Since then, research on workflow scheduling has got much attention with an objective of producing optimal execution time. It is well-known that workflow scheduling in a heterogeneous environment is NP-Complete [2]. Traditionally, optimizing the overall execution time (makespan) of the workflow has been an important and common objective of workflow scheduling. Many works in literature have designed different heuristic algorithms to get the schedule with minimized makespan. However, as using cloud computing gains popularity, makespan optimization is no longer the only objective to be considered for optimization during workflow scheduling. Many other significant objectives that can be recognized as important as makespan have arisen such as cost, energy, reliability, utilization, etc. Those objectives need to be taken into consideration together with makespan during workflow scheduling. Therefore, modern cloud workflow scheduling algorithms must be able to optimize more than one objectives at the same time.

Generally, the main concern for cloud computing customers, when selecting virtual machine (VM) instances to execute their workflows, is the monetary cost. Cloud VM instances renting price is charged based on the computation capacity which can be reflected directly to the CPU frequency settings; for example, the pricing models adopted by CloudSigma [3] and Elastichosts [4] charge customers based on the selected CPU frequencies assigned to the VM instances during the execution of each workflow task. These VM instances’ CPU frequency parameters are set in between maximum and minimum with a variation step frequency. Customers are likely to choose VM instances with lower price, but optimal makespan schedule will remain a critical requirement that cannot be neglected. In case of single objective optimization, VM instances running at high CPU frequency are a better choice for makespan optimization, while VM instances running at low CPU frequency may be a better choice for cost optimization. However, it is not easy to make an appropriate CPU frequency selection if both objectives need to be taken into consideration. In addition, it is not a wise idea to execute all the tasks on the VM instances running only on high or low CPU frequencies, because such strategy can not handle the execution time difference of the workflow tasks and/or the complexity caused by the data dependencies among tasks. This strategy may also affect execution of each individual task which may directly reflect to the total makespan and the total monetary cost. Therefore, with such pricing scheme, an interesting challenge that arise to the customers will be how to properly select VM and tune their CPU frequencies for each tasks so that makespan and the cost of executing his/her application are minimized. The objective of this paper is to provide the solution toward such problem.

Makespan which is defined as the overall completion time of the whole workflow can be divided into two parts: summation of the execution time of the workflow tasks in critical path and the data transfer time that comes from tasks dependencies, especially when the two adjacent tasks are scheduled on different VM instances. The latter depends on the bandwidth of the transmission line, while the former depends on the CPU frequency allocated to the VM instance during the execution process of each task. In this paper, we only focus on the variability of the CPU frequency under fixed bandwidth. The renting cost of VM instances depends on the execution time of each task and the time unit charge rate which is normally decided by the selected CPU frequency according to a certain pricing models such as linear, superlinear and sublinear(more information about those pricing models can be found in [5]). Selecting high CPU frequency will result in smaller execution time of each task, however, the cost reduction achieved by this small execution time will not afford the cost increased by selecting this CPU frequency. This is to say, using VM instances with high processing capacity(CPU frequency) may ensure the minimal completion time of the workflow(makespan) under a high cost while using VM instances with low processing capacity(CPU frequency) may ensure minimal monetary cost under high completion time. As a result, selecting VM’s CPU frequency for the improvement of the total cost values cannot be achieved without deteriorating the makespan values. So in this context, cost and makespan are conflicting objectives. Note that there is no single solution that can optimize both objectives at the same time. In this case, decision-makers may have to select the final preferred solution from the Pareto optimal objective vectors. Therefore, approximating the set of all the Pareto optimal objective vector is the appropriate way to deal with this multi-objective optimization problem.

It is well-known that, solutions to a multi-objective optimization problem under trivial situations could be an optimal solution of a scalar optimization problem in which the objective is an aggregation of all the objectives [6]. Therefore, the approximation of the Pareto Front(PF) can be decomposed into a number of scalar objective optimization sub-problems. To the best of our knowledge, none of the majority of the current state of the art multi-objective workflow scheduling algorithms [7,8,9,10,11] considered decomposition.

Therefore, with cost and makespan minimization in mind, in our previous work [12], we proposed a Workflow Scheduling Algorithm Based on Decomposition(WSABD). Given a scientific workflow with deterministic model of execution time and communication time, and a set of resources with variable CPU frequencies and cost, WSABD starts with initialization step using CFMax [14], then updates the functional values, finally, checks for the satisfaction of stopping criteria. The update step will be executed cyclically until the stopping criteria is satisfied. Then the algorithm will return the set of Pareto Front solutions. Different from the Evolution algorithm, WSABD speeds up its runtime to generate final schedule by using a search operation rather than overlapping mutations. The main contribution of that paper [12] is to propose a novel workflow scheduling algorithm with three variants, which incorporates decomposition approaches in workflow scheduling and uses search operation rather than mutations. In this paper, we significantly extend our previous work [12] by:

  • Studying the performance of the algorithm when different decomposition approaches are considered.

  • Studying the performance of the algorithm when different DAG structures are used.

  • Studying the performance and the runtime of the proposed algorithm when different pricing models are used.

  • Using the Hyper-volume indicator to investigate the quality of the set of PF solutions generated by the proposed algorithm.

  • Studying the performance of the proposed algorithm when different settings such as population, iteration number, and the number of VMs are used.

  • Studying the time complexity and the time overhead of the proposed algorithm.

The rest of this paper is organized as follows: Sect. 2 presents the related works, Sect. 3 describes the system and application model. Section 4 identifies the problem to be solved. The proposed algorithm is described in Sect. 5. Evaluation settings and findings are presented in Sect. 6. Finally, Sect. 7 concludes this paper and summarizes the future works.

Fig. 1
figure 1

DAG example with communication time

Table 1 An example of execution time
Table 2 An example of CPU frequency settings for 3 VMs

2 Related works

Workflow scheduling and resource provisioning have become the fundamental research topic in the cloud computing platform. A remarkable number of works have been done to deal with optimization problems, such as single objective, bi-objective or multi-objective optimization problem. Among those focused on optimization of makespan as a single objective, HEFT [15] is the lightweight workflow scheduling heuristic for a heterogeneous environment like a cloud. Given a set of VM instances, HEFT ranks tasks according to their priority values, and then schedules them one after another to these VMs, aiming at minimizing the overall execution time of whole workflow with the consideration of the data transfer time among tasks. Because of its low complexity, HEFT has been employed by other researchers to provide new workflow scheduling algorithms [5, 14, 17,18,19]. With the objective of mapping all the workflow tasks to the available VMs so that makespan and cost are minimized, [20] proposed a bi-objective algorithm which is a hybrid of HEFT and GSA (Gravitational Search Algorithm).

As cloud computing emerges, modern workflow scheduling algorithms have to be able to optimize more than one objective. Different researches have been carried out to respond to this trend [8,9,10,11, 21, 22]. Mostly, multi-objective workflow scheduling algorithms rely on finding the Pareto set and then finding non-dominated solutions from the Pareto set. Different from other usual objectives, the work presented in [9] designed a new systematic method that considers both task security demands and interactions in securing task placement in the cloud. This work proposed a heuristic algorithm that is based on task completion time and security requirements. Most of the multi-objective workflow-scheduling algorithms considered two to three objectives at once, [8] focused on the optimization of Task Scheduling using a novel approach: Dynamic dispatch Queues (TSDQ) and hybrid meta-heuristic algorithms; And proposed two hybrid metaheuristic algorithms: One based on Fuzzy Logic with Particle Swarm Optimization algorithm (TSDQ-FLPSO), and other based on Simulated Annealing with Particle Swarm Optimization algorithm (TSDQ-SAPSO).

The proposed algorithm approximates the optimal solution by considering user-specified constraints on objectives in a dual strategy: maximize the distance to the user’s constraints for dominant solutions and minimize it. Evolutionary algorithms are an excellent way to solve the multi-objective optimization problem. However, they are designed for non-constrained problems. With the aim of investigating the proper task-VM mapping plan to minimize the total financial cost and the degree of imbalance under deadline constraints, the algorithm proposed in [23] modifies NSGA-II (Non-dominated Sorting Genetic Algorithm-II), and then makes it accept constraints. The modified version is used to solve the considered optimization problem. Inspired by the hybrid chemical reaction optimization algorithm, [24] proposed an energy-efficient workflow scheduling algorithm Even-though the proposed algorithm is for energy reduction, it also minimize the makespan of the schedule. This study, come up with a novel measure of determining the amount of energy to be saved when considering a DVS-enabled environment. Decomposition is a traditional multi-objective optimization strategy that decomposes a multi-objective optimization problem into a number of scalar optimization problems and optimizes them simultaneously. [6] presented a multi-objective evolutionary algorithm that is based on the decomposition techniques. However, this work is not designed for workflow scheduling purposes.

Differentiating from the work presented above, this paper proposes a workflow-scheduling algorithm based on decomposition. Our algorithm uses a search operation to get a new solution rather than overlapping mutations. To generate the initial population our algorithm employs CFMax [14].

3 System and application model

3.1 Application model

We assume the presence of cloud computing VM instances that are charged based on the pay-as-you-go basis of the CPU frequency used to execute each task in the workflow. Each allocated VM instance is provisioned from the beginning of the execution time of the task until its completion time. Information about data transfer between tasks and execution time of tasks when the VM instances run at their maximum CPU frequency is known in advance as illustrated in Fig. 1 and Table 1 respectively. We also consider a workflow application modeled as Directed Acyclic Graph(DAG) \({G=(T,D)}\), where T represents a set of interdependent tasks \({T=\{t_{1},t_{2},..,t_{n}\}}\) and D represents a set of intermediate data to be transferred between two adjacent tasks \({D=\{d_{ij}\}}\) (Fig. 1 for illustration). We use \({pred(t_{i})}\) to determine a set of predecessors of task \({t_{i}}\), and \({Succ(t_{i})}\) to determine a set of successors of task \({t_{i}}\). If task \({t_{i}}\) is adjacent to task \({t_{j}}\), task \({t_{i}}\) is a parent of task \({t_{j}}\), \({t_{j}}\) is a child of task \({t_{i}}\). Task \({t_{j}}\) can not start its execution before all its parents are completed and transmitted all required data \({d_{ij}}\) to it. If a task is executed on the VM instance using a CPU frequency lower than the maximum, its execution time can be calculated by:

$${\begin{aligned} ET_{(t,f)}=\left( \beta \cdot (\frac{f_{max}}{f}-1)+1\right) \cdot ET_{(t,f_{max})} \end{aligned}}$$
(1)

where \({ET_{(t,f_{max})}}\) is the execution time when task \({t_i}\) runs at the maximum CPU frequency and the parameter \({\beta }\) indicates the impact of the CPU frequency on task execution time in the range of 0 to 1. In this paper, we set \({\beta }\)=0.4 by default.

3.2 System model

The heterogeneous VM instances operate on variant CPU frequencies in between maximum and minimum value (\({f_{min}, f_{max}}\)) with a step \({f_{step}}\) that determines the variability level as illustrated in Table 2. Each VM instance is charged according to the CPU frequency allocated to each task. We adopted three pricing models presented in [19]. Let \({C_{(m,f)}}\) represent the price charged per time unit of a VM instance m with CPU frequency f, \({C_{(m,f_{min})}}\) represent the price charged per time unit of a VM instance m running at minimum CPU frequency \({f_{min}}\), and \({\delta }\) represent the coefficient to tune the charging rate of the price according to f. Then, with the linear pricing model, \({C_{(m,f)}}\) can be calculated as bellow:

$${\begin{aligned} C_{(m,f)} =C_{(m,f_{min})}+\delta _{m}.\frac{f_{i}-f_{min}}{f_{min}} \end{aligned}}$$
(2)

With Super-linear pricing model \({C_{(m,f)}}\) can be calculated as bellow:

$${\begin{aligned} C_{(m,f)} =C_{(m,f_{min})}+\delta _{m}.\left( \left( 1+\frac{f_{i}-f_{min}}{f_{min}})\cdot \log (\frac{f_{r}-f_{min}}{f_{min}}\right) \right) \end{aligned}}$$
(3)

With Sub-linear pricing model \({C_{(m,f)}}\) can be calculated as bellow:

$${\begin{aligned} C_{(m,f)} =C_{(m,f_{min})}+\delta _{m}.\log \left( 1+\frac{f_{i}-f_{min}}{f_{min}}\right) \end{aligned}}$$
(4)

Let also \({EC_{(t,m,f)}}\) denote the task execution cost on the VM instance running at a frequency f. \({EC_{(t,m,f)}}\) is calculated as:

$${\begin{aligned} EC_{(t,m,f)}=ET_{(t,f)}\cdot C_{(m,f)} \end{aligned}}$$
(5)

The total cost to execute the whole workflow tasks is calculated as:

$${\begin{aligned} TC = \sum _{\forall (t,m)\in S}EC_{(t,m,f)} \end{aligned}}$$
(6)

where S is the schedule which describes the tasks-VM mapping and the operating CPU frequency of each VM instance.

4 Problem formulation and basics of decomposition algorithm

In this paper we consider the problem of minimizing cost and makespan as a multi-objective optimization problem (MOP) which can be written as follows:

$${\begin{aligned} \begin{aligned} minimize ~~ F(x)= (f_{1}(x),...,f_{m}(x))^{T}\\ Subject ~~ to ~~ x \in ~~ X \end{aligned} \end{aligned}}$$
(7)

where \({{X}}\) is the decision space (variable space), F:\({{X}}\) \({\rightarrow }\) \({R^{m}}\) consists of m values, m is the number of objective functions and \({R^{m}}\) is the the objective space.{\({F(x)~~x\in ~~{X}}\)} is a set called attainable objective set. Mostly, the objectives in Eq. 7 contradict each other. The only possible way to balance them is to find the trade-off among them which can be achieved by using Pareto optimality.

The aim of multi-objective optimization algorithms is to find the trade-off between contradicting objectives. During multi-objective workflow scheduling, there may be a big or even infinite number of solutions. However, only non-dominated solutions can be taken by decision-makers for the selection of the final preferred solution. A solution \({S_{a}}\) is said to dominate a solution \({S_{b}}\) if and only if \({S_{a}}\) is better than \({S_{b}}\) in both objectives. \({F(x^{'})}\) is said to be Pareto Optimal if there is no solution x such that F(x) dominates \({F(x^{'})}\). This means that any change in Pareto optimal values for the satisfaction of one objective must lead to the change in at least other objective. A set of all Pareto optimal solutions is called Pareto Set(PS), and a set of all Pareto optimal objective vectors is called Pareto Front(PF). Some mathematical models have been developed to approximate PF. However, it is well-known that Pareto Optimal solutions for a multi-objective problem under slight conditions can be the optimal solutions of a scalar optimization problem in which objective is a combination of both the weight vectors [6]. Hence, the approximation of the PF can be decomposed into a number of scalar objective optimization sub-problems. In this paper, we adopted five decomposition approaches: Weighted Sum (WS), Tchebycheff (TE), Penalty Boundary Intersection (PBI), Modified Tchebycheff (MTE) and NIMBUS, to decompose the problem of approximation of the PF into a number of scalar optimization problems.

4.1 Weighted sum approach

This approach considers a concave combination of different objectives. Let \({\lambda =(\lambda _{1},...,\lambda _{m})_{T}}\) represent the weight vector, i.e., \({\lambda _{i}\ge 0}\) for all \({i=1,..,m}\) and \({\sum _{i=1}^{m} \lambda _{i} = 1}\). The Pareto optimal point of the Eq. 7 can be calculated as follows:

$${\begin{aligned} \begin{aligned}&minimize\, g^{ws}(x\vert {\lambda })=\sum \limits _{i=1}^{m}\lambda _{i}f_{i}(x)\\&\quad Subject\,to \, x \in \, X \end{aligned} \end{aligned}}$$
(8)

where \({g^{ws}(x\vert {\lambda })}\) is used to express that \({\lambda }\) is a coefficient vector in the objective function, x represents the values of the variables to be optimized (cost and makespan in our case), \({\lambda }\) is used as a weight vector that facilitates the approach to generate a set of different Pareto optimal vectors. This approach will mostly work well as long as the PF is convex (concave in the case of maximization). However, in real-world not all Pareto Front vectors are concave or convex. In case of non-concave this approach will hardly work. To overcome these deficiencies, some efforts have been made to incorporate other techniques such as \({\epsilon -constraint}\) into this approach.

4.2 Tchebycheff approach

The scalar optimization problem can be represented as follows:

$${\begin{aligned} \begin{aligned}&minimize\,g^{te}(x\vert {\lambda }, z) = \max _{1 \le i \le m} \{\lambda _{i} \vert f_{i}(x)-z_{i}\vert \}\\&\quad Subject \,to \, x \in \, X \end{aligned} \end{aligned}}$$
(9)

where \({z^{*}=(z^{*}_{1},...,z^{*}_{m})^{T}}\) is a reference point ,i.e., \({z^{*}_{i}=min(\{f_{i}(x)(x\in {X})\})}\) for each \({i=1,...,m}\). For each PF point \({z^{*}}\) generated by Eq. 9 there is a weight vector \({\lambda }\) that emphasizes that \({z^{*}}\) is an optimal solution. Therefore, by alternating the weight vector, this approach will generate multiple different Pareto optimal solutions.

4.3 Modified Tchebycheff approach

The scalar optimization problem can be represented as follows:

$${\begin{aligned} \begin{aligned}&minimize \,g^{mte}(x\vert {\lambda }, z) = \max _{1 \le i \le m} \{\frac{1}{\lambda _{i}} \vert f_{i}(x)-z_{i}\vert \}\\&\quad Subject\, to \, x \in \, X \end{aligned} \end{aligned}}$$
(10)

where \({z^{*}=(z^{*}_{1},...,z^{*}_{m})^{T}}\) is a reference point ,i.e., \({z^{*}_{i}=min(\{f_{i}(x)(x\in {X})\})}\) for each \({i=1,...,m}\). For each PF point \({z^{*}}\) generated by Eq. 10 there is a weight vector \({\lambda }\) that emphasizes that \({z^{*}}\) is an optimal solution. Like TE approach, alternating the weight vector will cause the MTE approach to generate multiple different Pareto optimal solutions (more information about this approach can be found in [25]).

4.4 Penalty-based boundary intersection

The scalar optimization problem can be represented as follows:

$${\begin{aligned} \begin{aligned} minimize\,g^{pbi}(x\vert {\lambda }, z) = d_{1}+\theta d_{2}\\Subject\, to \, x \in \, X\\ \\ \mathrm{where }\, d_{1}= {{ \Vert (F(x)-z)^{T}\lambda \Vert }\over { \Vert \lambda \Vert }} \\ \mathrm{and} \, d_{2}=\Vert F(x)-(z+d_{1} \lambda ) \Vert \end{aligned} \end{aligned}}$$
(11)

where x is a vector containing the variables of both objectives to be optimized (in this paper we consider cost and makespan), \({\lambda }\) is a weight vector, z represents the reference point which corresponding to the minimal values for both objectives considered(cost and makespan), \({\theta }\) is a penalty parameter that has to be greater than 0, \({d_{1}}\) is the distance between \({z^{*}}\) and y, \({d_{2}}\) is the distance between F(x) and line L. It worths mentioning that PBI and TE use the same set of evenly distributed weight vectors when the number of considered objectives is two. However, PBI has more advantages over TE.

  • The resultant optimal solutions generated by PBI are much more uniformly distributed than those generated by TE, especially when the number of the weight vectors is not large.

  • For TE, when x dominates y it is still possible that \({g^{te}=(x\vert ~\lambda ,z^{*}=g^{te}=(y\vert ~\lambda ,z^{*})}\). However, this is a rare case for PBI.

To achieve these advantages, penalty factor values have to be set. It is well known that a too large or too small number of penalty factors will decline the performance of this approach.

4.5 NIMBUS approach

Miettinen et al. [26] describes NIMBUS as an interactive classification-based multi-objective optimization approach. It uses the same principle as other decomposition algorithms. The scalar optimization problem can be represented as follows:

$${\begin{aligned} \begin{aligned}&minimize\,g^{nbs}(x\vert {\lambda }, z) = \max _{1 \le i \le m} \{(\lambda _{i}( f_{i}(x)-z_{i})),\\&\quad (\lambda _{j} (f_{i}(x)-z^{*}_{j})) \}+p \sum _{i=1}^{m}\lambda _{i} f_{i}(x)\\&\quad \quad Subject\, to \, x \in \, X \end{aligned} \end{aligned}}$$
(12)

where z is the ideal objective vector, \({z^{*}}\) is the aspiration levels for the objective function, \({p~>0}\) is a relatively small scalar bounding trade-off, and \({\lambda }\) is a weight vector used to scale up or down the values of the considered objectives. More details about WS, TE and PBI approaches can be found in [6] and more details about NIMBUS can be found in [27].

Based on the models and assumptions above, we present the multi-objective workflow scheduling algorithm, which generates schedules and properly tunes the CPU frequency for each task so that makespan and total cost of the submitted workflow are minimized.

5 The proposed algorithm

This section describes the Workflow Scheduling Algorithm Based on Decomposition (WSABD), a multi-objective algorithm proposed to solve the problem described in Sect. 4.

5.1 Algorithm description

As presented in algorithm 2, WSABD takes seven elements (W, VMs, SC, N, WV, T, A) as input. Those input elements are described as follows: W(Workflow) is a set of tasks with known execution time and communication time, VMs is a set of resources with CPU frequencies and associated prices, SC is a fixed number of iterations used as the stopping criteria, N is the number of sub-problems considered, WV is a uniform distribution of N Weight Vectors: \({\lambda ^1...\lambda ^N}\) (N=2), T is the number of weight vectors in the neighborhood of each weight vector, A is a decomposition approaches(selected from the set of decomposition approaches) that is used to compute and compare new solutions. In our case, A can be one of those five approaches: WS, TE, MTE, PBI and NIMBUS. WSABD returns EP as the output, where EP is a set of non-dominated solutions. The proposed algorithm consists of three main steps: Initialization, Update and checking the stopping criteria.

figure e
$${\begin{aligned} RW_{(t,m,f)}=TC_{(t,m^{'},f^{'})}-TC_{(t,m,f)} \end{aligned}}$$
(13)
figure f

In our previous study [28], we proposed a workflow scheduling algorithm with two variants (CFMax, CFMin). Like all other studies that focused on optimizing workflow scheduling objectives under user’s deadline [13, 16, 31], the purpose of our former study [28] was to minimize the users’ monetary expenditure for the submitted workflow application under a given deadline. The experimental results of our proposal show that CFMax performs better than CFMin. To satisfy the user’s deadline regardless of the total cost, CFMax starts with a makespan-aware scheduling algorithm (such as HEFT, MIN-MIN, MCT, or MAXMIN like in [14]) and schedules each task to the appropriate VM instance using the maximum CPU frequency. To guarantee the cost reduction, a reduction-weight(RW) table is created to measure the cost reduction impacted by the task reassignment and CPU frequency re-allocation. The values are inserted into RW table according to Eq. (13), where \({TC_{(t,m',f')}}\) represents new task’s cost after changing the CPU frequency, and \({TC_{(t,m,f)}}\) represents the cost of executing the tasks on the VM instance under current CPU frequency. To take a re-assignment decision, the combination of VM instance and CPU frequency that produces the maximum value in RW table is selected as the winner.

In the first step of WSABD, we initialize the inputs, CFMax is used to generate the initial population (line 4 of Algorithm 2). In the second step, we update the initialized variables by iteratively changing vector variables and iteration settings. We compute new solutions according to the updated settings and update new solutions according to the decision from a designated decomposition approach. In the third step, we check if stopping criteria is satisfied then return the Non-Dominated solutions (namely, EP), otherwise go to step two.

In detail, step one (from lines 1 to 6) consists of initialization of the input variables such as the number of weight vectors in the neighborhood (T), weight vector indexes B, VM instance information, DAG information. After the population is initialized using CFMax, the individual cost and makespan are calculated as objective function values. Line 4 in algorithm 2 says that FV stands for objective function values. Among them the minimum one is selected as the initial reference point z. Note that T plays an important role in limiting the search operation to a certain extent. The second step (from line 7 to 26) consists of two sub-steps. In the first sub-step (from line 7 to 15), we update the individual(\({y'}\)) by searching either the minimized cost or the minimized makespan based on the position index of the individual in the population. If the index of the individual in the population is even, we get new individual by minimizing cost otherwise we get new individual by minimizing makespan. Then the new cost and the new makespan are calculated according to the new individual values and renamed as (\({FV(y')}\)). Finally, the reference point is updated according to (\({FV(y')}\)). In the second sub-step (from line 16 to 26), the solutions of the individual’s neighbors are updated. As shown in the algorithm 1, in this stage we use one of the approaches defined in Eqs. (8), (9) and (11). For each selected approach, we calculate the \({g^{*}(y'\Vert \lambda ^{j},z)}\) and \({g^{*}(x^{j}\Vert \lambda ^{j},z)}\). Before resetting the current cost-makespan(\({x^j}\)) value to be equal to the new individual \({y'}\), the two values are compared first. If the new values (\({g^{*}(y'\Vert \lambda ^{j},z)}\)) are less or equal to the current values (\({g^{*}(x^{j}\Vert \lambda ^{j},z)}\)), update is allowed otherwise the current values are kept and the algorithm continues. EP is also updated according to the new values of \({FV(y')}\). The second step(the update step) is repeated until the stopping criteria is satisfied then the final EP is returned.

Table 3 \({c_{min}}\) and \({\delta }\) values

5.2 WSABD complexity

WSABD algorithm has three main parameters: population size (PopSize), the number of neighbors (T) and the number of iterations (iterNum). PopSize corresponds to the number of single objective sub-problems that the algorithm decomposes such multi-objective problem into, and indicates search breadth of the algorithm. T can be expressed as a sub-problem by using the number of adjacent subproblems. iterNum represents search depth of the algorithm. Suppose that there exists a workflow with N tasks and VMs available resources with C selections of CPU frequency on average. According to Algorithm 2, the update step needs iterNum times. Each time Popsize individuals in population have to be updated in turn. In the process of updating each individual, the solutions of its T neighbors will be recalculated by Algorithm 1. Therefore, the time complexity of WSABD is: \({iterNum \times PopSize \times Max(T,N\times VMs\times C)}\), where \({N\times R\times C}\) is the number of computations required to construct the RW table.

6 Evaluation

In order to evaluate the performance of the proposed algorithm, we perform two types of experiments by using different parameter settings (as presented in Sect. 6.1) in each experiment. The first experiment learns the variability of cost and makespan values under fixed parameters settings (the number of VMs, the number of iterations, population size and the number of neighbors), and the second one learns the variability of the cost and makespan values under changeable parameter settings. We mainly focus on three types of evaluations including optimization results, runtime results and HyperVolume(HV) results. For the first experiment, we presented all the three types of results, while for the second experiment we just focus on the HV results. The results and their discussions are presented in the subsections below Fig. 2.

6.1 Evaluation settings

Fig. 2
figure 2

Structure of the considered workflows (Source: [29])

The algorithm presented in Sect. 2 and its simulation tools are implemented in Java. A PC with 4-core Intel i5-7300HQ @2.5GHz CPU and 8GB-RAM is used as an experimental environment. Each simulated resource runs at a CPU frequency in the range between maximum and minimum, with a variation step, selected randomly as shown in Table 2. We considered three pricing models described in section [5], with \({c_{min}}\) and \({\delta }\) presented in Table 3 as in our previous work [14]. We also considered three real workflows (Montage, Inspiral, and Epigenomics) which are different in structure. [30] described Montage as a type of workflow that is created by the NASA/IPAC Infrared Science Archive as an open toolkit that can be used to generate custom mosaics of the sky from input images in the Flexible Image Transport System (FITS) format. [30] described LIGO Inspiral Analysis (also known as LIGO or Inspiral in short) as a type of workflow that is created to analyze data from the coalescing of compact binary systems such as binary neutron stars and black holes. The same work described Epigenomics as a type of workflow that is created by the USC Epigenome Center and Pegasus Team to automate various operations in genome sequence processing. The considered workflows are defined by DAX files which follow DAX XML specifications. We download the DAX files describing a 1000-node Montage workflow, a 1000-node Inspiral workflow and 997-node Epigenomics workflow from the website of Pegasus workflow Generator [29] and use them as the inputs of the scheduling algorithm in our simulator. For the optimization experiments, we set the input parameters such as population size, the number of neighbors, the number of the iterations and the number of VMs as 20, 5, 2000 and 15 respectively. WSABD is applied to each workflow with the parameters defined above.

6.2 Performance results

To evaluate the performance of the proposed algorithm, we considered Hyper-Volume indicator, one of the performance indicators used to measure the single score to indicate the quality of a set PF solutions acquired by the proposed algorithm using different settings. For a 3D data, this score is equal to the volume covered by the PF and the selected reference point R. However, in case of 2D (shown in Fig. 3), it refers to the area covered by the PF set and the selected reference point R. The HV enclosed by the PF and reference point R is calculated as follows:

$${\begin{aligned} HV(PF,R)=\cup _{v \in PF}volume(v) \end{aligned}}$$
(14)

where volume(v) is the area bounded by the solution v in PF and R. For the minimization problem, the larger the HV is,the closer the solution set obtained by the algorithm is to the lower left corner of the coordinate axis, the better the convergence of the algorithm and the distribution of the solution set are. When the HV values are stable, the solution set obtained by the algorithm is not changed, and the algorithm is convergent.

Fig. 3
figure 3

Calculation of HV results

6.3 Fixed parameter settings

Table 4 HV results

6.3.1 Optimization results

In this section we present and discuss the optimization results of the proposed algorithm. Given the parameters and the settings described above, CFMax is firstly executed, then cost and makespan results are collected. The results generated by CFMax are also the initial population for WSABD. WSABD can approximate optimal solutions (set of PF solutions) at the end of a single run. Note that WSABD is made of five decomposition approaches. As long as the iteration number is not yet satisfied and all possibilities are not tried yet, the algorithm will continue to generate new solutions. The experimental results are shown in Figs. 4, 5, 67, 8, 9, 10, 11 and12.

Fig. 4
figure 4

Montage optimization results (Linear model)

Fig. 5
figure 5

Montage optimization results (Superlinear model)

Fig. 6
figure 6

Montage optimization results (Sublinear model)

Fig. 7
figure 7

Inspiral optimization results (Linear model)

Fig. 8
figure 8

Inspiral optimization results (Superlinear model)

Fig. 9
figure 9

Inspiral optimization results (Sublinear model)

Fig. 10
figure 10

Epigenomics optimization results (Linear model)

Fig. 11
figure 11

Epigenomics optimization results (Superlinear model)

Fig. 12
figure 12

Epigenomics optimization results (Sublinear model)

Note that users can determine the solutions that suit their needs based on some constraints such as budget and/or deadline. Based on the optimization results presented in these figures, the key findings of this subsection can be summarized as follows:

  • The overall optimization results show that Montage produces small value results of cost and makespan while Epigenomics produces high-value results.

  • When the considered parameter settings are applied to Montage and Inspiral for linear and superlinear pricing models, WSABD-TE achieves high makespan than other algorithms.

  • When the pricing model is sublinear, no algorithm achieves the same results as the others.

  • In case of sublinear pricing model, cost and makespan results generated by our algorithm start higher than the ones generated by CFMAX.

6.3.2 Runtime results

To evaluate the runtime of WSABD, the algorithm will run 100 times with each decomposition approach under aforementioned environments and settings, then the average time consuming will be calculated as the runtime result. The runtime results are shown in Figs. 13, 14 and 15.

Fig. 13
figure 13

Runtime results (Montage)

Fig. 14
figure 14

Runtime results (Inspiral)

Fig. 15
figure 15

Runtime results (Epigenomics)

Based on the runtime results presented in those figures, the key findings of this subsection can be summarized as follows:

  • The overall runtime results demonstrate that when DAGs are Montage and Inspiral, WSABD-TE’s runtime is higher than the runtime of other decomposition approach options.

  • For Epigenomics, the runtime of WSABD-PBI becomes higher when the pricing models are linear and superlinear, but it becomes second lower when the pricing model is sublinear.

  • For Montage, lower runtime can be achieved by WSABD-WS for linear pricing model, WSADB-MTE for superlinear pricing model and WSABD-PBI for sublinear pricing model.

  • For Inspiral, lower runtime can be achieved by WSABD-NIMBUS for both linear and superlinear pricing models, and by WSABD-NIMBUS for sublinear pricing model.

  • For Epigenomics, lower runtime can be achieved by WSABD-MTE for both linear and sublinear pricing models, and by WSABD-WS for superlinear pricing model.

Table 5 Variation of the number of VMs
Table 6 Variation of population size
Table 7 Variation of the number of neighbors
Table 8 Variation of the number of iterations

6.3.3 HV results

With the same parameters settings as used for optimization and runtime experiments, we also evaluated performance of the proposed algorithm based on the HV results of each algorithm. It worths mentioning that the CFMAX results will mostly be in the lower left corner of the coordinate axis compared to the results of the proposed algorithm, which gives CFMAX the ability to dominate in many cases. If the results of CFMAX are the best, we also search for other best results excluding CFMAX. We tabulated all the results obtained in Table 4 and the best results are in bold.

  • The results generated by CFMAX dominate the results generated by our algorithm in 7/9 cases.

  • When the results generated by CFMAX are excluded, the results generated by WSABD-TE dominate the results generated by other variants in 4 cases (two cases when DAG is Montage and two more cases when DAG is Inspiral), followed by WSABD-WS (3 cases when DAG is Epigenomics). Both WSABD-TE and WSABD-PBI have equal number of cases (WSABD-TE has one case when DAG is Montage and WSABD-PBI has one case when DAG is Inspiral). WSABD-NIMBUS didn’t achieve best solution in any case.

6.4 Variation of other parameter settings

In the second experiment, we used the same parameter settings. However, only one of them will be changed each time while all others remain at their default values as in the previous experiment. We let the number of VMs change from 5 to 35 with a variation step of 5, population size change from 5 to 30 with a variation step of 5, the number of neighbors change from 2 to 7 with a variation step of 1, deadline ratio change from 1.5 to 4 with a variation step of 0.5 and iteration number change from 500 to 3000 with a variation step of 500. The HV results for variation of the number of VMs, variation of population size, variation of the number of neighbors and variation of the number of iterations are presented in Tables 5, 6, 7 and 8 respectively. For each table, we omit the HV results generated by the parameter settings corresponding to those we used in the previous experiment, which can be found in Table 4. Global ranking of the performance of HV results cannot be used to make the final decision about the best variant, because the performance of the algorithm depends on the parameter settings of the environment and the structure of the DAG processed. Therefore, we also analyze HV results of each individual variant, re-rank based on DAG structure.

6.4.1 Variation of the number of VMs

  • The overall HV results for VMs number variation show that each variant can at least achieve the best results in some case.

  • The general results are collected for each decomposition approach and ranked as follows: WSABD-TE the first with 21/54 cases, WSABD-MTE the second with 16/54 cases, WSABD-PBI the third with 7/54 cases, WSABD-WS the fourth with 6/54 cases and lastly WSABD-NIMBUS with 4/54 cases. Among 54 cases, CFMAX achieves best results in 35 cases (where 11/18 cases are from Montage, 13/18 cases are from Inspiral and 11/18 cases are from Epigenomics).

  • For Montage, the variants are ranked as follows: WSABD-TE the first with 8/18 cases, WSABD-PBI the second with 4/18 cases, WSABD-MTE the third with 3/18 cases, WSABD-WS the fourth with 2/18 cases and finally WSABD-NIMBUS the last with 1/18 cases.

  • For Inspiral, the variants are ranked as follows: WSABD-TE the first with 7/18 cases, WSABD-WS, WSABD-TE and WSABD-NIMBUS the second with 3/18 cases each and finally WSABD-PBI the last with 2/18 cases.

  • For Epigenomics, the variants are ranked as follows: WSABD-TE the first with 10/18 cases,WSABD-MTE the second with :6/18 cases, WSABD-WS, WSABD-PBI the third with 1/18 case each and finally WSABD-NIMBUS the last with 0/18 case.

6.4.2 Variation of population size

  • Based on the general results, variants are ranked as follows: WSABD-WS the first with 15/36 cases, WSABD-TE the second with 9/36 cases, WSABD-MTE the third with 7/36 cases, WSABD-PBI the fourth with 5/36 cases and lastly WSABD-NIMBUS with 0/36 case. Among 36 cases, CFMAX achieves best results in 28 cases (where 8/12 cases are from Montage, 12/12 cases are from Inspiral and 8/12 cases are from Epigenomics).

  • For Montage, the variants are ranked as follows: WSABD-WS,WSABD-TE, WSABD-PBI, WSABD-MTE the first with 3/12 cases for each of them and finally WSABD-NIMBUS the last with 0/12 case.

  • For Inspiral, the variants are ranked as follows: WSABD-MTE the first with 4/12 cases, WSABD-WS, WSABD-TE the second with 3/12 cases for each, WSABD-PBI the third with 2/12 cases and finally WSABD-NIMBUS the last with 0/12 case.

  • For Epigenomics, the variants are ranked as follows: WSABD-WS the first with 9/12 cases, WSABD-TE the second with 3/12 cases, and finally WSABD-PBI, WSABD-MTE and WSABD-NIMBUS the last with 0/12 case for each of them.

6.4.3 Variation of the number of neighbors

  • Based on the general results, the variants are ranked as follows: WSABD-MTE the first with 16/45 cases, WSABD-WS the second with 15/45 cases, WSABD-TE the third with 9/45 cases, WSABD-PBI the fourth with 6/45 cases and lastly WSABD-NIMBUS with 0/45 case. Note that both WSABD-WS and WSABD-PBI achieve the best result in one case(when the number of neighbours is 3 for Epigenomics with linear pricing model). Among 45 cases, CFMAX achieves best result in 36 cases (where 9/15 cases are from Montage, 15/15 cases are from Inspiral and 12/15 cases are from Epigenomics).

  • For Montage, the variants are ranked as follows: WSABD-MTE the first with 8/15 cases, WSABD-TE the second with 3/15 cases, WSABD-WS and WSABD-PBI the third with 2/15 cases for each of them, and finally WSABD-NIMBUS the last with 0/15 case.

  • For Inspiral, the variants are ranked as follows: WSABD-WS, WSABD-TE and WSABD-MTE the first with 4/15 cases for each of them, WSABD-PBI the second with 3/15 cases each and finally WSABD-NIMBUS the last with 0/15 case.

  • For Epigenomics, the variants are ranked as follows: WSABD-WS the first with 9/15 cases, WSABD-MTE the second with 4/15 cases, WSABD-TE the third with 2/15 cases, WSABD-PBI the fourth with 1/15 case and finally WSABD-NIMBUS the last with 0/18 case. Note that both WSABD-WS and WSABD-PBI achieve the best result when the number of neighbours is 3 under linear pricing model.

6.4.4 Variation of the number of iterations

  • Based on the general results the variants are ranked as follows: WSABD-WS and WSABD-PBI the first with 16/45 cases for each of them, WSABD-MTE the third with 12/45 cases, WSABD-TE the fourth with 2/45 cases and lastly WSABD-NIMBUS with 0/45 case.Note that both WSABD-WS and WSABD-MTE achieve the best result in one case(when the number of iterations is 2500 for Epigenomics with linear pricing model). Among 36 cases, CFMAX achieves best result in 28 cases (where 8/12 cases are from Montage, 12/12 cases are from Inspiral and 8/12 cases are from Epigenomics).

  • For Montage, the variants are ranked as follows: WSABD-PBI the first with 8/15 cases, WSABD-MTE the second with 5/15 cases, WSABD-TE the third with 2/15 cases, finally WSABD-WS, WSABD-NIMBUS the last with 0/15 case.

  • For Inspiral, the variants are ranked as follows: WSABD-PBI the first with 8/15 cases, WSABD-MTE the second with 6/15 cases, WSABD-WS the third with 1/15 case, finally WSABD-TE, WSABD-NIMBUS the last with 0/15 case.

  • For Epigenomics, the variants are ranked as follows: WSABD-WS the first with 15/15 cases WSABD-MTE the second with 1/15 case and finally WSABD-TE, WSABD-PBI and WSABD-NIMBUS the last with 0/15 case.Note that both WSABD-WS and WSABD-MTE achieve the best result when the number of iterations is 2500 under linear pricing model.

7 Conclusion

In this paper, the problem of minimizing the makespan and monetary cost of a submitted workflow is considered and modeled as a multi-objective optimization problem. A novel workflow scheduling algorithm based on decomposition is proposed to assist in the tuning of the CPU frequency for each task so that both makespan and cost can be minimized. The evaluation results on optimization show that in different conditions, all the variants of the proposed algorithm can at least perform well in some cases. And the runtime evaluation results show that different parameter settings cause runtime variability for all the tested cases. However, the proposed algorithm still has room for further improvements. The use of cloud and/or cluster computing requires the optimization of more than two objectives at the same time. On one hand, multiple objectives have to be considered to further test the capability of the proposed algorithm. On the other hand, the algorithm complexity shall be lowered to provide better scalability. Future works could consider different algorithms to initialize the population of the proposed algorithm besides CFMAX. Additionally, the efficiency of the proposed algorithm could be tested under a real cloud platform.