A cost and makespan aware scheduling algorithm for dynamic multi-workflow in cloud environment

Xia, Yuanqing; Zhan, Yufeng; Dai, Li; Chen, Yuehong

doi:10.1007/s11227-022-04681-3

A cost and makespan aware scheduling algorithm for dynamic multi-workflow in cloud environment

Published: 02 August 2022

Volume 79, pages 1814–1833, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

The Journal of Supercomputing Aims and scope Submit manuscript

A cost and makespan aware scheduling algorithm for dynamic multi-workflow in cloud environment

Download PDF

Yuanqing Xia¹,
Yufeng Zhan^1,2,
Li Dai¹ &
…
Yuehong Chen¹

465 Accesses
6 Citations
Explore all metrics

Abstract

With the development of cloud computing, a growing number of workflows are deployed on cloud platform that can dynamically provides cloud resources on demand for users. In clouds, one basic problem is how to schedule workflow for minimizing the execution cost and the workflow completion time. Aiming at the problem that the maximum completion time and cost of multiple workflows are too high, this paper proposes a model of dynamic multi-workflow scheduling in cloud environment and a new scheduling algorithm which is named as MT (multi-workflow scheduling technology). In MT, the heterogeneity of resources is considered when calculating the priority of tasks. Then, the technique for order of preference by similarity to ideal solution (TOPSIS) method is used to rank the resources when selecting resources for tasks. Finally, MT takes the estimated minimum completion time of the workflow and the cost of the task as two attribute indexes in TOPSIS decision matrix. Also, it uses a fixed reference point instead of calculating ideal solution, which ensures the uniqueness of the evaluation criteria when there is a change in the number of resources. Simulation experiments are illustrated to verify the effectiveness of the proposed algorithm in reducing the maximum completion time and cost of multiple workflows. Compared with the state-of-the-art methods, the maximum completion time and cost can be reduced by at most 17 and $9\%$, respectively.

Time- and Cost-Aware Scheduling Method for Workflows in Cloud Computing Systems

A novel strategy for deterministic workflow scheduling with load balancing using modified min-min heuristic in cloud computing environment

Article 15 March 2024

Unified Multi-constraint and Multi-objective Workflow Scheduling for Cloud System

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

With the rapid development of cloud computing technology, the cloud resource provides an efficient computing service model and provided great convenience for human life [1, 2] and is widely used in complex applications with the powerful computing capability [3], such as transportation, medical care, education and e-commerce industries [2]. In clouds, the cloud model consists of a great number of servers which are equipped with adequate cloud resources, such as CPU cores and memory and multiple virtual machines (VMs) instances are running simultaneously on these servers. In this way, many workflows applications are executed in cloud environment. As a result, there are many challenges for cloud service providers how to effectively schedule applications with cloud resources [4].

Workflow scheduling in cloud computing refers to obtaining the corresponding time and space mapping relationship between tasks and resources [5] and allocating tasks to proper resources according to different scheduling objectives, which not only plays a decisive role in the whole cloud workflow system, but also greatly affects QoS requirements of users [6]. Many heuristics or meta-heuristic algorithms [7,8,9] have been proposed for workflow scheduling to optimize a single objective, such as the scheduling length or execution cost. However, more than one objective need to be taken into consideration. Time and cost are the two most important but conflict QoS parameters, which increases the difficulty of workflow scheduling. In addition, current approaches mainly focuses on single workflow scheduling. As for multiple workflows, they will schedule the workflows sequentially, which cannot extract all the features of workflows so as to give the optimal scheduling strategy. In the cloud, the resources with the best performance usually have the most expensive prices. Therefore, how to balance these two parameters in scheduling when multiple workflow dynamically arrive is a challenge.

In this paper, we design a scheduling and optimization algorithm for dynamic scheduling multiple workflows in clouds. Different from current approaches, the proposed approach can schedule multiple workflows simultaneously, and the objective of this paper is to minimize the maximum completion time and the total cost for executing the dynamic multiple workflows. The main contributions are as follows:

We consider the heterogeneity of resources when calculating the priority of tasks. And the TOPSIS^{Footnote 1} method is adopted to select resources for tasks.
A new algorithm called MT is proposed, which can minimize the maximum completion time and the total cost of the multiple worklflows. Considering the influence of resource selection on the completion time of the child tasks, the estimated minimum completion time of the workflow is applied as one attribute index, and the sum of the tasks’ execution cost and data transmission cost is taken as the other attribute index, which can further reduce the execution time and cost of the workflow. In addition, it adopts a fixed reference point instead of calculating ideal solution, which ensures the uniqueness of the evaluation criteria when there has a change in the number of resources.
Simulation experiments are carried out to verify the effectiveness of the proposed algorithm. Experimental results validate that the proposed MT algorithm can generated less maximum completion time and total cost of multiple workflows than the state-of-the-art algorithms.

The remainder of this paper is organized as follows. Section 2 presents the related work. System model and the scheduling problem are given in Sect. 3. The proposed MT algorithm is introduced in Sect. 4. In Sect. 5, experiments are carried out to evaluate the algorithm’s efficiency. Finally, Sect. 6 concludes the paper.

2 Related work

There are many researches on workflow scheduling. And they can be divided into several categories according to different optimization objectives, such as cost or scheduling length. The heterogeneous earliest-finish-time (HEFT) algorithm has been proposed in [7], which is a heuristic algorithm to minimize the scheduling length of the workflow in the heterogeneous environment. Based on this, Lin et al. [11] proposed scalable heterogeneous earliest-finish-time (SHEFT) algorithm, which successfully realized the elastic scaling of resources in the process of scheduling, and effectively reduced the execution time of tasks. Besides the heuristic algorithms, some scholars use meta-heurisitc algorithms to slove these problems. Buyya [12] proposed particle swarm optimization (PSO) to generate the schedule plan. And [13] adopted ant colony optimization (ACO) algorithm to schedule workflows. For optimizing the execution cost and time, Pareto optimal scheduling heuristic (POSH) as a multi-objective heuristic optimization algorithm has been proposed by Su [14] in 2013, which aims to achieve the balance between the execution cost and the maximum completion time of the workflow. It selects tasks in the same way as described in HEFT. In the stage of resource selection, the task execution cost and execution time are weighted respectively and added to obtain a new objective function. Based on the new objective function, the optimal resource for the task can be selected.

Recently, Li et al. considered a multiobjective workflow scheduling problem and proposed a scoring and dynamic hierarchy-based NSGA-II (nondominated sorting genetic algorithm II), to minimize both workflow makespan and cost [15]. Chen et al. [16] adopted co-evolutionary multiple populations to design a novel multiobjective workflow scheduling algorithm with the ant colony system to minimize both workflow execution time and cost. Zhu et al. [17] proposed an evolutionary multiobjective optimization (EMO)-based algorithm with novel schemes for fitness evaluation, genetic operator. Garg and Singh [18] designed a multiobjective workflow scheduling optimization approach to optimize makespan and monetary cost simultaneously for the whole workflow execution with designed genetic operations in hybrid clouds.

However, previous studies are only focused at the single workflows. Aiming at the multiple workflows, the online workflow management (OWM) algorithm designed by Hsu et al. [19] to solve the scheduling problem of multiple and hybrid parallel workflows. However, because only idle resources are considered, the task may be delayed, which causes the increase of the completion time of the workflow. Different from current works, the proposed approach in this paper focuses on the workflow scheduling which has multi-objectives. In addition, our approach can schedule the multiple workflows simultaneously, which will match the optimal resources for each task, so as to make full use of the resources.

3 System model

The dynamic multi-worklfow model is illustrated in Fig. 1. After accepting the application requests, the system abstract them into the directed acyclic graph (DAG) and store them in the workflow repository. Ready task pool refers to tasks that are fully prepared and can be scheduled at any time in the workflow repository according to certain selection rules, such as round robin or first come first schedule. Various scheduling algorithms are embedded in the scheduler, which can intelligently select the optimal scheduling algorithms to assign tasks to appropriate resources according to the users’ personalized needs, and obtain the corresponding mapping relationship between tasks and resources.The task waiting queue mainly stores the corresponding waiting tasks on each resource. Finally, the specific scheduling is implemented in the cloud environment according to the scheme given in the scheduler and the order of tasks in the task waiting queue. The usage status of related cloud resources in the cloud also needs to be feed back to the scheduler in real time to provide real-time information for schedule subsequent tasks. In this paper, when a resource in the cloud resource pool executes a task, it can only process one task at a time, in other words, the execution of the task can’t be preempted.

Cloud service provider such as Amazon cloud can provide a platform which comprises a large number of virtual machines (VMs) or containers with different types. The VMs and containers in the cloud environment are collectively referred to as resources in the paper. The cloud environment is heterogeneous, that is, there are differences in computing and storage capacities among the cloud resources. And let $P=\{p_1,p_2,\ldots ,p_{|P|}\}$ denote the cloud resources provided, and |P| is the size of the resources. Traditionally, the applications submitted by different users could be modeled as workflows and denoted as $GS=\{G_1,G_2,\ldots ,G_{|M|}\}$, where |M| is the number of the workflows. Each workflow can be decomposed into many subtasks with precedence constraints. In this paper, the workflow is abstracted as a DAG and denoted as G(T, E, W, C), where $T=\{t_1,t_2,\ldots ,t_{|N|}\}$ is the set of |N| tasks, and E is the set of communication edges, where $E=\{e_{ij}\Vert i,j=1,\ldots |N| \}$ represents the data dependency constraints set for tasks of workflow. The $c_{ij} \in C$ is the data transmission time between $t_i$ and $t_j$. W is the $|N|\times |P|$ computation matrix, and |N| is the number of tasks in the DAG and |P| is the number of resources provided. $w_{i,k} \in W$ represents the time of task $t_i$ processed on resource $p_k$. The workflow model based on DAG is shown in Fig. 2 and the corresponding computation matrix is shown in Fig. 3.

Next, some definitions of the DAG model are described as follows:

$pred(t_i)$: The set of the parent tasks of task $t_i$. In the Fig. 2, task $t_1$ is the parent task of task $t_2$, $t_3$, $t_4$, $t_5$ and $t_6$. The task set composed of task $t_2$, $t_4$ and $t_6$ is the parent task set of task $t_8$, and the set composed of task $t_2$, $t_4$ and $t_5$ is the parent task set of task $t_9$. If a task has no parent task, the task is called the entry task of the DAG. If a DAG has more than one entry task, for convenience, a dummy entry task is added in the beginning of the DAG, which has zero execution time and zero–weight with the actual entry task.
$succ(t_i)$: The set of the child tasks of task $t_i$. In the Fig. 2, the set composed of task $t_2$, $t_3$, $t_4$, $t_5$ and $t_6$ is the child task set of $t_1$, and the task set consist of task $t_8$ and $t_9$ is the child task set of task $t_2$. If a task’s child task set is empty, the task is called the exit task of the DAG. If an DAG has more than one exit task, for convenience, a dummy exit task is added in the ending of the DAG, which has zero execution time and zero–weight with the actual exit task of the DAG.
$\mathrm{EST}(t_i,p_k)/\mathrm{EFT}(t_i,p_k)$: The earliest start/finish time of task $t_i$ in resource $p_k$.
$\mathrm{AST}(t_i)/\mathrm{AFT}(t_i)$: The actual start/finish time of task $t_i$ based on the actual scheduling.
$\mathrm{Makespan}(G_m)$: The finish time for the workflow $G_m$. The finish time of $G_m$ can be calculated by:
$$\begin{aligned} \mathrm{Makespan}(G_m)= \mathrm{AFT}(G_m.t_\mathrm{exit})-\mathrm{AST}(G_m.t_\mathrm{entry}), \end{aligned}$$
(1)
where $\mathrm{AFT}(G_m.t_\mathrm{exit})$ indicates the actual finish time of the exit task of $G_m$ and $\mathrm{AST}(G_m.t_\mathrm{entry})$ is the actual start time of the entry task of $G_m$.

3.1 Cost model

Cloud provides a strategy of pay-as-you-go pricing, where users are charged according to the used time of the resources. Owning to the various capacities of the resources, their prices are also different. Resources with rapid processing speed are expensive, and cheaper resources always have slow processing capacities. In the paper, task’s cost consists of computation cost and transmission cost. We use $\mathrm{price}(p_k)$ to denote the price per unit time associated with processor $p_k$ and $\mathrm{price}(p_{t})$ to represent the price of data transmission in unit time. The computation and transmission cost of task $t_i$ is

$$\begin{aligned} C(t_i,p_k) = \mathrm{price}(p_k) * w_{i,k}+\sum \limits _{t_m \in pred(t_i)}c_{mi}*\mathrm{price}(p_{t}), \end{aligned}$$

(2)

The cost of workflow $G_m$ is the sum of actual cost of all its tasks

$$\begin{aligned} C(G_m)=\sum \limits _{t_i\in G_m} C(t_i,p_k). \end{aligned}$$

(3)

3.2 Problem formulation

Based on the system scheduling model and the cost model, the scheduling problem in this paper is to search the proper map from tasks to resources to minimize the maximum completion time and cost in the multi-workflow system when multiple workflows arrive dynamically, which is

$$\begin{aligned} \begin{aligned} \mathrm{minimize}~~~&\mathrm{Makespan}(\mathrm{GS}),\\ \mathrm{minimize}~~~&C(\mathrm{GS}), \end{aligned} \end{aligned}$$

(4)

where $\mathrm{Makespan}(\mathrm{GS})$ represents the maximum completion time of all workflows in $\mathrm{GS}$, i.e., the total time from the beginning of the first task to the completion of last task in $\mathrm{GS}$. $C(\mathrm{GS})$ represents the total cost of all workflow. $\mathrm{Makespan}(\mathrm{GS})$ and $C(\mathrm{GS})$ are calculated via

$$\begin{aligned} \begin{aligned} \mathrm{Makespan}(\mathrm{GS})=&\max _{G_m \in \mathrm{GS}}\max _{t_\mathrm{exit}\in G_m}\mathrm{AFT}(t_\mathrm{exit})\\ C(\mathrm{GS})=&\sum \limits _{G_m \in \mathrm{GS}} C(G_m) \end{aligned} \end{aligned}$$

(5)

4 Minimize the maximum completion time and cost of multi-workflows

In this section, an algorithm named MT based on TOPSIS method is proposed for minimizing the maximum completion time and cost for multi-workflows. The algorithm mainly contains two phases, which are the task selection and the processor selection based on TOPSIS method. The multi-workflow scheduling process by MT algorithm is shown in Fig. 4. These two phases are described in detail as follows.

4.1 Task selection

The traditional method of calculating tasks’ priorities, such as the average execution time of tasks on resources when calculating $rank_u$ in HEFT [7] algorithm, eliminates the heterogeneity among resources’ performance. In this section, we consider the heterogeneity of resources in the cloud computing when calculating the priorities of tasks and iteratively calculate the rank value of tasks on each resource in turn

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle \mathrm{Rank}_n(t_i,p_k)=w_{i,k}+\max \limits _{t_j \in succ(t_i)} \{c_{ij}+\mathrm{Rank}_n(t_j,p_k)\},\\ \displaystyle \mathrm{Rank}_n(t_\mathrm{exit},p_k)=w_\mathrm{exit,k}, \end{array} \right. \end{aligned}$$

(6)

where $\mathrm{Rank}_n(t_i,p_k)$ represents the longest distance between task $t_i$ and the exit task when resource $p_k$ is selected by task $t_i$. $w_{i,k}$ and $w_\mathrm{exit,k}$ indicate the execution time of task $t_i$ and $t_\mathrm{exit}$ on resource $p_k$, respectively. $c_{ij}$ is the time of data transmission between task $t_i$ and its child task $t_j$.

According to equation (6), the priority of task $t_i$ can be determined by

$$\begin{aligned} \mathrm{Rank}_n(t_i)=\sum \limits _{p_k \in P} \mathrm{Rank}_n(t_i,p_k)/|P|. \end{aligned}$$

(7)

After obtaining the value of $\mathrm{Rank}_n$ for all tasks in each workflow, place the tasks in the $init\_queue$ of the corresponding workflow in the order of decreasing $\mathrm{Rank}_n$. In order to ensure fairness, for the unscheduled workflows, the tasks with the largest $\mathrm{Rank}_n$ in every workflow’s $init\_queue$ are submitted to the $ready\_pool $, and then the tasks in the $ready\_pool $ are reordered according to the following formula

$$ {\text{Rank}}_{r} (G_{m} \cdot t_{i} ) = \frac{1}{{{\text{PRT}}(G_{m} )}} + \frac{1}{{{\text{CPL}}(G_{m} )}}, $$

(8)

where $\mathrm{PRT}(G_m)$ represents the percentage of the remaining unscheduled tasks in the workflow $G_m$, and $\mathrm{CPL}(G_m)$ is the critical length of $G_m$. The equation indicates that if there are two workflows with the same number of tasks, the task of the workflow with the less unscheduled tasks and the smaller $\mathrm{CPL}$ will gets a high priority.

After calculating the $\mathrm{Rank}_r$ of all tasks in the $ready\_pool$, selecting the task $t_\mathrm{curr}$ with the largest $\mathrm{Rank}_r$ value, and determining the appropriate resources for the selected task according to TOPSIS method. Before introducing the strategy of resource selection, next we first introduce TOPSIS method for multi-attribute decision-making.

4.2 Overview for TOPSIS method

TOPSIS [20] method for multi-attribute decision-making is simple in calculation and rigorous in logic, which can intuitively show the gap among the various schemes. The main steps of the TOPSIS method is

(1) Based on the evaluation index of the problem and the given multiple optional schemes, the original decision matrix X of the problem is determined via

$$\begin{aligned}&X= \begin{bmatrix} x_{11} &{} x_{12} &{} \cdots &{} x_{1n} \\ x_{21} &{} x_{22} &{} \cdots &{} x_{2n} \\ \vdots &{} \vdots &{} \vdots &{} \vdots \\ x_{m1} &{} x_{m2} &{} \cdots &{} x_{mn} \\ \end{bmatrix},\nonumber \\&\quad (i=1,2,\ldots ,m;j=1,2,\ldots ,n), \end{aligned}$$

(9)

where $x_{ij}$ is the value of the jth evaluation index on the ith scheme, m is the number of given optional schemes, and n is the number of attribute indexes.

(2) Because the attribute indexes to be evaluated are different from each other, the dimensions of data in decision matrix X are also quite different. In order to eliminate the influence of data dimension, this section uses vector method to standardize the data of X as follows

$$\begin{aligned} q_{ij}=\frac{x_{ij}}{\sqrt{\sum \limits _{i=1}^m{x_{ij}}^2}}\quad (i=1,2,\ldots ,m;\,\,\,j=1,2,\ldots ,n). \end{aligned}$$

(10)

After the standardization of X, the dimensionless matrix Q is

$$\begin{aligned}&Q= \begin{bmatrix} q_{11} &{} q_{12} &{} \cdots &{} q_{1n} \\ q_{21} &{} q_{22} &{} \cdots &{} q_{2n} \\ \vdots &{} \vdots &{} \vdots &{} \vdots \\ q_{m1} &{} q_{m2} &{} \cdots &{} q_{mn} \end{bmatrix},\nonumber \\&\quad (i=1,2,\ldots ,m;\,\,\,j=1,2,\ldots ,n). \end{aligned}$$

(11)

(3) In the actual problem, the significance of different attribute indexes are various. Thus, it is necessary to determine the weight of attribute indexes in advance according to the actual demand, and to multiply the determined index weight by the corresponding value in Q to obtain the weighted matrix V as

$$\begin{aligned} v_{ij}=q_{ij}*w_j ,\nonumber \\ \sum \limits _{j=1}^n{w_j}=1, \nonumber \\ (i=1,2,\ldots ,m;\,\,\,j=1,2,\ldots ,n). \end{aligned}$$

(12)

(4) Determining the positive ideal solution $Z^+$ and negative ideal solution $Z^-$ in matrix V, and for different types of indexes, $Z_j^+$ and $Z_j^-$ can be expressed as follows, respectively,

$$\begin{aligned}&\begin{aligned} Z_j^+=\left\{ \begin{array}{l} min(v_{ij}) , ~~~\text {index of minimization}, \\ max(v_{ij}) , ~~~\text {index of maximization}, \end{array} \right. \\ (i=1,2,\ldots ,m;\,\,\,j=1,2,\ldots ,n), \end{aligned} \end{aligned}$$

(13)

$$\begin{aligned}&\quad \begin{aligned} Z_j^-=\left\{ \begin{array}{l} max(v_{ij}) , ~~~\text {index of minimization}, \\ min(v_{ij}) , ~~~\text {index of maximization}, \end{array} \right. \\ (i=1,2,\ldots ,m;\,\,\,j=1,2,\ldots ,n). \end{aligned} \end{aligned}$$

(14)

(5) Determining the Euclidean distance between scheme i and positive ideal solution $Z^+$ and negative ideal solution $Z^-$ as

$$\begin{aligned} \begin{aligned}&\begin{array}{l} d_i^+=\sqrt{\sum \limits _{j=1}^n(Z_j^+-v_{ij})^2},\\ d_i^-=\sqrt{\sum \limits _{j=1}^n(Z_j^--v_{ij})^2},\\ \end{array}\\&\quad (i=1,2,\ldots ,m;\,\,\,j=1,2,\ldots ,n). \end{aligned} \end{aligned}$$

(15)

(6) The relative closeness $R_i$ between the scheme i and the ideal solution $Z^+$ is

$$\begin{aligned} R_i=\frac{d_i^-}{d_i^++d_i^-},\quad (i=1,2,\ldots ,m). \end{aligned}$$

(16)

It can be seen from equation (16) that the closer $R_i$ is to 1, the closer the scheme is to the positive ideal solution $Z^+$, in other words, the better the scheme i is compared with other schemes. Each candidate scheme is sorted according to the principle of the relative closeness decreasing.

(7) According to the practical problems and the sorting results of all schemes, the best scheme can be selected.

4.3 Resource selection based on TOPSIS method

Combined with the actual scheduling problem of this paper, this section uses TOPSIS method to select th most appropriate resource for the task $t_\mathrm{curr}$. The main steps is:

Step1: Deterimine the original decision matrix $X_{m*n}$. The number m of $X_{m*n}$ is the size |P| of the set of provided resources. The objective of the paper is to minimize the completion time and cost of the workflows. Thus, the number n is 2. When the task $t_\mathrm{curr}$ is executed on the resource $p_i$, the $\mathrm{EFT}(t_\mathrm{curr}, p_i)$ of $t_\mathrm{curr}$ on the resource $p_i$ is firstly calculated by using formula (17) and the resource insertion strategy is

$$\begin{aligned} \begin{array}{l} \mathrm{EST}(t_\mathrm{curr},p_i)=\max \{T_{avail}(p_i), \max \limits _{t_m\in pred(t_\mathrm{curr})}(\mathrm{AFT}(t_m)+c_{m(\mathrm{curr})})\}, \\ \mathrm{EFT}(t_\mathrm{curr},p_i)=\mathrm{EST}(t_\mathrm{curr},p_i)+w_{\mathrm{curr},i}, \end{array} \end{aligned}$$

(17)

where $T_\mathrm{avail}(p_i)$ is the available time of resource $p_i$, and the inner max of $\mathrm{EST}(t_\mathrm{curr},p_i)$ indicates that only if all parent tasks of $t_\mathrm{curr}$ have been finished and the data required by $t_\mathrm{curr}$ have arrived at $p_k$, then $t_\mathrm{curr}$ can be ready for processing.

When optimizing the completion time of the workflow, we should not only consider the impact of resource selection on the current task’s completion time, but also take the influence on the task completion time of the child task. Therefore, calculating the maximum value of the shortest path from all child tasks of $t_\mathrm{curr}$ to the exit task is

$$\begin{aligned} \begin{aligned} \mathrm{Min}(t_\mathrm{curr},p_i)=\max \limits _{t_s\in \mathrm{succ}(t_\mathrm{curr})} \{\min \limits _{p_o\in P}\{w_{s,o}+c_{(\mathrm{curr})s}+\mathrm{Min}(t_s,p_o) \} \}, \end{aligned} \end{aligned}$$

(18)

where $w_{s,o}+c_{(\mathrm{curr})s}$ represents the sum of the computation time of the child task $t_s$ on resource $p_o$ and the data transmission time between $t_s$ and $t_\mathrm{curr}$ when $t_\mathrm{curr}$ selects $p_i$. Note that, when $t_\mathrm{curr}$ and $t_s$ select the same resource, the data transmission time is $c_{(\mathrm{curr})s}=0$.

Adding $\mathrm{Min}(t_\mathrm{curr},p_k)$ to the calculated $\mathrm{EFT}(t_\mathrm{curr}, p_i)$ to estimate the minimum completion time of the workflow $\mathrm{Min}G(t_\mathrm{curr},p_i)=\mathrm{EFT}(t_\mathrm{curr},p_i)+\mathrm{Min}(t_\mathrm{curr},p_i)$. Making $\mathrm{Min}G(t_\mathrm{curr},p_i)$ as the first attribute index of the decision matrix, namely, $x_{i1}=\mathrm{Min}G(t_{\mathrm{curr},p_i})$.

The second attribute index of the decision matrix is the sum of the computation cost and the data transmission cost of $t_\mathrm{curr}$ is defined as $x_{i2}=C(t_{\mathrm{curr},p_i}).$

step2: The dimensionless matrix $Q_{m*n}$ is obtained by standardizing $X_{m*n}$ according to the equation (10).

step3: According to the equation (12) and the weight determined for the two attribute indexes, we can get the weighted matrix $V_{m*n}$. In this paper, we assume that the completion time is more significant than the cost, so we set the weight of time and cost to 0.9 and 0.1, respectively.

step4: When the traditional TOPSIS method matches resources for tasks, if the size of the given resource increases or decreases, it is likely that the ideal solution will change, which may cause that the evaluation results of the same two resources will reverse. In order to solve this problem, we use a fixed reference point instead of determining ideal solution (Sect. 4.2 (4)), which can ensure the uniqueness of evaluation criteria and the robustness of the proposed algorithm under different size of resources.

Since the two attribute indexes in this section belong to the indexes of minimization and , We use 0 as the positive ideal solution and 1 as the negative ideal solution as follows

$$\begin{aligned} \begin{aligned} AZ_j^+&=0,\\ AZ_j^-&=1,\\ (j&=1,2,\ldots ,n). \end{aligned} \end{aligned}$$

(19)

Based on the above analysis, $AZ^+= [0, 0]$ , $AZ^- = [1, 1]$ in this section.

step5: Replace $Z^+$ and $Z^-$ with $AZ^+$ and $AZ^-$ in (15). Calculate the euclidean distance between each resource and $AZ^+$ according to (15).

step6: Obtain the relative closeness between each resource scheme and the absolute ideal solution $AZ^+$ according to the formula (16), and rank all resources in descending order of the relative closeness.

step7: Select the resource $p_\mathrm{sel}$ with the highest value of the relative closeness for task $t_\mathrm{curr}$, and assign $t_\mathrm{curr}$ to resource $p_\mathrm{sel}$ for execution.

The process of our algorithm for dynamic multi-workflow scheduling is shown in Algorithm 1.

Table 1 Parameters for experiments

Full size table

5 Experiments

5.1 Test environments and metrics

The experiment is implemented in the heterogeneous experimental environment (2.11 ghz, 8GB RAM) using java language. The heterogeneous experimental environment consists of several VMs which are with different service unit prices and computing capacities. The parameters for experiments are shown in Table 1. The unit price per unit time of resources is set as $0.3 \$/h \le \mathrm{price}(p_k)\le 0.7\$/h$. The unit price of data transmission between resources is set as $\mathrm{price}(p_\mathrm{comm})=0.1 \$/h$. The task execution time is denoted by $10 us \le w_{j,k} \le 100 us$ which means the different computing capacities of VMs. The Larger $w_{j,k}$ means lower computing capacity for a VM.

In addition, the five types of workflows used in the experiment are shown in Fig. 5. They are linear algebra, Gaussian elimination, Diamond graph, Complete binary tree, and Fast Fourier transform, respectively. To illustrate the number of tasks of each workflow, we introduce a parameter $\rho $.

(a) Linear algebra: The total number of tasks is $|N|=\rho (\rho +1)/2$. And the Fig. 5a is the worklfow with $\rho =5$.

(b) Gaussian elimination: The total number of tasks is $|N|=\frac{\rho ^2+\rho -2}{2}$. And the Fig. 5b is the worklfow with $\rho =5$.

(c) Diamond graph: The total number of tasks is $|N|=\rho ^2$. And the Fig. 5c is the worklfow with $\rho =4$.

(d) Complete binary tree: The total number of tasks is $|N|=2^\rho -1$. And the Fig. 5d is the worklfow with $\rho =5$.

(e) Fast Fourier transform: The total number of tasks is $|N|=2\times \rho -1+\rho \times log_2\rho $, where $\rho =2^y$. And the Fig. 5e is the worklfow with $\rho =4$.

There are more than one entry task in the workflow of Linear algebra, so we add a virtual entry task and all the actual entry tasks are set as the immediate successor tasks of the virtual entry task. There are more than one exit task in the complete binary tree and Fast Fourier transform workflows. Therefore, a virtual exit task should be added and all the actual exit tasks should be set as the immediate predecessor tasks of the virtual exit task. Note that, the data transmission time between the added virtual tasks and the actual tasks is zero.

Table 2 Workflow types for experiments

Full size table

Furthermore, the number of tasks in linear algebra, Guassian elimination, Diamond graph, Complete binary and Fast Fourier transform workflows are shown in Table 2. The proportion of the five types of workflows is the same under different number of workflows, that is, when the number of multiple workflows is 10, the number of workflows of each type is 2, and when the number of multiple workflows is 20, the number of workflows of each type is 4. Moreover, the workflow parameters are set as $10us \le w_{i,k}\le 100us$, $10us \le c_{ij}\le 100us$ in Table 1.

The performance metrics of the algorithm in this paper are the maximum completion time of multi-workflow $\mathrm{Makespan}(\mathrm{GS})$ and the total cost of the system $C(\mathrm{GS})$. All of the results for each experiment are average values after 20 times of the dynamic and multiple workflows’ execution.

5.2 Experiments results

In this subsection, the effectiveness of MT algorithm is validated. And we show the effectiveness of the algorithm from three aspects: the number of workflows, the arrival time interval of workflow and the number of resources.

5.2.1 Varying number of workflows

The compared algorithms are OWM, FDWS and MPOSH, where MPOSH is the extension of POSH to the problems of multiple workflow by simply adding the function of receiving multiple workflows. This part evaluate the performance of four algorithms by different workflow numbers.

In Fig. 6, the workflow arrival interval is 30, and the number of workflows is set in the range of 10, 20, 30, 40, 50. The number of resources provided is 100. According to Fig. 6, the makespan of four algorithms are increasing with the increase of the workflow number. The reason is that the set of given resources is fixed, and the increase of the amount of tasks to be processed will inevitably lead to the increase of time consumption and cost. In addition, it can be seen that the makespan and cost of MT algorithm are lower than those of the other three algorithms under different workflow numbers. This is mainly because the MT algorithm considers the heterogeneity of VMs for sorting tasks and adopts the TOPSIS method to rank the resources for executing tasks, which can further reduce the execution time and cost of the workflow. Furthermore, to fully demonstrate the benefits of the proposed algorithm, we set the workflow arrival interval as 10 and the number of resources as 100. The number of workflows is set in the range of 20, 40, 60, 80 which is used to evaluated the makespan and cost of four algorithms for workflows with different workflow numbers in Fig. 7. From Fig. 7, the makespan of four algorithms are increasing with the increase of the workflow number. In addition, it can be seen that the makespan and cost of MT algorithm are lower than those of the other three algorithms under different workflow numbers. Hence, the experimental results demonstrate that MT outperforms three state-of-the-art algorithms in terms of makespan and cost with different workflow numbers.

5.2.2 Varying the arrival time interval of workflow

In this subsection, the performance of FDWS algorithm, OWM algorithm, MPOSH algorithm and MT algorithm are compared from the perspectives of maximum completion time and total system cost of multiple workflows by different arrival time interval of multiple workflows. The number of workflows in this part is 30, and the number of resources is 100. The experimental results are shown in Figs. 8 and 9.

In Fig. 8, the arrival time interval of workflow is set to 10, 20, 30, 40, 50 respectively. Figure 8 indicates that under the same number of workflows, the makespan of the four algorithms gradually increases with the increase of the arrival time interval of the workflows. Moreover, compared with the other three algorithms, the makespan and cost of multiple workflows obtained by MT algorithm are the smallest. And the performance of MPOSH algorithm is worse than FDWS and OWM algorithms. Furthermore, the arrival time interval of workflow is set to 60, 70, 80, 90 and 100 in Fig. 9. From Fig. 9, the makespan of all algorithms are increasing with the increase of the arrival time interval. In addition, it can be seen that the makespan and cost of MT algorithm are lower than those of the other three algorithms under different arrival time interval. Hence, We can observe that the performance of MT is better compared with the FDWS, OWN and MPOSH in these experiments.

5.2.3 Varying number of resources

We evaluate the performance of four algorithms with different number of resources in this part. The experimental results of the four algorithms when the number of resources changes as shown in Figs. 10 and 11.

In Fig. 10, the number of resources is set to 10, 20, 30, 40 and 50, respectively. The number of workflows set in this part is 10, and the arrival time interval of workflow is 10. As depicted in Fig. 10, with the change of the number of resources, the MT algorithm can also achieve the best performance in the maespan and the cost compared with the other three algorithms. It shows that the MT algorithm has a good robustness when there has a change in the number of resources provided. This is mainly because that MT algorithm adopts absolute ideal solution of 0-1 type instead of relative ideal solution to ensure the uniqueness of evaluation criteria and avoid the reverse order of the same two resources when it uses the TOPSIS method to select resources for tasks. To further demonstrate the effectiveness of MT, the number of workflows is set as 30, and the arrival time interval of workflow is 30. The number of resources is set to 100, 120, 140, 160, 180 and 200 in the experiments, respectively. The experimental results of the four algorithms when the number of resources changes as shown in Fig. 11. It can be seen from Fig. 11 that the makespan and cost of MT algorithm are lower than those of the other three algorithms under different resource numbers. Hence, We can observe that the performance of MT is better compared with the FDWS, OWN and MPOSH in these experiments.

6 Conclusion

In this paper, we focus on the optimization of maximum completion time and total cost for the dynamic multi-workflow. Firstly, the system model and framework of dynamic multi workflow scheduling are proposed. Secondly, the optimization algorithm of dynamic multi workflow scheduling based on TOPSIS is designed, and the task priority calculation method, selection process and resource selection based on TOPSIS in heterogeneous computing environment are given. Finally, experiments hava been carried out in five types of real workflows, which fully prove the superiority and effectiveness of the scheduling algorithm proposed in this paper in reducing the maximum completion time and saving the total cost of the multiple workflows. In the future, we intend to extend MT to other workflow scheduling problems such as energy-aware and privacy-aware workflows scheduling in clouds.

Notes

The Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) is a multi-criteria decision analysis method, it is based on the concept that the chosen alternative should have the shortest geometric distance from the positive ideal solution and the longest geometric distance from the negative ideal solution [10].

References

Xia Y (2015) Cloud control systems. IEEE/CAA J Autom Sinica 2(2):134–142
Article MathSciNet Google Scholar
Xia Y, Yan C, Wang X (2019) Intelligent transportation cyber-physical cloud control systems. Acta Autom Sinica 45(1):132–142
MATH Google Scholar
Atmaca T, Begin T, Brandwajn A, Casteltaleb H (2016) Performance evaluation of cloud computing centers with general arrivals and service. IEEE Trans Parallel Distrib Syst 27(8):2341–2348
Article Google Scholar
Xie G, Chen Y, Li R, Li K (2018) Hardware cost design optimization for functional safety-critical parallel applications on heterogeneous distributed embedded systems. IEEE Trans Industr Inf 14(6):2418–2431
Article Google Scholar
Yassir S, Mostapha Z, Claude T (2017) Workflow scheduling issues and techniques in cloud computing: a systematic literature review. In: Proceedings of the International Conference of Cloud Computing Technologies and Applications, pp. 241– 263 . Springer
Bardsiri AK, Hashemi SM (2012) A review of workflow scheduling in cloud computing environment. Int J Comput Sci Manag Res 1(3):348–351
Google Scholar
Topcuoglu H, Hariri S, Wu M-y (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13(3):260–274
Article Google Scholar
Braun TD, Siegel HJ, Beck N, Bölöni LL, Maheswaran M, Reuther AI, Robertson JP, Theys MD, Yao B, Hensgen D et al (2001) A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J Parallel Distrib Comput 61(6):810–837
Article MATH Google Scholar
Sakellariou R, Zhao H, Tsiakkouri E, Dikaiakos MD (2007) Scheduling workflows with budget constraints, 189–202
Assari A, Mahesh T, Assari E (2012) Role of public participation in sustainability of historical city: usage of topsis method. Indian J Sci Technol 5(3):2289–2294
Article Google Scholar
Lin C, Lu S ( 2011) Scheduling scientific workflows elastically for cloud computing. In: Proceedings of the 4th IEEE International Conference on Cloud Computing, pp. 746– 747 . IEEE
Rodriguez MA, Buyya R (2014) Deadline based resource provisioningand scheduling algorithm for scientific workflows on clouds. IEEE Trans Cloud Comput 2(2):222–235
Article Google Scholar
Chen Z-G, Zhan Z-H, Li H-H, Du K-J, Zhong J-H, Foo YW, Li Y, Zhang J (2015) Deadline constrained cloud computing resources scheduling through an ant colony system approach. In: 2015 International Conference on Cloud Computing Research and Innovation, pp. 112– 119 . IEEE
Su S, Li J, Huang Q, Huang X, Shuang K, Wang J (2013) Cost-efficient task scheduling for executing large programs in the cloud. Parallel Comput 39(4–5):177–188
Article Google Scholar
Li H, Wang B, Yuan Y, Zhou M, Fang Y, Xia Y (2021) Scoring and dynamic hierarchy-based nsga-ii for multiobjective workflow scheduling in the cloud. IEEE Trans Autom Sci Eng. https://doi.org/10.1109/TASE.2021.3054501
Article Google Scholar
Chen Z-G, Zhan Z-H, Lin Y, Gong Y-J, Gu T-L, Zhao F, Yuan H-Q, Zhang J (2019) Multiobjective cloud workflow scheduling: a multiple populations ant colony system approach. IEEE Trans Cybern 49(8):2912–2926
Article Google Scholar
Zhu Z, Zhang G, Li M, Liu X (2016) Evolutionary multi-objective workflow scheduling in cloud. IEEE Trans Parallel Distrib Syst 27(5):1344–1357
Article Google Scholar
Garg R, Singh AK ( 2011) Multi-objective workflow grid scheduling based on discrete particle swarm optimization. In: International Conference on Swarm, Evolutionary, and Memetic Computing, pp. 183– 190 . Springer
Hsu C-C, Huang K-C, Wang F-J (2011) Online scheduling of workflow applications in grid environments. Futur Gener Comput Syst 27(6):860–870
Article Google Scholar
Lai Y-J, Liu T-Y, Hwang C-L (1994) Topsis for modm. Eur J Oper Res 76(3):486–500
Article MATH Google Scholar
Zhan, Y (2022) https://github.com/Ray-ZHAN/workflow-scheduling. [Online; accessed 06-Jun.-2022]

Download references

Acknowledgements

All codes and data are published on GitHub [21]. There is no conflict of interest in this paper.

Author information

Authors and Affiliations

School of Automation, Beijing Institute of Technology, Beijing, 100081, China
Yuanqing Xia, Yufeng Zhan, Li Dai & Yuehong Chen
Yangtze Delta Region Academy of Beijing Institute of Technology, Jiaxing, 314000, China
Yufeng Zhan

Authors

Yuanqing Xia
View author publications
You can also search for this author in PubMed Google Scholar
Yufeng Zhan
View author publications
You can also search for this author in PubMed Google Scholar
Li Dai
View author publications
You can also search for this author in PubMed Google Scholar
Yuehong Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yufeng Zhan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xia, Y., Zhan, Y., Dai, L. et al. A cost and makespan aware scheduling algorithm for dynamic multi-workflow in cloud environment. J Supercomput 79, 1814–1833 (2023). https://doi.org/10.1007/s11227-022-04681-3

Download citation

Accepted: 20 June 2022
Published: 02 August 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s11227-022-04681-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A cost and makespan aware scheduling algorithm for dynamic multi-workflow in cloud environment

Abstract

Similar content being viewed by others

Time- and Cost-Aware Scheduling Method for Workflows in Cloud Computing Systems

A novel strategy for deterministic workflow scheduling with load balancing using modified min-min heuristic in cloud computing environment

Unified Multi-constraint and Multi-objective Workflow Scheduling for Cloud System

1 Introduction

2 Related work