1 Introduction

Over last few years, cloud computing (CC) has become an emerging research area. It is considered as the main model of distributed computing. It offers elastically scalable and highly available resources as a subscription-based service like utility computing [1] for executing scientific workflows (SWfs). The SWf (such as Montage, CyberShake, Epigenomics, LIGO and SIPHT) is a transposition of the general term of workflow in the experimental context, that is to say relating only to computational processes via complex data flows and control dependencies while automating their implementations in appropriate resources [2].

Successful execution of SWf requires optimal use of resources. For that, the work must be focused on finding the best strategy for allocating workflow tasks to available computing resources. This is called workflow scheduling (WS). The WS aims at mapping and managing the execution of interdependent tasks by considering precedence constraints on shared resources [3]. This problem is known to be \({\mathcal {NP}}\)-Complete [3] due to its combinatorial aspect. What prompted researchers to provide a near-optimal solution.

Workflows must be well defined and managed to be executed later. This is the role of an efficient workflow management system (WMS).

As shown in Fig. 1, to schedule and map workflow tasks to the available resources in cloud environment, system needs a workflow scheduler (bridge).

Fig. 1
figure 1

Scientific workflow execution architecture in cloud environment

In this work, we extend the solution proposed in our previous work [4] by dealing with the SWf scheduling in cloud environment. We focus on efficient use of resources in order to run a SWf composed of interdependent tasks.

To this end, we develop a new hybrid GA for SWf scheduling in CC. The main idea of the proposed solution is matching a SWf tasks to proper resources in order to minimize the computation cost and the execution time while meeting deadline and budget.

The tasks scheduling in CC is among the most important problems for the various stakeholders in this environment especially when it comes to the scheduling of SWf at different levels of difficulty and complexity.

In the scientific field, we can find workflows sensitive to large volumes of data, others sensitive to complex processing and those sensitive to different criteria at the same time.

This very important subject has prompted several researchers to propose solutions aimed at optimizing the processing of these SWfs and in particular seeking a compromise between two contradictory quality of service (QoS) parameters which are time and cost. In this context, it should be noted that QoS determines the level of satisfaction of a user of a given service. Generally, the quality of service is measured by qualitative metrics such as computational time, computational cost and reliability. Generally, to process workflows very quickly, you have to hire powerful resources that are costly. On the other hand, cheaper resources can be slow to complete a job.

This contradiction pushed us to set up a promising solution aiming to minimize the processing time while minimizing the cost of this processing as much as possible and respecting the deadline and budget constraints.

To achieve this dual objective, we have looked in the literature for the most used approaches that can meet this objective by giving good results. For this, we have chosen to mix the power and simplicity of HEFT with the evolutionary algorithm which is the GA.

This mixing produces a hybrid solution aimed at optimizing the scheduling of SWf. The performance of GA can be enhanced by the incorporation of the solution generated by HEFT into the set of randomly generated solutions making up the initial population. HEFT is chosen because it offers very good scheduling of SWf. This approach tends to speed up the scheduling process to reach the optimum aiming to minimize computational time and cost.

This proposed approach is hybrid because a heuristic model, which is the HEFT, intervenes in the generation of the initial population. This mixture HEFT-GA is in order to obtain a set of optimal solution.

Based on the empirical results obtained from our simulations, we demonstrate that the proposed algorithm performs better than other state-of-the-art strategies to solve WS problem in cloud environments.

The remainder of this paper is organized into the followings sections.

We start by discussing several related works in Sect. 2. While Sect. 3 formulates the scientific workflow scheduling in CC that aims to minimize cost and time while meeting deadline and budget. Section 4 proposes a hybrid approach based on GA and HEFT. Experiments and results are reported in Sect. 5.

2 Related work

Several research studies have been conducted in recent years that analyses the SWf scheduling problem in the field of CC. Therefore, in the literature, several approaches have been applied in order to optimize one or several objectives. Table 1 states a list of criteria, constraints, methods, SWf applications and implementation environment chosen in some research topics. This part is organized in four parts that each one is related to a different optimization problem type.

Table 1 State of the art related to scientific workflow scheduling

2.1 Unconstrained SWf scheduling problems

In [5], optimization of scientific workflow execution in clouds by exploiting multicore system with the parallelization of bottleneck tasks is the objective of this paper.

In [23], the authors proposed a solution to optimize the resource usage of a workflow schedule.

Sahni et al. [24] propose a workflow and platform aware task clustering technique that aims to achieve maximum possible parallelism among the tasks.

In [25], Zhang et al. apply an ordinal optimization method iteratively in order to execute scientific workflow on elastic cloud compute nodes with dynamic workload.

In [13], the authors proposed an optimization solution of both makespan and cost.

Xiang et al. [6] presented a novel WS algorithm to minimize makespan in heterogeneous environments.

In [14], the authors optimized the makespan and monetary cost in cloud environment using GA.

In [7], Chirkin et al. proposed an optimization solution of the workflow makespan.

A solution based on list scheduling heuristic approach in order to optimize four objectives such as makespan, cost, energy consumption and reliability is proposed by Fard et al. [26].

In [27], the authors proposed a novel approach based on GA that uses the reliability-driven reputation to optimize makespan and reliability.

Zhang et al. [15] proposed a vectorized ordinal optimization approach to optimize makespan and cost.

2.2 SWf scheduling problems with deadline constraint

In [16], the authors minimized the execution cost of the workflow while meeting the deadline in CC environment.

Vinay et al. [28] scaled resources vertically in order to maximize resources utilization that are required to execute scientific workflow to meet deadline.

The minimization of the execution cost of a workflow in clouds under a deadline constraint was the main objective of [17].

In [8], the authors proposed an heuristic algorithm for scheduling workflows in deadline constrained clouds used for initialization of proposed GA.

In [18], Rodriguez et al. presented a resource provisioning and scheduling strategy for scientific workflow on IaaS.

Li and Cai [11] divide the workflow deadline into task deadlines in order to minimize resource renting costs.

Developing a scheduling system in order to minimize the expected monetary cost given the user specified probabilistic deadline guarantees is the objective addressed in [12].

In [19], the authors proposed a WS strategy in order to minimize the makespan and cost while meeting the deadline.

Visheratin et al. [9] proposed a co-evolutional GA for scheduling series of workflows in order to minimize the makespan while meeting deadline.

2.3 Optimization problems with deadline and budget constraints

Maximization of the number of completed workflow under both budget and deadline constraints is the optimization problem treated in [29].

In [10], Calheiros and Buyya replicated workflow tasks using idle time to mitigate effects of performance variation of resources.

In [20], the authors proposed an algorithm to solve the WS problem in order to minimize the time and cost while meeting deadline and budget.

Shishido et al. [21] used a mixed methodology based on security-aware and cost-aware WS algorithm that applied GA to optimize the combinatorial scheduling scheme.

2.4 Optimization problem with other constraints

The main objective of [30] was the partitioning workflow application into sub-workflow in order to minimize data dependencies and then execute each sub-workflow to minimize their partial makespan.

Liu et al. [22] executed a SWf in a multisite cloud while reducing makespan and cost.

As shown in Fig. 2, the majority of the work targeted on:

  • Performance metrics makespan and cost.

  • QoS constraints deadline and to a lesser extent budget.

  • SWf applications Cybershake, LIGO, Montage, Epigenomic and SIPHT.

Fig. 2
figure 2

Comparison between (1) performance metrics, (2) QoS constraints and (3) SWf applications

After this study, we can deduce that makespan and cost are the most important QoS parameters to study in task scheduling problems in the cloud environment while respecting QoS constraints like deadline and budget. Workflows must be executed before the scheduled deadline while not exceeding the allocated budget. For this, we deduce that the makespan which represents the completion time of the last task of the workflow must be lower than the deadline for a total cost not exceeding the budget. These constraints are defined in Eqs. (6) and (7).

3 Problem formulation

We address a SWf scheduling in CC in order to minimize the computational cost and the execution time (makespan) while meeting two constraints which are the deadline and the budget.

The problem consists to well respond to the client demands to run a workflow which is a set of interdependent tasks.

3.1 Structural representation

The WS representation can only be done using an adequate technique. For this, we choose an oriented graph which does not have a circuit and whose arcs are oriented. This type of graph is known as an acyclic oriented graph [directed acyclic graph (DAG)] [10, 13, 16] which is the most popular model as shown in Table 1. The nodes of the DAG are the tasks, and the arcs represent the dependency relationships between the tasks. Figure 3 shows a general workflow’s DAG scheme which contains 8 tasks. Other SWf DAG schemes are shown in Fig. 10.

Fig. 3
figure 3

Workflow DAG scheme sample

DAG is represented by a couple of vertex (V) and edge (E). It is usually denoted by \(W=\lbrace V,E\rbrace\) where

  • \(V=\lbrace t_0 \ldots t_n\rbrace\) is a set of n tasks of a SWf;

  • \(E=\lbrace t_i \rightarrow t_j \vert t_i,t_j \in V\) and \(t_i\) is parent of \(t_j\quad i,j = 1, \ldots , n \rbrace\) is a set of directed edges (control, data dependencies) that connect the tasks (vertices).

  • \(t_1\) is the start task denoted \(t_{start}\).

  • \(t_n\) is the end task denoted \(t_{end}\).

The DAG structure refers to parent–child relationships. We assume that the \(Child_i\) task is a successor to the \(Parent_i\) task if there is an edge from \(Parent_i\) to \(Child_i\) in the DAG. Upon task precedence constraint, only if the predecessor \(Parent_i\) finishes its execution and sends a message to its successor \(Child_i\), the latter can start its execution.

In cloud environment, tasks must be mapped to a set of resources used as a set of virtual machine (VM), in order to be executed. This set denoted \({VM} =\lbrace vm_0 \ldots vm_m\rbrace\). Each VM has its proper capacity (CPU, memory, BW).

DAG is usually represented by a binary square matrix M[nn].

$$M[i,j]= \left\{ \begin{array}{ll} 1 &\quad \hbox {when }t_i \in Parent_j\\ 0 &\quad \hbox {otherwise} \end{array} \right.$$

Figure 4 shows the representation of a workflow sample shown schematically in Fig. 3.

Fig. 4
figure 4

\(8 \times 8\) Matrix of a workflow example

3.2 Problem statement

Each task mapped to a particular VM has a computational cost calculated based on a time interval which designates the unit of measurement for calculating costs, especially since we assume in this paper that the resources are billed per unit of time of use. This time interval is called “quantum.” Logically, the fastest VM must be more costly. For this, we must find a trade-off between computation time and cost. The notation used in our problem modeling is described in Table 2.

Table 2 Parameter summary

The computational time \(CT^{k}_{i}\) of \(t_i\) executed in \(vm_k\) is defined in Eq. (1) as:

$$\begin{aligned} CT^{k}_{i} = \dfrac{Size_i}{CompC_k} \quad i = 1, \ldots , n \quad k = 1, \ldots , m \end{aligned}$$
(1)

The communication between two tasks \(t_i\) and \(t_j\) needs a transfer time from parent \(t_i\) to child \(t_j\). Transfer time denoted \(TrT_{ij}\) is calculated using the formula of Eq. (2):

$$\begin{aligned} {TrT_{ij}} = \dfrac{Data_{ij}}{BW} \quad i,j = 1, \ldots , n \end{aligned}$$
(2)

It should be well noted that the \(TrT_{ij}\) is zero if \(t_i\) and \(t_j\) belong to the same VM. Therefore, the internal data transfer is free of cost, which is the case in most of cloud data center.

The makespan expresses the total time spent to complete the user job. Makespan can be defined in Eq. (3) as:

$$\begin{aligned} Makespan_k = Max \lbrace CT^{k}_{i} \rbrace \quad i = 1, \ldots , n \quad k = 1, \ldots , m \end{aligned}$$
(3)

The objectives functions of this proposed model can be defined in Eqs. (4) and (5) as:

$$\begin{aligned}&\textit{Min }Cost = \sum _{k=1}^{m}{\frac{FCT_{k}-SCT_{k}}{Quantum}\times UCC_{k}} \end{aligned}$$
(4)
$$\begin{aligned}&\textit{Min }Makespan = Max \lbrace Makespan_{k} \rbrace \quad k = 1, \ldots , m \end{aligned}$$
(5)

subject to:

$$\begin{aligned}&Makespan \le Deadline \end{aligned}$$
(6)
$$\begin{aligned}&Cost \le Budget \end{aligned}$$
(7)

4 Proposed approach

In this paper, we propose a hybrid GA-based approach mixed with HEFT to generate an initial population. In our proposed approach, we are looking for a solution in which the best trade-off time/cost has been applied while meeting deadline and budget constraints.

4.1 Implementation of approach

  1. 1.

    Encoding

    In the literature, different types of encoding representation are proposed [4, 13, 27]. In this paper, we choose to use a direct representation, which is the more adapted to our information encoding. We propose to encode a chromosome as a n-sized collection. Each box i of the collection contains the VM identifier used by \(t_i\). The encoding is represented as shown in Fig. 5. In the proposed example, we identify two major information. The indexes of the vector depict the tasks that are scheduled and the number in each cell identifies the VM instance to which the task is allocated.

  1. 2.

    Initialization

    Algorithm 1 presents of the different steps taken to generate an initial population to implement our solution based on GA.

figure a
Fig. 5
figure 5

Chromosome encoding

  1. 3.

    Fitness function

    To solve our problem using GA, we have two goals to achieve. We want to minimize the computational time and the computational cost of executing the workflow. For this, we must clearly define a fitness function that meets this dual objective. The reader must be able to clearly understand how the fitness score is calculated. This function should generate intuitive results. The best solutions must have the best scores while the worst solutions must have the worst scores. First, we need to normalize all the fitness factors that make up our fitness function to be able to define it as defined in Eq. (8). Our function is composed of two factors defined as follows:

    $$\begin{aligned} f1(s) = \frac{Makespan(s)}{Deadline} \quad f2(s) = \frac{Cost(s)}{Budget} \end{aligned}$$

    f1 as the inverse of the makespan with deadline to ensure that the makespan does not exceed the deadline. f2 as the inverse of the cost with budget to ensure that the cost does not exceed the allocated budget. The fitness function is defined as below:

    $$\begin{aligned} &{ f(s) = w \times f1(s) + (1-w) \times f2(s)} \nonumber \\&\qquad \text {subject to}\quad w \in [0,1] \end{aligned}$$
    (8)

    where

    • \(s \in Population\)

    • w: the weight of time in the fitness function.

    • \(1-w\): the weight of cost in the fitness function.

  1. 4.

    Selection operation

    The tournament selection strategy [16, 19], the most popular selection technique of GA, is used to select the chromosome. In Algorithm 2, a presentation of the different steps was taken to apply this strategy. Figure 6 shows a tournament selection example applied on a sample of chromosomes.

figure b
Fig. 6
figure 6

Tournament selection for the GA

  1. 5.

    Crossover

    Using the tournament selection method, two chromosomes are selected for a two-point crossover [4, 13] operation wherein alternating segments are swapped to get new offsprings. In other words, these two chromosomes chosen will give rise to two offsprings after a crossing. For example, Fig. 7 shows a two-point crossover example.

Fig. 7
figure 7

An example of the two-point crossover

  1. 6.

    Mutation

    We propose to use an integer representation. A random task from the set of workflow tasks is assigned to a randomly chosen VM [4] as shown in Fig. 8.

Fig. 8
figure 8

The mutation operator

5 Simulation environment

5.1 Simulation setup

We have generated synthetic workflow data obtained from the Pegasus workflow repository [31]. To study the effectiveness of proposed SWf scheduling algorithm, we have applied simulation parameters summarized in Table 3.

Table 3 Workflow scheduling simulation parameters

5.2 Framework simulation environment

To evaluate our proposed solution, we chose to use WorkflowSim Framework based on CloudSim [18, 29]. Figure 9 explains the different steps from the introduction of the DAXFootnote 1 file generated by Pegasus workflow repository to the matching of workflow tasks to VM before being runned.

Fig. 9
figure 9

WorkflowSim framework components

A workflow planner generates a list of tasks that are introduced first in raw state as an XMLFootnote 2 file. The workflow parser module intervenes to prepare this tasks list. Tasks can be grouped in a set of jobs by the clustering engine if necessary. After that, these tasks must be ordered by the workflow engine taking into account the dependency constraints. At this level, the workflow scheduler intervenes to match ordered tasks to available VMs before being runned by processors.

6 Implementation and results

Following an in-depth study of the related work, we identify several areas of applications related to the scheduling problem. We examine five families of SWf applications as Montage, Cybershake, Epigenomics, LIGO and SIPHT, which are abstractions of dataflows that are used in real applications. These SWf applications were chosen because they represent a wide range of application domains and a variety of resource requirements. On this basis, Table 4 shows a comparison between these SWf applications in terms of system intensiveness while their DAGs are represented in Fig. 10.

Table 4 Comparison between the existing SWf applications
Fig. 10
figure 10

Real-world scientific workflows DAGs

We conducted a series of experiments using existing heuristic algorithms such as:

  • FCFS [19]

  • MinMin [7]

  • MaxMin [20, 27]

  • RoundRobin [30]

  • HEFT

We managed all jobs with the Pegasus WfMS, which transforms high-level descriptions of workflows into specific sequences of operations and identifies the computing resources required for execution. Here are the results:

  • The results of a series of experimentations applied to Montage workflow to calculate the makespan are described in Table 5.

Table 5 Experimental results for montage datasets of 100 and 1000 tasks
  • The result of a series of experimentations applied to Cybershake workflow, with, respectively, 100 and 1000 nodes, to calculate the makespan is described in Table 6.

Table 6 Experimental results for cybershake datasets of 100 and 1000 tasks
  • The result of a series of experimentations applied to Epigenomics workflow to calculate the makespan is described in Table 7.

Table 7 Experimental results for Epigenomics datasets of 100 and 997 tasks
  • The result of a series of experimentations applied to LIGO workflow to calculate the makespan is shown in Table 8.

Table 8 Experimental results for LIGO datasets of 100 and 1000 tasks
  • The result of a series of experimentations applied to SIPHT workflow to calculate the makespan is shown in Table 9.

Table 9 Experimental results for the SIPHT datasets of 100 and 1000 tasks

Figures 11 and 12 show the experimental results of the makespan, while Figs. 13 and 14 show the experimental results of the cost.

Fig. 11
figure 11

Simulation results plot of the makespan for 100 tasks and 20 VMs

Fig. 12
figure 12

Simulation results plot of the makespan for 1000 tasks and 20 VMs

Fig. 13
figure 13

Simulation results plot of the cost for 100 tasks and 20 VMs

Fig. 14
figure 14

Simulation results plot of the cost for 1000 tasks and 20 VMs

The proposed approach HEFTGA that we applied is based on the population of 20 chromosomes transformed in 100 generations which is a sufficient number to achieve a good convergence rate.

This proposed algorithm is developed in Netbeans 8.1 using Java programming language.

All experiments were performed on a computer with 2.4 GHz Intel Core i5 CPU and 8Go 1333 MHz of RAM.

Experimental results presented in Figs. 10, 11, 12 and 13 show that HEFTGA outperforms other state-of-the-art WS strategies in most cases.

  • Regarding Montage, we note that our solution gives better results in different cases. For a workflow consisting of 100 tasks (respectively, 1000), HEFTGA completes their execution after 103.12 s (respectively, 912.02) for a cost of 3449.21 (respectively, 36,260.11), whereas HEFT executes them with a cost higher in more time.

  • For Cybershake, the simulation results are less interesting when compared with those realized with Montage, but still they are better than those of HEFT. Workflow composed of 100 tasks is performed at a lower cost than others heuristics.

  • Unlike Cybershake, Epigenomics gives very good results to complete the execution of different tasks like Montage. The trade-off between time and cost is realized successfully.

  • About LIGO, the two workflows composed of 100 and 1000 tasks are completed with the cheapest costs by comparing them with the other results obtained by applying the other heuristics. A very good compromise makespan cost is realized for the second workflow consisting of 1000 tasks while the time is moderately acceptable for 100 tasks.

  • The results obtained with SIPHT are very interesting for the makespan of the workflow composed of 100 tasks and the completion cost of 1000 tasks while the makespan of 1000 tasks and the cost of 100 tasks are just acceptable.

To conclude, we can deduce that HEFTGA is well suited to Montage and Epigenomics applications since there is a successful compromise between time and cost. On the other hand, we can see that for Cybershake workflow profit is not very big. In addition, it should be mentioned that HEFTGA provides best results that HEFT in all cases for all workflow applications. This is due to the integration of the solution generated by HEFT into the initial population of our HEFTGA approach, which gives rise to a hybrid solution that guarantees the results of HEFT in the worst case otherwise much more efficient results.

7 Conclusion

In this paper, a cloud hybrid evolutionary approach for deadline and budget constrained real-world scientific workflow scheduling is proposed. As a solution of this problem, the proposed algorithm integrates the HEFT solution into the initial population used by our approach to achieve an optimal execution time and minimal execution cost. We have presented the different objectives as multiple QoS function including time and cost. Experiments on real-world SWf as Montage, Cybershake, Epigenomics, LIGO and SIPHT show that our approach HEFTGA has outperformed several state-of-the-art algorithm as FCFS, MinMin, MaxMin and RoundRobin that were previously used to solve the WS problem.

In future work, we plan to deal with the problem of data centers power consumption when planning workflows in cloud environments. Another work ahead is to simulate SWf in heterogeneous cloud environments globally distributed, hence the importance of taking into account the time and cost of data transfer between different data centers.