1 Introduction

Cloud computing is a highly internet-based and versatile computing environment which gives on-demand services and working platforms to fulfill the computational needs of different users [1]. Infrastructure as a Service (IaaS) is a standout amongst a widely recognized cloud service framework which provides proficient and adaptable computational resources to the users. These models discharge cloud infrastructure as virtual machines (VMs) [2].Clients being able to access a practically infinite amount of resources with less possession cost for application execution. Huge scale workloads can be kept running on the VMs facilitated by the cloud infrastructure [3].Users may use services with the rate fixed by their CSP (Cloud Service Provider) for specific requirements. These services could be delivered at any time [4].

Workflow logically portrays data-intensive applications that are hosted and facilitated on the cloud infrastructure. The tasks and information conditions among the works are depicted in the workflow models [5]. Normally, the workflow arranges the procedure in a Directed Acyclic Graph (DAG) structure, wherein every single node denotes the constituent task, and edges mean the inter-task conditions [6]. Workflows are composed of numerous interdependent data-intensive computational tasks, and they require high-performance computing resources for efficient execution. Hence the scheduling and mapping of these tasks to suitable resources in a proper way are significant in dealing with the workflow executions [7]. The greater part of the scheduling algorithms used in cloud model considers disregard security and resource execution issues of workflow implementation in the cloud [8].

The cloud security system comprises a lot of arrangements, controls, methods, and advancements, which are cooperated to ensure secure information in cloud-based frameworks and infrastructure [9]. Typically, cloud security procedures are assumed to be a mutual responsibility among solution provider and the owner. Cloud security methods are intended to support consistency, protect data, and confirm the safety of client privacy. Leakage of data, alterations of sensitive information by malicious attacks are considered as the two major threats in the cloud computing environment [10]. Due to shared infrastructure, security concerns while scheduling workflows is considered as one of the main complexity in the cloud [11]. Recently, more considerations are paid in the safety efforts of the cloud. In data center, VMs running on a physical machines (PMs) are considered as an essential element in cloud. Nowadays, diverse security components are available in VMs to assure the controls. Execution of security procedures in cloud computing reduces security risk and threats [12].

The objective function can be limited or maximized to fulfill clients' QoS requirements by the allocation of tasks to appropriate resources in cloud computing. The QoS requirement based workflow scheduling currently incorporates the time, cost, security, load, and success rate, etc. [13]. But many of the researches in the field confined to single or bi objective optimization of QoS parameters. This substantiates the study on multi objective optimization as more QoS objectives are to be optimized. The major QoS objectives in normal optimizations are time and cost. Load balancing is a serious issue which leads to performance degradation in system [14]. Load balancing recognized as parameter for multi objective optimization in this work. User requirement also has a significant role in deciding the QoS objectives [15]. Distinctive enhancement algorithms like a PSO, simulated annealing (SA), genetic algorithm (GA), etc. are used in a workflow applications for the effective optimization of QoS objectives. These algorithms bring enhanced results; however, some issues may arise like the overhead due to time-consuming procedures [16]. Many research used only heuristic methods for workflow scheduling. The scope of meta-heuristic methods for workflow optimization is still not completely explored and extensive researches are continuing in the field. Each meta-heuristic algorithm has its own advantages and disadvantages therefore hybrid method is emerging as a solution. The vast majority of the workflow scheduling examinations concentrates the expense and makespan. The security is one of the QoS constraints that is not properly addressed in past works [17]. In these contexts, the main contributions of our proposed methodology are explained as below:

  • Meta-heuristic solution to execute workflows in cloud computing systems with multi objective optimization.

  • A scheduling framework based on a hybrid ALO-PSO procedure is developed to perform an efficient workflow scheduling in the cloud computing model with data security.

  • A low complexity scheduling algorithm based on meta heuristic hybrid technique with upstanding results in terms of load, cost and makespan.

Towards this, the efficient particle swarm algorithm is hybridized with recent ALO and outcomes are compared with state of the art algorithms to distinguish an impacts of proposed strategy. The security algorithms in this work shield the information or data from alteration and leakage. The remaining paper is systematized as follows: In Sect. 2, related work is described. Problem formulation and framework models are demonstrated in Sect. 3. Section 4 contains the simulation strategies and anticipated outcomes. Section 5 deals with test outcomes and evaluation. At last, Sect. 6 provides the conclusion and recommendations for future work.

2 Literature survey

Some of the previous researches related to the present study are discussed in the below sections.

Due to its wide business and scientific applications, cloud-based workflow scheduling had gained major attention. Different researchers were used various heuristic and meta-heuristic strategies for scheduling by considering a few issues, for example, protection of energy, cost, and makespan. Choudhary et al. [18] exhibited a calculation for the processing, which deals with the reduction of cost and makespan. To schedule workflow applications, hybrid of heterogeneous earliest finish time (HEFT) algorithm and gravitational search algorithm (GSA) was used. Another issue named as equivalence in cost time made the objectives enhancement additionally sensible. Financial cost proportion, schedule lengths were considered for the performance measurements to contrast with the existing strategies. The ultimate results showed that the used strategy gives better results.

Manasrah and Hanan [19] developed a hybrid GA-PSO procedure for workflow scheduling in the cloud framework. This hybrid method effectively achieves the task allocation to the resources. Moreover, it considered some of the parameters like cost, makespan and load balance rate as an objective function to perform the task allocation in cloud environment. This hybrid algorithm was differentiated by the properties of GA as well as PSO algorithm. The evaluation of proposed methodology was tested with different sizes of workflow applications. The multi-objective task scheduling based on a hybrid EDA-GA algorithm was developed by Pang et al. [20]. EDA stands for Estimation of Distribution Algorithm. Initially, the sampling method and probability model of EDA were utilized to create the specified number of feasible solutions. Then, the search range of solutions was generated to initiate the crossover and mutation function. So, based on the hybrid algorithm functions, final task allocation to VM was achieved.

Workflow scheduling in the cloud is very challenging due to the versatility and heterogeneity of cloud resources. Optimization of execution cost and execution time are two prominent basic issues for scheduling resources in the cloud. Chen et al. [21] modeled a cloud workflow scheduling as a multi-objective (MO) enhancement issue with the scheduling objectives, minimization of execution time, and execution cost. The proposed method was an ant colony system (ACS) using multiple populations for multiple objectives. The scheduling was based on two colonies for those two objectives of time and cost.

Apart from the out-of-date objectives like budget, deadline, and makespan, in this proposed multi-objective GA (MOGA), Rehman et al. [22] considered the energy consumption aspect also. To provide energy-efficient solutions while optimization, dynamic voltage frequency scaling method was utilized in the study. A gap search calculation that finds the gap between continuous engagement periods of VMs and the optimal use of cloud resources. The results of MOGA were equated with multi-objective PSO (MOPSO) through similar objective constraints as that of MOGA. The MOGA gave better results when contrasted with other GA based algorithms which considered objectives like makespan, cost, and deadline individually.

Nowadays, meta-heuristic algorithms are emerged as a prime choice for workflow scheduling in the cloud because of its NP-hard nature. Shishido et al. [23] examined the efficiency of meta-heuristic methods to schedule workflows in the cloud. This study assessed the impacts of both PSO enhancement and GA for workflow scheduling optimization. A cost-aware workflow scheduling problem was adopted towards measuring the competency of meta-heuristic methodology and the experiments were conducted with PSO, GA, and Multi Population GA meta-heuristic. Meta-heuristic algorithms were evaluated on minimization of cost and response time objectives. These algorithms returned better schedules that decrease the expense inside a sensible timeframe.

Angela et al. [24] suggested secured cost prediction-based scheduling (SCPS) method, which emphasizes security while minimizing makespan in workflow scheduling. Even though cloud services could be successfully utilized to execute big, information and computation-exhaustive scientific workflow requests, security remains a major concern. Cost prediction matrix (CPM) for expense estimation and a fuzzy based model for choosing appropriate VMs based on security concerns are the basic elements of this presented methodology.

For scientific workflow scheduling, Sharma and Rashid [25] developed a hybrid PSO algorithm in the cloud computing model. This research hybridized PSO algorithm with Predict Earliest Finish Time (PEFT) technique to make better scheduling process. Here, initial population was generated by PSO algorithm. Performance of proposed methodology was estimated in terms of cost and makespan. In cloud computing, Deadline-aware and Cost-effective Hybrid Genetic Task Scheduling (DCHG-TS) method was developed by Iranmanesh and Naji [26] for scientific workflow scheduling. In this GA, they used new genetic operators and modified genetic operators to enhance the load balancing routine. At execution time, this research applies a load balancing method to exploit the resources.

For multi-objective task scheduling problem in cloud environment, Abualigah and Diabat [27] developed a new hybrid ant lion optimization (ALO) procedure. Elite based differential evolution method was hybridized with ALO to resolve multi-objective task scheduling difficulty in which the proposed method was named as MALO. The multi-objective problem was derived to maximize resource utilization as well as to minimize makespan. Moreover, exploitation ability of ALO algorithm was improved by the elite based differential evolution technique. Moth-flame optimization algorithm (MFO) [28] was developed to perform an efficient task scheduling (TS) method for cyber-physical system (CPS) applications in fog computing. This algorithm considered minimization of transfer time and task execution as fitness function in optimization algorithm.

Garg et al. [29] developed a reliable and energy efficient workflow scheduling in cloud model. Energy consumption of a whole system was condensed by an efficient Dynamic voltage and frequency scaling (DVFS) method. During application execution, negative effect of DVFS increases the transient faults in system reliability. This paper developed a new scheduling algorithm to overcome the abovementioned problems. There were four stages in the scheduling algorithm which were named as priority estimation, task clustering, target time distribution and conveying cluster to processing component that have proper frequency or voltage levels. An enhanced GA was developed by Keshanchi et al. [30] to perform a task scheduling in cloud environment. Accuracy of the proposed algorithm was checked by behavioral modeling method. Then, predictable stipulations of the proposed method were extracted in the form of linear temporal logic (LTL) formulas. Labeled Transition System (LTS) method also utilized to confirm the performance of suggested technique. Comparative analysis of existing methods is listed in Table 1.

Table 1 Comparative analysis of existing methods

Applications of workflow have variety of different tasks and composite structure. Every single task is processed by entering information, accessing software, processing or storage functions. It is difficult to achieve a good trade off solution among cost execution time. These are considered as a reasons for a workflow scheduling to be a NP-hard problem. For the selection of proper resources, it is very essential to develop an efficient algorithms for workflow execution. Past investigations take a lot of efforts in workflow scheduling to schedule the works properly. From the literature review, it is evident that almost all experimentations were done with single objective or bi-objective scheduling only. No single investigation has incorporated with all the parameters into one algorithm, as this study proposes.

Moreover, they have some limitations like time, maximum cost, QoS constraints and computational complexity. The security practices in workflow scheduling are mentioned very rarely in past researches. But, the proposed methodology utilizes DES algorithm for security in workflow applications. Moreover, the present investigation shows the impact of a hybrid meta-heuristic algorithm in multi-objective optimization of workflows. The optimization of cost, makespan, and load are the objectives, and it additionally provides security to the data during scheduling. A few algorithms are weak in local search, and others are weak in global search optimization. Hence hybridization of diverse meta-heuristic method seems to provide better results for workflow scheduling problems. In hybridization, the first optimization procedure is applied to elect the ideal weight parameter. In that algorithm, the presence of random parameter may increase the number of iterations. So, in order to reduce this, we hybrid another optimization algorithm that shows less computation time. Implementation of a hybrid system will improve a deficiencies of previous approaches. ALO is a recent meta-heuristic method which exhibits high exploration and convergence rate [36]. The ALO and PSO are claimed to be dominant than GA. Then the hybridization of ALO and PSO is expected to provide better optimization results.

3 System model

This section describes the cloud resource and workflow model utilized in this experimentation.

3.1 Model of workflow

DAG is commonly used to represent the structure of workflow. It is demonstrated as \(W = (T,D)\) in which \(T = \{ T_{0} ,T_{1} ,T_{2} , \ldots ,T_{n} \}\) signifies the collection of n tasks in a workflow. Among the tasks, \(D = \{ (T_{i} ,T_{j} )|T_{i} ,T_{j} \in T\}\) demonstrates the set of data flow conditions. These are represented by \((T_{i} ,T_{j} )\), which shows the constraint called precedence in between Ti and Tj. Tasks may have various predecessors and successors, the immediate predecessors and successors task set of Ti is specified by Pred(Ti) and Succ(Ti) respectively.

$$ \Pr ed\left( {Tj} \right) = \left\{ {Ti|\left( {T_{i} ,Tj} \right) \in D} \right\} $$
(1)
$$ Succ\left( {T_{i} } \right) = \left\{ {T_{j} |\left( {T_{i} ,T_{j} } \right) \in D} \right\} $$
(2)

The task with no predecessor is termed as entry task, Tentry which is indicated by,

$$ Pred(T_{entry} ) = \varphi $$
(3)

While tasks with no successors is named as exit task Texit and shown by,

$$ Succ(T_{exit} ) = \varphi $$
(4)

Generally, all workflow scheduling needs a DAG with single Texit and Tentry. These can be accomplished effectively by including pseudo Texit and Tentry having zero weight to DAG. Same assumptions are followed in every workflow and have only one Tentry and Texit.

3.2 Cloud resource model

The computational resources in an IaaS platform areoffered by VMs. Instances are the running VMs and different type of instances are being provided by IaaS. These instance types are of varying computing capacity, memory, and bandwidth combinations. For different instance types, the capacity of the CPU will decide the execution time of a task, and a bandwidth influences data communication time between tasks. User can use an infinite number of instances and it is indicated by an infinite set \(Is = \{ Is_{0} ,Is_{1} , \ldots Is_{a} \}\). In an IaaS platform, the present instance types are indicated by a set of types \(Ps = \{ Ps_{0} ,Ps_{1} , \ldots Ps_{n} \}\) where n is the number of types. An instance can run only one task at a time.

Compute unit is utilized to mention CPU capabilities of different instance types. Let \(Compute(Ps_{i} )\) be the compute unit of the instance type \(Ps_{i}\). The actual running time of a task in multi-core CPUs are distinguished as,

$$ AR_{time} (T_{i} ) = \frac{{refer\,time(T_{i} )}}{{Compute(Ps_{j} )}} $$
(5)

where \(AR_{time} (T_{i} )\) is actual task running time of Ti. The reference time of execution for a task Ti is \(refer\,time(T_{i} )\) in which it is mentioned as the time taken to execute the task on an instance whose compute unit equals to one [31]. The total bandwidth (BW) utilization is defined as the percentage of utilized bandwidth from the total available bandwidth. The given below expression is applied to calculate the bandwidth utilization of each VM.

$$ BW_{utilization} = \frac{{(actual\,used\,BW\,in\,VM_{i} )}}{{(Total\,BW\,capacity\,of\,VM_{i} )}} \times 100 $$
(6)

Memory resource is assumed as one of the main resources in cloud computing. It is mainly utilized to satisfy the service request in the cloud. The capacity of memory is calculated based on the given below expression.

$$ Memory_{capacity} = \frac{{utilized\,memory\,of\,VM_{i} }}{{Total\,memory\,of\,VM_{i} }} $$
(7)

3.3 Energy and system reliability model

Energy consumption in cloud data center is mainly from the usage of CPU, memory, network interfaces and disk storage. CPU consumes more power than any other resources. Here, the energy consumption of single task ti is represented in the given below expression.

$$ E_{i} (f_{r,op} ) = (P_{{ind_{i} }} + C_{eff} f_{r,op}^{3} )\frac{{et_{i} }}{{f_{r,op} }} = P_{{ind_{i} }} \frac{{et_{i} }}{{f_{r,op} }} + C_{eff} f_{r,op}^{2} et_{i} $$
(8)

Here, implementation time of the task \(t_{i}\) at maximum operating frequency is mentioned as \(et_{i}\), \(P_{{ind_{i} }}\) mentions the frequency independent power consumption, operating frequency level of processor is mentioned as \(f_{r,op}\), and effective load capacitance is denoted as \(C_{eff}\). Finally, the overall energy consumption is calculated based on the given below expression [29].

$$ E_{total} = \sum\limits_{i = 1}^{n} {E_{i} (f_{r,op} )} $$
(9)

For computation intensive applications, reliability is considered as an important factor. Reliability of the system is defined as the probability of execution of task without any failure. So, this paper considered the poison distributed fault rate for reliability calculation. The reliability of task \(t_{i}\) with execution time \(et_{i}\) and operating frequency \(f_{r,op}\) is mentioned in the given below expression

$$ {\text{Re}} l_{{t_{i} }} (f_{r,op} ) = e^{{ - \lambda (f_{r,op} ).\frac{{et_{i} }}{{f_{r,op} }}}} $$
(10)

The consistency of an application involving n tasks which is given as product of reliabilities of all its tasks and it is described in the given below expression [29].

$$ {\text{Re}} l_{G} = \prod\nolimits_{i = 1}^{n} {{\text{Re}} l_{{t_{i} }} (f_{r,op} )} $$
(11)

3.4 Problem definition of workflow scheduling

An overview of multi-objective workflow scheduling problems is discussed in this section. These problems are considered to solve workflow scheduling in the proposed methodology. In this experimentation, the different metrics like makespan, load and cost are considered as main issues. The security efforts are trying to ensure data security while scheduling. Calculation of cost, makespan, and load are given initially for the problem definition.

Let \(S_{t} (T_{i} )\) and \(F_{t} (T_{i} )\) indicates the start and finish time of \(T_{i}\). The starting time of a task is related to finishing time of all predecessors \(\Pr ed(T_{i} )\), communication time \(Com_{t} (T_{j} ,T_{i} )\) among its predecessors and itself, and completion time \(F_{t} (T_{j} )\) of the earlier task which implemented on a same instance. Finish time \(F_{t} (T_{i} )\) of task \(T_{i}\) are estimated as,

$$ F_{t} (T_{i} ) = S_{t} (T_{i} ) + AR_{time} (T_{i} ) = \max \left\{ {Avail(insta(T_{i} )),\mathop {\max }\limits_{{T_{j} \in \Pr ed(T_{i} )}} (F_{t} (T_{j} ) + Com_{t} (T_{j} ,T_{i} ))} \right\} + AR_{time} (T_{i} ) $$
(12)

The available time of instance, in which task \(T_{i}\) executes on can be given as \(Avail(insta(T_{i} ))\) and this varies during scheduling. Beginning time of entry task \(T_{entry}\) be zero then finish time of \(T_{entry}\) be also zero. Completion time of exit task is distinguished as makespan and expressed as

$$ Makespan = F_{t} (T_{exit} ) $$
(13)

For evaluating cost, let \(Cst = \{ Cst_{1} ,Cst_{2} , \ldots ,Cst_{H} \}\) be taken as the set of pricing schemes towards consuming services. \(Monetary_{\cos t} = (Cst_{h} ,Type,Run_{insta} )\) be the cost for running instance \(Run_{insta}\) with instance type Type and corresponding cost scheme \(Cst_{H}\). The total monetary cost for executing all tasks in workflows is shown as,

$$ {\text{Cos}} t = \sum {Monetary_{\cos t} \left( {Cst_{h} ,Type\left( {Run_{Insta} } \right),Run_{Insta} } \right)} $$
(14)

The next problem is load balancing. Load balancing can be characterized by the length of the task, the capacity of cloud resources, task dependency [41]. Proper load balancing can enhance the usage of resources and thereby forms effective schedules. The sum of all load in VM scan be demonstrated as,

$$ Ld = \sum\nolimits_{i = 1}^{k} {load_{i} } $$
(15)

where i denotes a number of VMs in data center. Given below expression states the Load per unit capacity (Lpuc).

$$ L_{puc} = \frac{Ld}{{\sum\nolimits_{i = 1}^{m} {Cap_{i} } }} $$
(16)
$$ {\text{Threshold}}\;{\text{value}},\quad Tresh_{i} = L_{puc} *Cap_{i} $$
(17)

where capability of node is mentioned as \(Cap_{i}\). Load imbalance factor of specific VM is shown as

$$ VM_{capacity} \left\{ \begin{gathered} < |Tresh_{i} - \sum\nolimits_{i = 1}^{k} {load_{i} |Underloaded} \hfill \\ > |Tresh_{i} - \sum\nolimits_{i = 1}^{k} {load_{i} |Overloaded} \hfill \\ = |Tresh_{i} - \sum\nolimits_{i = 1}^{k} {load_{i} |Balanced} \hfill \\ \end{gathered} \right. $$
(18)

Load status of a VM can be identified using the above formula, if sum of loads of all VMs are less than the threshold value of VM, then it is identified as under loaded. An under loaded VM accepts load from overloaded VM until it becomes balanced. Flowchart model for different load balancing strategy is displayed in Fig. 1.

Fig. 1
figure 1

Flowchart model for load balancing technique

A directed acyclic graph \(W = (T,D)\) represents the workflow, and the scheduling scheme is denoted by a \(Schedule = (R,M,{\text{Cos}} t,Time,Load)\). Where, load mentions the size of the task, M mentions the resource task mapping, time mentions the execution time and R denotes the resources. Here the problem is to design workflow schedules that determine the task scheduling order. Task scheduling order must comply with the dependency requirements between undertaking tasks. The task cannot be scheduled before each of its predecessors is scheduled [32, 33]. The issues can be characterized as finding a schedule with least execution cost, execution time and minimize task migration for load maintenance in scheduling. The security of data while transfer in scheduling is a significant requirement and needs to extend security measures in the proposed work.

3.5 Multi-Objective Optimization Problem (MOP)

The MOP has various conflicting objectives that should be optimized concurrently. In that view, workflow scheduling is also a typical MOP where conflicting objectives like makespan, cost, and load are to be optimized. Multi-objective optimization problems with many decision variables and objectives can be formally defined as,

$$ Fitness\,Function\,F(x) = \min (\omega_{1} f_{1} (x) + \omega_{2} f_{2} (x) +_{3} f_{3} (x)) $$
(19)

Here, makespan, cost and load balancing rate are some metrics taken to perform the optimization process. In our work, these three metrics are taken to reduce a whole objective function. First objective function is makespan. The expression for makespan is given as below:

$$ f_{1} (x) = Makespan $$
(20)

Here, the equation for makespan is already displayed in Eq. (13). The second objective is the cost minimization, and it is expressed in a given below expression.

$$ f_{2} (x) = {\text{Cos}} t $$
(21)

The cost function is mentioned in Eq. (14). Finally, the last objective is load balancing rate which is expressed in the given below expression.

$$ f_{3} (x) = Ld $$
(22)

Here also, the mathematical term of load balancing rate is mentioned in Eq. (15).

4 Proposed methodology

The issues indicated in Sect. 3 are NP-hard and a meta-heuristic approach could only provide alternative solutions. In this paper, a novel multi-objective hybrid algorithm to improve the workflow scheduling condition is proposed. These algorithms can perform multi-objective optimization and solve NP-hard issues. The workflow schedule is improved by utilizing the multi-objective hybrid ALO-PSO. Improvement of workflow scheduling is done as far as cost, makespan, and load. A security-based workflow scheduling is utilized to offer security to scheduling, which counteracts the alteration and loss of information. The datasets are created manually by DAG, and the execution time matrix as well as communication matrix are considered for the experimentation. The input data are inserted into the hybrid algorithm, which provides the optimized results. The security procedures are implemented by utilizing a data encryption scheme (DES) mechanism.

The process flow of proposed methodology is illustrated in Fig. 2. Experimental simulation is demonstrated in the CloudSim toolbox and the outcomes were contrasted with existing algorithms. The proposed multi-objective enhancement procedure and security measure give a successful trade-off between objectives by improved outcomes. The ideal results will be obtained from the experimentation regarding cost, makespan, and load balancing.

Fig. 2
figure 2

Scheduling model of proposed system

4.1 Security strategy

Security in the cloud mechanism is one of the fundamental concerns while handling the scheduling and other cloud-related procedures. Data alterations, the attempt of information capture are the sorts of attacks that may happen during scheduling. In this experimentation, the DES technique [34] is utilized to provide security to workflow scheduling. The DES works dependent on a similar key to encode and decode a system of procedures. This depends on two essential properties of cryptography called substitution and transposition.

The DES comprises 16 steps, every one of which is called a round. The DES algorithm gives security to the data and the procedures are demonstrated in Fig. 3. The data are encoded and moved while scheduling is carried out. These will counteract the data leakage and gives better security.

figure d
Fig. 3
figure 3

DES in workflow scheduling

The security features implemented during workflow scheduling will avoid the data leakage and improves the data confidentiality. DES algorithm helps to decrypt the data while task scheduling is carried out [35]. Algorithm 1 represents the DES algorithm.

4.2 Ant-lion optimization

ALO considered as a novel nature-inspired method, which imitates a foraging procedure of ant-lions. The ant's travelling turns out to be random when searching for the food, to model such movement, the random walk is determined as,

$$ X(t) = [0,cumulsum(2R(t_{1} ) - 1),cumulsum(2R(t_{2} ) - 1, \ldots ,cumulsum(2R(t_{n} ) - 1)] $$
(23)

Here, n is the maximum number of iteration, t represents the iteration, R(t) is the stochastic function defined as below,

$$ R\left( t \right) = \left\{ {\begin{array}{*{20}c} {1,\quad if\;rand > 0.5} \\ {0,\quad if\;rand \le 0.5} \\ \end{array} } \right. $$
(24)

where rand is the random number created in the period of \([0,1][0,1]\). Ant’s positions are stored in the matrix for optimization.

$$ Matrix_{{Ant}} = \left[ {\begin{array}{*{20}c} {Ant_{{1,1}} } & {Ant_{{1,2}} } & \cdots & {Ant_{{1,d}} } \\ \cdots & \vdots & \cdots & \vdots \\ {Ant_{{n,1}} } & {Ant_{{n,2}} } & \cdots & {Ant_{{n,d}} } \\ \end{array} } \right] $$
(25)

where \(Matrix_{Ant}\) is the matrix having position of each ant, n be the number of ants, and d is the number of variables. The position of ant represents parameter for a particular solution. Fitness function is utilized in optimization procedure to evaluate each ant, and the fitness value obtained thus is stored in the matrix as follows,

$$ Matrix_{Fitness} = \left[ {\begin{array}{*{20}c} {f\left( {Ant_{1,1} ,Ant_{1,2} ,\begin{array}{*{20}c} \cdots & {Ant_{1,d} } \\ \end{array} } \right)} \\ {\begin{array}{*{20}c} \vdots & \cdots & \cdots \\ \end{array} } \\ {f\left( {Ant_{n,1} ,Ant_{n,2} ,\begin{array}{*{20}c} \cdots & {Ant_{n,d} } \\ \end{array} } \right)} \\ \end{array} } \right] $$
(26)

Similar to the case of ants, position and fitness value of the ant lions is also stored as \(Matrix_{antlion}\) and \(Matrix_{fital}\) respectively.

4.2.1 Ant’s random walks

The random walks depend on Eq. (23), and the position is updated in every optimization step with random walks. Given below expression is utilized to limit the ant’s random walks inside the search space.

$$ X_{i}^{t} = \frac{{(X_{{^{i} }}^{t} - A_{i} ) \times (D_{i} - C_{i}^{t} )}}{{(D_{i}^{t} - A_{i} )}} + c_{i} $$
(27)

In Eq. (27), Ai is ith variable’s minimum random walk, ith variable’s maximum random walk is denoted as Ci is \(C_{i}^{t}\) gives minimum of tth iteration value for ith variable and \(D_{i}^{t}\) is a maximum of tth iteration value for ith variable.

4.2.2 Entrapment of ants in pits

Ant’s random walks are influenced by traps of ant-lions. For the mathematical representation of this condition, the below equations are used,

$$ C_{i}^{t} = C^{t} + Antlion_{j}^{t} $$
(28)
$$ D_{i}^{t} = D^{t} + Antlion_{j}^{t} $$
(29)

where \(C^{t}\) and \(D^{t}\) be the minimum and maximum of all variables at tth iteration.\(Antlion_{j}^{t}\) is a position of jth ant-lion during tth iteration. Roulette wheel concept is applied to trap an ant by ant-lion. During optimization procedure, the roulette wheel selects fittest ant-lion for trapping.

4.2.3 Hunting prey and re-creating pits

Ant-lion catches the ant when the fitness value of ant becomes higher than the related ant-lion. Then ant-lion needs to update its location to the current location of a hunted ant. This action increases the possibility of hunting another ant.

$$ Antlion_{j}^{t} = Ant_{i}^{t} \,if\;f(Ant_{i}^{t} ) > f(Antlion_{j}^{t} ) $$
(30)

4.2.4 Elitism

This allows maintaining the relevant solution in an optimization stage. The correct ant-lion attained so far in every iteration has stored and termed as elite. The position of ant has updated through the random walk nearby an ant-lion chosen through roulette wheel, and random walk around an elite as given below

$$ Ant_{i}^{t} = \frac{{RW_{A}^{t} + RW_{E}^{t} }}{2} $$
(31)

where \(Ant_{i}^{t}\) is the position of ith ant at tth iteration. \(RW_{A}^{t}\), is the random walk around ant-lion taken by roulette wheel at tth iteration \(RW_{E}^{t}\), is the random walk around the elite at tth iteration [36].

4.3 Particle swarm optimization

PSO is a stochastic optimization technique based on the behavior of swarm of animals. Here, particles are the basic elements that can traverse through the problem space. This movement of particles in the problem space forms the result for the problem behind optimization. The position and velocity are the particle attributes. The optimum position achieved for each particle \(pbest_{i}\) global best position attained for any of the particles of the whole swarm \(gbest_{i}\) are used to calculate the velocity of a particle in every single iteration of the algorithm. Fitness function calculates the goodness of a particle’s position at each step. Depends on the fitness function, the velocity of every single particle is upgraded to a best locations, and global best positions. Velocity and location of a particles are changed in terms of Eqs. (32) and (33), respectively [37]. The algorithm continues to work until a stopping criteria is satisfied. Movement of particles towards the best position and global best positions are controlled by random numbers.

$$ vel_{i} (t + 1) = IW(t)vel_{i} (t) + \phi_{1} r_{1} (pbest_{i} - X_{i} (t)) + \phi_{1} r_{2} (gbest - X_{i} (t)) $$
(32)

Velocity equation shows different parameters that influence the convergence of algorithms such as \(IW(t)\), \(\varphi_{1}\) and \(\varphi_{2}\). In the equation, t be the iteration number and \(IW(t)\) is the inertia weight of cycle t. Then \(pbest_{i}\) and \(gbest_{i}\) are the best location of a particle i and the global best location of entire particles. r1 and r2 are two random numbers related to \(\varphi_{1}\) and \(\varphi_{2}\). Where \(\varphi_{1}\) is the learning factor and \(\varphi_{2}\) is the social factor and also, it fulfils a condition \(\varphi_{1} + \varphi_{2} \ge 4\) [38]. Utilizing the velocity equation, the position of the particle is updated,

$$ X_{i} (t + 1) = X_{i} (t) + vel_{i} (t + 1) $$
(33)

4.4 Workflow scheduling using hybrid ALO and PSO

Finding a technique to decide the type and number of VM is the initial step to clear the problem. The on-demand VMs are costlier when compared to the pre-allocated ones, and the algorithm preferred to utilize allocated VMs. Here, predefined sets of VMs are configured to prepare scheduling information. The different initial set of resources can be used to find various solutions to achieve the scheduling objective. Only a single pricing method is used for a single schedule. Hence pricing standards of cloud suppliers do not have effects on cost. Tasks in the workflow are assigned by the scheduler to cloud VM. The scheduler distinguishes the execution time and refreshes the task history database. The step by step procedure of scheduler is displayed in Algorithm 2.

figure e

The scheduler manages the optimization algorithm and controls the processing. Moreover, the scheduler is responsible for the termination of workflows. In cloud-based framework, different strategies and methodologies are used for workflow scheduling process [17].Various types of algorithms were used in past research works. In present work, it utilizes a hybrid ALO and PSO for the powerful scheduling frameworks. The PSO is a stochastic method based on group cooperation and simulates the conduct of birds foraging. The ant-lion gathering is one of the vital parts of the ALO methodologies. Proposed hybrid algorithm is presented in Algorithm 3.

figure f

The proposed system has characteristics of both ALO and PSO algorithms. The ant-lions with the capacity of better correspondence and memory can advance faster toward the ideal arrangement. In the hybrid algorithm, the search qualities of ALO are kept, and the communication characteristic benefit of PSO is embedded. During the workflow scheduling, the parameters like cost, makespan, and load are optimized in the proposed method. The ALO strategy forms the scheduling systems, and the PSO calculations help to search the better ant-lions in the ALO algorithm. This combination improves overall optimization mechanisms.

5 Simulation outcomes and discussions

In this section, simulation outcomes and the execution details of proposed system are discussed. CloudSim toolbox is utilized for cloud simulation purposes. It is an extensible simulation framework that empowers consistent modelling and simulation of cloud resources and applications. The workflow scheduling is optimized in terms of cost, makespan and load, and also provides security. The simulations operated on the dataset, which is created with 100 tasks. Communication and execution time of tasks are considered for the experimentation. The tests were kept running on an Intel i7 quad-core processor with 12 GB of DDR4 RAM. This section presents the test arrangement and examines the test results. The scheduling scheme is enhanced with the proposed hybrid algorithm. The data security is guaranteed by using DES calculations.

5.1 Task representation

The best way to represent tasks of a workflow application is in the form of DAG i.e., \(G = (T,D)\) where T denotes n number of vertices or tasks i.e., \(T = \{ T_{0} ,T_{1} , \ldots T_{n} \}\) and D denotes set of edges. Edges are used to measure the transfer delay time between any two different tasks i.e. Ti and Tj. It also represents dependency constraint among tasks. For suppose Tj cannot start unless task Ti completes its execution. Tasks without any predecessor and successor are termed as entry \((T_{entry} )\) and exit \((T_{exit} )\) tasks respectively. If the DAG consists of more than one \((T_{entry} )\) and \((T_{exit} )\) then a new entry and exit pseudo task is appended with zero transfer delay and computation time. All \((T_{entry} )\) and \((T_{exit} )\) tasks are connected with the newly created pseudo task [16]. A sample DAG with 10 tasks is shown in the Fig. 4.

Fig. 4
figure 4

Sample DAG struscture for 10 tasks

In this experimentation, the processing is carried out with 100 tasks. The task properties and relation between the tasks are taken for the evaluation.

The characteristics of VMs which are utilized in the workflow scheduling process are indicated in Table 2. Basically, five VMs with 512 Mb memory is considered for execution purpose. The usage of VM brings down the number of hardware and limits the processing costs.

Table 2 Characteristics of VM

The properties of tasks used in the execution of workflows are indicated in Table 3. In this work, the 100 tasks are taken with various lengths. The file size, output size, and required CPU of only 30 tasks are indicated in the tabulation.

Table 3 Task properties

5.2 Comparison of proposed algorithm with existing methods

To distinguish the viability of proposed system, five different algorithms are examined, and their outcomes are compared. The tests had been done to a similar test bed, and the correlation is consistent. The round-robin scheduling, ALO, PSO and hybrid GA-PSO, GA [19] algorithms are taken for contrasting with the proposed ALO-PSO methods. In this experimentation firstly the values of optimization objectives makespan, cost and load are obtained and compared with the above mentioned existing methods. Proper load balancing with less task migration and reduced makespan enhances the overall performance. To ascertain this energy consumption and system reliability values are also examined. For the calculation of these values, energy and reliability models offered in the literature [29] are followed which is discussed in Sect, 3.4. The existing scheduling methods are implemented as per the algorithms available in the literature. The ant-lion and particle swarm enhancement algorithms were examined in the above section. The remaining procedure utilized for scheduling is round-robin (RR). RR scheduling depends on time-sharing, giving each task a scheduled time. The scheduling of workflows depends on the timeslots, and the scheduler chooses the task in the ready queue to execute. These are considered as a pre-emptive technique. The comparative results of proposed and existing optimization procedures are demonstrated by means of cost, makespan, system reliability, load balancing rate, as well as energy consumption.

5.2.1 Cost evaluation

Execution cost of various optimization algorithms for varying number of tasks are indicated in Table 4. The cost is calculated by considering the costs of data centre, computing capacity, memory, bandwidth, storage of each VM instance type. A particular cost for each type is taken in the simulation, and the total cost is calculated by summing each task processing cost. Results show that the proposed ALPSO achieved lowest cost, this attributed to the minimal utilization of resources by hybrid algorithm through proper allocation of tasks. Also, it selects appropriate VMs to complete least implementation cost to implement all the tasks.

Table 4 Cost evaluation

Figure 5, cost analysis point outs the lowest execution cost of the proposed ALPSO. When large numbers of VMs are available, probabilities of picking appropriate VMs are high due to the extensive search ability of the proposed hybrid algorithm. The upsurge in the number of task increases a cost of scheduling. Existing algorithms consume more cost than the proposed hybrid algorithm. From this graph, it can be identified that the proposed algorithm optimizes overall cost. The other observations from the cost values are, the deterministic algorithm RR consumes more cost than the considered stochastic algorithms. Hybrid algorithms give better results than pure algorithms. For example, PSO and ALO give cost value 245 and 256 respectively for 100 tasks, while the hybrid GAPSO and ALPSO incur only 235.58 and 179 respectively.

Fig. 5
figure 5

Cost analysis of existing and proposed algorithms

5.2.2 Makespan evaluation

The total length of schedule until all the tasks are finished is termed as makespan. Makespan is usually distinguished by assessing the time difference between the beginning and end time of a schedule [39]. The makespan of workflow is calculated using Eqs. (12) and (13).

The makespan values of optimization algorithms for varying number of tasks are given in Table 5. Results show that the proposed ALPSO achieved the lowest makespan value than other compared algorithms. ALPSO presents an improvement of 8%, 10%, 20%, 35% and 45% in comparison with GAPSO, PSO, ALO, RR and GA respectively in makespan value. This improvement is due to better search ability and convergence capacity of proposed algorithm.

Table 5 Makespan

Figure 6 demonstrates the performance of proposed ALPSO compared to other algorithms in makespan analysis. The deterministic algorithm RR and GA among metaheuristic algorithms take more time to execute all tasks. Proposed algorithm shows a consistent performance for all number of tasks. It is also worth to note that, hybridization of algorithms significantly improves the makespan. The ant-lion algorithm and particle swarm algorithm, when taken alone, shows higher makespan value than the proposed one.

Fig. 6
figure 6

Comparison of makespan

5.2.3 Load balance evaluation

Load balancing procedures move tasks to under-loaded VMs for an efficient workflow scheduling. In cloud computing, distinctive load balancing techniques are given. The efficiency of the optimization algorithm lies in the ability to place tasks to the most suitable VMs at the initial phase itself. This simulation pursues the task migration process when over-loading occurs in VMs. The over-loaded tasks are moved to another VM. Our proposed algorithm tries to minimize task migration for better scheduling. Task migration can be minimized by limiting the load on VMs. Load balancing will maximize the throughput and minimize the reaction time [40].

Load balancing values of proposed and existing optimization algorithms are indicated in Table 6. Our efficient optimization algorithm provides less overloading. By using PSO algorithm, the ALPSO method converges to the solution in a better way to avoid an unnecessary diversity which may worsen the algorithm quality. Non-efficient algorithm provides more loads to VMs [42].

Table 6 Load balancing in scheduling

Performance comparison of load balancing with a various task is indicated in Fig. 7. When tasks increases naturally, overhead in VMs also increases. The task migration will also increase, which is clear from Fig. 7. The overloaded tasks are then migrated to a new VM and the loads are balanced. From the results, it is clear that better load optimization is possible in the proposed hybrid method. The GA scheduling algorithm shows the highest possibilities of VM overloading and hence more number of migrations. But for more number of tasks GA shows a comparatively good load balancing rate. For example as shown in Table 6 load balancing rate of GA for 20 tasks is 52.8 where as it is only 49.2 for 100 tasks. However our proposed algorithm exhibits a consistent performance for all number of tasks.

Fig. 7
figure 7

Evaluation of load balancing

5.2.4 Energy consumption evaluation

Total energy consumed by the computing resources which implements all tasks of the workflow application is termed as energy consumption. Table 7 provides the comparison values for energy consumption of existing algorithms.

Table 7 Comparison of energy consumption

Evaluation of energy consumption is displayed in Fig. 8. The performance is compared with previous methods like reliability and energy efficient workflow scheduling (REEWS), HEFT, Power Aware List-based Scheduling (PALS), and RHEFT [29] methods. The REEWS and PALS algorithms which are focusing on energy parameter consume less energy compared to HEFT and RHEFT. Here the energy consumption rate of the proposed method is as good as REEWS and it outperforms PALS.

Fig. 8
figure 8

Evaluation of energy consumption

5.2.5 Reliability evaluation

Table 8 provides the comparison values for system reliability for existing methods. Probability of execution of all number of tasks without any failure is defined as the reliability of the system.

Table 8 Comparison of system reliability

Figure 9 displays the evaluation of system reliability. The results show the good reliability performance of the solution achieved by the proposed method. The REEWS algorithm which is specifically designed for reliability and energy exhibits high reliability. But, in case of PALS, reliability is small because this focuses on the optimization of execution time and energy consumption. Reliability of both HEFT and RHEFT are high as they select normal frequency of the processor. From the above analysis it is evident that the reliability and energy consumption rate of the schedules created by the proposed method is comparable with that of algorithms specifically designed for energy and reliability.

Fig. 9
figure 9

Evaluation of system reliability

6 Conclusion

In past years, the researchers have focused on cloud workflow scheduling with a single objective. Nowadays, a few of the researches attempted to solve multi-objective problems. This research is predominantly intended to lessen cost, makespan, and a load of workflow applications on IaaS clouds in the context of workflow execution. The situation was modelled as a multi-objective optimization problem. Hybrid multi-objective ALO and PSO improvement algorithms are proposed as a solution. In the present work, the adequacy of hybrid algorithm is examined for multi-objective workflow scheduling in the cloud model. The workflow datasets are executed and assessed utilizing the CloudSim tool. The novel hybrid algorithm returned better scheduling plans that reduce cost, makespan, and load. The security procedure includes data encryption model which offers better security while scheduling. Simulations are conducted using randomly generated workflows with varying numbers of tasks. The results prove the superiority of hybrid meta-heuristic algorithms in scheduling workflows. The proposed hybrid algorithm outperforms previous meta-heuristic algorithms like GA-PSO, PSO, ALO, and deterministic algorithm RR. Simulation values show that the proposed method minimizes the cost by 9.8% of GA-PSO, 10% of PSO, 20% of ALO, 30% of RR and 12% of GA. Load balancing and makespan objectives of the proposed method enhance 8% than GA-PSO 10% than ALO, 20% than PSO, 35% than RR and 45% than GA.The energy consumption and reliability performance of the solution generated by the proposed method is also promising. High convergence rate and searchability of the ALO algorithm, as well as the communication capability of PSO are contributed significantly towards the performance of this proposed method. These strategies can be well applied to resolve diverse multi-objective enhancement issues in the workflow scheduling scenario. In future, the present work can be protracted by utilizing diverse QoS requirements like success rate, trust management and so forthto efficiently perform task scheduling. It is also good to consider more than one pricing scheme for VM leasing in future enhancements. The options for executing and extending these strategies for multi-clouds also suggested. Multi-cloud environment integrate multiple clouds together to offer a unified service in a collaborative manner.