Introduction

Grid computing is a huge accumulation of heterogeneous autonomous systems, which is globally distributed over different locations and connected by heterogeneous networks. The major challenge faced by the computational grid is to share the resources. Foster et al. [1, 2] had introduced the grid computing, at the time itself the researchers paid attention over the problem of resource allocation due to its flexible behavior for solving the complex problems in the fields like scientific simulation, bio-informatics, drug discovery etc. Unlike compared with the scheduling issues in traditional distributed systems, it seems much more complexes to solve the scheduling issue in grid computing, due to its dynamic behavior and heterogeneous jobs and resources. The major problem for the grid environment is to minimize the makespan [3] and cost [4] of the model.

This paper deals with a hybrid mechanism called as genetic algorithm and ant colony optimization (GAACO) for scheduling of workflows in grid environment, which is a combination of genetic algorithm (GA) and ant colony optimization (ACO). The model has been tested by using the grid simulator and the results of the proposed approach is compared with stand alone GA and ACO algorithms. For the evaluation of GAACO approach, the model considered different sizes of grids. Based on the market-driven structure of global computation grids which is explained in [5, 6], the architectural design of workflow management (WfM) can be illustrated in Fig. 1.

Fig. 1
figure 1

The architecture of workflow management in grid [7]

Problem Formulation

Here the problem by using direct acyclic graph (DAG) is being modelled; where G = (V, A). Assume that the number of tasks we consider as n, set of nodes in a graph V = {T1, T2, …, Tn} related to the tasks of the abstract workflow and relation between tasks is denoted by using a set of arcs A.

Let {Ti, Tj} represents the structure of the arc, where the parent task is represented by Ti and the child task is represented by Tj. Here the condition is, the child tasks executes after the parent tasks completes its execution. P (Ti) is denoted as set of parents and C (Tj) is denoted as set of child tasks.

Every task Ti has an associated domain \({\text{D}}_{\text{i}} = \left\{ {{\text{d}}_{\text{i}}^{ 1} ,{\text{d}}_{\text{i}}^{ 2} , \, \ldots ,{\text{ d}}_{\text{i}}^{\text{m}} } \right\}\) where \({\text{d}}_{\text{i}}^{\text{j}} \left( {1 \le {\text{j}} \le {\text{m}}} \right)\) denotes a service request carried by the grid service provider and m represents the total number service requests available for the task Ti. The \({\text{d}}_{\text{i}}^{\text{j}} \cdot {\text{t}}\) and \({\text{d}}_{\text{i}}^{\text{j}} \cdot {\text{c}}\) represents the two properties called as the time and cost of a related domain in the grid service provider.

This model deals with the precedence constraints identified as a functioning mechanism in DAG. The tasks order of execution is specified by the precedence constraints to achieve the optimal solution. In this paper two kinds of constraints have been considered.

Time/Makespan Constraints

The makespan value of the scheduling workflow will not exceed when compared to the user specified deadline. Otherwise, it satisfies the following condition.

$$\text {S} \cdot {\text{makespan}} \le \text {DL}$$
(1)

where S makespan denotes the execution time of S and DL denotes the deadline which is specified by the user.

Cost Constraints

For a given schedule S (S1, S2… Sn), it satisfies the following condition.

$$\text{S}\cos \text{t} = \sum\limits_{\text{i} = 1}^{\text{n}} {\text{d}_{\text{i}}^{{\text{s}_{\text{i}} }} } \text{c} \le \text{BD}$$
(2)

where S cost denotes the total cost of S; and BD represents the variable budget which is specified by the user. The total cost is less than the user specified budget.

A Hybrid GAACO Algorithm for Scheduling Problem

This section deals with the hybridization of GAACO which is hybridizing of two meta-heuristics approaches. The GAACO approach is a loosely coupled method which runs GA as the first heuristic and then followed by the ACO which is shown in Fig. 2. GA initializes the population and generates the new population by selection, crossover and mutation process. Then GA is activated, which produces an optimal scheduling output [7] as a solution upon reaching the convergence criteria. The optimal solution which is generated by GA is carried as input to the ACO approach. The GAACO approach iterates this procedure until the algorithm reaches to the convergence criteria.

Fig. 2
figure 2

Flow diagram for hybrid GAACO algorithm

Genetic Algorithm for Scheduling Mechanism in Grid

Genetic algorithm is treated as a class of adaptive, evolutionary and stochastic algorithms which is involved in both searching and optimization. The idea behind the GA is evolved from the natural process in order to create simulated process for an effective algorithm to find the solution for the scheduling mechanism in grid computing [810].

Makespan

The main objective of the GA is to reduce the makespan and to calculate the maximum completion time of a task among the overall scheduling workflows.

$${\text{Smakespan}} = \frac{{\text{L}_{\text{i}} }}{{\text{SP}_{\text{j}} }}$$
(3)

where Li denotes the size of the current task i; and SPj denotes the processing speed of the scheduling workflow j. Based on this, the equation is formulated in Eq. (3).

Minimum Cost

The next objective of GA is to reduce the total cost of the scheduling process. The unit price of the scheduling workflow is denoted by Pj. Therefore, the scheduling cost is calculated by using the Eq. (4).

$$\text{S}\cos \text{t}(\text{i},\text{j}) = \text{S}{\text{makespan}}(\text{i},\text{j})\text{P}_{\text{j}}$$
(4)

The total cost of the chromosome is calculated by Eq. (5).

$$\text{S}\cos \text{t}(\upalpha ) = \sum\limits_{\text{j} = 1}^{\text{m}} {\text{S}\cos \text{t}(\text{j})} ; \, \{ 1 \le \text{j} \le \text{m}\}$$
(5)

where Scos t(α) represents the overall chromosome cost resulting in the population.

Implementing a Fitness Function

While introducing the tasks, the concentration of users are at the completion time and costs of executing jobs [11]. For this reason the weight factor is calculated by the user for both time and cost (Wt and Wc). The value of weights lie in the range of [0 and 1] and the summation of weights is equal to 1.

$$\text{F} =\text{W}_{\text{t}} \times \frac{\text{S}\cos \text{t}}{\text{BD}} + \text{W}_{\text{c}} \times \frac{{\text{S}{\text{ makespan}}}}{\text{DL}}$$
(6)

For example, user is concentrated on completion time as 60 and 40 % for financial cost. Wt and Wc will give more freedom to the user to place their jobs into the grid. Therefore, the Eq. (6) specifies the fitness function of a chromosome [12, 13].

Ant Colony Optimization for Scheduling Mechanism in Grid

This paper combines GA with ACO to find the best optimal solution. The ACO is treated as best algorithm for finding the solution for the scheduling problem.

Representation of Pheromone and Heuristic Function

Pheromone and heuristic function is treated as the major factor in ACO algorithm.In general, pheromone is a mechanism which uses the past searching behaviour and predicts the future search. On the other side, the heuristic information contains the details about the searching behaviour of ants. Let us consider, \({\text{d}}_{\text{i}}^{\text{j}}\) represents the service instance which is mapped by the pheromone to task Ti as μij where 1 ≤ i ≤ n, 1 ≤ j ≤ m and the mapping value of heuristic function is given as ηij.

Initially all pheromone values are set to µ0

$$\upmu_{\text{ij}} = \,\upmu_{0} ,$$
(7)
  1. (i)

    Time Heuristic (TH)

Time heuristic preferences the scheduling instances which having the minimum execution time and allocate these scheduling instances to the ants. Consider that, the heuristic type of ant is TH and the value of heuristic mapping function ηij is calculated as follows.

$$\upeta_{\text{ij}} = \text{TH}_{\text{ij}} = \frac{{\hbox{max} \_\text{t}_{\text{i}} - \text{d}_{\text{i}}^{\text{j}} \cdot \text{t} + 1}}{{\hbox{max} \_\text{t}_{\text{i}} - \hbox{min} \_\text{t}_{\text{i}} + 1}}$$
(8)

where \({ \hbox{max} }\_\text{t}_{\text{i}} = { \hbox{max} } _{1 \le \text{j} \le \text{m}} \left\{ {\text{d}_{\text{i}}^{\text{j}} \text{t}} \right\}\) and \({ \hbox{min} }\_\text{t}_{\text{i}} = { \hbox{min} } _{1 \le \text{j} \le \text{m}} \left\{ {\text{d}_{\text{i}}^{\text{j}} \text{t}} \right\}\). By evaluating the Eq. (8), TH contains the greater heuristic value and ηij €{0,1}.

  1. (ii)

    Cost Heuristic (CH)

Cost heuristic preferences the scheduling instances which having the minimum cost and allocate these scheduling instances to the ants. Consider that, the heuristic type of ant is CH and the value of heuristic mapping function ηij is calculated as follows.

$$\upeta_{\text{ij}} = \text{CH}_{\text{ij}} = \frac{{\hbox{max} \_\text{c}_{\text{i}} - \text{d}_{\text{i}}^{\text{j}} \text{t} + 1}}{{\hbox{max} \_\text{c}_{\text{i}} - \hbox{min} \_\text{c}_{\text{i}} + 1}}$$
(9)

where \({ \hbox{max} }\_\text{c}_{\text{i}} = { \hbox{max} } _{1 \le \text{j} \le \text{m}} \left\{ {\text{d}_{\text{i}}^{\text{j}} \text{c}} \right\}\) and \({ \hbox{min} }\_\text{c}_{\text{i}} = { \hbox{min} } _{1 \le \text{j} \le \text{m}} \left\{ {\text{d}_{\text{i}}^{\text{j}} \text{c}} \right\}\).

By evaluating the Eq. (9), CH contains the greater heuristic value and ηij €{0,1}.

  1. (iii)

    Suggested Deadline Heuristic (SDH)

The Suggested Deadline Heuristic preferences the newly generated service instances to the ants. The model allocates SDH to each task based on the user defined deadline. The following mechanism illustrates how to compute SDH for each task.

  • Task Ti Earliest Start Time (ESTi) and Backward Earliest Start Time (BESTi)

To compute earliest start time, the service instances with minimum execution time is mapped with the task Ti. Then the model rotates the DAG into reverse system by considering the beginning node as the finishing node and the other way around, and treating the all directed routing arcs. For each task Ti, the ESTi in reverse system is treated as BESTi. BY considering both mechanisms the model can estimate the avg_min_ti of task Ti from both ahead navigate and in reverse navigate which is given in Eq. (10).

$${\text{avg}}\_\hbox{min} \_\text{t}_{\text{i}} = \frac{{(\min_{{\forall \mathop \text{T}\nolimits_{\text{j}} \in {\text{Pred}}(\mathop \text{T}\nolimits_{\text{i}} )}} \{ {\text{BEST}}_{\text{j}} \} - {\text{BEST}}_{\text{i}} ) + (\min_{{\forall \mathop \text{T}\nolimits_{\text{j}} \in \text{suc}(\mathop \text{T}\nolimits_{\text{i}} )}} \{ \text{EST}_{\text{j}} \} - \text{EST}_{\text{i}} )}}{2}$$
(10)

On a scale of deadline the avg_min_t is expanded with min_makespan, then the model calculates the value of SDHi as

$$\text{SDH}_{\text{i}} = {\text{avg}}\_\hbox{min} \_\text{t}_{\text{i}} \frac{\text{DL}}{{\hbox{min} \_{\text{makespan}}}}$$
(11)

Consider that, the heuristic type of ant is SDH and the value of heuristic mapping function ηij is calculated as follows.

$$\upeta_{\text{ij}} = \frac{{\hbox{max} \{ |\hbox{max} \_\text{t}_{\text{i}} - \text{SDH}_{\text{i}} |,|\text{SDH}_{\text{i}} - \hbox{min} \_\text{t}_{\text{i}} |\} - |\text{d}_{\text{i}}^{\text{j}} .\text{t} - \text{SDH}_{\text{i}} | + 1}}{{\hbox{max} \{ |\hbox{max} \_\text{t}_{\text{i}} - \text{SDH}_{\text{i}} |,|\text{SDH}_{\text{i}} - \hbox{min} \_\text{t}_{\text{i}} |\} + 1}}$$
(12)

Based on the Eq. (12), the service instance of a task and the execution time is almost equal to the SDHi, which is related with a greater heuristic value and ηij € {0,1}.

  1. (iv)

    Budget Heuristic (BH)

It is similar to SDH, the BH preferences the newly generated service instances to the ants. The model allocates suggested budget to each task based on the user defined budget cost. The following mechanism illustrates how to compute BH for each task. one can obtain the min_c of the whole work by evaluating Eq. (13).

$$\text{BH}_{\text{i}} = \hbox{min} \_\text{c}_{\text{i}} \frac{\text{BD}}{\hbox{min} \_\text{c}}$$
(13)

Consider that, the heuristic type of ant is BH and the value of heuristic mapping function ηij is calculated as follows.

$$\upeta_{\text{ij}} = \frac{{\hbox{max} \{ |\hbox{max} \_\text{c}_{\text{i}} - \text{BH}_{\text{i}} |,|\text{BH}_{\text{i}} - \hbox{min} \_\text{c}_{\text{i}} |\} - |\text{d}_{\text{i}}^{\text{j}} .\text{c} - \text{BH}_{\text{i}} | + 1}}{{\hbox{max} \{ |\hbox{max} \_\text{c}_{\text{i}} - \text{BH}_{\text{i}} |,| \text{BH}_{\text{i}} - \hbox{min} \_\text{c}_{\text{i}} |\} }}$$
(14)

Based on the Eq. (14), the service instance of a task and the budget cost is almost equal to the BHi, which is related with a greater heuristic value and ηij € {0,1}.

  1. (v)

    Time/Cost Heuristic (TCH)

The TCH combines the efficiency of both time and cost heuristics of service instances. Consider that, the heuristic type of ant is TCH and the value of heuristic mapping function ηij is calculated as follows.

$$\upeta_{\text{ij}} = \frac{1}{2}(\text{TH}_{\text{ij}} + \text{CH}_{\text{ij}} )$$
(15)

Based on the Eq. (15), the service instance of a task with minimum execution time and cost, which is related with a greater heuristic value and ηij € {0,1}.

Construction of Solution

In this method, the number of ants is directly proportional to the number of solutions. Each ant manages their construction process and simultaneously all ants would construct their solutions. Firstly,based on their pheromone values ants selects their heuristic information. The roulette wheel selection scheme is used for selecting the heuristics. After selection, ants will start finding solution schedules to the problem. In each step, ants chooses the service instance according to the heuristic information which is selected and the pheromone values is mapped to unmapped tasks. The Eq. (16) explains about the selection rule.

$$\text{d}_{\text{i}}^{\text{j}} { = }\left\{ {\text{apply the Roulette wheel scheme}} \right\}$$
(16)

In every iteration, each service instance is mapped with a task. This process will continue until it reaches to convergence criteria.

Management of Pheromone

  1. (i)

    Initialization of Pheromone

As per the ACO algorithm the pheromone values is set to μ0, μ0 is the least value of the pheromone [14, 15]. In Eq. (17), the μ0 is calculated by mapping the tasks to service instances.

$$\upmu_{0} = \left\{ {\frac{{\frac{{\hbox{min} \_{\text{makespan}}}}{{\hbox{max} \_{\text{makespan}}}}}}{{\frac{\hbox{min} \_\cos t}{\hbox{max} \_\cos t}}}} \right.$$
(17)
  1. (ii)

    Local Pheromone

In the ACO algorithm, the immediate process after mapping the tasks to the service instance is local pheromone updating. The local pheromone updating is given as

$$\upmu_{\text{ij}} = (1 - {\uprho} )\upmu_{\text{ij}} + {\uprho}\upmu_{0}$$
(18)

In Eq. (18), ρ € (0, 1) is a parameter, the main behaviour of the local updating pheromone rule is to minimize the µij to extend diversity of the approach.

  1. (iii)

    Global Pheromone

The global pheromone update takes place when all ants find their solutions. First the solutions in the iteration is compared by the algorithm and the solution schedule S is computed by the following Eqs. (19) and (20).

In Eq. (19), the S score in the case of optimising makespan is given as

$$\text{S}{\text{score}} = \left\{ {\frac{\text{BD}}{\begin{aligned} \text{S}\ \text{cos t} \hfill \\ 1 + \frac{{\hbox{min} \_{\text{makespan}}}}{{\text{S} {\text{makespan}}}},\quad {\text{if }}\text{S}\ \text{cos t} \le \text{BD} \hfill \\ \end{aligned} }} \right. + \frac{{\hbox{min} \_{\text{makespan}}}}{{\hbox{max} \_{\text{makespan}}}},\quad {\text{if }}\text{S}\ \text{cos t} \le \text{BD}$$
(19)

In Eq. (20), the S score in the case of optimising cost is given as

$$\text{S}{\text{score = }}\left\{ {\frac{\text{DL}}{\begin{aligned} \text{S} {\text{makespan}} \hfill \\ 1 + \frac{\hbox{min} \_\cos \text{t}}{\text{S}.\cos \text{t}},\quad {\text{if }}\text{S}{\text{makespan}} \le \text{DL} \hfill \\ \end{aligned} }} \right. + \frac{\hbox{min} \_\cos \text{t}}{\hbox{max} \_\cos \text{t}},\quad {\text{if }}\text{S}{\text{makespan}} \le \text{DL} $$
(20)

The components which has the best solution so for, there only we can apply this global pheromone updating rule. The solution has the highest score value. Let us consider S (S1, S2, …, Sn) is the best solution so for, (S1, S2, …, Sn) means Ti task is associated with the service instance \({\text{d}}_{\text{i}}^{\text{j}}\), the updating global pheromone rule is given in Eq. (21).

$${\upmu_{\text{i}} \text{S}_{\text{i}}} = (1 - \uprho )\upmu_{\text{i}} \text {S}_{\text{i}} + \text{S} {\text{score}};\quad {\text{where i}} = 1,2, \ldots, {\text{n}}$$
(21)

The results of the global pheromone update rule improves the pheromone values which is related with the best solutions obtained so for. So that, it can increase the efficiency of mapping in later iterations.

  1. (iv)

    Experimental Analysis

  2. (a)

    Simulation Setup

The GridSim toolkit is used to evaluate the performance of proposed GAACO approach. Three grid environments with different specifications are considered, small grid with 32 hosts/521 machines, medium grid with 64 hosts/1024 machines, and large grid with 128 hosts/2048 machines respectively.

  1. (b)

    Parameters of the GAACO Algorithm

GA is evaluated by using the parameters which are denoted as generation size, crossover rate, muataion rate, maximum iteration and convergence criteria. Generation size represents how many chromosomes are presented in the population, the value is taken as 250, that is, 250 chromosomes are present in the generation. Crossover rate represents how frequently the crossover is applied, if there is no crossover, the children are exacty copy of parents. If there is a crossover, some parts of the parent chromosome is taken for child chromosome generation. Here the crossover rate is taken as 0.8, that is, 80 % of the child is made up of crossover. The mutation rate represents how frequently the parts of the chromosomes is mutate, here the value is taken as 0.2, that is, 20 % of the chromosome is mutated. Maximum iteration restricts the count of the crossover and mutation process which is applied to the generation, here the value is taken as 500 and finally the convergence criteria which is also called as stopping criteria, the value is taken as 20, that is, after 20 generations the genetic algorithm will automatically quits. The brief representation of parameters is shown in Table 1.

Table 1 Parameter values of GA

The ACO is evaluate by using the parameters which are given as a pheromone update rule, selection method, random selection proportion rule, budget, deadline. The value for the pheromone update rule is given as 0.1, which is an optimal value, that is, one ant performs the global update after all ants finishes their tour to the problem. The random selection proportion rule is denoted by q and β, initially all ants will be initiated at the same node. Therfore, the q value is taken as 0 and after that ants converges to the global solution β which is given as 1. The model allocates the suggested budget to each task based on the user defined budget cost. The value for the user define budget is taken as 25,000. The model allocate suggested deadline to each task in based on the user defined deadline. The value for the user define deadline is taken as 1600. The brief representation of parameters is discussed in Table 2.

Table 2 Parameter values of ACO
  1. (c)

    Results Evaluation of Static Grid

The computational results for makespan and cost are averaged for each scenario which is iterated for 30 times. The results of makespan and cost is reported in Figs. 3 and 4, respectively. The results for min–min, GA, ACO and hybrid GAACO algorithm are been compared. The hybrid GAACO runs the ACO as a main algorithm and receives inputs from GA for optimizing the solution.

Fig. 3
figure 3

Average makespan values of small, medium and large grids in static scenario

Fig. 4
figure 4

Average scheduling cost in static scenario

  1. (d)

    Results Evaluation for Dynamic Grid

The computational results for makespan and cost are averaged for each scenario which is iterated for 30 times. The results of makespan and cost is reported in Figs. 5 and 6, respectively. it is observed that GAACO perform better in cost utilization compared to min–min, GA and ACO for small size instances.

Fig. 5
figure 5

Average makespan values of the dynamic scenario

Fig. 6
figure 6

Average scheduling cost in dynamic scenario

The proposed GAACO approach provides computational economy for organizations because of the ideal allocation of the resources based on the budget and deadline constraints provided by organizations. This method allows resource providers to earn money by allocating the resources to users for solving their problems. This method also considers the time and cost heuristics for the efficient usage of resources, it would ultimately help the users to choose the appropriate resources.

Conclusion

The GAACO algorithm is proposed for efficient scheduling mechanism in computational grids. This algorithm improves the heuristic behaviour of scheduling mechanism. Different parameters are considered in designing of algorithm like time and cost heuristics.The quality of schedule is improved by permitting the user to set the constraints. Different heuristics based on the user constraints are proposed. The algorithm in small, medium and large grids were examined for both environments.