1 Introduction

Distributed high performance computing systems, like grid and cloud computing, have provided infrastructure for resource sharing. Large scale grids like EGEE [1], TeraGrid [2] and PlanetLab [3] bring together sites of resources to facilitate e-science and e-business issues. Grid middleware like Globus [4] and Condor [5] use scheduling algorithms to map jobs submitted by users to resources. Efficient job scheduling mechanisms are required to perfectly utilize the resources. An accurate job Quality of Service (QoS) prediction method enhances the efficiency of scheduling algorithms [6].

Job QoS estimation is a challenging problem due to the availability and performance variation of grid resources. For example, resources may be shut down or restarted to save power for a software update or even due to a hardware problem. During resource uptime, the performance may degrade because of a huge workload on the resource obliged by the owner of the resource or job grids. All these phenomena are obstacles to properly estimating the job QoS metrics.

Most of the current researches estimate the execution time of atomic jobs on resources [612]. However, many scientific and business processes are modeled via composite jobs or workflow. A workflow consists of several activities supposed to be executed according to a predefined order. Estimation of QoS parameters in a workflow is much more challenging than for an independent job since its execution involves multiple resources and activities. Two groups of researchers have considered the workflow QoS estimation: those working in the area of web service composition [1316], and those in the area of workflow scheduling [1720].

A composite web service invokes some web services to obtain the user requirements. The interactions among web services are modeled by a workflow. Studies have been done in the area of QoS estimation of composite web services [1316] which cannot be applied for grid workflow for two reasons. First, in these works, the QoS estimation has been done independently of the mapping of activities to web services. In a grid, the resources are heterogeneous and different mappings of one workflow will have different QoS values. Second, these works have assumed a deterministic or semi-deterministic behavior for web services. For example, each web service has a fixed predefined response time or at most a predefined probability distribution function for response time. In a grid, resources are so dynamic that it is impractical to define a distribution or fixed quantity to model their response time, availability or cost.

During grid workflow scheduling, an estimation of QoS parameters (mostly makespan) is done to find a suitable mapping of activities to resources. Most of these algorithms like in [1720] use the static built-in speed of resources to estimate the QoS parameters. The dynamicity in performance and availability of grid resources makes these estimations error prone.

In this paper, based on resources availability prediction, we propose a Workflow QoS Estimation algorithm called WQE. Figure 1 shows the input and output of the system. The workflow, the mapping of activities to resourcesFootnote 1 and the log of availability changes in resources are the input to the estimation system. The output is the QoS metrics consisting of response time, reliability and cost of workflow execution. Response time is the mean time in which the execution of the workflow completes. Reliability is the probability of the workflow execution completed without failure, and finally, cost is the budget user must pay for consuming resources. WQE consists of two main steps: resource monitoring and analysis and workflow QoS computation.

Fig. 1
figure 1

Input and output of a workflow QoS estimation system

A multi-state availability model has been used for resources. The states define both the workload and availability of resources. In this paper, we use four availability states: Available, User present, CPU threshold exceeded, and Unavailable as used in the Grid failure trace archive [21]. A simple monitoring system monitors each resource and records the changes in the availability state within time. When a workflow is submitted, a window with a specified length is assumed on the availability trace of each resource. The availability information within this window is analyzed to predict the state of each resource. In this paper, two stochastic prediction methods are proposed: PE and PW. In PE, all historical information within the window has the same value while in PW; the information nearer to the prediction point gets more weight. The probability of being in a state is computed based on the percentage of time the state has occupied within the window. Simulation results indicate that PW performs better than PE in state prediction.

To compute the QoS of a workflow, the input workflow graph is converted to a tree structure. In the tree, leaf nodes are activities. Middle nodes are basic structures with their children from left to right indicating the operands they must be applied on. The QoS metrics are computed by aggregating the QoS of nodes, in a bottom-to-top manner according to the structure of the tree. To enable computation, at first the QoS of each activity must be computed. This computation is based on the prediction of the availability states of the host. Then, the QoS metrics of activities are aggregated to form the QoS of basic structures. At the end, these values are composed based on tree structure to generate the QoS of the workflow.

The method has been evaluated for random and real world workflows. We compare our work with Heterogeneous Earliest Finish Time (HEFT) QoS estimation [18]. To make this comparison, we get the output scheduling of HEFT as the input of our algorithm and estimate the QoS parameters. Then our estimation and HEFT estimation are compared with the actual values. These values are achieved by simulating workflow execution on the grid of Notre Dame University (NDU) according to Condor trace at 2007 [21]. For 25620 random workflows in simulation, in comparison with HEFT, WQE has gained 52 %, 41.3 % and 48.2 % improvement in reliability, response time and cost estimation, respectively.

In the rest of this paper, the related works are mentioned in Sect. 2. In Sect. 3, the problem statement and preliminaries are described. The details about our work will be discussed in Sect. 4. In Sect. 5, the complexity of the proposed algorithm will be analyzed. The results will be presented in Sect. 6, and finally, Sect. 7 gives the conclusion and directions for future work.

2 Related works

We categorize the related works into two groups: (i) resource state prediction, (ii) job QoS estimation. Each group is explained separately.

2.1 Resource state prediction

Some prediction methods for load on resources have been presented before. The Network Weather Service (NWS) uses a mixture of experts based on a linear model to choose the best prediction among experts [22]. Linear predictors have been proposed based on averaging the states during last N intervals [23]. Hu et al. proposed two methods of load prediction based on support vector regression and neural network. They emphasized using the least amount of features in prediction to reduce the cost of monitoring [24].

Byun et al. defined a Markov chain with three states for a resource: idle, in use and stopped. The resources are monitored every 30 minutes to estimate the rate of transitions among the states. The resource with the highest availability probability is employed for a submitted job [11]. Jiong et al. assumed two states, idle and busy, for each resource. With the assumption of static and predefined transition rates, they tried to boost job scheduling [25]. Lili and Shoubao specified CPU usage, network usage and failure of resources as Markov state variables [26]. They supposed a value for each state and aggregated these values to compute a rank for each resource. Resources with higher ranks were used during job scheduling. Ren et al. used CPU load, memory thrashing and unavailability of resources to form a Markov chain with five states. They emphasized on computing the probability of transitioning from available states to other states [27]. Rood and Lewis used the same states introduced in this paper but they performed a discrete transitional analysis to compute the probability of resource states [6]. They also took advantage of this analysis for independent job scheduling. Ramakrishnan and Reed suggested a birth-and-death process with six availability states to model resource state changes. Based on static transition rates, they predicted the probability of the resource being in each state [12].

2.2 Job QoS estimation

Most of the current researches predict the execution time of atomic jobs on resources. They perform this prediction directly [710] or based on resource state prediction [6, 11, 12]. Most workflow scheduling algorithms use static information about resources to estimate QoS of a workflow [17, 18, 20, 2830]. In these works, the dynamicity of resources has been ignored and they do not support selection and loops in the structure of a workflow.

There are some works in QoS computation of composite web services. A composite web service is made of some tasks, each supposed to call a web service. The works in this area are categorized into two groups: In the first group, deterministic behavior for web services is assumed, while in the second group, the probability distributions of web service responses are considered to be well defined. Jaeger et al. estimated the QoS of a composite service by aggregating the QoS of individual services. They assumed a deterministic behavior of web services in responding to requests [15]. Cardellini et al. also estimated the QoS of a workflow with this assumption. They modeled service selection as a linear programming problem based on the computed QoS values. To adapt with dynamicity of web service behaviors, they proposed rescheduling. However, rescheduling makes overhead for a workflow management system [31]. Hwang et al. [14] used a probability mass function (PMS) for QoS of services. It aggregates the PMS to give a probabilistic QoS for a workflow. Zheng et al. introduced kernel estimation as a nonparametric method to estimate the probability density function of services. The PDFs were aggregated to generate the PDF of QoS for a workflow [16]. However, applying these methods on grid workflows is impractical because of resource heterogeneity and dynamicity.

3 Problem statement and preliminaries

In this section, we illustrate the formal definition for a workflow and grid resources, and describe the problem.

3.1 Workflow

A grid workflow can be represented by a directed graph G=(A,E) where A={a 1, a 2,…,a M } is a set of activities and E is the set of edges showing precedence relationships among activities. In a given workflow graph, the activity with no predecessor is an entry and the activity with no successor is called an exit activity. RE i is the set of resources having software/hardware requirements needed for executing activity a i . D is an M×M matrix where d ij shows the amount of data transmitted from a i to a j .

A composition of four basic structures can be used in the workflow graph: sequence, loop, selection and parallel (Fig. 2). Each X i can be regarded as an operand for the correspondent basic structure. An operand can recursively involve a basic structure or be an atomic activity, i.e., X i ∈{seq,par,sel,loop}∪A. In the loop structure, f l is assumed to be the mean number of iterations, and in the selection block, f i is the probability of selection of X i . For sequence, loop and parallel structures, we have: ∀X i ; f i =1.

Fig. 2
figure 2

Four basic structures used in workflows

3.2 Grid computing resources

The set RE={r 1,r 2,…,r N } is the set of all grid resources. The computational speed of resource r j is represented by CS j . BW is an N×N matrix where BW ij indicates the bandwidth between resources r i and r j . NL is a vector of N elements where NL j shows the network latency of resource r j . A ready time of resource r j is denoted by rt j which is defined as the time the resource is waiting to accept a new process after it has finished processing previously assigned processes. Table 1 shows the summary of symbols used in the proposed method.

Table 1 Symbols used in the proposed algorithm

3.3 Problem statement

Let G=(A, E) be the workflow and RE={r 1,r 2,…,r N } be the set of resources. Let M be an M×1 vector where M(a i ) shows the index of the host for activity a i . The problem is to estimate QoS parameters for G under the mapping M, relative to the dynamic behavior of RE. The QoS parameters include: Response time which is the mean time where the execution of the workflow completes; Reliability which is the probability of the workflow execution being completed without failure; Cost which is the budget user should pay for consuming resources.

3.4 HEFT algorithm

Figure 3 shows the HEFT algorithm [18]. Accordingly, activities and edges are assigned some weights. The weight of each activity defines its average execution time on resources. The weight of each edge is the average required time to transfer data from the predecessor to the successor activity. By traversing the graph of the workflow from the bottom to the top, and using a recursive relation, a rank is computed for each activity. The ranks of leaf activities, i.e., activities without any successor, are the same as their weight. The ranks of other activities are the maxima of sums of ranks of their immediate successors and the weights of connecting edges. This ranking gives execution priority to critical activities to minimize the makespan. Activities are sorted by these ranks to form a list. They are removed from this list one by one, and each activity is assigned to a resource with minimum completion time. In this algorithm, resources are assumed to be available and have static computational speed. The mapping of activities to resources and the estimated values for QoS parameters including response time (makespan), reliability and cost are regarded as the output of this algorithm. We use HEFT mapping as the input to our proposed method and compare our estimation of QoS with HEFT estimation.

Fig. 3
figure 3

HEFT algorithm

The HEFT does not support selection and loop structure. A loop structure can be considered as the repetition of the structure for a specific number of iterations. For selection some solutions are possible:

  • For every workflow that consists of selection, construct a set of workflows. Each workflow in this set is a possible execution of the original workflow with its probability of execution computable according to the selection probabilities. For example, for a workflow which has two selection structures with selection probabilities (0.2, 0.8) and (0.4, 0.3, 0.3), six workflows with probabilities 0.08, 0.06, 0.06, 0.32, 0.24, 0.24 will be constructed. For each workflow a mapping is generated by HEFT. The QoS parameters are computed as weighted sums of the QoS of each workflow where the weights are execution probabilities of workflows.

  • The selection structure can be substituted by a single activity with its number of operations obtainable as a weighted sum of the number of operations of tasks involved in the structure. The selection probabilities form the weights.

  • Ignoring the selection probabilities, a mapping can be found for the workflow. Regarding the found mapping, the QoS parameters are recomputed as the weighted sums of QoS of the activities where the weight of activity a i is equal to f i .

In this paper, we used the third solution because it does not have the computation overhead of the first solution and it does not omit the selection structures as in the second solution.

4 Workflow QoS estimation system

Figure 4 shows the architecture of the proposed method. There are two main components:

  1. (a)

    Resource monitoring and analysis—To detect the dynamic behavior of resources, we employ a simple monitoring system. Some availability states are defined for resources. These states represent load of works on resources. A higher availability state implies a lower workload on the resource, which leads to a better processing capability. The monitoring system monitors each resource periodically and logs the changes in the availability states. A stochastic predictor analyzes the log to predict the availability state of each resource.

  2. (b)

    Workflow QoS computation—In this part, the QoS of a workflow is computed based on the resource analysis result. To achieve this goal, the QoS of each activity is estimated after data transmission modeling. Then, the quality of service for basic structures involved in the workflow is computed. Finally, the QoS composition estimates the whole workflow QoS by traversing a tree corresponding to the workflow from the bottom to the top.

Fig. 4
figure 4

Architecture of the proposed method for workflow QoS estimation

In the rest of the paper, we explain each component in detail.

4.1 Resource monitoring & analysis

A simple monitoring system monitors each resource periodically to record the changes in availability state of the resource with time. In this paper, we use the four availability states supported by Condor [5, 6]: available, user present, CPU threshold exceeded, and unavailable. An available machine is currently connected to the network, has more than 15 minutes of idle time, and a CPU load less than the CPU threshold.Footnote 2 In the user present state, the resource owner has touched the keyboard or mouse. In CPU threshold exceeded, the local CPU load surpasses some specific threshold, due to new or currently running processes. Finally, when a machine fails or becomes unreachable, it will be unavailable.

The multi-state system assumed for modeling the changes in the availability of a resource is shown in Fig. 5. Rood et al. proposed a discrete transitional availability state predictor for atomic jobs [6], which is not suitable for our work. As a workflow has many activities, it takes a long time to be processed. In this regard, the Rood’s predictor misses most of the information in the history, which leads to low information for prediction. So, other predictors must be employed.

Fig. 5
figure 5

State diagram used for modeling the behavior of resources in the NDU Trace

Figure 6 shows the concept of our predictor. Assume we want to predict the state of a resource at time t. A window with size L is used, which contains the latest historical information during time interval [tL,t−1]. Inside the window, there are some subintervals, each corresponding to a specific availability state. For instance, as it is shown in Fig. 6, there are three subintervals with availability states 1, 3, and 2, respectively. We define the distance from the middle of subinterval k (in the time range of [\(t_{1}^{k}, t_{2}^{k}\)]) to time t as in Eq. (1):

$$ d_{mt}^{k} =t- \frac{t_{1}^{k} + t_{2}^{k}}{2} $$
(1)
Fig. 6
figure 6

Concept of resource state prediction using historical information inside a window with size L

Two predictors are proposed: Prediction with Equal Weights (PE) and Prediction with Weighting (PW). In PE, all observations have the same weight while in PW the observations nearer to the prediction time are regarded to be more important, hence they will get more weight. The reason for observation weighting is that our investigation on the NDU trace [21] has shown a behavior similar to the Markov property in state changes of resources. In other words, a future state of a resource depends mostly on the sequence of earlier events than the later ones. So PW tries to give more weight to observations closer to the prediction point, in order to enhance the prediction accuracy. The weight of an observation for resource j at time i belonging to subinterval s is defined in Eq. (2):

$$ w_{j,i} = \left \{ \begin{array}{l@{\quad}l} 1 & \mathrm{PE}\\ \frac{\min_{k} d_{mt}^{k}}{d_{mt}^{s}} & \mathrm{PW} \end{array} \right . $$
(2)

In PE, all observations get 1 as the weight. In PW, the weight is computed in such a way that the observations belonging to the nearest subinterval to the prediction time, have weight equal to 1. Every other observation has weight reversely proportional to the distance of the middle point of its subinterval to the prediction time. In Eq. (2), k is an arbitrary subinterval within the window. With this weighting, all observations within one subinterval get equal weight which is reversely proportional to their average distance from prediction point. The probability that resource j is at state al at time t is computed by the percentage of time the resource was at al according to the history in the window as defined by Eq. (3):

$$ \pi_{j,\mathit{al}} (t)= \frac{\sum_{i=t-L}^{t-1} w_{j,i} ( q_{j,i} =\mathit{al})}{ \sum_{i\in[t-L,t-1]} w_{j,i}} $$
(3)

where q j,i is the observation for resource j at time i and (q j,i =al) is equal to 1 where the state of resource j at time i is al otherwise it is zero. Figure 7 shows steps of resource state prediction.

Fig. 7
figure 7

Stochastic resource availability state predictor

4.2 Workflow QoS computation

To compute the QoS of a workflow under the mapping M, the QoS is estimated at the level of activity. These estimates are aggregated to form the QoS of the basic structures. The composition of the QoS of the basic structures results in the QoS of the whole workflow. In the following sections, each part is explained separately.

4.2.1 Data transfer time modeling

Before the execution of each activity, its required data must be transmitted from predecessors host to the activity host. We assume that the data transfer time between any two activities has a normal distribution with the mean value computed by Eq. (5). It is the relation of the size of data transmitted between two activities to the bandwidth of hosts. The network latency has also been considered. The variance of data transmission can be determined according to network communication.

$$\begin{aligned} & \forall a_{i},\quad a_{j} \, dt\,t( a_{i}, a_{j}, \boldsymbol{M} )\sim N\bigl(\mu\, dt ( a_{i}, a_{j}, \boldsymbol{M} ), \sigma dt^{2} ( a_{i}, a_{j}, \boldsymbol{M} )\bigr) \end{aligned}$$
(4)
$$\begin{aligned} & \mu\, dt ( a_{i}, a_{j}, \boldsymbol{M} ) = \mathit{NL}_{M ( a_i )} +\ \frac{d_{ij}}{\mathit{BW}_{M ( a_i ),M(a_j )}} \end{aligned}$$
(5)

4.2.2 Activity level

In this section, we estimate the QoS parameters of an arbitrary activity a i . Before the start of execution of an activity, some latency (possibly zero) will exist. This latency is due to the time needed for transmitting data from predecessors to the activity or due to the current process on the mapped resource. Equation (8) shows the computation of the latency. Let \(\mathit{FT}_{a_{j}}\) be the finish time of activity a j and \(\mathit{rt}_{M( a_{i} )}\) be the ready time of a i ’s host. The finish time of data transmission is computed by Eq. (6), and the latest predecessor is defined by Eq. (7). The first equation in (8) occurs when the data has been transmitted but the host is not ready to process the activity. In this case, the difference of the ready time and finish time of the latest predecessor defines the latency. The second equation in (8) occurs when the resource is waiting for completion of data transmission. In this case, the latency is equal to data transmission time from the latest predecessor.

$$\begin{aligned} & \mathit{FTDT}= \max_{a_j \in \operatorname{pred}( a_i )} \bigl\{ \mathit{FT}_{a_j} +\mu\, dt ( a_{j}, a_{i}, \boldsymbol{M} ) \bigr\} \end{aligned}$$
(6)
$$\begin{aligned} & a_{p} =\mathop{\mathrm{argmax}}_{a_j \in \operatorname{pred}( a_i )} \ \bigl\{ \mathit{FT}_{a_j} +\mu\, dt ( a_{j}, a_{i}, \boldsymbol{M} ) \bigr\} \end{aligned}$$
(7)
$$\begin{aligned} & \operatorname{lat}( a_{i}, \boldsymbol{M}) = \left \{ \begin{array}{l@{\quad}l} \mathit{rt}_{M( a_i )} - \mathit{FT}_{a_p} & \mathit{rt}_{M ( a_i )} >\mathit{FTDT}\\ \mu\, dt ( a_{p}, a_{i}, \boldsymbol{M} ) & \mathit{rt}_{M (a_i)} \leq \mathit{FTDT} \end{array} \right . \end{aligned}$$
(8)

Let CS j be the built-in computational speed of resource j. We assume that computational speed reduces by a coefficient of α al in the availability state al as computed by Eq. (9). The degradation coefficient is specified by statistic information.

$$ \mathit{CS}_{j\mid \mathit{al}} = \left \{ \begin{array}{l@{\quad}l} \mathit{CS}_{j} ( 1- \alpha_{\mathit{al}} ) & \mathit{al}\in \mathit{AL}\backslash \mathrm{unavailable}\\ 0 & \mathit{al}:\mathrm{unavailable} \end{array} \right . $$
(9)

The expected time of executing activity a i when the host is in state al is computed by Eq. (10) where O i is the number of operations in a i .

$$ t_{i,M( a_i )\mid \mathit{al}} = \frac{O_{i}}{\mathit{CS}_{M( a_i )\mid \mathit{al}}} $$
(10)

Using Eqs. (8), (9), (10) and the prediction result gained from predictor, the mean response time is computed by Eq. (11) which is latency plus a weighted sum of the probability vector of resource states and expected execution time of activity in each state. Note that the mean response time of an activity shows the required time the resource successfully completes the activity execution. So, in the computation, unavailable states have been omitted, and the probability of each state under the condition of the resource being up has been used.

$$ \mu ( a_{i}, \boldsymbol{M} ) = \operatorname{lat}( a_{i}, \boldsymbol{M} ) + \sum_{\mathit{al}\in \mathit{AL}\backslash \mathrm{unavailable}} \frac{\pi_{M( a_i ),\mathit{al}}}{\sum_{\mathit{al}\in \mathit{AL}\backslash \mathrm{unavailable}} \pi_{M( a_i ),\mathit{al}}} \times t_{i,M( a_i )\mid \mathit{al}} $$
(11)

The probability of a successful execution of an activity is computed by multiplying two probabilities: the probability that required data has successfully been transmitted and the probability that a resource is available. This probability is found as in Eq. (12):

$$ r ( a_{i}, \boldsymbol{M} ) =\ \prod_{a_p \in \operatorname{pred}( a_i )} r\,dt ( a_{p}, a_{i}, \boldsymbol{M} ) \times\sum _{\mathit{al}\in \mathit{AL}\backslash \mathrm{unavailable}} \pi_{M( a_i ),\mathit{al}} $$
(12)

The execution cost of activity a i when the host is in state al is computed by Eq. (13). This cost has a direct relation to the number of operations of the activity and the computational speed of the host at availability state al. In this equation, the division by the maximal computational speed is for normalization. β cost is a cost coefficient which is assumed for execution of each operation. This coefficient is specified by the grid owners according to their economic policy.

$$ c_{i,M( a_i ),\mid \mathit{al}} = O_{i} \times\beta_{\mathrm{cost}} \times \frac{\mathit{CS}_{M( a_i )\mid \mathit{al}}}{\max_{j\in \mathit{RE}_i} CS_{j}} $$
(13)

The total execution cost of activity a i is computed by a weighted sum of the probability vector of resource states and the expected cost of execution in each state as in Eq. (14):

$$ c ( a_{i}, \boldsymbol{M} ) = \sum_{\mathit{al}\in \mathit{AL}\backslash \mathrm{unavailable}} \frac{\pi_{M( a_i ),\mathit{al}}}{\sum_{\mathit{al}\in \mathit{AL}\backslash \mathrm{unavailable}} \pi_{M( a_i ),\mathit{al}}} \times c_{i,M( a_i )\mid \mathit{al}} $$
(14)

4.2.3 QoS of basic structures

The QoS of basic structures are computed as shown in Table 2. In a sequence structure, the response times of all operands are accumulated and the reliabilities of them are multiplied. In a parallel structure, the maximal response time of all operands forms the final response time and the reliability is again found by multiplication of reliabilities of all operands. The cost in these structures is the sum of costs for all operands. The computation is similar for selection. The only difference is that the probabilities of selection of operands are involved in the computation. A loop can be considered as a sequence structure which repeats for iterations.

Table 2 QoS of basic structures. X i ∈{seq,par,sel,loop}∪A

4.2.4 QoS composition

The input of the system is a directed graph representing the workflow. This graph is converted to a tree structure (T). In the tree, the leaf nodes are activities and the middle nodes are basic structures with their children from left to right indicating the operands the basic structure must be applied on. An example is shown in Fig. 8. The tree structure can be constructed by syntax definition of the workflow. To compose the whole QoS, it is enough to call {μ(root(T),M),r(root(T),M),c(root(T),M)}. Figure 9 shows the sequence of calls occurred during the mean response time computation from the tree structure in Fig. 8. The whole algorithm based on the mentioned steps has been shown in Figs. 10 and 11.

Fig. 8
figure 8

An example workflow and its corresponding tree structure

Fig. 9
figure 9

Sequence of calls for mean response time computation of sample tree structure in Fig. 8

Fig. 10
figure 10

The pseudocode of workflow QoS estimation

Fig. 11
figure 11

The pseudocode for computing the QoS of a node in tree

4.3 An illustrative example

To demonstrate the steps of the algorithm, in this section we illustrate by an example. Figure 12 shows a workflow to be executed on three resources. The labels on the edges show the data amount transmitted among activities. For simplicity, the inter-bandwidth among all resources is assumed to be 1 while the intra-bandwidth is ∞ and the data transmission variance is zero. The resources change their behavior as shown in Fig. 13. The fastest resource r 1 starts with a user present and transitions to CPU threshold exceeded at time 50. It becomes unavailable at time 200. The slowest resource r 3 is permanently available. The mediocre resource r 2 is in CPU threshold exceeded state. In this example β cost=0.01 and ∀al α al =0.3. The predictor is PW with window size of 350 s.

Fig. 12
figure 12

An example workflow supposed to be executed on three resources. Ranks in part (b) are computed by HEFT as described in Fig. 3

Fig. 13
figure 13

Resources behavior

The workflow is submitted at times 70 and 250. The HEFT and WQE estimations of workflow execution at submit time 70 have been shown at Fig. 14. The actual execution of workflow has also been shown for comparison. As it is found, the WQE is much closer to the actual execution than HEFT. Note that the actual execution can be traced according to resources behavior in Fig. 13. For example, at time 70 the r 1 is in CPU threshold exceeded and has a speed of \(\mathit{CS}_{1\mid \mathrm{CPU\ thr\ execeeded}} =10\times ( 1 - 2 \times0.3 ) =4~\mathrm{ops}\mbox{/}\mathrm{s}\) according to Eq. (9). Thus, submitting a 1 at time 70 takes \(\frac{O_{1}}{ \mathit{CS}_{1\mid \mathrm{CPU\ thr\ execeeded}}} = \frac{40}{4} =10~\mathrm{s}\).

Fig. 14
figure 14

WQE and HEFT estimation of task execution on resources when the workflow in Fig. 12 has been submitted at time 70. The actual execution has been shown for comparison

The final results of the response time, reliability and cost estimation in WQE, HEFT and the actual values are shown in Table 3. At time 250, the r 1 is unavailable and the workflow execution fails. The HEFT assumes the execution is reliable while the WQE computes the reliability as 0.013. For more clarification, some parts of the computations have been shown in Table 3.

Table 3 WQE and HEFT results for workflow of Fig. 12

5 Complexity analysis

In this section, we analyze the time complexity of the proposed method for estimating QoS metrics for a workflow with M activities and E edges supposed to be mapped on N resources defined in the mapping vector M. Before proceeding with the analysis, the following theorems are proved.

Theorem 5.1

The number of edges in a structured workflow with M activities as defined in Sect3.1 is O(M).

Proof

Each activity is inserted to a workflow through a basic structure including sequence, parallel, selection, and loop. As it is observable in Fig. 2, this insertion to any arbitrary basic structure adds at most two more edges to the workflow. Thus, the number of edges is at most twice of the number of activities, i.e., E=O(M). □

Theorem 5.2

The number of nodes in the correspondent tree of any structured workflow with M activities is O(M).

Proof

The minimum possible corresponding tree has two levels with one root and M activities as leaves. This tree obviously has O(M) nodes. As the number of leaves is constant and equal to M, to construct a tree with maximum possible nodes, the middle nodes should be maximized. Ignoring loops, these nodes can be labeled as “seq”, “sel” or “par” and function as operators. In order to maximize the middle nodes, the least number of operands, which is obviously two, should be selected for each node. Consequently, the biggest tree is a full binary tree having M leaves. This tree can be balanced or unbalanced. As analyzing either case gives the same result, we continue the analysis with a balanced full tree, i.e., full and complete tree. Let M′ be the smallest positive number greater than or equal to M such that it is a power of two. In other words, \(M ' = 2^{ \lceil \log_{2}^{M} \rceil}\). Assuming these M′ nodes as leaves of a tree, the total number of nodes in a full and complete tree is \(2^{\log_{2}^{M\prime} +1} -1=2M'-1\). Above each node of this tree, a “loop” can be inserted which leads the number of nodes being 2×(2M′−1)=O(M). □

As our analysis has shown the same time complexity for the response time, reliability and cost estimation, in the rest of this section we concentrate on the response time estimation. Estimating the response time involves four processing steps: first, the availability prediction of resources, second, data transfer time computation, third, the response time estimation of activities, and fourth, the response time composition.

Let |SI| be the number of subintervals within prediction window of size L. NoteFootnote 3 that 1≤|SI|≤L. The availability prediction is done by processing the subintervals within a window.Footnote 4 Thus, the availability prediction for all resources is O(|SI|⋅N).

To compute the mean data transmission time for each pair of activities, a computation as in Eq. (5) is required for each edge in the workflow. According to Theorem 5.1, this processing needs O(M) operations.

The estimation of response time of activity a i needs \(O( \vert \operatorname{pred} ( a_{i} ) \vert )\) operations to compute the latency of data transmission time and O(|AL|) time to compute the weighted sum of the response time at availability levels as indicated by Eq. (11). In total, the estimation of the response time for activity a i takesFootnote 5 \(O( \vert \operatorname{pred} ( a_{i} ) \vert )\) operations.

The composition is done via a post order traversal of the corresponding tree in which the complexity depends on the number of nodes in the tree. According to Theorem 5.2, the process is done within O(M) time.

The time complexity of all four steps results in \(O( \vert \mathit{SI} \vert \cdot N+M + \sum_{i=1,\ldots,M} \vert \operatorname{pred} ( a_{i} ) \vert +M)=O( \vert \mathit{SI} \vert \cdot N+M+E) \stackrel{\mbox{\scriptsize Theorem~5.1}}{\Longrightarrow} O( \vert \mathit{SI} \vert \cdot N+M)\) where 1≤|SI|≤L.

6 Experiments

The simulation has been done over NDU data set. Condor has recorded the availability changes of 64 nodes over 6 months in the early 2007 [21]. To choose a suitable resource state predictor, in the first part of the simulation, PE and PW with different window sizes are initially examined. The results in the second part of the simulation are based on the best performed predictor. To measure the accuracy of the workflow QoS estimation, the actual QoS parameters have been gained by simulating the execution of workflows on resources which behave exactly according to the NDU trace. The errors of the QoS estimation in HEFT and the WQE methods have been measured in comparison with the actual values.

6.1 Resource state prediction results

The state of each resource has been predicted each hour within 6 months. The availability state with the maximum probability has been selected as output of the predictor as in Eq. (15):

$$ \mbox{predicted state for resource }j \mbox{ at time } t=\mathop{\mathrm{arg\,max}}_{\mathit{al}} \pi_{j,\mathit{al}} (t) $$
(15)

When the predicted state is equal to the actual reported state in the trace, the prediction is correct otherwise it is wrong. Figure 15 indicates the prediction accuracy when the window size changes from 1 to 25 hour. The smaller the window size, the better the achieved accuracy. PW outperforms PE. The reason is that the information near to the prediction time gets more weight in PW. This makes sense as the state of a resource is more dependent on its recent states rather than farther states in the past.

Fig. 15
figure 15

Resource availability prediction accuracy vs. size of window

To examine the effect of distance of prediction point from history, we have chosen 1 and 2 hour as the window size, since, in Fig. 15, they showed the highest accuracy. Figure 16 shows how the accuracy reduces when the distance of prediction point from history increases. This accuracy reduction has a great effect in workflow QoS estimation. After submitting the workflow, it takes some time for the activities to become ready for execution. So, when we predict the state of a resource at a workflow submit time, the prediction does not necessary hold true when the activity starts execution. The figure shows that when the distance of prediction point changes from 0 to 12 hours, the accuracy reduces from 96 % to 85 %. When this distance is less than 2 hours, the PW performs better, but for distances above 2 hours PE has performed better. In the rest of the experiments, we use PW with window size of 1 hour as a predictor, since 70 % of the workflows have makespan less than 4 hours and in this range PW performs better than PE.

Fig. 16
figure 16

Resource availability prediction accuracy vs. distance of prediction point from history

6.2 Workflow QoS estimation results

6.2.1 Random workflows

To measure the accuracy of QoS estimation, 25620 random workflows were generated. The 56 nodes of NDU trace have been considered as resources. Simulation parameters have been shown in Table 4. Among the parameters, Height shows the height of the workflow graph which is equal to the length of the longest path from the root to the exit activity. And–Or ratio is the probability that the basic structure would be and rather than or when a split occurs. Max edge ratio is a parameter controlling the maximum out-degree of a node. The number of operations per activity is modeled by a normal distribution with its mean uniformly selected from the range of 15 Mega Operation (MOP) to 1500 MOP while the standard deviation is a quarter of the mean. For the sake of simplicity, we have assumed that there is no loop in the workflows. As there was no bandwidth information for resources of NDU trace, we use the Communication-to-Computation Ratio (CCR) to compute communication time [19]. The communication time between each two activities is modeled as a normal distribution with mean computed by Eq. (16) and the standard deviation of 20 s. We assume that network communication is reliable.

$$ \mu\, dt ( a_{i}, a_{j}, \boldsymbol{M} ) = \left \{ \begin{array}{l@{\quad}l} 0 & \mathrm{if}\ M ( a_{i} ) =M( a_{j} )\\ \mathrm{CCR}\times\frac{\frac{O_{i}}{\mathit{CS}_{M( a i )}} + \frac{O_{j}}{\mathit{CS}_{M( a j )}}}{2} & \mathrm{else} \end{array} \right . $$
(16)
Table 4 Parameters related to resources and 25620 random generated workflows in the simulation

In this part of simulation, we compare the accuracy of HEFT estimation of QoS with WQE estimated values. The accuracy is computed by comparing the estimated values with actual values achieved by simulating workflow execution on NDU trace. Each workflow is submitted to the grid in time interval of 2 hours during 6 months. The results are the average for all workflows and all submit times.

Table 5 shows the failure and the resource utility reports of workflows execution. It has been found that about 75 % of workflows have a failure rate less than 60 %, while 25 % of them have a failure rate above 60 %. About 80 % of workflows use less than 40 % of the resources. The reason is that HEFT is a greedy algorithm toward using the fastest resources which is defined by the static speed of resources.

Table 5 Failure and resource utility report

Table 6 illustrates the actual response time and cost report of successfully executed workflows, respectively. 70 % of workflows have been executed within 4 hours, and the execution of the rest has been completed in at most 8 hours. About 80 % of the workflows have been executed with the cost less than 2850 of base units and the rest of them have cost up to 5700.

Table 6 Mean actual response time and cost report

In order to measure the accuracy of reliability prediction, we use the following rules:

$$ \begin{aligned} & \mathbf{rule\ 1}{:} \quad \mbox{if workflow execution failed} \wedge r \bigl( \mathit{root} ( T ), \boldsymbol{M} \bigr) <0.5\ \rightarrow \mbox{accurate}\\ & \mathbf{rule\ 2}{:}\quad \mbox{if workflow execution succeeded} \wedge r \bigl( \mathit{root} ( T ), \boldsymbol{M} \bigr) >0.5\ \rightarrow \mbox{accurate}\\ & \mathbf{rule\ 3}{:}\quad \mbox{otherwise} \rightarrow \mbox{inaccurate} \end{aligned} $$
(17)

When the reliability is predicted correctly, we measure the certainty of prediction by confidence value as in Eq. (18):

$$ \mathit{confidence}= \left \{ \begin{array}{l@{\quad}l} 1-r ( \mathit{root} ( T ), \boldsymbol{M} ) & \mathbf{rule~1}\\ r ( \mathit{root} ( T ), \boldsymbol{M} ) & \mathbf{rule~2} \end{array} \right . $$
(18)

85 % of failure/success execution of workflows has correctly been predicted in WQE with confidence of 0.82. HEFT has the accuracy of 56 % in failure/success prediction which is low in comparison with WQE. This is because HEFT always assumes that the execution will be completed successfully.

The cumulative distribution function (CDF) of the mean absolute error (MAE) of the response time estimation has been shown in Fig. 17. The curve of WQE is above HEFT, which indicates its superior performance due to considering resource state prediction in the computation of the QoS parameters. In 80 % of cases, the MAE in WQE is less than 18 minutes, while this value increases to 32 minutes in HEFT. Similar results have been obtained in cost prediction as shown in Fig. 18.

Fig. 17
figure 17

Cumulative distribution function of mean absolute error for response time estimation

Fig. 18
figure 18

Cumulative distribution function of mean absolute error for cost estimation

In the rest of this section, we investigate the effect of the workflow structure on estimation accuracy. As the behavior of cost curves was similar to the response time curves, due to lack of space, we only show the results of the response time estimation.

Figure 19 indicates the effect of height when it varies in the range from 2 to 20. As it is shown, the increase of height causes a makespan increment, and therefore, an increment in distance of prediction point from history. Thus, the quality of estimation decreases. In this way, a slight descending slope in the accuracy of reliability prediction and at the same time a large ascending slope in MAE of the response time are generated. HEFT has reverse behavior in reliability prediction. Since, when the height increases,Footnote 6 it becomes greedier toward using few fast resources. In NDU trace, fast resources have high availability. Thus, less reliability prediction error will be involved in HEFT.

Fig. 19
figure 19

The effect of the height variation on the accuracy of workflow QoS estimation. The accuracy of reliability is the percentage of times that a success/failure execution of a workflow has been correctly predicted using the rules in Eq. (17)

To investigate the effect of the number of tasks, we have shown the accuracy as a function of the number of tasks in the range of {5,10,…,70}. Figure 20 shows the result. As expected, HEFT performs worse with a considerable slope variation. On the other hand, we get a less variable slope behavior in WQE. The reason is that increasing the number of tasks does not necessary make an increment in the makespan. It means that the distance of prediction point is kept rather at the same level. So, in WQE, the quality of estimation does not change a lot.

Fig. 20
figure 20

The effect of the number of tasks variation on the accuracy of a workflow QoS estimation. The accuracy of reliability is the percentage of times that a success/failure execution of a workflow has been correctly predicted using the rules in Eq. (17)

Figures 19 and 20 also show that as the workflow becomes larger either by height or the number of tasks, WQE will outperform HEFT even more. In large workflows, HEFT causes a huge error in estimation while WQE has much better performance due to regarding the dynamic states of resources in QoS estimation. For very small workflows, for example, with height less than 4 as shown in Fig. 19(b) or under 10 activities as shown in Fig. 20, applying WQE has less benefit.

6.2.2 Actual workflows

The method has been evaluated on actual workflows (Table 7). We have constructed the tree according to the structure of workflows. The CCR is supposed to be 0.2 to enhance the effect of resource dynamicity on QoS parameters. The number of operations per activity is a normal distribution with mean uniformly chosen from 15 MOP to 1500 MOP and the variance equal to a quarter of the mean. Table 8 shows the results. The superior performance of WQE is enhanced when the workflow becomes bigger. For example, for Avian Flu, the MAE of the response time estimation in WQE is about 79 s less than HEFT, while in Motif, this outperformance increases to 39 min. For very small workflows like Avian Flu and Gene 2 Life, the accuracy of reliability prediction in WQE is less than HEFT, but for other workflows the WQE has predicted better. The reason is that small workflows consume few fast resources. These resources in NDU trace are also highly available. When the size of workflow grows, WQE provides more accurate prediction for reliability.

Table 7 Actual workflow structures used in simulation
Table 8 Evaluation of QoS estimation on actual workflows

6.2.3 Network stability effect

In the previous sections, we assumed that the data transmission time between each pair of activities is done with a mean computed by Eq. (16) and the standard deviation of 20 s. As the network may not be stable for communications, this variance might be higher in reality. To investigate the effect of this variation, we have changed the standard deviation of data transmission time for each edge in the Epigenomics workflow in the range from 20 to 420 s. The result is shown in Fig. 21. As the standard deviation increases, the estimation error of both methods increases, since the estimation of communication time has a direct effect on workflow QoS estimation. A good prediction method for network operation can be composed with the proposed method to minimize the effect of data transfer time variation.

Fig. 21
figure 21

The effect of the data transfer time variation on response time estimation

6.2.4 Computation time

In this part of simulation, the effect of the parameters on the run time of WQE has been investigated. These parameters include the number of activities (M), number of resources (N), and prediction window size (L). In each part of the simulation, one parameter changes while others remain constant. A Java-based simulator runs on a system with Intel® Core i7-3770k CPU (3.50 GHz) and uses 128 MB of memory.

In the first part of the simulation, N=64 and L=1 h. M changes in the range of {500,1000,…,8500} to reflect various numbers of activities. To have fair communication-to-computation and the number of parallel-to-selection splits, CCR and And–Or ratio have both been selected to be 0.5. The max edge ratio is 1.5 and the height is 50. Figure 22 shows that the run time of the algorithm changes from 17.2 ms for a workflow with 500 activities to 68.6 ms for a workflow with 8500 activities. As expected by theoretical analysis of Sect. 5, there is a linear relationship among the points of the plot. For example, when the number of activities changes from 1000 to 4000, the run time smoothly increases from 28.8 to 45.7 ms.

Fig. 22
figure 22

Run time of WQE for workflows with different sizes. CCR=0.5; And–Or ratio = 0.5; Max edge ratio = 1.5; height = 50; L=1 h. NDU resources have been used. The results are averages over 50 runs

In the second part of the simulation, M=4500 and L=1 h. N changes in the range of {4,8,…,64} to reflect various numbers of resources. As expected by theoretical analysis, there is a linear relationship among the run time and the number of resources as shown in Fig. 23.

Fig. 23
figure 23

Run time of WQE for workflow with 4500 tasks vs. number of resources. CCR = 0.5; And–Or ratio = 0.5; Max edge ratio = 1.5; height = 50; L=1 h. The results are averages over 50 runs

Finally, we change L in the range of {1,7,…,108} hours in the case when M=4500 and N=64. Figure 24 shows that as the size of the window increases, the run time of the algorithm slightly changes from 45.4 to 55.9 ms. The slight changes indicate that the most important parameters in the run time of WQE are the number of activities in the workflow and the number of resources. Increasing the size of the prediction window slightly increases the number of subintervals and thus has less effect on the run time increment.

Fig. 24
figure 24

Run time of WQE for workflow with 4500 tasks vs. size of window. CCR = 0.5; And–Or ratio = 0.5; Max edge ratio = 1.5; height = 50. The results are averages over 50 runs

7 Conclusion and directions for future work

Accurate workflow QoS estimation enhances the performance of a workflow scheduling algorithm. In this paper, we propose a method called WQE, for estimating the QoS parameters of a Grid Workflow. These parameters include reliability, response time and execution cost. The two main components of WQE include resource monitoring and analysis and workflow QoS computation.

We have employed a simple monitoring system which monitors each resource periodically to record the changes in availability state of them within time. The availability states represent both the workload and availability of resources. The resource behavior with respect to availability changes is modeled by a multi-state system. Two prediction algorithms (PE and PW) have been proposed to stochastically predict the availability state of a resource. These predictors use different weighting mechanisms for historical availability information. Simulation results showed the superior performance of PW in comparison with PE.

The workflow QoS computation is done in four steps: data transfer modeling, activity level estimation, basic structures computation, and QoS composition. The QoS of activities are computed based on resources availability analysis. We support sequential, parallel, selection, and loop as basic structures. The QoS of each basic structure is computed by aggregating the QoS of each operand involved in the basic structure. Assuming the workflow graph is converted to a tree structure, the QoS composition uses QoS of basic structures to compute the QoS of the root which is regarded as the final computation. NDU trace has been used to simulate workflow executions to get the actual QoS values. Simulations have been carried on for random and actual workflows. WQE outperforms estimation of HEFT, and the estimated values are much closer to actual values.

There are three directions for future work. First, the presented estimation method can be exploited to enhance the quality of a workflow scheduling algorithm. A good trade-off among reliability, performance, and cost in scheduling is possible when employing estimated QoS of workflow. Second, a network operation prediction method can be combined with WQE to improve consistency with the actual world. Finally, the method might be justified for estimating the QoS of the workflow running on virtual machines inside a data center to move toward cloud computing. The major challenge will be predicting the behavior of a virtual machine which needs much more investigation in this way.