1 Introduction

Project scheduling has been a topic of interest for several decades. Typically, a project consists of a number of tasks and the aim is to optimize an objective related to the completion time/value of the tasks. In previous research, the objective has mainly been to minimize a project duration but more recently there has also been interest in maximizing a net present value (NPV) of the project, which is a cash flow dependent on the task completion date (Chen et al. 2010; Vanhoucke 2010; Show 2006).

Project scheduling has been formulated in various ways (Brucker et al. 1999; Demeulemeester and Herroelen 2002; Neumann et al. 2003). Brucker et al. (1999) provides an overview of a number of variants. They focus on projects consisting of tasks, shared (renewable) resources, precedences between tasks and deadlines. Amongst methods, they discuss branch & bound, heuristic and local search approaches. Demeulemeester and Herroelen (2002) also describes a variety of problems from this class and detail ways in which these problems have been tackled. These include various exact methods, heuristic schemes and meta-heuristics including genetic algorithms, simulated annealing and tabu search. Neumann et al. (2003) examine project scheduling with time windows also detailing various exact and heuristic methods.

Project scheduling is closely related to resource constrained job scheduling (Singh and Ernst 2011; Thiruvady et al. 2012, 2014). The problem considered by these studies have similar characteristics (e.g. precedences and resource constraints between jobs, deadlines, etc.) with the main difference being the objective which is to minimise the total weighted tardiness. Singh and Ernst (2011) examine a Lagrangian relaxation based heuristic which proves to be more effective than heuristics on their own. Thiruvady et al. (2012, 2014) explore ant colony optimisation and hybrids with constraint programming for a similar problem with hard deadlines.

Various studies have investigated methods (heuristic and exact) for the NPV problem (Chen et al. 2010; Vanhoucke 2010; Show 2006; Kimms 2001; Gu et al. 2013). Chen et al. (2010) investigate an ant colony optimization approach and show that their method outperforms other heuristic methods based on genetic algorithms, simulated annealing and tabu search for instances with up to 98 tasks. Vanhoucke (2010) also considers a heuristic scatter search approach and shows that this method is more effective than a previously suggested branch & bound approach for the same problem (Vanhoucke et al. 2001). Gu et al. (2013) investigate a Lagrangian relaxation and constraint programming hybrid for the same problem and show that improved good feasible solution can be obtained with this approach. Show (2006) also investigate ant colony optimization for a similar problem with up to 50 tasks.

Lagrangian relaxation (LR) is a well-known technique applied to integer programming problems (Fisher 2004). Many computationally hard problems can be tackled by considering a simpler version of the problem which omits (or ‘relaxes’) some complicating constraints. The solution to the relaxed problem can provide useful information about the original problem. In particular, Lagrangian relaxation methods provide an alternative way to obtain upper bounds (for maximization problems) providing performance guarantees. This is done by adding a cost term to the objective which is negative when any of the relaxed constraints would have been violated. Since the objective is to maximize, this cost drives the solution towards satisfying the constraints. In the context of a NPV-based problem, Kimms (2001) showed how such a scheme could be successful.

Ant colony optimization (ACO) is a reinforcement-based meta-heuristic based on the foraging behavior of real ants which has been successfully applied to various combinatorial optimization problems (Dorigo and Stűtzle 2004). This includes variants project scheduling problems (Merkle et al. 2000; Chen et al. 2010; Show 2006). Merkle et al. (2000) investigate the resource constrained project scheduling using makespan as the objective. Chen et al. (2010) consider a problem where tasks can be executed in multiple modes. Show (2006) also consider a similar problem to Vanhoucke (2010).

We investigate the resource constrained project (RCP) scheduling problem suggested by Kimms (2001). The problem consists of maximizing the NPV of all the tasks subject to precedences between some tasks and a common deadline for all the tasks. Furthermore, all the tasks may require a number of renewable resources. There is a maximum availability on each of these resources for every time point. The previous study by Kimms (2001) develops a Lagrangian relaxation based heuristic and show that this approach is effective at obtaining tight upper bounds. We further extend these results here to show that lower bounds can be improved with the assistance of ACO. LR and ACO can effectively be applied to the RCP problem independently. However, here we show that the hybrid of these methods, LR-ACO, proves to be the most effective method to obtain lower bounds outperforming LR and ACO individually.

This paper is organized as follows. The RCP problem is stated formally in Sect. 2. Section 4 describes ACO and how it has been tailored for the current problem. In Sect. 5, the experimental details and an analysis of the results are provided. Section 7 concludes the paper.

2 Problem formulation

The RCP scheduling problem can formally be stated as follows. There are a number of tasks \(T = \{o_1,\ldots ,o_n\}\) with each task consisting of a duration \(d_i, i \in T\). During the execution of a task, there are associated cash flows. Let \(cf_{it}\) be the cash flow of task \(i\) in period \(t\). The total cash flow of a task \(c_i\) can be computed as \(\sum _{t=1}^{d_i} cf_{it}e^{\alpha (d_i-t)}\) where \(\alpha \) is a discount rate. The discounted value of the task at the beginning of the project can be computed as \(c_ie^{-\alpha (s_i+d_i)}\) where \(s_i\) is the start time of the task. Precedences between tasks may exist and are denoted by the set \(P = \{(i_1,j_1),\ldots ,(i_m,j_m)\}, i,j \in T\).

The constraints to be satisfied include resource and deadline constraints. Given \(k\) resources \(R = \{R_1,\ldots ,R_k\}\) with constant availability, each task requires \(r_{ik}\) units of the \(k^{th}\) resource. Additionally, every task must be completed by a pre-defined deadline \(\delta \).

The objective is to maximize the NPV

$$\begin{aligned} max.&\quad \sum _{i=1}^{n} c_i e^{-\alpha (s_i+d_i)} \end{aligned}$$
(1)
$$\begin{aligned} s.t.&\quad s_i + d_i \le s_j&\quad&\forall (i,j) \in P \end{aligned}$$
(2)
$$\begin{aligned}&\sum _{i \in {S(t)}} r_{ik} \le R_k&\quad&k = \{1,\ldots ,m\},&\quad&t = \{1,\ldots ,\delta \} \end{aligned}$$
(3)
$$\begin{aligned}&s_i + d_i \le \delta&\quad&i \in T \end{aligned}$$
(4)

where \(S\)(\(t\)) is the set of tasks executing at time \(t\). The objective, Eq. 1, is the net present value which is to be maximized. Constraint 2 specifies all tasks must start after their predecessors have completed. Constraint 3 requires that all the resources are satisfied and the final constraint requires that all the deadlines are satisfied.

An alternative way to view a solution is by considering permutation of tasks \(\pi \). Now, a scheduling scheme can take \(\pi \) and map it to a resource-feasible schedule and let \(\sigma \)(\(\pi \)) be such a mapping. Then, a feasible schedule is one that assigns start times to all tasks by satisfying precedence and resource constraints. This schedule has a NPV associated which each task and the cumulative NPV can be determined trivially and is represented as \(f(\sigma (\pi ))\). The precedences may be accounted for within a permutation, i.e., if a task \(i\) precedes task \(j\) then \(i\) appears earlier in the permutation then \(j\). An alternative is to allow a preceding task to appear later than its successor in the permutation and modify the scheduling scheme \(\sigma \) to nevertheless generate a feasible schedule from it in which task \(i\) precedes task \(j\). We found the latter scheme more effective and we therefore use this scheme in this study.

Permutations may not necessarily map to feasible schedules in terms of satisfying the deadlines. However, this situation is avoided by specifying sufficiently large deadlines such that feasible solutions are easily found. Hence, constraint (4) is redundant. Since the aim in this study is not to minimize makespan there is no issue with specifying large deadlines except in the situation with negative-valued cash flow tasks which may be scheduled later with larger deadlines.

3 Lagrangian relaxation

In this study we consider two integer programming (IP) models to implement the LR method. The first is the one proposed by Kimms (2001) and the second is an adaptation of a similar model used by Singh and Ernst (2011) which was originally used for a resource constrained job scheduling problem.Footnote 1 The motivation for using the latter model is that it is a stronger formulation providing improved run-time scalability.

3.1 LR-Kimms

We provide the IP model suggested by Kimms (2001) and briefly describe the Lagrangian function.Footnote 2 Binary variables \(x_{jt}\) are defined such that \(x_{jt}=1\) if task \(j\) completes at time \(t\) and 0 otherwise. The problem can be specified as follows.

$$\begin{aligned} max.&\quad \sum _{i=1}^{n}\sum _{t=1}^{\delta } c_i e^{-\alpha t} x_{it} \end{aligned}$$
(5)
$$\begin{aligned} s.t.&\quad \sum _{t=1}^{\delta }x_{it} = 1&i \in T \end{aligned}$$
(6)
$$\begin{aligned}&\sum _{t=1}^{\delta }t x_{jt} - \sum _{t=1}^{\delta }t x_{it} \ge d_j&\forall (i,j) \in P \end{aligned}$$
(7)
$$\begin{aligned}&\sum _{i=1}^{n} \sum _{\hat{t}= t}^{t+d_i-1} r_{ik} x_{i\hat{t}} \le R_{kt}&k \in R,&t \in \{1,\ldots ,\delta \} \end{aligned}$$
(8)
$$\begin{aligned}&x_{it} \in \{0,1\}&i \in T,&t \in \{1,\ldots ,\delta \} \end{aligned}$$
(9)

Equation 6 requires that all tasks complete. The precedences are incorporated via Eq. 7 and the resource constraints via Eq. 8.

The Lagrangian relaxation of the above problem can be obtained by relaxing the resource constraints and introducing multipliers \(\lambda _{kt},\,k \in R, t \in \{1,\ldots ,\delta \}\). An upper bound can be obtained by solving the Lagrangian dual

$$\begin{aligned} LRR(\lambda ) = max.&\sum _{i=1}^{n}\sum _{t=1}^{\delta } c_i e^{-\alpha t} x_{it}\,+&\sum _{k \in R} \sum _{t=1}^{\delta } \lambda _{kt} \left( R_{kt} - \sum _{i=1}^{n} \sum _{\hat{t}= t}^{t+d_j-1} r_{i\hat{t}} x_{i\hat{t}}\right) \end{aligned}$$
(10)

Subject to Eqs. 6, 7 and 9. The above objective can be rearranged to obtain

$$\begin{aligned} LRR(\lambda )&= max. \quad \sum _{i=1}^{n}\sum _{t=1}^{\delta } x_{it} \left( c_i e^{-\alpha t} - \sum _{i=1}^{n} \sum _{\hat{t}= t-d_j+1}^{t} \lambda _{k\hat{t}} r_{i\hat{t}} \right) \nonumber \\&\quad + \left( \sum _{k \in R} \sum _{t=1}^{\delta } \lambda _{kt} R_{kt} \right) \end{aligned}$$
(11)

where the last term is a constant and can be ignored when optimizing.

3.2 LR-SE

The second model is adapted from the one suggested by Singh and Ernst (2011) for multiple resources. As above, binary variables \(x_{it}\) represent the completion time of a task. However, unlike LR-Kimms, once a task completes it stays completed (see Eq. 13 below).

$$\begin{aligned} max.&\quad \sum _{i=1}^{n}\sum _{t=2}^{\delta } c_i e^{-\alpha t} (x_{it} - x_{it-1}) \end{aligned}$$
(12)
$$\begin{aligned} s.t.&\quad x_{it} \ge x_{it-1}&i \in T,&t \in \{1,\ldots ,\delta \} \end{aligned}$$
(13)
$$\begin{aligned}&x_{i\delta } = 1&i \in T \end{aligned}$$
(14)
$$\begin{aligned}&x_{jt} \le x_{it-d_j}&(i,j) \in P,&t \in \{1,\ldots ,\delta \} \end{aligned}$$
(15)
$$\begin{aligned}&\sum _{i=1}^{n} r_{it} (x_{it} - x_{it-d_i}) \le R_{kt}&k \in R,&t \in \{1,\ldots ,\delta \} \end{aligned}$$
(16)
$$\begin{aligned}&x_{it} \in \{0,1\}&i \in T,&t \in \{1,\ldots ,\delta \} \end{aligned}$$
(17)

Equation 13 enforces that a task stays completed once it has finished and Eq. 14 requires that all tasks complete. The precedences are enforced via Eq. 15 and the resources constraints are specified by Eq. 16.

As above, Lagrangian multipliers \(\lambda _{kt},\,k \in R, t \in \{1,\ldots ,\delta \}\) are introduced and an upper bound can be obtained by solving the Lagrangian dual

$$\begin{aligned} LRR(\lambda )&= max. \quad \sum _{i=1}^{n}\sum _{t=2}^{\delta } c_i e^{-\alpha t} (x_{it}-x_{it-1}) \nonumber \\&\quad + \sum _{k \in R} \sum _{t=1}^{\delta } \lambda _{kt} \left( R_{kt} - \sum _{i=1}^{n} \sum _{\hat{t}= t}^{t+d_j-1} r_{i\hat{t}} (x_{i\hat{t}} - x_{i\hat{t}-d_i}) \right) \end{aligned}$$
(18)

which can be rearranged to obtain

$$\begin{aligned} LRR(\lambda )&= max. \quad \sum _{i=1}^{n}\sum _{t=2}^{\delta } (x_{it}-x_{it-1}) \left( c_i e^{-\alpha t} - \sum _{i=1}^{n} \sum _{\hat{t}= t-d_j+1}^{t} \lambda _{kt} r_{i\hat{t}} \right) \nonumber \\&\quad + \left( \sum _{k \in R} \sum _{t=1}^{\delta } \lambda _{kt} R_{kt} \right) \end{aligned}$$
(19)

with the last term being constant.

3.3 The lagrangian relaxation heuristic

figure a

Given both models, the high-level LR algorithm is presented in Algorithm 1. To begin with, various parameters and the multipliers are initialized. The main loop starts at line 5 and executes for 1,000 iterations. The gap is above a specified threshold and \(\gamma = 2.0\) is a scaling factor.Footnote 3 Each procedure is described in detail below.

Solve(\(\lambda ^i,\,U\!B\)): The relaxed problem is solved within this procedure. This involves solving Lagrangian function, i.e., Eq. 11 or 20 depending on the model being used. The upper bound \(U\!B\) is set to

$$\begin{aligned} U\!B = LLR(\lambda ^i) \end{aligned}$$
(20)

which is minimized to provide tight upper bounds. The procedure returns a set of start times for the tasks (\(ST\)) representing the optimal relaxed solution.

GenerateList(\(ST\)): The start times obtained are transformed into a list of tasks. This is done by selecting the earliest start time, appending the corresponding task to \(\pi \) and continuing in the same way with the remaining tasks. In case of ties, tasks with higher NPV values are chosen first. The procedure returns a complete list of tasks (\(\pi \)).

ImproveLB(\(\pi \)): \(\pi \) can be mapped to a feasible schedule as will be seen in Sect. 4.1. This provides a lower bound to the optimal NPV. However, this lower bound may be improved further with the assistance of an alternative method resulting in a modified permutation \(\pi \). The hybrid method is obtained by using ACO here.

UpdateBest(\(\pi ^{bs}\),\(\pi \),\(\gamma \)): \(\pi ^{bs}\) = \(\pi \) if \(f(\sigma (\pi )) > f(\sigma (\pi ^{bs}))\). Additionally, if \(\pi ^{bs}\) has not been updated in the last five iterations, \(\gamma \leftarrow \gamma \div 2\).

UpdateMult(\(\lambda ^i,\,LB^*,\,U\!B^*,\,ST,\,\gamma \)): The multipliers are updated for all time periods \(t \in \{1,\ldots ,\delta \},\,k \in R\)

$$\begin{aligned} \lambda _{k,t}^{i+1} = max \left( 0, \lambda _{k,t}^{i}+\frac{\gamma (U\!B^*-LB^*)\varDelta _{kt}}{\sum _{k \in R} \sum _{\hat{t}=1}^{\delta }\varDelta _{k\hat{t}}^2}\right) \end{aligned}$$
(21)

where in the case of LR-Kimms

$$\begin{aligned} \varDelta _{kt} = \sum _{i=1}^{n} \sum _{\bar{t}= t}^{t+d_j-1} r_{ik} x_{i\bar{t}} - R_{kt} \end{aligned}$$
(22)

The above equation can be modified for the LR-SE model as follows

$$\begin{aligned} \varDelta _{kt} = \sum _{i=1}^{n} \sum _{\bar{t}= t}^{t+d_j-1} r_{i\bar{t}} (x_{i\bar{t}}-x_{i\bar{t}-d_i}) - R_{kt} \end{aligned}$$
(23)

4 Ant colony optimization

ACO was first suggested by Dorigo (1992) for combinatorial optimization. When looking for food, ants will leave their nests and mark the paths that they use to a food source with pheromone. Other ants looking for food will follow these trails based on the amount of pheromone deposits on the paths. They, in turn, deposit pheromones on these paths. Thus, paths with more pheromone receive more ants which in turn deposit more pheromone leading to a positive feedback loop. This mechanism leads the colony to converge on better food sources over time (Camazine et al. 2001).

In the context of the RCP problem, we consider a pheromone model that is based on learning an ideal permutation of the tasks. Here, the aim is to learn permutations of the tasks which are then mapped to a schedule using a scheduling scheme (discussed later). We consider the model suggested by den Besten et al. (2000) who examine an ACO algorithm for the single machine problem with the total weight tardiness objective. The pheromones \(\mathcal {T}\) consist of pheromone values \(\tau _{ij}\) for each task \(j\) and variable \(i\) or position in the sequence. The motivation to use this model is that in the absence of any obvious dependenciesFootnote 4 selecting a task for a variable is the simplest model.

figure b
figure c

Two popular variants of ACO are ant colony system (Dorigo and Gambardella 1997) (ACS) and Max-Min ant system (Stűtzle and Hoos 2000). Initial experiments with both variants showed that ACS was better suited to this problem. Thus, for all experiments in this study we used ACS.

The ACS algorithm is presented in Algorithm 2 and Algorithm 3. In Algorithm 2 we see the high level ACO procedure. The global best ant (\(\pi ^{bs}\)) and the pheromone trails \(\mathcal {T}(\tau _{ij}=1/n,\,\forall i,j)\) are first initialised. Now Algorithm 3 is called with \(\mathcal {T}\) as the input. A best solution (\(\pi ^{bs}\)) is also maintained by this algorithm and runs until some terminating criteria is met (line 3) such as number of iterations or time limit. Within each iteration, \(n_a\) ants construct permutations (line 6) which are mapped to schedules (line 7).

ConstructSolution(\(\mathcal {T}\)): A permutation \(\pi \) of tasks is constructed by selecting a task for each variableFootnote 5 starting at \(\pi _1\). A complete solution obtains a permutation where all variables have unique tasks assigned to them. A task is selected with one of two schemes, either deterministic or probabilistic. First, a random number \(q \in (0,1]\) is generated and compared with a pre-defined parameter \(q_0\) in order to select a task at \(\pi _i\). If \(q<q_0,\,k\) is deterministically selected according to

$$\begin{aligned} k = max_{k \in \mathcal{J} {\setminus } \{\pi _{1},\ldots ,\pi _{i-1}\}} \{ \tau _{ik} \eta _{k}^{\beta }\} \end{aligned}$$
(24)

otherwise, \(k\) is probabilistically selected from the following distribution

$$\begin{aligned} \mathbf{P}(\pi _i = k) = \frac{\tau _{ik} \eta _{k}^{\beta }}{\sum _{j\in \mathcal{J} {\setminus } \{\pi _{1},\ldots ,\pi _{i-1}\}} \left( \tau _{ij} \eta _{j}^{\beta }\right) } \end{aligned}$$
(25)

where \(\eta _k\) is heuristic information that may be used to bias the search and \(\beta \) is a factor that determines the contribution of the heuristic information. We attempted various heuristics, such as favouring positive-valued cash flow tasks to be placed early in the sequence, but found no obvious advantage with any of them. Hence, \(\beta =0\) was used which effectively rules out heuristic information.

When a task \(j\) at variable \(i\) is selected, the pheromones are updated as follows:

$$\begin{aligned} \tau _{ij} = \tau _{ij} \times \rho + \tau _{min} \end{aligned}$$
(26)

which is called a local pheromone update, where \(\rho \) is a learning rate parameter which is chosen to gradually reduce the levels of pheromone associated with task \(j\) at variable \(i\). This favours diversity by allowing other tasks to be assigned to the same variable during future solution constructions. \(\tau _{min} = 0.001\) is a lower limit which does not allow the probability of selection of a task to reduce to 0.

ScheduleTasks(): Once a sequence for the current solution has been specified the schedule \(\sigma (\pi )\) is determined. This is done using a scheduling scheme (see Sect. 4.1) and depending on whether the permutation is precedence feasible the scheduling scheme satisfies precedences and resource constraints or only resource constraints. This scheme always generates resource feasible schedules given infinite time.

Update(\(\pi ^{ib}\),\(\pi ^{bs}\)): The procedure sets \(\pi ^{bs}\) to \(\pi ^{ib}\) if \(f(\sigma (\pi ^{ib})) > f(\sigma (\pi ^{bs}))\) where \(f(\sigma (\pi ^{ib}))\) is the NPV of the iteration best solution.

PheromoneUpdate(\(\mathcal {T}\),\(\pi ^{bs}\)): All components \((i,j)\) appearing in \(\pi ^{bs}\) are used to update the corresponding components in the pheromone matrix:

$$\begin{aligned} \tau _{ij} = \tau _{ij} \times \rho + \varDelta \end{aligned}$$
(27)

where \(\varDelta = 0.01\) is a reward amount set to be a constant value. While different reward factors may be used here it was found that reward amount based on the NPV did not provide any improvements over this constant reward on a subset of the instances. Hence, for the sake of simplicity, a constant reward was used. \(\rho \) is the learning rate as specified earlier and is defined to be \(0.1\) for this study.

ComputeConvergence(\(\pi ^{ib}\)): As a convergence measure we use the iteration best solution and a history of the past \(\theta \) iteration best solutions to determine if the pheromones have converged. If sampling the pheromones repeatedly produces the same solution, the pheromones are considered to have converged. A list of the past \(\theta \) solutions, \(l_{\pi ^{ib}}\), in the form of a queue is maintained. Every time a new iteration best is generated it is appended to the end of this list while the first solution in the list is removed. The quality of the current iteration best \(\pi ^{ib}\) is compared to the quality of all the solutions in the list. If they all have the same objective value the pheromones are re-initialized: \(f(\pi ^{ib}) = f(k), k \in l_{\pi ^{ib}} \Rightarrow \tau _{ij}=1/n\,\forall i,j\). The list is also re-initialized where all previous solutions are removed.

4.1 Scheduling schemes

Given a permutation, the tasks are scheduled using a serial scheduling scheme. The scheme is similar to the one used by Li and Willis (1992) and a similar modified scheme by Kimms (2001). The permutation is split into two sets, \(N = N^+ \cup N^-\). \(N^+\) consists of all tasks with positive cash flows which are independent of other tasks or those tasks whose cash flow is positive and greater than the cumulative cash flow of its dependent tasks:

$$\begin{aligned} N^+ = \left\{ t: cf(t) - \sum _{t' \in S(t)}cf(t') > 0 \right\} \end{aligned}$$
(28)

where \(S(t)\) is the set of direct and indirect successors of \(t\). \(N^-\) is the remaining set of tasks, i.e. \(N {\setminus } N^+\).

We consider two ways of scheduling tasks given a permutation (see Fig. 1). Consider the permutation \(\pi \) in Fig. 1a where the problem consists of 7 tasks with precedences between two of them (task 3 must finish before task 4 commences). \(N^-\) consists of tasks 5 and 7 have negative-valued cash flows. There is a single resource \(R\) and the heights of the tasks specify the amount of resource needed. As the permutation shows, precedences are not maintained when constructing the permutation. Figure 1b shows the placement scheme. Here, positive-valued tasks are placed as early as possible, satisfying the resource constraints. If a task appears which has a predecessor that is not scheduled, it is placed on a waiting list \(\hat{\pi }\). Once the predecessor is placed (task 3) the task on the waiting list is also immediately placed.

Fig. 1
figure 1

This figure demonstrates how a permutation of tasks (without precedences) may be mapped to a schedule. There are 7 tasks requiring a single resource and task 3 precedes task 4. Additionally, tasks 5 and 7 have negative-valued cash flows. There are \(R\) units of resource available and the height of a task is the amount of resource needed. The time horizon starts at \(t_0\) and finishes at \(t_d\) which is the deadline for the tasks. a This is a permutation of the 7 tasks. Note that although task 3 must start before task 4, task 4 is allowed to appear ahead of task 3 in the permutation. b The positive-valued cash flow tasks are placed as early as possible given the available resource. c The negative valued cash flow tasks are placed starting at the end of the schedule. Tasks are successively chosen from the end of the permutation

Figure 1c demonstrates how negative-valued cash flow tasks are scheduled. Here, tasks are considered from the end of the permutation and starting at the deadline, tasks are placed in a greedy fashion similar to how the positive-valued cash flows are placed at the beginning of the schedule. Note that the negative valued cash flow tasks do not have to appear at the end of the sequence. If they happen to be in-between positive-valued tasks they are still placed starting at the deadline.

Precedences are not considered when constructing permutations in the above scheme, however, this is trivially accounted for at the scheduling stage. We have attempted to ensure precedences are satisfied within the permutation but found improved results when we ignore them in the permutations.

4.2 LR-ACO

ACO can be incorporated in a straightforward manner into Algorithm 1. In Algorithm 1, ImproveLB(\(\pi \)) can be replaced with Algorithm 3. The basic idea is to seed ACO with \(\pi \) to improve the lower bound. \(\pi \) is used as the global best solution for the ACO procedure which biases the search towards this solution. Note that there are no changes with respect to the manner in which schedules are generated. Lagrangian relaxation with or without ACO uses the same scheme described in the previous section. There are two variants of LR-ACO that we consider. LR-Kimms-ACO is the LR model originally suggested by Kimms (2001) with ACO and LR-SE-ACO is the more efficient LR model also combined with ACO.

5 Experiments and results

5.1 Algorithm settings

Experiments were conducted with LR, ACO, LR-ACO where the LR components make use of the model of Kimms 3.1. The following parameter settings were chosen by conducting tests on a subset of the instances. To choose the number of solutions per iteration, \(n_a,\,\{5,10,20,30\}\) solutions were tested and it was found that 10 was the most effective. Similarly, \(q_0 = 0.9\) was determined from \(\{0.3,0.5,0.7,0.9,1.0\}\). This amounts to high deterministic selection. \(\rho = 0.1\) was selected from \(\{0.1,0.01\}\) and while this is a relatively high learning rate it is justified given that the pheromones are re-initialized when ACO converges to a single solution. Note that the same settings were used for ACO or all algorithms using an ACO component. In the case where LR is combined with ACO, we have allowed the ACO search 500 iterations. This was not determined in any systematic way but rather selected based on conducted as few “meaningful” iterations as possible. Thus 500 iterations provides a reasonable number of updates to the pheromones in a relatively short time-frame.

The LR algorithm of Kimms (2001) is deterministic and was run once for every instance. All the runs were given at most 15 minutes of execution time. The thresholds were set to gap \(<0.01\) or \(\lambda < 0.01\) below any of which the algorithm will terminate. The experiments were conducted on the Monash Sun Grid and the Enterprisegrid with Nimrod/G (Abramson et al. 2000).

5.2 Benchmark sets

The first set of instances were obtained from the project scheduling problem library (Kolisch and Sprecher 1997) which were also the instances used by Kimms (2001). These include a large number of instances with varying degrees of network complexities, resource factors and resource strengths. We first conduct a number of experiments on all the problems instances with 120 tasks to confirm that similar results are being achieved to Kimms (2001) and also to show that LR-Kimms-ACO provides improved results compared to LR and ACO independently. Additionally, we conduct a number of experiments with a subset of the instances (60 and 120 tasks) which are detailed in Sect. 6.

These instances are categorized in terms of three measures. Firstly, network complexities indicate the proportion of precedences incorporated in the instance with a larger value implying a larger number of precedences. The second measure is the resource factor which specifies how many resources are required by an activity in proportion to the total number of resources available. Finally, the resource strengths measure scarceness of resources with low values implying that resource constraints are tight.

The deadlines \(\delta \) were determined in a similar manner to Kimms (2001) so that feasible solutions are found easily. Note that in this study we do not aim to minimize makespan so the deadlines chosen here are larger than the previous study. For a task \(j\), a latest start time (\(ls_j\)) is determined by recursively considering all preceding tasks, i.e., \(ls_j \ge ls_i + d_i\,\forall (i,j) \in P\). Then, \(\delta = 3.5 \times max_jls_j\). The cash flows are determined exactly like Kimms (2001) by selecting \(c_i = [-500,1,000]\) uniformly randomly. We examine two discount rates \(\alpha \). The first is determined like Kimms (2001) where \(\alpha = \root 52 \of {1+0.05}-1\) and the second is \(\alpha =0.01\).

Vanhoucke (2010) also had a number of project scheduling instances with similar characteristics as those tested by Kimms (2001). We consider all the instances with 100 tasks. The major difference between these two studies is that Vanhoucke (2010) uses tight deadlines whereas the schedules generated by Kimms (2001) are always deadline feasible. We use similar settings as those used by Vanhoucke (2010) in terms of deadlines and these experiments are discussed in more detail in Sect. 5.4.

5.3 Results for Kimms’ instances with 120 tasks

We consider the datasets from the project scheduling problem library with all instances consisting of 120 tasks. Figure 2a shows the average results across all instances with 120 tasks. The gap is defined as \((ub-lb)/ub\). For ACO, we have used the LR upper bound to determine its gap. We see that the average gap of 2 % for LR is similar to what was seen by Kimms (2001). The average gap obtained by LR-Kimms-ACO is significantly lower than LR and ACO and proves to be the best option. ACO is effective, but repairing solutions provided by LR are slightly more effective.

We now examine the results by resource strength (RS), resource factor (RF) and network complexity (NC). See Fig. 2b, c and d. The first observation is that LR-Kimms-ACO is always the best performing algorithm always providing average lower gaps than any of the two methods on their own. For the more tightly constrained problems, LR is superior to ACO (large resource factors and low resource strength). This is expected since LR is designed to deal with constraints effectively, whereas ACO and meta-heuristics in general often struggle to deal with hard constraints (Meyer and Ernst 2004). The resource factor shows that as far as network complexity is concerned, ACO is always marginally worse than LR which, in turn, is worse than LR-Kimms-ACO.

Fig. 2
figure 2

The results of each algorithm for all the problem instances 120 tasks. a An average of all the tasks, b Tasks split by resource strengths, c Tasks split by resource factors and d Tasks split by network complexity

Table 1 shows the breakdown of the results discussed in Fig. 2. The table shows lower and upper bounds for each algorithm broken down by resource factors, network complexity and resource strengths. The row \(Mean\) shows the average across all instances. The following rows show the averages for each measure. The LR-based algorithms are always superior to ACO concerning the lower bounds. Here, LR-ACO is generally superior to LR whereas LR is more effective on 4 out of the 12 measures. Interestingly, ACO assists to improve the upper bounds quite significantly by outperforming LR across all measures.

Table 1 Breakdown of the results on all the instances with 120 tasks

5.4 Results for Vanhoucke’s Instances with 100 Tasks

For this experiment, we consider the instances from Vanhoucke (2010) who also used tight deadlines. We consider the largest instances of 100 tasks or more. Gu et al. (2013) examine a Lagrangian relaxation and constraint programming (CP-LR) hybrid for the same set of instances, however they do not extend deadline to allow feasible solutions. Similar to our results, the CP-LR provides feasibility on more than 99 % of the instances with 100 tasks. More interestingly however, the CP-LR algorithm focuses on project scheduling whereas the proposed LR-ACO is applicable to domains beyond scheduling.Footnote 6

Firstly, we examine deadlines of 20 % beyond the minimum project deadline. The results are presented in Table 2 and Fig. 3 for instances with all tasks having positive cash flows (100–0), 20, 40, 60, 80 negative cash flows to 100 % negative cash flows. The figure shows the gaps whereas the table shows a breakdown into lower and upper bounds. Since the deadlines are hard, feasible solutions may not always be found and in this case we extend the deadlines to allow feasibility. Note that this was required for less that 1 % of the instances.

Fig. 3
figure 3

A comparison of Scatter search Vanhoucke (2010), ACO, LR-ACO and LR+ACO. The x-axis represents the proportion of negative cash flow tasks. 100–0 consists of only positive cash flow tasks whereas 100–100 consist of only negative cash flow tasks

Table 2 Results on the instances of Vanhoucke (2010) with 100 tasks

We see that for mainly positive cash flows (0 and 20 % negative cash flows), LR-ACO is the best algorithm. This is mainly sure to the upper bounds being very tight. For these instances, LR is most effective in finding lower bounds. Beyond 40 % negative cash flows, the scatter search is the best lower bounding method. While the tight deadlines can be attributed for this effect, we looked more closely at the individual runs of these methods. We find that even the relaxed problems of the LR require significant solving time which increase with negative cash flow tasks. Thus, in the presence of very tight deadlines, much larger run-times are needed to provide a reasonable number of iterations for the algorithm to converge. Hence, the scatter search is more effective since it is able to generate a large number of solutions (lower bounds) in reduced time-frames.

The conclusion above shows that when there are a large number of negative cash flows (\(\ge \)40) and tight deadlines, the scatter search is a more effective algorithm. Hence, we investigated tighter deadlines to see if the scatter search will eventually be effective on the 0 and 20 % instances. These results are also present in Table 2. We see that with increasing tightness of deadlines, the scatter search is more effective even for the positive cash flow instances.

6 Investigating algorithms

From among the problem instances of Kimms (2001), we selected 12 instances with 60 and 120 activities with a total of 24 instances.Footnote 7 Our aim is to measure the performance of the algorithms on tightly constrained problems and the instances we have chosen reflect this. Thirty runs per instance were conducted for ACO and LR-ACO and each algorithm was given 15 min of execution time.

All of the selected instances have a network complexity value of 2.1. Resource factors were chosen from 0.5 and 1.00 where, in the first case, the tasks require half the number of resources available. In the second case the tasks require all the resources available. Resource strengths with values 0.2 or 0.5 were chosen reflect a wide range of resource strengths.

We present the results of the algorithms on the 24 selected instances (see Table 3). There are four categories of instances, each category consisting of three instances, with 60 and 120 tasks each. The table specifies network complexities (NC), resource factors (RF) and resource strengths (RS). The tables to follow highlight statistically significant results (\(p = 0.05\)) with italics and boldface. Additionally, the best result achieved on any instance is marked only in boldface. The best, mean and standard deviations (sd) for lower bounds (lb) and upper bounds (ub) including gaps where applicable are also reported. In the case of ACO, the gap is reported to the upper bound obtained by LR-SE-ACO.

Table 3 Instances selected for comparing the algorithms

6.1 Comparing LR-SE-ACO and LR-Kimms-ACO

The first set of results are presented in Table 4 with the aim of determining which of the two models (LR-SE-ACO or LR-Kimms-ACO) is more effective. This table clearly shows that for several instances LR-SE-ACO is more effective considering the lower bound. This is mainly due to the LR-SE model being a stronger formulation, thus solving the relaxed proble more quickly leading to improved results. Thus LR-SE-ACO proves to be the superior model and we therefore make use of this model for further comparisons.

Table 4 The results of LR-SE-ACO and LR-Kimms-ACO

6.2 Comparing ACO, LR-SE and LR-SE-ACO

The second comparison is between ACO, LR-SE and LR-SE-ACO. The results are shown in Table 5. Considering the lower bounds, we see that LR-SE-ACO is the best performing method across most instances, especially for the instances with 60 tasks. ACO is also effective and in some cases obtains the best results (e.g. for instance 50 - 5). LR-SE is generally worse, however, by small margins. For the instances with 120 tasks, a select number of instances in categories 57 and 60 shows that LR-SE is as good or better than the other algorithms on average. However, the best results of LR-SE-ACO and ACO are still superior, with LR-SE-ACO performing best. Additionally, there are no significant differences with the upper bounds and gaps obtained are also of similar levels.

Table 5 The results of LR-SE-ACO, ACO and LR-SE

We explain LR-SE-ACO’s improved performance by the following. The LR algorithm obtains optimal start times for the tasks given the relaxed problem. The relaxed problem approaches the original problem overtime through the penalties, leading to start times for the tasks close to that of the optimal. ACO is able to use these start times to further explore surrounding regions of the search space effectively through the use of high learning rates. This leads to improvements on most occasions. Where there are no improvements or worse performance by LR-SE-ACO compared to LR-SE, the ACO component has not effective. Here the time spent on the ACO component has not been as useful but where LR iterations may have been.

Table 6 show results for two other variants of LR-SE with ACO. The first one is where the LR solution information is used to bias the ACO selection via the heuristic information (\(\eta \)) referred to as LR-ACO (heur). The second scheme is one where ACO is run with a partially converged pheromone matrix after LR has completed (LR+ACO). These results are presented in Table 6. LR-ACO (heur) is almost always more effective. Comparing with LR-ACO, LR-ACO (heur) performs worse on the small problems while it is slightly more effective on larger instances.

Table 6 LR-SE-ACO (heur) and LR-SE+ACO

Now we compare the algorithms with a discount rate of \(\alpha =0.01\). The results are shown in Table 7. As a result of this new discount rate, the gaps obtained are not as close as before in the previous comparisons. In this situation, the advantage gained by LR-SE-ACO is accentuated, providing the best lower bounds across all instances with 60 tasks. For the problems with 120 tasks LR-SE-ACO is still the best performing method, however, its advantage is reduced. Here, ACO is able to outperform LR-SE-ACO occasionally. It is worth noting that even when this is the case, LR-SE-ACO always performs better on average.

Table 7 The results of LR-SE-ACO, ACO and LR-SE with \(\alpha =0.01\)

The upper bounds obtained by both algorithms are closely matched with LR-SE having a slight advantage. However, observing the gaps shows us that in all cases LR-SE-ACO is superior except for two instances 50-3 and 60-5. This is easily explained by the advantage LR-SE-ACO obtains from ACO improvement to the lower bounds.

To summarize the results presented above, LR-SE-ACO is the most effective algorithm to obtain lower bounds. Specifically, the LR component provides a very good seed for the ACO component which is quick to improve the solutions found. Clearly either algorithm, ACO or LR-SE, on their own are not as effective, whereas the hybrid method is best suited for this problem.

6.3 Convergence of lower bounds

We further analyze the performance of the algorithms concerning lower bounds.Footnote 8 We consider the instances with 120 tasks and report results by averaging their performance across the categories and across all instances. As we have done earlier, 30 runs per instance for ACO and LR-SE-ACO have been conducted the results are mean normalized so that we can compare across instances. The first comparison is of the lower bounds obtained by all the algorithms over 15 min and seen in Fig. 4. Here, the results for every instance and for all 30 runs per instance are first mean normalized. The mean normalized results are averaged and the difference of each algorithm to the average is reported.

Fig. 4
figure 4

A comparison of mean normalized lower bounds (NPV) of four algorithms over 15 min. The results across the 30 runs are first mean normalized by instance. These results are averaged across all algorithms and the difference to the average is shown for each algorithm

Figure 4 shows that LR-SE-ACO overall is the best performing algorithm. ACO and LR-SE-ACO have a significant advantage at the initial stages. LR-SE improves gradually until 100 seconds, however, at this stage it outperforms ACO. The trend shows that ACO and LR-SE-ACO continue to improve over time whereas LR-SE and LR-SE-ACO (heur) reach a point after which its performance stagnates. These results split by category are shown in Fig. 5. Here a similar picture as that of the overall average performance can be seen. However, we note that when the resource factor is less (instances 47 and 50) LR-SE-ACO has an advantage early. LR-SE-ACO always has the initial advantage, but with large resource factors (instances 57 and 60) the algorithms are more closely matched until 200 seconds. From this point LR-SE-ACO re-gains its advantage.

Fig. 5
figure 5

A comparison of mean normalized lower bounds (NPV) of four algorithms by category over 15 min. Leftright and topbottom we have the results for instance categories 47, 50, 57, 60. The results across the 30 runs are first mean normalized by instance. These results are then averaged across all algorithms in a category. The difference to the average by category is shown for each algorithm

The initial difference is attributable to ACO’s ability to improve a starting solution effectively. However, overtime ACO’s influence is reduced but still assists LR-SE-ACO. This leads to the divergence of LR-SE-ACO and LR-SE curves gradually. ACO’s initial result is very good and it gradually improves but not enough to keep pace with the LR-base algorithms.

Interestingly LR-SE-ACO (heur) follows LR-SE despite using heuristic information. For these runs with \(\alpha = 0.01\), the heuristic information seems to make no significant difference unlike the results seen in Table 6. In contrast, LR-SE-ACO is always more effective. Thus, depending on the learning rate, providing a bias via a solution (LR-SE-ACO) which is a stronger form of bias, is more effective than a subtle bias through the heuristic information. Larger discount rates require a stronger bias.

The next comparison of interest is to consider iterations as opposed to time. Since a single iteration of LR-SE-ACO is more expensive than that of LR-SE, the aim here is to identify if it converges more quickly compared to LR-SE given the same number of relaxed problems solved. We consider 100 iterations and comparisons made based on mean normalized data (see Fig. 6). LR-SE-ACO is overall the best performing algorithm. This is not surprising given the results previously seen, however, we find that both LR algorithms converge to a point (about 50 iterations) after which they diverge gradually. This explanation is similar to the one provided earlier where ACO provides large improvements initially but its influence eventually reduces. ACO is the least expensive method per iteration and, not surprisingly, this can be seen across the iterations where ACO is significantly worse. Over 100 iterations we also see that LR-SE-ACO (heur) does not provide significant improvements over LR-SE-ACO which follows a similar trend to what was seen in the long runs.

Fig. 6
figure 6

A comparison of mean normalized lower bounds (NPV) of four algorithms over 100 iterations. The results across the 30 runs are first mean normalized by instance. These results are averaged across all algorithms and the difference to the average is shown for each algorithm

Fig. 7
figure 7

A comparison of mean normalized lower bounds (NPV) of four algorithms by category over 100 iterations. Leftright and topbottom we have the results for instance categories 47, 50, 57, 60. The results across the 30 runs are first mean normalized by instance. These results are then averaged across all algorithms in a category. The difference to the average by category is shown for each algorithm

A break down by category (Fig. 7) shows a similar trend. However, category 57 with high resource factor and low resource strength shows that the initial advantage provided by ACO is lost, but gained again across iterations.

7 Conclusion

In this study we have shown that hybrids of a Lagrangian relaxation based heuristic with ACO can be effectively applied to a resource constrained project scheduling problem. We show through comparisons based on CPU time that LR-SE-ACO outperforms LR-SE (Lagrangian relaxation with list scheduling) and ACO on their own when maximizing NPV. The complementary advantages of each algorithm assist the hybrid to improve upon either on their own. This study shows that improvements can be made with respect to lower bounds. While the upper bounds may already be very good, improving them still warrants further research and is currently being addressed.

We find that if the project deadlines are tight, the hybrid LR-ACO is not as effective. This is mainly due to the increased solve time of the sub-problems. In this direction, improvements in the original formulation could potentially help to allow the algorithm to execute more quickly in reduced time-frames thereby allowing more iterations and improvements to the lower and upper bounds.

ACO has been shown to assist the LR algorithm, however, other methods may be substituted for ACO effectively. In particular, local search methods may prove effective here to essentially explore the surrounding regions of the search space suggested by the start times provided by the LR algorithm. Additionally, further feedback from ACO to the LR components can provide more effective bounds and also help the algorithm converge more quickly.

Given these results, extending all of these algorithms to significantly larger problems should provide interesting results. In fact, we are currently exploring data for a problem obtained from the Australian mining industry with up to 11,000 tasks with promising initial results.