1 Introduction

The resource-constrained project scheduling problem (RCPSP) is one of the most widely studied scheduling problems. The basic type of RCPSP is to find a precedence- and resource-feasible schedule (i.e., a vector of activity starting times) that minimizes the makespan of a project. One of the fundamental assumptions of the basic type of RCPSP is that activity durations are deterministic (i.e., they are known in advance). In reality, however, activity durations are subject to considerable uncertainty. The stochastic RCPSP (or SRCPSP) takes this uncertainty into account and observes the RCPSP when activity durations are stochastic. In contrast to the basic type of RCPSP, the SRCPSP has received only little attention in the literature (refer to Demeulemeester and Herroelen (2002) and Neumann et al. (2003) for an overview of the field of resource-constrained project scheduling, and to Herroelen and Leus (2005) for a survey on project scheduling under uncertainty).

Because the SRCPSP is known to be \(\fancyscript{NP}\)-hard (Ballestín 2007), most researchers have focussed their efforts on developing heuristic solution methods. Golenko-Ginzburg and Gonik (1997) propose two heuristics that both rely on solving a series of multi-dimensional knapsack problems (the first heuristic uses an exact procedure, whereas the second procedure resorts to a heuristic solution of the knapsack problems). Tsai and Gemmill (1998) apply simulated annealing and tabu search procedures, whereas Ballestín (2007) and Ballestín and Leus (2009) use a genetic algorithm and a GRASP algorithm, respectively. Ashtiani et al. (2011) adopt a two-phase local-search procedure. Stork (2001), who builds on the work of Igelmund and Radermacher (1983) and Möhring (2000), is one of the few researchers who has developed exact procedures to optimally solve the SRCPSP. In his PhD, he compares the performance of five different branch-and-bound algorithms.

In this article, we extend the work of Creemers et al. (2010) and present an exact model that uses a backward stochastic dynamic-programming (SDP) recursion to determine the minimum expected makespan of a resource-constrained project with stochastic activity durations. We use acyclic phase-type (PH) distributions to model activity durations and match the first two moments of the activity duration distributions. The complexity increases with the number of project activities and with decreasing levels of activity duration variability. Therefore, the model is intended for solving small- to medium-sized problem instances where activity durations have a moderate-to-high level of variability.

The main contributions of this article are (1) we develop an exact and analytic solution method for optimally solving the SRCPSP, (2) our model drastically improves computational performance when compared to the model of Creemers et al. (2010), (3) our model outperforms the existing state-of-the-art when it comes to optimally solving small- to medium-sized problem instances where activities have a moderate-to-high level of duration variability, (4) our model significantly improves solution quality when compared to the heuristic approaches available in the literature, and (5) we investigate the impact of making scheduling decisions also during the execution of an activity rather than only at the end of an activity.

The remainder of this article is organized as follows. Section 2 presents the basic definitions and outlines the problem statement. The model itself is explained in Sect. 3. Section 4 provides a numerical example. The experiments and the results are discussed in Sect. 5. Section 6 concludes.

2 Definitions and problem statement

A project is a network of activities that can be represented by a graph \(G=\left( V,E \right) \), where \(V = \left\{ 0,1,\ldots ,n \right\} \) is a set of nodes and \(E = \left\{ (i,j) | i,j \in V \right\} \) is a set of arcs. The nodes represent project activities, whereas the arcs represent precedence relationships. Activities \(0\) and \(n\) are dummy activities and represent the start and completion of the project, respectively. The duration of an activity \(i\) is a random variable \(\tilde{p}_{i}\) and has expected value \(\mu _{i}\) and variance \(\sigma _{i}^{2}\). \(\tilde{\mathbf {p}} = \left\{ \tilde{p}_{0}, \tilde{p}_{1}, \ldots , \tilde{p}_{n} \right\} \) denotes the vector of the activity duration random variables. \(p_{i}\) is a realization (or random variate) of \(\tilde{p}_{i}\), and \(\mathbf {p} = \left\{ p_{0}, p_{1}, \ldots , p_{n} \right\} \) is a realization of \(\tilde{\mathbf {p}}\). An activity \(i\) can start when all of its predecessors are finished and if sufficient resources are available. There are \(K\) renewable resource types. The availability of each resource type \(k\) is denoted by \(R_{k}\). Each activity \(i\) requires \(r_{i,k}\) units of resource \(k\), where \(r_{0,k} = r_{n,k} = 0\) for all \(k \in \fancyscript{R} = \left\{ 1,2, \ldots , K \right\} \).

A solution to the basic type of RCPSP is a schedule \(S = \left\{ S_{0}, S_{1}, \ldots , S_{n} \right\} \), where \(S_{i}\) is the starting time of an activity \(i\), \(S_{0}=0\) (i.e., the project starts at time \(0\)), and \(S_{n}\) represents the completion time (or makespan) of the project. In addition, define \(\fancyscript{A}(S,t) = \left\{ i \in V : S_{i} \le t \wedge (S_{i} + p_{i}) \ge t \right) \), the set of activities in schedule \(S\) that are active at time \(t\). A schedule \(S\) is feasible if

$$\begin{aligned} S_{i} + p_{i}&\le S_{j}&\qquad \forall (i,j) \in E, \end{aligned}$$
(1)
$$\begin{aligned} \sum \limits _{i \in \fancyscript{A}(S,t)} r_{i,k}&\le R_{k}&\qquad \forall t \ge 0, \forall k \in \fancyscript{R}, \end{aligned}$$
(2)
$$\begin{aligned} S_{i}&\ge 0&\qquad \forall i \in V. \end{aligned}$$
(3)

The optimal schedule \(S^{*}\) minimizes \(S_{n}\) subject to Constraints 13.

Because activity durations are not known in advance, a solution to the SRCPSP is a policy rather than a schedule. A policy \(\varPi \) is a set of decision rules that defines actions at decision times. Decision times are typically the start of the project and the completion times of activities. An action, on the other hand, corresponds to the start of a precedence- and resource-feasible set of activities. In addition, decisions have to respect the non-anticipativity constraint (i.e., a decision at time \(t\) can only use information that has become available before or at time \(t\)). As time progresses, activity duration realizations become known, and a schedule is generated (i.e., activities are assigned a starting time). Consequently, a policy \(\varPi \) may be interpreted as a function \(\mathbb {R}^{n+1}_{\ge 0} \mapsto \mathbb {R}^{n+1}_{\ge 0}\) that maps given realizations of activity durations \(\mathbf {p}\) to vectors of feasible starting times \(S\left( \mathbf {p}; \varPi \right) = \left\{ S_{0} \left( \mathbf {p}; \varPi \right) ,S_{1}\left( \mathbf {p}; \varPi \right) ,\ldots ,S_{n}\left( \mathbf {p}; \varPi \right) \right\} \) (see for instance Igelmund and Radermacher (1983), Möhring (2000), and Stork (2001)). For a given realization \(\mathbf {p}\) and policy \(\varPi \), \(S_{n} \left( \mathbf {p}; \varPi \right) \) denotes the makespan of schedule \(S\left( \mathbf {p}; \varPi \right) \). The objective of the SRCPSP is to minimize \(E \left( S_{n} \left( \mathbf {p}; \varPi \right) \right) \) over a given class of policies (where \(E \left( \cdot \right) \) is the expectation operator with respect to \(\mathbf {p}\)). Optimization over the class of all policies is computationally intractable. Therefore, most researchers focus their attention to the class of elementary policies \(\mathcal {P}\), and allow decisions to be made only at the start of the project (i.e., at time 0) and at the completion times of activities.

3 Markov decision chain

A project network with stochastic activity durations is often referred to as a PERT network, and a PERT network with independent and exponentially distributed activity durations is also called a Markovian PERT network. For Markovian PERT networks, Kulkarni and Adlakha (1986) have developed an exact method for deriving the distribution of the earliest project completion time using a continuous-time Markov chain (CTMC). Buss and Rosenblatt (1997), Sobel et al. (2009), and Creemers et al. (2010) use the CTMC of Kulkarni and Adlakha (1986) as a starting point to develop scheduling procedures that maximize an expected-NPV (eNPV) objective. All aforementioned studies, however, assume unlimited resources and exponentially distributed activity durations. In this article, we extend the work of Creemers et al. (2010) to accommodate (1) resource constraints, (2) PH-distributed activity durations, and (3) a minimum-makespan objective.

Below, we first study the special case of exponential activity durations (Sect. 3.1), followed by an overview of the PH distributions that are used in this article (Sect. 3.2). Next, we discuss how to incorporate PH distributions in the model (Sect. 3.3), and explain why they are a good choice for approximating the activity duration distributions (Sect. 3.4).

3.1 Exponential activity durations

In this section, we assume that duration \(\tilde{p}_{i}\) is exponentially distributed with rate parameter \(\lambda _{i} = \mu _{i}^{-1}\) for all \(i \in V \setminus \left\{ 0,n \right\} \). At any time instant \(t \ge 0\), the status of an activity is either idle (not yet started), ongoing (being executed), or finished (completed). Let \(I(t)\), \(O(t)\), and \(F(t)\) denote the activities in \(V\) that are idle, ongoing, and finished at time \(t\), respectively. \(I(t)\), \(O(t)\), and \(F(t)\) are mutually exclusive sets and \(I(t) \cup O(t) \cup F(t) = V\) for all \(t \ge 0\). Without loss of generality, we omit index \(t\) when referring to sets \(I(t)\), \(O(t)\), and \(F(t)\).

The state of the system is defined by the status of the individual activities and can be represented by a pair \(\left( I,O \right) \) (set \(F\) can be obtained from sets \(V\), \(I\), and \(O\)). State transitions take place at the completion of an activity and are determined by the policy at hand. The starting conditions of the project are \(I = V \setminus \left\{ 0 \right\} \) and \(O = \left\{ 0 \right\} \). Selecting the optimal scheduling policy \(\varPi ^{*}\) corresponds to minimizing a value function \(G \left( \cdot \right) \) in a continuous-time Markov decision process (CTMDP) on the state space \(Q\), with \(Q\) containing all feasible states. The value function \(G \left( I,O \right) \) returns the expected time until completion of the project upon entry of state \(\left( I,O\right) \), conditional on the assumption that optimal decisions are made in all subsequent states (i.e., the Bellman principle of optimality applies).

Upon entry of state \((I,O)\), a decision needs to be made whether or not to start a set of activities \(W \subseteq H_{I,O}\), with \(H_{I,O}\) the set of activities that are eligible to start. An activity \(i\) is eligible to start if

  1. 1.

    \(i \in I\),

  2. 2.

    \(j \in F\) for all \(j\) for which \(\left( j,i\right) \in E\),

  3. 3.

    \(r_{i,k} \le \left( R_{k} - \sum \limits _{j \in O} r_{j,k} \right) \forall k \in \fancyscript{R}\).

In addition, define \(D \left( I,O, W \right) \), the time until completion of the project if a decision is made to start a set of activities \(W\) upon entry of state \((I,O)\).

If no activities are started, a transition toward another state takes place after the first completion of an activity in \(O\). The probability that an activity \(i \in O\) finishes first equals \(\lambda _{i} / \sum _{j \in O} \lambda _{j}\). The time until the first completion is exponentially distributed and has expected value \(\left( \sum _{i \in O} \lambda _{i} \right) ^{-1}\). Therefore, if no activities are started, the time until completion of the project upon entry of state \((I, O)\) equals

$$\begin{aligned} D \left( I,O, \emptyset \right) = \frac{1}{\sum \limits _{i \in O} \lambda _{i}} + \sum \limits _{i \in O} \frac{\lambda _{i}}{\sum \limits _{j \in O} \lambda _{j}} G \left( I, O \setminus \left\{ i \right\} \right) . \end{aligned}$$
(4)

If, on the other hand, a set of activities \(W \subseteq H_{I,O}\) is started, an immediate transition toward another state is made and the time until completion of the project upon entry of state \((I,O)\) equals

$$\begin{aligned} D \left( I,O, W \right) = G \left( I \setminus W, O \cup W \right) . \end{aligned}$$
(5)

In order to determine the best set of activities to start, Creemers et al. (2010) enumerate all subsets of \(H_{I,O}\), resulting in \(2^{\left| H_{I,O} \right| }\) decisions to be evaluated. In this article, we propose a more efficient approach in which only \(\left| H_{I,O} \right| \) decisions have to be evaluated. Each decision corresponds to the start of a single activity in \(H_{I,O}\), and evaluating all \(\left| H_{I,O} \right| \) decisions suffices to determine the optimal objective value upon entry of state \((I,O)\). To see this, one only has to consider that (1) upon starting an activity, an instantaneous transition is made toward a subsequent state and (2) due to the Bellman principle of optimality, optimal decisions are made in subsequent states. In other words, instead of starting multiple activities in a single instantaneous transition, we make multiple instantaneous transitions, during each of which a single activity is started. This modification results in a drastic reduction of the number of decisions to be evaluated upon entry of a state \((I,O)\). The impact of the modification on the computational performance of the backward SDP recursion is verified in Sect. 5.1.

Upon entry of state \((I,O)\), it is optimal to either not start activities or to start activity \(i^{*}\)

$$\begin{aligned} i^{*} = \underset{i \in H_{I,O}}{{\text {argmin}}} \left\{ D \left( I,O, \{ i \} \right) \right\} . \end{aligned}$$
(6)

Clearly, if \(H_{I,O} = \emptyset \), the optimal decision is to not start activities and \(G \left( I,O \right) = D \left( I,O, \emptyset \right) \). Otherwise, \(G \left( I,O \right) \) equals

$$\begin{aligned} G \left( I,O \right) = \min \left\{ D \left( I,O, \emptyset \right) , D \left( I,O, \{ i^{*} \} \right) \right\} . \end{aligned}$$
(7)

The backward SDP recursion starts in \((I,O) = (\{ n \}, \emptyset )\) and stops if \((I,O) = (V \setminus \{ 0 \}, \{ 0 \})\), and the optimal objective value equals \(\min E \left( S_{n} \left( \mathbf {p}; \varPi ^{*} \right) \right) = G ( V \setminus \{ 0 \}, \{ 0 \})\).

3.2 PH distributions

In this article, we model activity durations using acyclic, continuous-time PH distributions. Continuous-time PH distributions use exponentially distributed building blocks to approximate any positive-valued continuous distribution with arbitrary precision (see Neuts (1981), Latouche and Ramaswami (1999), and Osogami (2005) for further details on PH distributions). More formally, a PH distribution is the distribution of time until absorption in a Markov chain with absorbing state \(0\) and state space \(\left\{ 0,1,\ldots ,Z \right\} \). It is fully characterized by parameters \(\varvec{\tau }\) and \(\mathbf {Z}\). \(\varvec{\tau }\) is the vector of probabilities to start the process in any of the \(Z\) transient states (i.e., phases), and \(\mathbf {Z}\) is the transient state transition matrix. The infinitesimal generator of the Markov chain representing the PH distribution is

$$\begin{aligned} \mathbf {Q}= \left( \begin{array}{cc} 0 &{} \mathbf {0} \\ \mathbf {t} &{} \mathbf {Z} \end{array} \right) , \end{aligned}$$

where \(\mathbf {0}\) is a zero matrix and \(\mathbf {t} = - \mathbf {Z}\mathbf {e}\) (with \(\mathbf {e}\) a vector of ones).

Various techniques exist to approximate a given distribution by means of a PH distribution. In this article, we adopt a two-moment matching procedure that minimizes the required number of phases. Let \(\nu _{i}\) denote the squared coefficient of variation (SCV) of the duration of activity \(i\)

$$\begin{aligned} \nu _{i} = \sigma _{i}^{2} \mu _{i}^{-2}. \end{aligned}$$
(8)

We define three cases: (1) \(\nu _{i} = 1\), (2) \(\nu _{i} > 1\), and (3) \(\nu _{i} < 1\). In the first case, we approximate the activity duration distribution by means of an exponential distribution with rate parameter \(\lambda _{i} = \mu _{i}^{-1}\). The parameters of the corresponding PH distribution are

$$\begin{aligned} \begin{array}{lcl} \varvec{\tau }_{i} &{} = &{} 1,\\ \mathbf {Z}_{i} &{} = &{} \left( -\lambda _{i} \right) . \end{array} \end{aligned}$$

In the second case, we use a two-phase Coxian distribution where the rate parameter of the first phase is determined by means of a scaling factor \(\kappa \)

$$\begin{aligned} \lambda _{i,1} = \frac{1}{\mu _{i}\kappa }. \end{aligned}$$
(9)

The expected value of the two-phase Coxian distribution is

$$\begin{aligned} \mu _{i} = \lambda _{i,1}^{-1} + \pi _{i,1,2} \lambda _{i,2}^{-1}, \end{aligned}$$
(10)

where \(\lambda _{i,2}\) is the exponential rate parameter of the second phase and \(\pi _{i,1,2}\) is the probability of visiting the second phase. The variance of the two-phase Coxian distribution is

$$\begin{aligned} \sigma _{i}^{2} = \lambda _{i,1}^{-2} + 2 \pi _{i,1,2} \lambda _{i,2}^{-2} - \pi _{i,1,2}^{2} \lambda _{i,2}^{-2}. \end{aligned}$$
(11)

When deriving parameters \(\lambda _{i,2}\) and \(\pi _{i,1,2}\) as a function of parameters \(\mu _{i}\), \(\nu _{i}\) and \(\kappa \), we obtain

$$\begin{aligned} \lambda _{i,2}= & {} \frac{2 \left( \kappa - 1 \right) }{ \mu _{i} \left( 2 \kappa - 1 - \nu _{i} \right) },\end{aligned}$$
(12)
$$\begin{aligned} \pi _{i,1,2}= & {} \frac{2 \left( \kappa - 1 \right) ^{2}}{1 + \nu _{i} - 2 \kappa }. \end{aligned}$$
(13)

The parameters of the corresponding PH distribution are

$$\begin{aligned} \begin{array}{lcl} \varvec{\tau }_{i} &{} = &{} \mathbf {e}_{1},\\ \mathbf {Z}_{i} &{} = &{} \left( \begin{array}{cc} -\lambda _{i,1} &{} \pi _{i,1,2} \lambda _{i,1}\\ 0 &{} -\lambda _{i,2} \end{array} \right) , \end{array} \end{aligned}$$

where \(\mathbf {e}_{1}\) is a single-entry vector that is populated with zeroes except for the first entry, which equals one. In the third case, we use a hypo-exponential distribution (a series of exponential distributions whose parameters are allowed to differ; a generalization of the Erlang distribution). The number of required phases equals

$$\begin{aligned} Z_{i} = \lceil \nu _{i}^{-1} \rceil . \end{aligned}$$
(14)

We assume that the first \(Z_{i}-1\) phases of the hypo-exponential distribution are independent and identically exponentially distributed with rate parameter \(\lambda _{i,1} = \lambda _{i,2} = \ldots = \lambda _{i,Z_{i}-1}\). The last phase is exponentially distributed with rate parameter \(\lambda _{i,Z_{i}}\). The expected value and variance of the hypo-exponential distribution are

$$\begin{aligned} \mu _{i}= & {} \sum \limits _{z=1}^{Z_{i}} \lambda _{i,z}^{-1},\end{aligned}$$
(15)
$$\begin{aligned} \sigma _{i}^{2}= & {} \sum \limits _{z=1}^{Z_{i}} \lambda _{i,z}^{-2}. \end{aligned}$$
(16)

When deriving the rate parameters of the hypo-exponential distribution as a function of parameters \(\mu _{i}\), \(\nu _{i}\), and \(Z_{i}\), we obtain

$$\begin{aligned} \lambda _{i,Z_{i}}= & {} \frac{1 + \sqrt{ \left( Z_{i} - 1 \right) \left( Z_{i} \nu _{i} - 1\right) }}{ \mu _{i} \left( 1- Z_{i} \nu _{i} + \nu _{i} \right) },\end{aligned}$$
(17)
$$\begin{aligned} \lambda _{i,z}= & {} \frac{\left( Z_{i} - 1 \right) - \sqrt{\left( Z_{i}-1 \right) \left( Z_{i} \nu _{i} - 1\right) }}{\mu _{i} \left( 1-\nu _{i} \right) }, \end{aligned}$$
(18)

for all \(z \in \left\{ 1,2,\ldots ,Z_{i}-1 \right\} \). The parameters of the corresponding PH distribution are

$$\begin{aligned} \begin{array}{lcl} \varvec{\tau }_{i} &{} = &{} \mathbf {e}_{1},\\ \mathbf {Z}_{i} &{} = &{} \left( \begin{array}{cccccc} -\lambda _{i,1} &{}\quad \lambda _{i,1} &{}\quad 0 &{}\quad \cdots &{}\quad 0 &{}\quad 0\\ 0 &{} -\lambda _{i,2} &{} \lambda _{i,2} &{}\quad \cdots &{}\quad 0 &{}\quad 0\\ \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots &{} \quad \vdots \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad \cdots &{}\quad -\lambda _{i,Z_{i}-1} &{} \lambda _{i,Z_{i}-1}\\ 0 &{}\quad 0 &{}\quad 0 &{} \quad \cdots &{}\quad 0 &{}\quad -\lambda _{i,Z_{i}}\\ \end{array} \right) . \end{array} \end{aligned}$$

For the three cases, \(Z_{i}\) equals \(1\), \(2\), and \(\lceil \nu _{i}^{-1} \rceil \), respectively. Figure 1 provides an overview of the PH distributions that are used in this article. Note, however, that our method works with all acyclic PH distributions.

Fig. 1
figure 1

Overview of PH distributions

3.3 PH-distributed activity durations

In this section, we describe how to obtain the minimum expected makespan of a resource-constrained project when activity durations are PH distributed. The duration of an activity \(i \in V \setminus \left\{ 0, n \right\} \) can be seen as a sequence of \(Z_{i}\) phases where each phase \(\theta _{i,z}\): (1) has an exponential duration with rate parameter \(\lambda _{i,z}\), (2) has a probability \(\tau _{i,z}\) to be the initial phase when starting activity \(i\), and (3) is visited with probability \(\pi _{i,y,z}\) when departing from another phase \(\theta _{i,y}\). Also note that, due to the acyclic nature of the adopted PH distributions, a phase can never be visited more than once.

The use of PH-distributed activity durations makes it possible to transform any project network into a Markovian PERT network. Therefore, given a few adaptations, the SDP recursion described in Sect. 3.1 can easily be extended to accommodate PH distributions. The most important adaptation concerns the status of the ongoing activities. If activity durations are PH distributed, we not only have to keep track of the ongoing activities themselves, but we also need to keep track of their ongoing phase. Let \((I,\mathcal {O})\) denote the state of the system, where \(\mathcal {O}\) is the set of ongoing phases. Upon completion of a phase \(\theta _{i,y} \in \mathcal {O}\)

  1. 1.

    Activity \(i\) completes with probability \(\pi _{i,y,0}\) (i.e., the probability of being absorbed when departing from phase \(\theta _{i,y}\)), and a transition is made toward state \(\left( I,\mathcal {O} \setminus \left\{ \theta _{i,y} \right\} \right) \).

  2. 2.

    Activity \(i\) does not complete, phase \(\theta _{i,z}\) is started with probability \(\pi _{i,y,z}\), and a transition is made toward state \((I,\mathcal {O} \cup \{ \theta _{i,z} \} \setminus \{ \theta _{i,y} \})\).

Note that \(F = V \setminus \left( I \cup O \right) \), and \(O\) can be obtained from \(\mathcal {O}\) as follows

$$\begin{aligned} O = \left\{ i | \mathcal {Z}_{i} \cup \mathcal {O} \ne \emptyset \right\} , \end{aligned}$$
(19)

where \(\mathcal {Z}_{i} = \left\{ \theta _{i,1}, \theta _{i,2}, \ldots , \theta _{i,Z_{i}} \right\} \).

Let \(H_{I,\mathcal {O}}\) denote the set of activities that are eligible to start upon entry of state \((I,\mathcal {O})\) (its definition is analogous to that of \(H_{I,O}\)). If no activities are started, the time until completion of the project upon entry of state \((I,\mathcal {O})\) equals

$$\begin{aligned} \begin{aligned} D \left( I,\mathcal {O}, \emptyset \right)&= \frac{1}{\lambda _{\mathcal {O}}} +\sum \limits _{\theta _{i,y} \in \mathcal {O}} \frac{\lambda _{i,y}}{\lambda _{\mathcal {O}}} \pi _{i,y,0} G \left( I,\mathcal {O} \setminus \left\{ \theta _{i,y} \right\} \right) \\&\quad + \sum \limits _{\theta _{i,y} \in \mathcal {O}} \frac{\lambda _{i,y}}{\lambda _{\mathcal {O}}} \sum \limits _{z = y\!+1}^{Z_{i}} \pi _{i,y,z} G \left( I,\mathcal {O} \cup \{ \theta _{i,z} \} \setminus \{ \theta _{i,y} \} \right) , \end{aligned} \end{aligned}$$
(20)

where \(\lambda _{\mathcal {O}} = \sum _{ \theta _{j,z} \in \mathcal {O}} \lambda _{j,z}\).

If \(H_{I,\mathcal {O}} \ne \emptyset \), the best activity to start is

$$\begin{aligned} i^{\star } = \underset{i \in H_{I,\mathcal {O}}}{{\text {argmin}}} \left\{ \sum \limits _{z = 1}^{Z_{i}} \tau _{i,z} G \left( I \setminus \{ i \}, \mathcal {O} \cup \{ \theta _{i,z} \} \right) \right\} . \end{aligned}$$
(21)

The optimal objective value upon entry of state \((I,\mathcal {O})\) equals

$$\begin{aligned} G \left( I,\mathcal {O} \right) = \min \left\{ D \left( I,\mathcal {O}, \emptyset \right) , D \left( I,\mathcal {O}, \{ i^{\star } \} \right) \right\} . \end{aligned}$$
(22)

Note that, after the completion of a phase, it is possible to interrupt the execution of an activity and/or to make scheduling decisions. Therefore, using PH-distributed activity durations, it becomes possible to optimize over a class of policies that is far more general than \(\mathcal {P}\). Let \(\mathcal {N}\) denote the class of policies where decisions can be made: (1) at the start of the project, (2) at the completion times of activities, and (3) at the completion times of activity phases. It is clear that \(\mathcal {N}\) dominates \(\mathcal {P}\). In Sect. 5.5, we evaluate the gain in solution quality when we optimize over \(\mathcal {N}\) rather than over \(\mathcal {P}\).

3.4 Why PH distributions?

In most cases, the “true” distribution of the duration of an activity is unknown. Often, it is assumed that the duration of an activity follows a beta, uniform, or Gaussian distribution (see for instance Herroelen and Leus (2005), Bidot et al. (2009), Fu et al. (2012), and Creemers et al. (2014)). These distributions, however, only allow to match the first two moments of the true duration distribution. PH distributions, on the other hand, can match almost any distribution with arbitrary precision. Therefore, one could argue that PH distributions are a better choice as an approximation of the true duration distribution.

The size of a Markovian PERT network is determined by the number of phases that is required to model the project activities. Therefore, PH distributions are especially suited to approximate activity durations that have a moderate-to-high level of variability. In fact, any activity duration distribution that has a SCV in \(\left[ 0.5,\infty \right) \) can be modeled using a PH distribution of up to two phases. For SCV smaller than \(0.5\), the number of required phases increases rapidly, making PH distributions less applicable for settings where activity duration variability is small. It mainly makes sense, however, to solve the SRCPSP if activity durations exhibit a sufficient degree of variability. If activity durations have a low level of variability, solving the SRCPSP has limited added value, and the RCPSP may serve as an approximation of the SRCPSP.

As is explained in Sect. 3.2, PH distributions are a mixture of exponential distributions. Due to the memoryless property of the exponential distribution, there is no need to keep track of the remaining duration of the ongoing activities (i.e., the remaining duration of a phase \(\theta _{i,z}\) is exponentially distributed with rate parameter \(\lambda _{i,z}\) for every moment in time at which phase \(\theta _{i,z}\) is ongoing). As a result, the state of the system is fully defined by the set of idle activities and the set of ongoing activity phases. This compact representation of the state space allows us to solve problem instances which are beyond reach for other optimal solution methods.

If activity durations are PH distributed, it becomes possible to interrupt the execution of an activity and/or to make scheduling decisions at the completion of a phase rather than only at the completion of an activity. This enables the study of more complex scheduling problems and also allows to optimize over a class of policies that is more general than the class of elementary policies \(\mathcal {P}\).

4 Example

Any project network with stochastic activity durations can be transformed into a Markovian PERT network. In order to illustrate this transformation process, we consider a project that consists of five activities (i.e., \(V = \left\{ 0,1,2,3,4 \right\} \), where 0 and 4 are dummy activities). The data of the project are summarized in Table 1. The project network and its transformation are presented in Fig. 2. Activity \(1\) has duration SCV equal to \(1/3\) and can be modeled as a series of three phases that have exponentially distributed durations (i.e., a hypo-exponential distribution is used). Activity \(2\) has duration SCV equal to 1 and can be modeled as a single phase (i.e., the duration of activity \(2\) is approximated by an exponential distribution). If the activity duration SCV is larger than 1, a two-phase Coxian distribution is used. Activity \(3\), for example, has a duration SCV equal to \(2\) and can be modeled as a series of two phases, where the second phase is executed with probability \(\pi _{3,1,2}\). Note that this implies that not all phases have to be executed in order to complete the project (i.e., there is a probability \(\pi _{3,1,0}\) of not having to execute the second phase of activity \(3\)).

Table 1 Data for the example project
Fig. 2
figure 2

Example project network and its corresponding Markovian PERT network

Next, we use the example project to illustrate the importance of stochastic activity durations when solving the RCPSP. In the example, we assume that there is a single resource (i.e., \(K=1\)) that has an availability of 10 resource units (i.e., \(R_{1}=10\)). The non-dummy activities each consume \(5\) resources (i.e., \(r_{1,1}=r_{2,1}=r_{3,1}=5\)). If activity durations are deterministic, the optimal policy is to start activities \(2\) and \(3\), and to start activity \(1\) after completion of either activity \(2\) or \(3\). Refer to this policy as \(\varPi _{1}\). Adopting policy \(\varPi _{1}\) results in a project makespan of 18 time units if activity durations are deterministic. Policy \(\varPi _{2}\), on the other hand, starts activities \(1\) and \(2\), and starts activity \(3\) after completion of either activity \(1\) or \(2\). If activity durations are deterministic, policy \(\varPi _{2}\) corresponds to a makespan of 19 time units. Figure 3 illustrates both policies. If activity durations are not deterministic, however, policy \(\varPi _{2}\) may outperform policy \(\varPi _{1}\). This is illustrated in Fig. 4, which shows the makespan of the project as a function of the variability of the duration of activity \(3\). It is clear that, as the variability of the duration of activity \(3\) increases, policy \(\varPi _{2}\) becomes more effective when compared to policy \(\varPi _{1}\). For a duration SCV larger than 4.87 (i.e., for \(\nu _{3} > 4.87\)), the makespan that corresponds to policy \(\varPi _{2}\) is smaller than the makespan that is associated with policy \(\varPi _{1}\).

Fig. 3
figure 3

Illustration of policy \(\varPi _{1}\) and \(\varPi _{2}\) if activity durations are deterministic

Fig. 4
figure 4

Project makespan of policy \(\varPi _{1}\) and \(\varPi _{2}\) as a function of the SCV of the duration of activity \(3\)

5 Results

In this section, we assess the difference in performance between our approach and the approach of Creemers et al. (Sect. 5.1). Next, we discuss the different problem sets that are used in the literature (Sect. 5.2), and we compare the computational performance of our model and other optimal approaches (Sect. 5.3). We also evaluate the optimality gap of the existing heuristic procedures (Sect. 5.4), and investigate the impact of making decisions also during the execution of an activity (Sect. 5.5).

All our experiments are performed on an AMD Phenom II 3.2 GHz computer with 32,768 MB RAM.

5.1 Improving the model of Creemers et al. (2010)

Creemers et al. (2010) use a backward SDP recursion to find the maximum eNPV of a project where activity durations are exponentially distributed. In this article, we modify this SDP recursion to accommodate (1) resource constraints, (2) PH-distributed activity durations, and (3) a minimum-makespan objective. Next to structural changes, however, we also suggest a modification that drastically reduces the number of decisions that have to be evaluated upon entry of a state (see Sect. 3 for more details). In this section, we assess the impact of this modification on the computational performance of the backward SDP recursion. For this purpose, we replicate the study of Creemers et al. (2010).

In their study, Creemers et al. (2010) generate 30 scheduling instances for each combination of order strength (OS) and network size. The OS is either \(0.4\), \(0.6\), or \(0.8\). The size of the network ranges from \(10\) to \(120\) activities. In total, \(1080\) instances are generated. We analyze these instances using (1) the SDP recursion of Creemers et al. (2010) and (2) the modified SDP recursion that is introduced in Sect. 3.3. The results are presented in Table 2. For each combination of parameters, Table 2 aggregates (1) the number of instances that were solved to optimality, (2) the CPU time of the “old” approach, and (3) the CPU time of the “new” approach. It is clear that the modified approach drastically improves computation times. In fact, on average, the computational efficiency has been improved by a factor of 56.49.

Table 2 Comparison of the computational results when using the SDP recursion of Creemers et al. (2010) and the modified SDP recursion

5.2 Datasets used in the literature on the SRCPSP

Various datasets are available in the literature. Tsai and Gemmill (1998), Ballestín and Leus (2009), and Ashtiani et al. (2011) assess the performance of their procedures using the instances of the Patterson dataset (Patterson 1984). Stork (2001) evaluates his branch-and-bound algorithms on the J30 and J60 instances of the well-known PSPLIB dataset (Kolisch and Sprecher 1996). Ballestín and Leus (2009) and Ashtiani et al. (2011) use the J120 instances of the same dataset. Golenko-Ginzburg and Gonik (1997) use a single instance with 36 activities to evaluate their two heuristics. The same problem instance is also used in Ballestín and Leus (2009) and Ashtiani et al. (2011). In our experiments, we use the problem instances of the Patterson dataset and the J30 and J60 instances of the PSPLIB dataset. We do not use the J120 instances of the PSPLIB dataset because they cannot be solved to optimality. We also do not use the example project presented in Golenko-Ginzburg and Gonik (1997) because its activities have a very limited duration variability (e.g., when the uniform distribution is used to model activity durations, the average duration SCV equals \(0.014\)).

5.3 Computational performance and comparison with optimal procedures

In this section, we discuss the computational performance of our model and compare with other optimal approaches available in the literature. In a first experiment, we assume that activity durations are exponentially distributed and solve the instances of the Patterson dataset and the J30 and J60 instances of the PSPLIB dataset. Table 3 summarizes the results (the state-space sizes are expressed in thousands of states), and Fig. 5 presents a box plot of the computation times. It is clear that project networks of up to \(32\) activities are analyzed with ease. The results also show that networks of \(62\) activities can often be solved (we solve 301 out of 480 networks), albeit at a larger computational cost.

Table 3 Computational results if activity durations are exponentially distributed (state-space sizes are expressed in thousands of states)
Fig. 5
figure 5

Computational performance if activity durations are exponentially distributed

Next, we use the J30 instances of the PSPLIB dataset to analyze the impact of different levels of activity duration variability on the performance of our model. Table 4 summarizes the results (the state-space sizes are expressed in thousands of states). The level of activity duration variability determines the number of required phases. For values of SCV larger than \(0.5\), one or two phases suffice. If, however, the SCV of the activity durations is smaller than \(0.5\), the number of required phases increases rapidly. As a result, the size of the state space increases exponentially and computational performance plummets. For moderate-to-high levels of activity duration variability, however, the computational effort is acceptable.

Table 4 Computational results for different values of SCV when solving the J30 instances of the PSPLIB dataset (state-space sizes are expressed in thousands of states)

The main bottleneck of our approach is memory rather than CPU time. For large networks and/or low levels of activity duration variability, the state space becomes too big to store in memory. As a result, our model is mainly suited for small- to medium-sized projects where activity durations have a moderate-to-high level of variability. For this setting, our approach outperforms the current state-of-the-art.

The literature on optimal solution methods for the SRCPSP is rather scarce. With respect to the Patterson dataset, Tsai and Gemmill (1998) are able to solve \(95\) out of \(110\) instances to optimality if activity durations are deterministic. If activity durations are stochastic, optimality cannot be guaranteed. With respect to the J30 and J60 instances of the PSPLIB dataset, Stork (2001) is able to optimally solve \(179\) and \(11\) out of 480 instances, respectively. It is clear that our model performs better.

5.4 Comparison with heuristic procedures

In this section, we assess the optimality gap of the heuristic approaches that are available in the literature. The models of Ballestín and Leus (2009) and Ashtiani et al. (2011) are the current state-of-the-art when it comes to heuristically solving the SRCPSP. Both authors report results on the J120 dataset. Currently, however, it is impossible to optimally solve the instances of that dataset (i.e., the optimality gap cannot be evaluated). If activity durations are exponentially distributed, our model can optimally solve 480 and 301 instances of the J30 and J60 datasets, respectively (see Sect. 5.3 for more details). For these instances, we can measure the optimality gap if we also have the solutions of the heuristic approaches. Unfortunately, the solutions are not available from Ashtiani et al. (2011). We are able, however, to compare with the solutions of Ballestín and Leus (2009).

We assess the optimality gap of the GRASP method of Ballestín and Leus (2009) with a limit of 25,000 schedules. The results are reported in Table 5. Table 5 presents the minimum, average, and maximum difference between the optimal makespan and the makespan produced by the GRASP procedure. In addition, Fig. 6 presents a boxplot of the optimality gap. We find that our model improves solution quality with \(9.11 ~\%\) on average for the J30 instances, and with \(15.88 ~\%\) on average for the J60 instances. It is clear that the optimality gap increases with increasing network size. Therefore, we expect that the gap for the J120 instances is even larger. In addition, the minimum optimality gap increases rather drastically, indicating that it becomes more difficult for the heuristic approaches to approximate the optimal solution as the size of the network increases.

Table 5 Optimality gap of the GRASP method if 25,000 schedules are used and activity durations are exponentially distributed
Fig. 6
figure 6

Optimality gap of the GRASP method if 25,000 simulations are used and activity durations are exponentially distributed

5.5 Value of non-elementary policies

All solution methods in the literature on the SRCPSP allow decisions to be taken only at the start of the project and at the completion times of activities (i.e., they optimize over the class of elementary policies \(\mathcal {P}\)). In this section, we investigate the difference in solution quality if we allow decisions to be taken also at the completion of an activity phase (i.e., we observe the impact of optimizing over the class of non-elementary policies \(\mathcal {N}\)).

For this experiment, we use the 110 instances of the Patterson dataset. The average SCV of the activity durations ranges from \(0.1\) to \(2.0\). Depending on the SCV, the number of phases differs, and hence, the number of decision moments during the execution of an activity differs as well. Table 6 presents the difference in solution quality when optimizing over \(\mathcal {N}\) rather than over \(\mathcal {P}\). It is clear that the difference is minimal (at most, the difference amounts to 0.55 %). This, however, is not really surprising as it is often optimal to start activities as soon as possible if makespan is to be minimized (i.e., there is limited value in postponing the start of an activity). We do observe, however, that the difference in solution quality grows larger if activity duration variability increases (for a constant number of phases/decision moments). This indicates that it is more beneficial to have decision moments during the execution of an activity if the duration of that activity is more variable.

Table 6 Percentual difference in solution quality for policy classes \(\mathcal {P}\) and \(\mathcal {N}\) for different values of SCV

6 Conclusions

In this article, we have presented an exact and analytic solution method for optimally solving the SRCPSP. Our model extends the SDP recursion of Creemers et al. (2010) and accommodates (1) resource constraints, (2) PH-distributed activity durations, and (3) a minimum-makespan objective. Next to these structural improvements, we also improve the computational efficiency of the SDP recursion by a factor of 56.49 on average.

Experiments have shown that our model performs best when activity durations have a moderate-to-high level of variability, and that it can be used to optimally solve project instances that have up to \(62\) activities. For this setting, our model outperforms the existing state-of-the-art.

We have also used our model to assess the optimality gap of the heuristic approaches available in the literature. We show that our model improves the solution quality of the GRASP procedure of Ballestín and Leus (2009) with \(9.11\) and \(15.88~\%\) on average for instances that have \(32\) and \(62\) activities, respectively. This indicates that it becomes more difficult for the heuristic approaches to approximate the optimal solution as the size of the network increases.

In addition, we have investigated the difference in solution quality if we allow scheduling decisions to be made at the end of an activity phase rather than only at the end of an activity. An experiment has shown that the difference in solution quality is minimal (i.e., there is limited value in postponing the start of an activity). The experiment also shows that it is more beneficial to have decision moments during the execution of an activity if the duration of that activity is more variable.

Last, we have also illustrated that variability in activity durations is an important factor when solving the RCPSP. As such, it might not be advisable to assume that activity durations are deterministic when making project scheduling decisions.