1 Introduction

Scheduling problems arise in production planning (Sule 2007), in balancing processes (Shirazi et al. 1995), in telecommunication (Nemeth et al. 1997), and more generally in all situations in which scarce resources are to be allocated to jobs over time (Pinedo 2008). Depending on the application, the corresponding scheduling problem can be such that each job must be processed within a given time window, where the lower bound (release date or ready time) of this time window represents the earliest start of the execution of the job and the upper bound (deadline) corresponds with the latest acceptable completion time, for instance, the ultimate delivery time agreed upon with the customer (Pan and Shi 2005; Gordon et al. 1997; Xu and Parnas 1990). For some of these applications, only release dates or only deadlines are considered (Jouglet et al. 2004; Tanaka and Fujikuma 2012; Pan 2003; Posner 1985). In practice, a job often also needs to be processed before or after other jobs, e.g., due to tool or fixture restrictions or for other case-dependent technological reasons, which leads to precedence constraints (Potts 1985; Lawler 1978; Tanaka and Sato 2013). Finally, the contract with a client can also contain clauses that stipulate that penalties must be paid when the execution of a job is not completed before a reference date (due date) (Ibaraki and Nakamura 1994; Abdul-Razaq and Potts 1988; Jouglet et al. 2004; Tanaka and Fujikuma 2012; Dyer and Wolsey 1990; Talla Nobibon 2011).

In this article, we develop exact algorithms for a single-machine scheduling problem with total weighted tardiness (TWT) penalties. In the standard three-field notation introduced by Graham et al. (1979), the problem that we tackle can be denoted as \(1|r_j,\delta _j, \mathrm{prec}| \sum {w_j T_j}\): the execution of each job is constrained to take place within a time window, and we assume the corresponding deadline to be greater than or equal to a due date, which is the reference for computing the tardiness of the job. The scheduling decisions are also subject to precedence constraints. In the following lines we briefly summarize the state of the art.

Abdul-Razaq et al. (1990) survey different branch-and-bound (B&B) algorithms for \(1||\sum {w_j T_j}\). A benchmark algorithm is the B&B procedure of Potts and van Wassenhove (1985); an older reference is Held and Karp (1962), who present a dynamic programming (DP) approach. Abdul-Razaq and Potts (1988) introduce a DP-based approach to obtain tight lower bounds for the generalized version of the problem where the cost function is piecewise linear. They examine their lower bounds in a B&B algorithm and solve small instances (with at most 25 jobs) to optimality. Ibaraki and Nakamura (1994) extend their work and construct an exact method, called Successive Sublimation Dynamic Programming (SSDP), which solves medium-sized instances (with up to 50 jobs). Tanaka et al. (2009) improve the SSDP of Ibaraki and Nakamura (1994) and succeed in solving reasonably large instances (with up to 300 jobs) of \(1||\sum {w_j T_j}\) within acceptable runtimes.

Single-machine scheduling for TWT with (possibly unequal) release dates (\(1|r_j| \sum {w_j T_j}\)) has also been studied by several authors. Akturk and Ozdemir (2000, (2001) and Jouglet et al. (2004) develop B&B algorithms that solve small instances. van den Akker et al. (2010) propose a time-indexed formulation and a method based on column generation to solve this problem with identical processing times. Tanaka and Fujikuma (2012) present an SSDP algorithm that can solve instances of \(1|r_j| \sum {w_j T_j}\) with up to 100 jobs.

There are only few papers dealing with single-machine scheduling with deadlines and/or precedence constraints. Among these, we cite Posner (1985) and Pan (2003), who propose B&B algorithms for \(1|\delta _j|\sum {w_j C_j}\), Pan and Shi (2005), who develop a B&B algorithm to solve \(1|r_j,\delta _j| \sum {w_j C_j}\), Lawler (1978) and Potts (1985), who present B&B algorithms to solve \(1|\mathrm{prec}| \sum {w_j C_j}\), and Tang et al. (2007), who propose a hybrid backward and forward dynamic programming-based Lagrangian relaxation to compute upper and lower bounds for \(1|\mathrm{prec}|\sum {w_j T_j}\). Tanaka and Sato (2013) also propose an SSDP algorithm to solve a generalization of \(1|\mathrm{prec}|\sum {w_j T_j}\) (piecewise linear cost function). To the best of our knowledge, scheduling problems with release dates, deadlines, and precedence constraints have not yet been studied in the literature. The goal of this paper is to fill this gap and to propose efficient B&B algorithms that solve all the foregoing subproblems within limited computation times.

The remainder of this paper is structured as follows. In Sect. 2 we provide some definitions and a formal problem statement, while Sect. 3 proposes two different integer programming formulations. In Sect. 4 we explain the branching strategies for our B&B algorithms, while the lower bounds, the dominance rules and the initial upper bound are discussed in Sects. 56 and 7, respectively. Computational results are reported and discussed in Sect. 8. We provide a summary and conclusions in Sect. 9.

2 Problem description

The jobs to be scheduled are gathered in set \(N = \{1,2,\ldots ,n\}\). Job \(i\) is characterized by a processing time \(p_i\), a release date \(r_i\), a due date \(d_i\), a deadline \(\delta _i\), and a weight \(w_i\) which represents the cost per unit time of delay beyond \(d_i\). Jobs can neither be processed before their release dates nor after their deadlines (\(0 \le r_i \le \delta _i\)). Precedence constraints are represented by a graph \(G=(N^{\prime },A)\), where \(N^{\prime } = N\cup \{0,n+1\}\), with \(0\) a dummy start job and \(n+1\) a dummy end. Each arc \((i,j) \in A\) implies that job \(i\) must be executed before job \(j\) (job \(i\) is a predecessor of job \(j\)). We will assume that \(G(N^{\prime },A)\) is its own transitive reduction, that is, no transitive arcs are included in \(A\). Let \(\mathcal {P}_i\) be the set of all predecessors of job \(i\) in \(A\) (\(\mathcal {P}_i = \{k|(k,i) \in A\}\)) and \(\mathcal {Q}_j\) the set of successors of job \(i\) (\(\mathcal {Q}_i = \{k|(i,k) \in A\}\)). We also define an associated graph \(\hat{G}=(N^{\prime },\hat{A})\) as the transitive closure of \(G\). We assume that \(\mathcal {P}_0 = \mathcal {Q}_{n+1} = \emptyset \), and that all jobs are successor of \(0\) and predecessor of \(n+1\) in \(\hat{G}\) (apart from the jobs themselves).

Throughout this paper, we use the term ‘sequencing’ to refer to ordering the jobs (establishing a permutation), whereas ‘scheduling’ means that start (or end) times are determined. We denote by \(\pi \) an arbitrary sequence of jobs, where \(\pi _k\) represents the job at the \(k\)th position in that sequence. Let \(\pi ^{-1}(i)\) be the position of job \(i\) in \(\pi \); we only consider sequences for which \(\pi ^{-1}(i) < \pi ^{-1}(j)\) for all \((i,j) \in A\). Value \(C_i\) is the completion time of job \(i\). Each sequence \(\pi \) implies a schedule, as follows:

$$\begin{aligned} C_{\pi _i}= {\left\{ \begin{array}{ll} \max \{r_{\pi _i},C_{\pi _{i-1}}\} + p_{\pi _i} &{} \text{ if } i > 1\\ r_{\pi _i} + p_{\pi _i} &{} \text{ if } i = 1.\end{array}\right. } \end{aligned}$$

Equivalently, the end of job \(i\) according to sequence \(\pi \) can also be written as \(C_i(\pi )\). We denote by \(\mathcal {D}\) the set of all feasible permutations, where a permutation \(\pi \) is feasible (\(\pi \in \mathcal {D}\)) if and only if it generates a feasible schedule, which means that

$$\begin{aligned} r_{\pi _i}+p_{\pi _i} \le C_{\pi _i} \le \delta _{\pi _i} \qquad \qquad \forall i \in N. \end{aligned}$$

Note that the set \(\mathcal {D}\) may be empty.

The weighted tardiness associated with the job at the \(i\)th position in the sequence \(\pi \) is given by \(W(\pi _{i}) = w_{\pi _i}\left( C_{\pi _i}-d_{\pi _i}\right) ^+\), where \(x^+ = \max \left\{ 0,x\right\} \). A conceptual formulation of the problem P studied in this paper is the following:

$$\begin{aligned} \mathrm {P}: \underset{\pi \in \mathcal {D}}{\min } \mathrm {TWT}(\pi ) = \sum \limits _{i = 1}^{n}{W(\pi _{i})}. \end{aligned}$$
(1)

This problem is at least as hard as \(1||\sum w_i T_i\), which is known to be strongly NP-hard (Lawler 1977; Lenstra et al. 1977; Pinedo 2008). A stronger result is that the mere verification of the existence of a feasible schedule that respects a set of ready times and deadlines is already NP-complete (problem SS1, p. 236, Garey and Johnson 1979); we do not, however, incorporate the feasibility check as a formal part of the problem statement.

Example 1

Consider the following instance of \(\mathrm {P}\) with \(n=4\) jobs. The processing times, release dates, due dates, deadlines, and weights of the jobs are given in Table 1. The graph representing the precedence constraints is depicted in Fig. 1, with arc set \(A=\{(0,1), (0,4),(1,2),(2,3),(3,5),(4,5)\}\).

An optimal solution to this instance is \(\mathrm \pi = (4,1,2,3)\), which leads to the schedule \(C_1 = 6, C_2=9, C_3 = 13\), and \(C_4 = 4\). The objective value is \(w_4 \times 0 + w_1 \times 0 + w_2 \times 0 + w_3 \times (13-8) = 3 \times 5= 15\).

Fig. 1
figure 1

Precedence graph \(G(N^{\prime },A)\)

Table 1 Job characteristics

3 Mathematical formulations

The conceptual formulation for P presented in the previous section is not linear; therefore, it cannot be used by a standard (linear) mixed-integer programming (MIP) solver. In this section, we propose an assignment formulation (ASF) and a time-indexed formulation (TIF) for the problem. These formulations are adaptations of those presented in Keha et al. (2009) and Talla Nobibon (2011).

3.1 Assignment formulation

We use binary decision variables \(x_{is} \in \{0,1\} (i\in N, s \in \{1,2,\ldots ,n\})\), which identify the position of jobs in the sequence so that \(x_{is}\) is equal to 1 if job \(i\) is the \(s\)th job processed and equal to 0 otherwise. In other words, \(x_{is} = 1\) if and only if \(\pi _s=i\). We also use additional continuous variables \(T_{i} \ge 0\) representing the tardiness of job \(i \in N\) and continuous variables \(\tau _s \ge 0\) representing the machine idle time immediately before the execution of the \(s\)th job. The MIP formulation is given by

$$\begin{aligned}&\mathrm {ASF} : \quad \min \quad \sum \limits _{i = 1}^{n}{w_i T_{i}} \end{aligned}$$
(2)
$$\begin{aligned}&\text {subject to} \nonumber \\&\sum \limits _{s = 1}^{n}{x_{is}} = 1 \quad \forall i \in N \end{aligned}$$
(3)
$$\begin{aligned}&\sum \limits _{i = 1}^{n}{x_{is}} = 1 \quad \forall s \in \{1,2,\ldots ,n\} \end{aligned}$$
(4)
$$\begin{aligned}&\sum \limits _{s = 1}^{n}{ x_{is} s} \le \sum \limits _{t = 1}^{n}{ x_{jt} t} - 1 \quad \forall (i,j) \in A \end{aligned}$$
(5)
$$\begin{aligned}&\tau _s \ge \sum \limits _{i = 1}^{n}{x_{is} r_i} - \sum \limits _{t = 1}^{s-1}{\left( \sum \limits _{i = 1}^{n}{(x_{it} p_i)} + \tau _t\right) } \quad \forall s \in N \end{aligned}$$
(6)
$$\begin{aligned}&\sum \limits _{t = 1}^{s}{\tau _t} + \sum \limits _{t=1}^{s-1}{\sum \limits _{i=1}^{n}{p_i x_{it}}} + \sum \limits _{i=1}^{n}{\left( (p_i-\delta _i) x_{is}\right) } \le 0 \nonumber \\&\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \forall s \in N \end{aligned}$$
(7)
$$\begin{aligned}&T_{i} \ge \sum \limits _{t = 1}^{s}{\tau _t} + \sum \limits _{t=1}^{s-1}{\sum \limits _{j=1}^{n}{p_j x_{jt}}} + p_i - d_i - (1-x_{is})M_i \nonumber \\&\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \forall i \in N, s \in N \end{aligned}$$
(8)
$$\begin{aligned}&x_{is} \in \{0,1\}, \tau _s ,T_{i} \ge 0 \quad \forall i \in N, s\in \{1,2,\ldots ,n\} \end{aligned}$$
(9)

The objective function (2) is a reformulation of (1). The set of constraints (3) ensures that all jobs are executed. Constraints (4) check that each position in the sequence is occupied by exactly one job. The set of constraints (5) enforces the precedence restrictions. The set of equations (6) computes the idle time of the machine between the jobs in positions \(s-1\) and \(s\), and ensures that each job is not started before its release date. In this set of constraints, \(\sum _{t = 1}^{s-1}{\left( \sum _{i = 1}^{n}{(x_{it} p_i)} + \tau _t\right) }\) equals the completion time of the \((s-1)\)th job. Constraints (7) ensure that each job is not completed after its deadline, where \(\sum _{t = 1}^{s}{\tau _t} + \sum _{t=1}^{s-1}{\sum _{i=1}^{n}{p_i x_{it}}}\) is the start time of the \(s\)th job. Constraints (8) compute the correct value of the tardiness of job \(i\), with \(M_i = \delta _i-d_i\) the maximum tardiness of job \(i\).

A variant of ASF is obtained by replacing the set of constraints (5) by the following:

$$\begin{aligned} \sum \limits _{s = v}^{n}{ x_{is}} + \sum \limits _{s = 1}^{v}{ x_{js}} \le 1 \quad \forall (i,j) \in A, \forall v \in \{1,\dots ,n\}. \end{aligned}$$
(10)

We refer to this alternative formulation as \(\mathrm {ASF}^{\prime }\). We have the following result:

Lemma 1

\(\mathrm {ASF}^{\prime }\) is stronger than \(\mathrm {ASF}\).

All proofs are included in the Appendix. The number of constraints in (10) is much higher than in (5). As a result, the additional computational effort needed to process this higher number of constraints might offset the improvement of a stronger bound, and we will empirically compare the performance of the two variants in Sect. 8.4.

3.2 Time-indexed formulation

Let \(T_S\) (respectively \(T_E\)) be a lower (respectively upper) bound on the time the execution of any job can be completed; we compute these values as \(T_S = \min \{r_i+p_i|i \in N \}\) and \(T_E = \max \{\delta _i|i \in N \}\).

The time-indexed formulation uses binary decision variables \(x_{it} \in \{0,1\}\), for \(i \in N\) and \(T_S \le t \le T_E\), where \(x_{it} = 1\) if job \(i\) is completed (exactly) at time \(t\) and \(x_{it} = 0\) otherwise. We also introduce the set of parameters \(T_{it}=(t-d_i)^+\), representing the tardiness of job \(i\) when it finishes at time \(t\). The time-indexed formulation is given by

$$\begin{aligned}&\mathrm {TIF} : \quad \min \quad \sum \limits _{i = 1}^{n}{\sum \limits _{t = r_i+p_i}^{\delta _i}{w_i T_{it} x_{it}}} \end{aligned}$$
(11)
$$\begin{aligned}&\text {subject to} \nonumber \\&\sum \limits _{i = 1}^{n}{\sum \limits _{s = \max \{t,r_i+p_i\}}^{\min \{\delta _i,t+p_i-1\}}{x_{is}}} \le 1 \quad \quad \quad \forall t, T_S \le t \le T_E \end{aligned}$$
(12)
$$\begin{aligned}&\sum \limits _{t = r_i+p_i}^{\delta _i}{x_{it}} = 1 \quad \quad \quad \quad \quad \quad \quad \quad \forall i \in N \end{aligned}$$
(13)
$$\begin{aligned}&\sum \limits _{s = r_i+p_i}^{\delta _i}{x_{is} s} \le \sum \limits _{t = r_j+p_j}^{\delta _j}{x_{jt} t} - p_j \quad \forall (i,j) \in A \end{aligned}$$
(14)
$$\begin{aligned}&x_{it} \in \{0,1\} \quad i \in N, r_i+p_i \le t \le \delta _i \end{aligned}$$
(15)

The set of constraints (12) eliminates the parts of the solution space where the jobs overlap. The constraint set (13) ensures that all jobs are scheduled exactly once. We enforce precedence constraints in the formulation using the set of constraints (14).

Similarly as for the assignment formulation, we introduce an alternative formulation \(\mathrm {TIF}^{\prime }\) by replacing the set of constraints (14) by the following:

$$\begin{aligned}&\sum \limits _{s = t}^{\delta _i}{ x_{is}} + \sum \limits _{s = r_j+p_j}^{t-p_i}{ x_{js}} \le 1 \nonumber \\&\forall (i,j) \in A; \forall t: \max \{r_i,r_j+p_j\}+p_i\nonumber \\&\quad \le t \le \min \{\delta _i , \delta _j + p_i\} \end{aligned}$$
(16)

Lemma 2

(From Christofides et al. (1987), Artigues et al. (2007)) \(\mathrm {TIF}^{\prime }\) is stronger than \(\mathrm {TIF}\).

As explained for the assignment formulation, the performance of the new formulation is not necessarily better. In fact, it can be much worse than TIF, since in a time-indexed formulation the number of additional constraints is quite large (pseudo-polynomial).

4 Branching strategies

In this section we discuss two different branching strategies for our B&B algorithm. The structure of the B&B search trees is as follows: each tree consists of a finite number of nodes and branches, and at each level of the tree we make a sequencing decision for one job. Each node thus corresponds with a selection \(S_P\subseteq N\) containing the already scheduled jobs and a set of unscheduled jobs \(U = N \backslash S_P\). Each node also has two feasible partial sequences \(\sigma _B\) and \(\sigma _E\) of the scheduled jobs (each \(i\in S_P\) appears in exactly one of these two): \(\sigma _B\) (respectively \(\sigma _E\)) denotes the partial sequence of jobs scheduled from the beginning (respectively end) of the scheduling horizon; see Fig. 2 for an illustration. All jobs that are not scheduled, belong to the set of unscheduled jobs \(U = E_B \cup E_E \cup E_N\). \(E_B\) is subset of unscheduled jobs that are eligible to be scheduled immediately after the last job in \(\sigma _B, E_E\) is the subset of unscheduled jobs that are eligible to be scheduled immediately before the first job in \(\sigma _E\), and \(E_N\) is the subset of unscheduled jobs that are not in \(E_B \cup E_E\).

Fig. 2
figure 2

The structure of a partial schedule

The root node represents an empty schedule (\(S_P = \sigma _B = \sigma _E = \emptyset \)). Each node branches into a number of child nodes, which each correspond with the scheduling of one particular job, called the decision job, as early as possible after the last job in \(\sigma _B\) or as late as possible before the first job in \(\sigma _E\). A branch is called a forward branch if it schedules a job after the last job in \(\sigma _B\), and is called a backward branch if it schedules a job before the first job in \(\sigma _E\). In our branching strategies, there will be either only forward branches or only backward branches emanating from each given node. We will say that a node is of type \(\mathrm {FB}\) (respectively \(\mathrm {BB}\)) if all its branches are forward (respectively backward) branches.

Although scheduling jobs backward (from the end of the time horizon) often improves the tightness of lower bounds when release dates are equal (Potts and van Wassenhove 1985), it probably decreases the quality of the lower bounds in the presence of non-equal release dates; see Sect. 4.2 and 5.3 for a description of backward branching and of the lower bounds, respectively, and Sect. 8.4 for the empirical results and a discussion. Also, the efficiency of some dominance rules may decrease when we switch from forward scheduling to backward scheduling; see Sect. 6.4 for more details. We propose two B&B algorithms, each applying one of the branching strategies: BB1 corresponds with branching strategy 1 where only FB nodes are used and BB2 corresponds with branching strategy 2 where both FB and BB are created. The bounding and the dominance properties discussed in the following sections are the same in both B&B algorithms.

Let \(C_{\max }(\sigma )\) be the completion time of the last job in the sequence \(\sigma \). Throughout the branching procedure, we maintain two vectors of updated release dates, namely \(\hat{\mathbf {r}} = (\hat{r}_1,\ldots ,\hat{r}_n)\) and \(\bar{\mathbf {r}}= (\bar{r}_1,\ldots ,\bar{r}_n)\), defined as follows:

$$\begin{aligned}&\hat{r}_j = \max \{{r}_j, C_{\max }(\sigma _B)\} \\&\bar{r}_j = \max \left\{ \hat{r}_j, \max _{k\in \mathcal {P}_j} \left\{ \bar{r}_k+p_k\right\} \right\} . \end{aligned}$$

Let \(st(\pi )\) denote the start time of the first job according to sequence \(\pi \). In line with the two vectors of updated release dates, we also introduce two vectors of updated deadlines, namely \(\hat{\mathbf {\delta }} = (\hat{\delta }_1,\ldots ,\hat{\delta }_n)\) and \(\bar{\mathbf {\delta }}= (\bar{\delta }_1,\ldots ,\bar{\delta }_n)\), which are recursively computed as follows:

$$\begin{aligned}&\hat{\delta }_j = \min \{\delta _j,st(\sigma _E)\} \\&\bar{\delta }_j = \min \left\{ \hat{\delta }_j, \min _{k\in \mathcal {Q}_j} \left\{ \bar{\delta }_k-p_k\right\} \right\} . \end{aligned}$$

We use these updated release dates and deadlines in computing lower bounds and dominance rules. \(\bar{\mathbf {\delta }}\) and \(\bar{\mathbf {r}}\) are more restrictive than \(\hat{\mathbf {\delta }}\) and \(\hat{\mathbf {r}}\) in each node of the search tree (\(\bar{r}_j \ge \hat{r}_j\) and \(\bar{\delta }_j \le \hat{\delta }_j\)). Although being restrictive often is positive, \(\hat{r}_j\) and \(\hat{\delta }_j\) are occasionally preferred over \(\bar{r}_j\) and \(\bar{\delta }_j\), specifically in parts of computations related to the dominance rules discussed in Sect. 6. Further explanations of these occasions are given in Sect. 6. There are many cases in which \(\bar{r}_j = \hat{r}_j\) (respectively \(\bar{\delta }_j = \hat{\delta }_j\)) and either of the updated release dates (respectively deadlines) can be used. In these cases, we use \(\hat{r}_j\) (respectively \(\hat{\delta }_j\)) because less computations are needed.

4.1 Branching strategy 1

Branching strategy 1 only uses \(\mathrm {FB}\) nodes. The search tree is explored depth-first such that among children of a node, those with larger out-degrees (number of transitive successors) of their decision jobs in \(\hat{G}\) are visited first. As a tie-breaking rule, among children with equal out-degrees of their decision jobs, the node with lower index is visited first.

Figure 3 illustrates branching strategy 1 applied to Example 1; an asterisk ‘*’ indicates that the position has not been decided yet. Among the children of the root node, the node \((1,*,*,*)\) corresponds with the decision job (job \(1\)) with the largest out-degree (namely \(3\)). As a result, the node \((1,*,*,*)\) is visited first. The nodes \((2,*,*,*)\) and \((3,*,*,*)\) are not in the tree because they violate precedence constraints. Among the children of \((1,*,*,*)\), the node \((1,2,*,*)\) is visited first because it has the decision job \(2\) with the largest out-degree. Among the children of \((1,2,*,*)\), the node \((1,2,3,*)\) is visited first because its decision job has the largest out-degree and the smallest index. In Fig. 3, green nodes are \(\mathrm {FB}\) nodes; no \(\mathrm {BB}\) nodes are present. Red nodes are considered infeasible because the completion of a job (namely job \(4\)) occurs after its deadline. The node \((1,4,2,3)\) corresponds with a feasible schedule, but it is not optimal: its objective value is greater than \(15\), which is attained by the optimal sequence \((4,1,2,3)\).

Fig. 3
figure 3

Branching strategy 1 for Example 1 without dominance rules and without lower bounds

4.2 Branching strategy 2

In branching strategy 2, we try to exploit the advantages of backward scheduling whenever possible, so the search tree consists of both \(\mathrm {FB}\) and \(\mathrm {BB}\) nodes. If the inequality \(C_{\max }(\sigma _B) < r_{\max }(U) = \max _{j\in U}{\{r_j\}}\) holds, then the start times of the jobs in \(\sigma _E\) will depend on the order in which unscheduled jobs are processed. Therefore, if the inequality \(C_{\max }(\sigma _B) < r_{\max }(U)\) holds, the corresponding node is of type \(\mathrm {FB}\). Otherwise, the completion time of the last job in \(\sigma _E\) can be computed regardless of the sequencing decisions for the jobs in \(U\), and we have a \(\mathrm {BB}\) node. The branching is depth-first for both \(\mathrm {FB}\) and \(\mathrm {BB}\) nodes. Among the children of an \(\mathrm {FB}\) (respectively \(\mathrm {BB}\)) node, those with higher (respectively lower) out-degrees of their decision jobs are visited first. As a tie-breaking rule, among children with equal out-degrees, the node with lower (respectively higher) index is visited first.

Figure 4 illustrates branching strategy 2 for Example 1; green nodes are of type \(\mathrm {FB}\) and blue nodes are of type \(\mathrm {BB}\). The root node is \(\mathrm {FB}\) because \(C_{\max }(\emptyset ) = 0 < 4 = r_{\max }(\{1,2,3,4\})\). At the node labeled \({(1,*,*,*)}\), the completion time \(C_{\max }(1,*,*,*) = 5\) of the decision job surpasses \(r_{\max }(\{1,2,3,4\})=4\); therefore, the end of scheduling horizon is computed (\(T_E = 5+3+4+2 = 14\)) and the node is \(\mathrm {BB}\). The red nodes are infeasible because the completion time of job \(4\) falls after its deadline.

Fig. 4
figure 4

Branching strategy 2 for Example 1 without dominance rules and without lower bounds

5 Lower bounding

In this section we describe the lower bounds that are implemented in our B&B algorithm. Section 5.1 first introduces a conceptual formulation for our problem, Sect. 5.2 describes a very fast lower bounding procedure, and in Sect. 5.3 we describe several lower bounds based on Lagrangian relaxation.

5.1 Another conceptual formulation

Let variable \(C_{j} \ge 0\) denote the completion time of job \(j\in N\) and let variable \(T_{j} \ge 0\) represent the tardiness of job \(j\). An alternative formulation of our problem is given by

$$\begin{aligned}&\mathrm {P}: \quad \min \quad \sum \limits _{j = 1}^{n}{w_j T_{j}} \end{aligned}$$
(17)
$$\begin{aligned}&\text {subject to} \nonumber \\&{T_j \ge C_j -d_j}&{\forall j \in N} \end{aligned}$$
(18)
$$\begin{aligned}&{C_j \ge r_j + p_j}&{\forall j \in N} \end{aligned}$$
(19)
$$\begin{aligned}&{C_j \le \delta _j}&{\forall j \in N} \end{aligned}$$
(20)
$$\begin{aligned}&{C_j \ge C_i + p_j}&{\forall (i,j) \in A } \end{aligned}$$
(21)
$$\begin{aligned}&{C_j \ge C_i + p_j \text { or } C_i \ge C_j + p_i}&{\forall i,j \in N; i<j } \end{aligned}$$
(22)
$$\begin{aligned}&T_j \ge 0&{\forall j \in N} \end{aligned}$$
(23)
$$\begin{aligned}&C_j \ge 0&{\forall j \in N} \end{aligned}$$
(24)

In the above formulation, constraints (18) and (23) reflect the definition of job tardiness. Constraints (19) and (20) enforce time windows. Constraints (21) ensure that each job is scheduled after all its predecessors. Constraints (22) guarantee that jobs do not overlap. We will use this formulation in Sect. 5.3 for producing lower bounds.

To the best of our knowledge, a lower bound procedure specifically for P has to date not been developed in the literature. Lower bounds proposed for \(1||\sum w_j T_j, 1|\mathrm{prec}|\sum w_j C_j\) and \(1|r_j|\sum w_j C_j\), however, can also function as a lower bound for P; this is shown in the following theorems. These theorems are extensions of those presented in Akturk and Ozdemir (2000).

Let \(I\) be an instance of \(1|\beta |\sum w_j T_j\). We construct an instance \(I^{\prime }\) of \(1||\sum w_j T_j\) by removing all constraints implied by \(\beta \) and an instance \(I^{{\prime }{\prime }}\) of \(1|\beta |\sum w_j C_j\) by replacing all due dates with zeros. Let \(\mathrm {TWT}^*(I)\) be the optimal objective value of \(I\). Given any valid lower bound \(lb_{I^{\prime }}\) on the optimal value of \(I^{\prime }\), we have

Theorem 1

\(lb_{I^{\prime }} \le \mathrm {TWT}^*(I)\).

A job is called early if it finishes at or before its due date and is said to be tardy if it finishes after its due date. Let \(C_j(S)\) be the completion time of job \(j\) in feasible solution \(S\). For an optimal solution \(S^*\) to \(I\), we partition \(N\) into two subsets: the set \(\mathcal {E}\) of early jobs and the set \(\mathcal {T}\) of tardy jobs. Let \(lb^E\) be a lower bound on the value \(\sum _{j \in \mathcal {E}}{w_j (d_j - C_j(S^*))}\). Given any valid lower bound \(\bar{lb}_{I^{{\prime }{\prime }}}\) on the optimal value of \(I^{{\prime }{\prime }}\), we have

Theorem 2

\(\bar{lb}_{I^{{\prime }{\prime }}} - \sum _{j}{w_j d_j} + lb^E \le \mathrm {TWT}^*(I)\).

In the following, we remove several combinations of constraints in \(\mathrm {P}\) to construct subproblems for which there exist polynomial-time-bounded algorithms for computing lower bounds. These bounds then directly lead to valid lower bounds for \(\mathrm {P}\) via Theorem 1 and Theorem 2.

5.2 A very fast trivial lower bound

Let \(\mathrm {P_{T}}\) be the trivial subproblem of \(\mathrm {P}\) in which constraints (18), (19), (20), and (21) are removed, which is then equivalent to \(1||\sum w_j C_j\). An optimal solution \(S^*\) to \(\mathrm {P_{T}}\) (with the optimal value \(\mathrm {OPT}(S^*)\)) follows sequence \(\sigma _{T}\), which sequences jobs according to the shortest weighted processing time (SWPT) rule (Pinedo 2008). By Theorems 1 and 2, \(\mathrm {LB}_\mathrm{T}=\mathrm {OPT}(S^*) - \sum _{j}{w_j d_j} + lb^E\) is a valid lower bound for \(\mathrm {P}\). We compute \(lb^E\) as the summation of the earliness values when each job is scheduled at its latest possible starting time. Note that if \(r_j = d_j = 0\) for all jobs \(j\) and \(\sigma _{T}\) does not violate any deadline nor precedence constraint, then \(\sigma _{T}\) is optimal to \(\mathrm {P}\) and \(\mathrm {OPT}=\mathrm {LB}_\mathrm{T}\). In B&B algorithms, this situation frequently occurs when some jobs have already been scheduled.

5.3 Lagrangian relaxation-based bounds

In this section, we use Lagrangian relaxation for computing various lower bounds. Let \(\mathrm {P}_0\) be the subproblem of \(\mathrm {P}\) in which constraints (19), (20), and (21) are removed. This problem is studied by Potts and van Wassenhove (1985) and is considered as our base problem. Let \(\lambda \) be a vector of Lagrangian multipliers. Potts and van Wassenhove (1985) obtain the following Lagrangian problem associated with \(\mathrm {P}_0\):

$$\begin{aligned} \mathrm {LR}_{\mathrm{P}_0}:\quad&L_0(\lambda ) = \min \sum \limits _{j = 1}^{n}{(w_j - \lambda _j) T_{j}} + \sum \limits _{j = 1}^{n}{\lambda _{j}(C_j - d_j) } \\&\text {subject to constraints } (22)-(24). \end{aligned}$$

Parameter \(\lambda _j\) is the Lagrangian multiplier associated with job \(j\) (\(0 \le \lambda _j \le w_j\)). Potts and Van Wassenhove propose a polynomial-time algorithm to set the multipliers. Their algorithm yields a very good lower bound for \(\mathrm {P}_0\); they compute the optimal values of the multipliers in \(\mathrm {O}(n\,\log n)\) time, and for a given set of multipliers, the bound itself can be computed in linear time. Let \(\lambda _{\mathrm{PV}}\) be the best Lagrangian multipliers computed by Potts and van Wassenhove (1985); we refer to this lower bound as \(\mathrm {LB}_0=L_0(\lambda _{\mathrm{PV}})\). By Theorem 1, \(\mathrm {LB}_0\) is also a valid bound for \(\mathrm {P}\). Quite a number of aspects of the definition of \(\mathrm {P}\) are completely ignored in \(\mathrm {LB}_0\); however, in the following sections, we will examine a number of ways to strengthen \(\mathrm {LB}_0\).

5.3.1 Retrieving precedence constraints

When \({A} \ne \emptyset \) then incorporating some or all of precedence constraints into the lower bound will improve its quality. We partition arc set \(A\) as \({A} = A^{\prime } \cup A^{{\prime }{\prime }}\), where \(G^{\prime }=(N,A^{\prime })\) is a two-terminal vertex serial–parallel (VSP) graph and \(G^{{\prime }{\prime }}=(N,A^{{\prime }{\prime }})\). Figure 5 depicts an example of this graph decomposition. For the precise definition of VSP graphs, we refer to Valdes et al. (1982). It should be noted that there exist two types of serial–parallel graphs: VSP graphs and edge serial–parallel (ESP) graphs. Valdes et al. (1982) describe the link between these two types: a graph is VSP if and only if its so-called ‘line-digraph inverse’ is ESP.

Fig. 5
figure 5

This figure shows (a) an example graph \(G\), (b) an associated VSP subgraph \(G^{\prime }\) and (c) \(G^{{\prime }{\prime }}\)

We split the set of constraints (21) into two subsets, as follows:

$$\begin{aligned}&{C_j \ge C_i + p_j}&{\forall (i,j) \in A^{\prime } } \end{aligned}$$
(25)
$$\begin{aligned}&{C_j \ge C_i + p_j}&{\forall (i,j) \in A^{{\prime }{\prime }} } \end{aligned}$$
(26)

We introduce \(\mathrm {P}_1\), which is a generalization of \(\mathrm {P}_0\) where precedence constraints are retrieved by imposing constraints (25) and (26). We create the following associated Lagrangian problem:

$$\begin{aligned} \mathrm {LR}_{\mathrm{P}_1}:&\quad L_1(\lambda ,\mu ) = \min \sum \limits _{j \in N}^{}{(w_j - \lambda _j) T_{j}} \\&+ \sum \limits _{j \in N}^{}{\lambda _j (C_j - d_j)}+ \sum \limits _{j \in N}^{}{\sum _{k \in \mathcal {Q}_j} {\mu _{jk} (C_j + p_k - C_k)}}\\&\text {subject to constraints } (22)-(25). \end{aligned}$$

Here \(\lambda _j\ge 0\) is again the multiplier associated with job \(j\) and \(\mu _{jk} \ge 0\) denotes the Lagrangian multiplier associated with the arc \((j,k) \in A\). We deliberately keep constraints (25) in the Lagrangian problem \(\mathrm {LR}_{\mathrm{P}_1}\). The objective function can be rewritten as

$$\begin{aligned} \sum \limits _{j \in N}^{}{(w_j - \lambda _j) T_{j}} + \sum \limits _{j \in N}^{}{w^{\prime }_j C_j } + c \end{aligned}$$

where

$$\begin{aligned} w^{\prime }_j = \lambda _{j} + \sum _{k \in \mathcal {Q}_j} {\mu _{jk}} - \sum _{k \in \mathcal {P}_j} {\mu _{kj}} \end{aligned}$$

and

$$\begin{aligned} c = \sum \limits _{j \in N}^{}{\sum _{k \in \mathcal {Q}_j} {\mu _{jk} p_k }} - \sum \limits _{j \in N}^{}{\lambda _{j} d_j}, \end{aligned}$$

so it can be seen that \(\mathrm {LR}_{\mathrm{P}_1}\) is a total weighted-completion-times problem with serial–parallel precedence constraints, because all \(T_j\) will be set to zero and \(\sum _{j \in N}^{}{(w_j - \lambda _j) T_{j}}\) can be removed from the formulation. Lawler (1978) proposes an algorithm that solves this problem in \(\mathrm {O}(n\,\log n)\) time provided that a decomposition tree is also given for the VSP graph \(G^{\prime }\). Valdes et al. (1982) propose an \(\mathrm {O}(n + m)\)-time algorithm to construct a decomposition tree of a VSP graph, where \(m\) is the number of arcs in the graph. Calinescu et al. (2012) show that any VSP graph (directed or undirected), including \(G^{\prime }\), has at most \(2n-3\) arcs. Therefore, for any given \(\lambda \) and \(\mu \), the problem \(\mathrm {LR}_{\mathrm{P}_1}\) is solvable in \(\mathrm {O}(n\,\log n)\) time. From the theory of Lagrangian relaxation (see Fisher (1981)), for any choice of non-negative multipliers, \(L_1(\lambda ,\mu )\) provides a lower bound for \(\mathrm {P}_1\). By Theorem 1, this lower bound is also valid for \(\mathrm {P}\). In Sect. 5.3.2, we explain how to choose appropriate values for \(\lambda \) and \(\mu \) and Sect. 5.3.3 describes how to select a suitable VSP graph \(G^{\prime }\) and how to construct a decomposition tree for \(G^{\prime }\).

5.3.2 Multiplier adjustment

We present a two-phase adjustment (TPA) procedure for the multipliers in \(L_1(\lambda ,\mu )\). Let \(\lambda _{\mathrm{TPA}}\) and \(\mu _{\mathrm{TPA}}\) be Lagrangian multipliers adjusted by TPA; these lead to a new lower bound \(\mathrm {LB}_1 = L_1(\lambda _{\mathrm{TPA}},\mu _{\mathrm{TPA}})\). The TPA procedure is heuristic, in the sense that it may not minimize \(L_1\) in \(\lambda \) and \(\mu \).

In the first stage of TPA, we simply ignore precedence constraints altogether. For a feasible solution \(S\), consider the function \(g(\lambda ,\mu ,S)\) defined as follows:

$$\begin{aligned} g(\lambda ,\mu ,S)= & {} \sum \limits _{j \in N}^{}{(w_j - \lambda _j) T_{j}} + \sum \limits _{j \in N}^{}{\lambda _j (C_j - d_j)} \\&+ \sum \limits _{j \in N}^{}{\sum _{k \in \mathcal {Q}_j} {\mu _{jk} (C_j + p_k - C_k)}}. \end{aligned}$$

We start with the Lagrangian problem \(\hat{\mathrm {LR}}_{\mathrm{P}_1}\) where \(\hat{L}_1(\lambda ,\mu ) = \min g(\lambda ,\mu ,S)\) subject to constraints (22)–(24), which is a relaxation of \(\mathrm {LR}_{\mathrm{P}_1}\). We simply set all \(\mu _{jk}\) to zero (\(\mu =\mu _0 = (0, \ldots ,0)\)); with this choice, \(\hat{L}_1(\lambda ,\mu ) = {L}_0(\lambda )\) and we set \(\lambda _\mathrm{TPA} = \lambda _\mathrm{PV}\).

In the second stage of TPA, the multipliers \(\mu _{jk}\) are adjusted assuming that \(\lambda =\lambda _\mathrm{TPA}\) is predefined and constant. This adjustment is an iterative heuristic; we adopt the quick ascent direction (QAD) algorithm proposed by van de Velde (1995). One iteration of TPA runs in \(\mathrm {O}(m+n\log n)\) time, where \(m = |A|\). We have run a number of experiments to evaluate the improvement of the lower bound as a function of the number of iterations \(k_{\max }\). For a representative dataset, Table 2 shows that the average percentage deviation of \(\mathrm {LB}_1\) from \(\mathrm {LB}_0\) significantly increases in the first iterations, whereas after about five iterations the incremental improvement becomes rather limited; more information on the choices for \(k_{\max }\) follows in Sects. 5.3.3 and 8.2. The instance generation scheme is explained in Sect. 8.1.

Table 2 The average percentage deviation between \(\mathrm {LB}_1\) and \(\mathrm {LB}_0\) tested on \({\text {Ins}^{L}}\)

Theorem 3

\(\mathrm {LB}_0 \le \mathrm {LB}_1\).

5.3.3 Finding a VSP graph

\(\mathrm {LB}_1\) requires a decomposition of graph \(G\) into two subgraphs \(G^{\prime }=(N,A^{\prime })\) and \(G^{{\prime }{\prime }}=(N,A^{{\prime }{\prime }})\), such that \(A^{\prime } \cup A^{{\prime }{\prime }} = {A}\) and \(G^{\prime }\) is a VSP graph. The more arcs we can include in \(A^{\prime }\), the tighter the lower bound. In the following, we discuss procedures to find a VSP subgraph \(G^{\prime }\) with maximum number of arcs; we refer to this problem as the maximum VSP subgraph (MVSP) problem.

Valdes et al. (1982) state the following result:

Lemma 3

(From Valdes et al. (1982)) A graph \(G\) is VSP if and only if its transitive closure does not contain the graph of Fig. 6 as a subgraph.

Fig. 6
figure 6

The forbidden subgraph for VSP graphs

Valdes et al. refer to the pattern in Fig. 6 as the forbidden subgraph. Polynomial-time exact procedures exist for finding an ESP subgraph with maximum number of nodes (see Bein et al. (1992), for instance), but to the best of our knowledge, no exact approach for MVSP has been proposed yet in the literature. McMahon and Lim (1993) suggest a heuristic traversal procedure to find and eliminate all forbidden subgraphs and, at the same time, construct a binary decomposition tree for the resulting VSP graph. Their procedure runs in \(\mathrm {O}(n + m)\) time. The number of arcs in a VSP graph is bounded by \(2n-3\) for an arbitrary non-VSP graph, but the maximum number of arcs for an arbitrary input graph is \(\mathrm {O}(n^2)\). We implement a slightly modified variant of the algorithm in McMahon and Lim (1993) to compute \(G^{\prime }\); we select arcs for removal so that the lower bound remains reasonably tight. Simultaneously, it constructs a decomposition tree for the obtained VSP graph. The time complexity of \(\mathrm {O}(n + m)\) is maintained.

The structure of our heuristic decomposition and arc-elimination procedure is described in the following lines. The procedure constructs a decomposition tree by exploiting parallel and serial node reduction (Lawler 1978). Parallel reduction merges a job pair into one single job if both jobs have the same predecessor and successor sets. In the decomposition tree, such jobs are linked by a \(P\) node, which means they can be processed in parallel (see Fig. 7b). Serial reduction merges a job pair \(\{i,j\}\) into one single job if arc \((i,j) \in A\), job \(i\) has only one successor and job \(j\) has only one predecessor. In the decomposition tree, such two jobs are linked by an \(S\) node, which means they cannot be processed in parallel (see Fig. 7d). Whenever a forbidden subgraph is recognized, the procedure removes arcs such that the forbidden subgraph is resolved (removed) and the total number of removed arcs (including transitive and merged arcs) is approximately minimized (see Fig. 7b, c). Notice that some arcs may actually represent multiple merged arcs, so removing one arc in one iteration might imply the removal of multiple arcs simultaneously in the original network \(G\).

Fig. 7
figure 7

Modified traversal algorithm applied to the input graph in (a)

The proposed algorithm is run only once, in the root node of the search tree. In each other node of the search tree, graphs \(G^{\prime }\) and \(G^{{\prime }{\prime }}\) are constructed by removing from the corresponding graphs in the parent node the arcs associated with the scheduled jobs; the resulting graphs are then the input for computing \(\mathrm {LB}_1\). Notice that for each child node, both graphs \(G^{\prime }\) and \(G^{{\prime }{\prime }}\) as well as the associated decomposition tree can be constructed in \(\mathrm {O}(n)\) time.

To evaluate the impact of our arc-elimination procedure on the quality of the bounds, we examine two variations of \(\mathrm {LB}_1\), namely \(\mathrm {LB}_1({\text {VSP}}) = L_1(\lambda _\mathrm{TPA},\mu _\mathrm{TPA})\), where all forbidden graphs in \(G\) are resolved using the arc-elimination procedure, and \(\mathrm {LB}_1({\text {NO}}) = \hat{L}_1(\lambda _\mathrm{TPA},\mu _\mathrm{TPA})\), in which we simply remove all arcs (\(A^{\prime } = \emptyset \) and \(A^{{\prime }{\prime }} = A\)). Let \(k_{\max }\) be the maximum number of iterations for TPA, as explained in Sect. 5.3.2. Table 3 demonstrates the success of our proposed algorithm in tightening the bound. The distance between the bounds is decreasing with increasing \(k_{\max }\), but in a B&B algorithm, a large value for \(k_{\max }\) becomes computationally prohibitive.

Table 3 The average percentage deviation between \(\mathrm {LB_1}({\text {VSP}})\) and \(\mathrm {LB_1}({\text {NO}})\) tested on \({\text {Ins}^{L}}\) with 40 jobs.

Theorem 4

\(\mathrm {LB}_1({ NO }) \le \mathrm {LB}_1({ VSP })\) for the same value of \(k_{\max }\).

5.3.4 Retrieving release dates and deadlines

Bound \(\mathrm {LB}_1\) turns out not be to be very tight when release dates are rather heterogeneous. Below, we examine two means to produce a stronger bound, namely block decomposition and job splitting.

Block decomposition We follow references Pan and Shi (2005), Hariri and Potts (1983), Potts and van Wassenhove (1983) in setting up a decomposition of the job set into separate blocks: a block is a subset of jobs for which it is a dominant decision to schedule them together. We sort and renumber all jobs in non-decreasing order of their modified release dates \(\bar{r}_j\); as a tie-breaking criterion, we consider non-increasing order of \(w_j/p_j\). The resulting non-delay sequence of jobs is given by \(\sigma ^r = (1,\ldots ,n)\), where a sequence is said to be ‘non-delay’ if the machine is never kept idle while some jobs are waiting to be processed (Pinedo 2008). Let \(B_i = (u_i,\ldots ,v_i)\) be one block (in which jobs are sorted according to their new indices). The set \(B = \{B_1,\ldots ,B_\kappa \}\) is a valid decomposition of the job set into \(\kappa \) blocks if the following conditions are satisfied:

  1. 1.

    \(u_1=1\);

  2. 2.

    for each \(i,j\) with \(1 < i \le \kappa \) and \(1 \le j \le n\), if \( u_i = j\) then \(v_{i-1}=j-1\) and vice versa;

  3. 3.

    for each \(i,j\) with \(1 \le i \le \kappa \) and \(u_i \le j \le v_i\), we have \(\bar{r}_{u_i}+\sum _{s = u_i}^{j-1}{p_s} \ge \bar{r}_j\).

Although the sequencing of the jobs within one block is actually still open, the sequencing of the blocks is pre-determined. Given a valid set of blocks \(B\), we compute \(\mathrm {LB}_1\) for each block \(B_i \in B\) separately. The value \(\mathrm {LB}_2\) is then the sum of the bounds per block; analogously to Pan and Shi (2005), Hariri and Potts (1983), Potts and van Wassenhove (1983), \(\mathrm {LB}_2\) can be shown to be a lower bound for \(\mathrm {P}\).

We define \(\mathrm {LB}^*_1 = L_1(\lambda ^*,\mu ^*)\), where \(\lambda ^*\) and \(\mu ^*\) are optimal choices for the Lagrangian multipliers for \(\mathrm {LB}_1\), and \(\mathrm {LB}^*_2\), which is computed by adding the contribution \(L_1(\lambda _{B_i}^*,\mu _{B_i}^*)\) for each block \({B_i}\), where \(\lambda _{B_i}^*\) and \(\mu _{B_i}^*\) are the optimal choices for the multipliers for block \(B_i\).

Theorem 5

\(\mathrm {LB}^*_1 \le \mathrm {LB}^*_2\).

Although TPA might not find \(\lambda _{B_i}^*\) and \(\mu _{B_i}^*\) and thus the same result as Theorem 5 might not hold for \(\mathrm {LB}_1\) and \(\mathrm {LB}_2\), empirical results show that \(\mathrm {LB}_2\) is on average far tighter than \(\mathrm {LB}_1\) (these results are shown in Table 8).

Job splitting It sometimes happens that the decomposition procedure fails to improve the bound (only one block is created and \(\mathrm {LB}_2 = \mathrm {LB}_1\)). Another approach is to explicitly re-introduce the release date constraints (which have been removed previously). We define problem \(\mathrm {P}_2\), which is a generalization of \(\mathrm {P}_1\) in which the release date constraints (19) are included. The associated Lagrangian problem is

$$\begin{aligned} \mathrm {LR}_{\mathrm{P}_2}:&\quad L_2(\lambda ,\mu ) =\min \sum \limits _{j \in N}^{}{w^{\prime }_j C_j } +c \\&\text {subject to constraints } (19),(22)-(24). \end{aligned}$$

Contrary to \(\mathrm {LR}_{\mathrm{P}_1}\), we now remove the serial–parallel precedence constraints because they render the Lagrangian problem too difficult. Problem \(\mathrm {LR}_{\mathrm{P}_2}\) is a total weighted-completion-times problem with release dates. This problem is known to be NP-hard (Lenstra et al. 1977), but a number of efficient polynomial algorithms, which are based on job splitting, have been proposed to compute tight lower bounds (Hariri and Potts 1983; Nessah and Kacem 2012; Belouadah et al. 1992). One of these algorithms is the SS procedure proposed by Belouadah et al. (1992), which runs in \(\mathrm {O}(n \log n)\) time and which we adopt here. Essentially, we again decompose the job set into a set of blocks \(B\) and compute \(L_2(\lambda ,\mu )\) for each block \(B_i \in B\). The lower bound \(\mathrm {LB}_2^{\mathrm{SS}_\mathrm{r}}\) is again the sum of the contributions of the individual blocks. Experiments show that \(\mathrm {LB}_2^{\mathrm{SS}_\mathrm{r}}\) is typically tighter than \(\mathrm {LB}_2\) when the release dates are unequal. With equal release dates, on the other hand, normally \(\mathrm {LB}_2\ge \mathrm {LB}_2^{\mathrm{SS}_\mathrm{r}}\) because \(\mathrm {LB}_2\) incorporates a part of the precedence graph. TPA is applied also here for multiplier updates.

We introduce \(\mathrm {P}_2^{\prime }\), which is a generalization of \(\mathrm {P}_1\) where deadline constraints are retrieved by inclusion of the constraint set (20). The associated Lagrangian problem is

$$\begin{aligned} \mathrm {LR}_{\mathrm{P}_2^{\prime }}:&\quad L_2^{\prime }(\lambda ,\mu ) =\min \sum \limits _{j \in N}^{}{w^{\prime }_j C_j } + c \\&\text {subject to constraints } (20),(22)-(24). \end{aligned}$$

\(\mathrm {LR}_{\mathrm{P}_2^{\prime }}\) is a total weighted-completion-times problem with deadlines. This problem is known to be NP-hard (Lenstra et al. 1977). Posner (1985) proposes a job-splitting lower bounding scheme for \(\mathrm {LR}_{\mathrm{P}_2^{\prime }}\) that uses \(\mathrm {O}(n \log n)\) time; the lower bound \(\mathrm {LB}_2^{\mathrm{SS}_\delta }\) results from block decomposition and computation of \(L_2^{\prime }(\lambda ,\mu )\) for each block. We again apply TPA for setting the multiplies.

5.3.5 Improvement by slack variables

Relaxed inequality constraints can be considered to be ‘nasty’ constraints because they decrease the quality of lower bounds. We follow Hoogeveen and van de Velde (1995) in exploiting the advantages of slack variables to lessen the effect of such nasty constraints to improve the quality of the lower bounds.

We introduce two non-negative vectors of slack variables: vector \(\mathbf {y} = (y_1,\ldots ,y_n)\) and vector \(\mathbf {z} = (z_{11},\ldots ,z_{1n},\ldots ,z_{n1},\ldots ,z_{nn})\). Consider the following sets of constraints:

$$\begin{aligned}&{T_j = C_j - d_j + y_{j}}&\forall j \in N \end{aligned}$$
(27)
$$\begin{aligned}&{C_j = C_i + p_j + z_{ij}}&{\forall (i,j) \in A } \end{aligned}$$
(28)
$$\begin{aligned}&y_{j}, z_{ij} \ge 0&\forall i,j \in N \end{aligned}$$
(29)

Let problem \(\mathrm {P}_3\) be the variant of problem \(\mathrm {P}_1\) in which the sets of constraints (18) and (21) are replaced by the constraints (27)–(29). The Lagrangian problem associated with \(\mathrm {P}_3\) is

$$\begin{aligned}&\mathrm {LR}_{\mathrm{P}_3}:\quad L_3(\lambda ,\mu ) = \min \sum \limits _{j = 1}^{n}{(w_j - \lambda _j) T_{j}} + \sum \limits _{j = 1}^{n}{\lambda _j y_{j}} + \\&\sum \limits _{j = 1}^{n}{\sum _{k \in \mathcal {Q}_j} {\mu _{jk} z_{jk} }} + \sum \limits _{j \in N}^{}{w^{\prime }_j C_j } + c \\&\text {subject to constraints } (22)-(25) \text { and } (29). \end{aligned}$$

The values of the variables \(T_{j}, y_{j}\) and \(z_{jk}\) are zero in any optimal solution to \(\mathrm {LR}_{\mathrm{P}_3}\) because for \(i,j \in N\) the following inequalities hold: \(0 \le \lambda _j \le w_j\) and \(\mu _{jk} \ge 0\). In an optimal solution to \(\mathrm {P}_3\), however, these values might not be zero. In fact, according to the set of constraints (27), unless \(C_j = d_j\), either \(T_{j}\) or \(y_{j}\) is non-zero. Also, from constraints (28), \(z_{jk}\) may not be zero when job \(j\) has at least two successors or job \(k\) has at least two predecessors in \(G\). We introduce three problems that each carry a part of the objective function of \(\mathrm {LR}_{\mathrm{P}_3}\), one of which is \(\mathrm {LR}_{\mathrm{P}_1}\) and the other two are the following two slack-variable (SV) problems, where \(Y\) is the set of all \(\mathbf {y}\)-vectors corresponding to feasible solutions to \(\mathrm {P}_3\) and \(Z\) similarly contains all \(\mathbf {z}\)-vectors.

$$\begin{aligned} \mathrm {P}_{\mathrm{SV1}}:&\quad SV_1(\lambda ) = \min \sum \limits _{j = 1}^{n}{(w_j - \lambda _j) T_{j}} + \sum \limits _{j = 1}^{n}{\lambda _j y_{j}} \\&\text {subject to constraints } (22),(23),(25) \text { and } \mathbf {y} \in Y;&\\ \mathrm {P}_{\mathrm{SV2}}:&\quad SV_2(\mu ) = \min \sum \limits _{j = 1}^{n}{\sum _{k \in \mathcal {Q}_j} {\mu _{jk} z_{jk} }} \\&\text {subject to constraint } \mathbf {z} \in Z.&\end{aligned}$$

Note that the term \(\sum _{j = 1}^{n}{(w_j - \lambda _j) T_{j}}\) appears in two of the problems, but it will be set to zero anyway in \(\mathrm {LR}_{\mathrm{P}_1}\).

Hoogeveen and van de Velde (1995) propose \(\mathrm {O}(n\log n)\)-time procedures to compute valid lower bounds for \(\mathrm {P}_{\mathrm{SV1}}\) and \(\mathrm {P}_{\mathrm{SV2}}\). Let \(\mathrm {LB}_{\mathrm{SV1}}\ge 0\) and \(\mathrm {LB}_{\mathrm{SV2}}\ge 0\) be lower bounds for \(\mathrm {P}_{\mathrm{SV1}}\) and \(\mathrm {P}_{\mathrm{SV2}}\), respectively. By adding \(\mathrm {LB}_{\mathrm{SV1}}\) and \(\mathrm {LB}_{\mathrm{SV2}}\) to \(\mathrm {LB}_2\), a better lower bound \(\mathrm {LB}_3\) for \(\mathrm {P}\) is obtained (Hoogeveen and van de Velde 1995). The same SV problems can also be constructed for \(\mathrm {LB}_2^{\mathrm{SS}_\mathrm{r}}\) and \(\mathrm {LB}_2^{\mathrm{SS}_\delta }\) to lead to bounds \(\mathrm {LB}_3^{\mathrm{SS}_\mathrm{r}} = \mathrm {LB}_2^{\mathrm{SS}_\mathrm{r}} + \mathrm {LB}_{\mathrm{SV1}} + \mathrm {LB}_{\mathrm{SV2}}\) and \(\mathrm {LB}_3^{\mathrm{SS}_\delta } = \mathrm {LB}_2^{\mathrm{SS}_\delta } + \mathrm {LB}_{\mathrm{SV1}} + \mathrm {LB}_{\mathrm{SV2}}\). We have the following result:

Observation 1

\(\mathrm {LB}_2 \le \mathrm {LB}_3, \mathrm {LB}_2^{\mathrm{SS}_\mathrm{r}} \le \mathrm {LB}_3^{\mathrm{SS}_\mathrm{r}}\) and \(\mathrm {LB}_2^{\mathrm{SS}_\delta } \le \mathrm {LB}_3^{\mathrm{SS}_\delta }\).

5.3.6 Other Lagrangian bounds

All the lower bounds introduced in this section are based on the formulation (17)–(24). Other Lagrangian relaxation-based lower bounds have also been proposed for special cases of this problem. These other bounds are mostly based on other (conceptual) formulations. For example, to achieve a lower bound, Lagrangian penalties could be added to the objective function while allowing jobs to be processed repeatedly. Many variants of such a lower bound exist (Tanaka et al. 2009; Tanaka and Fujikuma 2012; Tanaka and Sato 2013), but most of these variants are either too weak or too slow. Another lower bound based on Lagrangian relaxation is obtained by relaxing the capacity constraints, such that jobs are allowed to be processed in parallel in exchange for Lagrangian penalties (Tang et al. 2007).

6 Dominance properties

Our search procedure also incorporates a number of dominance rules, which will be described in this section. We will use the following additional notation. Given two partial sequences \(\pi = (\pi _1,\ldots ,\pi _k)\) and \(\pi ^{\prime } = (\pi ^{\prime }_1,\ldots ,\pi ^{\prime }_{k^{\prime }})\), we define a merge operator as follows: \(\pi |\pi ^{\prime } = (\pi _1,\ldots ,\pi _k,\pi ^{\prime }_1,\ldots ,\pi ^{\prime }_{k^{\prime }})\). If \(\pi ^{\prime }\) contains only one job \(j\) then we can also write \(\pi |j = (\pi _1,\ldots ,\pi _k,j)\), and similarly if \(\pi = (j )\) then \(j|\pi ^{\prime } = (j,\pi ^{\prime }_1,\ldots ,\pi ^{\prime }_{k^{\prime }})\).

6.1 General dominance rules

We use the lower bounds proposed in Sect. 5 to prune the search tree. Let \(\mathrm {LB}(U)\) represent any of the lower bounds described in Sect. 5, applied to the set \(U\) of unscheduled jobs, and let \(S_\mathrm{best}\) be the currently best known feasible solution. Notice that \(\mathrm {TWT}(S_\mathrm{best})\) is an upper bound for \(\mathrm {TWT}(S^*)\). The following dominance rule is then immediate:

Dominance rule 1

\(({\varvec{\mathrm{DR}_1}})\) Consider a node associated with selection \(S_P\). If

$$\begin{aligned} \mathrm {TWT}(S_P) + \mathrm {LB}(U) \ge \mathrm {TWT}(S_\mathrm{best}), \end{aligned}$$

then the node associated with \(S_P\) can be fathomed.

As we already introduced in Sect. 4, a partial schedule can be denoted by either \(S_P\) or \((\sigma _B,\sigma _E)\). Multiple lower bounds can be used to fathom a node. The selection of lower bounds and the order in which they are computed obviously influences the performance of the B&B algorithm. These issues are examined in Sect. 8.2.

The subset of active schedules is dominant for total weighted tardiness problems (Conway et al. 1967; Pinedo 2008). A feasible schedule is called active if it is not possible to construct another schedule by changing the sequence of jobs such that at least one job is finishing earlier and no other job finishes later. The dominance of active schedules holds even when deadlines and precedence constraints are given.

Dominance rule 2

\({\varvec{(\mathrm{DR}_2)}}\) Consider a node associated with \({(\sigma _B,\emptyset )}\) that is selected for forward branching, and let \(j\) be a job belonging to \(E_B\). If \(\bar{r}_j \ge \min _{k \in E_B} \{\bar{r}_k + p_k\}\), then the child node associated with the schedule \({(\sigma _B|j,\emptyset )}\) can be fathomed.

We also prune a branch whenever an obvious violation of the deadline constraints is detected. A partial schedule associated with a particular node is not always extended to a feasible schedule. Scheduling a job in one particular position may force other jobs to violate their deadline constraints, even though it does not violate its own constraints. Let \(\mathcal {A}\) be an arbitrary subset of \(U\) and let \(\varPi _\mathcal {A}\) be the set of all possible permutations of jobs in \(\mathcal {A}\). The following theorem states when a job is scheduled in a ‘wrong position’, meaning that it will lead to a violation of deadline constraints.

Theorem 6

Consider a partial schedule \((\sigma _B,\sigma _E)\). If there exists any non-empty subset \(\mathcal {A} \subset U\) such that the inequality \(\min _{\pi \in \varPi _\mathcal {A}} \{C_{\max }(\sigma _B|\pi )\} > \max _{j\in \mathcal {A}}{\{\bar{\delta _j}\}}\) holds, then the schedule \((\sigma _B,\sigma _E)\) is not feasible.

The problem \(\min _{\pi \in \varPi _\mathcal {A}} \{C_{\max }(\sigma _B|\pi )\}\), which equates with \(1|r_j,\delta _j,\mathrm{prec}|C_{\max }\), is NP-hard because the mere verification of the existence of a feasible schedule is already NP-complete. We remove deadlines and create a new problem whose optimal solution is computed in \(\mathrm {O}(n^2)\) time (Lawler 1973). For computational efficiency, we use a linear-time lower bound for this new problem. This lower bound is computed as follows: \(\min _{j\in \mathcal {A} \cap E_B}{\{\bar{r}_j\}} + \sum _{j\in \mathcal {A}}{p_j}\).

Dominance rule 3

\({\varvec{(\mathrm{DR}_3)}}\) The node associated with \({(\sigma _B,\sigma _E)}\) can be eliminated if at least one of the following conditions is satisfied:

  1. 1.

    if \(\sigma _E = \emptyset \) and the condition of Theorem 6 is satisfied for the partial schedule \((\sigma _B,\emptyset )\);

  2. 2.

    if \(\sigma _E \ne \emptyset \) and \(\max _{j \in U} \{\bar{\delta }_j\} < st(\sigma _E)\).

While additional precedence constraints could be added to the problem considering time windows and using constraint propagation techniques, the solution representation for our B&B algorithms and the above dominance rules (\(\text {DR}_2\) and \(\text {DR}_3\)) are devised in such a way that any violation of these additional precedence constraints is dominated. Consider two jobs \(i\) and \(j\) with \(p_i = p_j = 10, r_i = 0, r_j = 5, \delta _i = 20\), and \(\delta _j = 30\). Using constraint propagation techniques, it could be possible to include an extra constraint that allows the processing of job \(j\) to occur only after the completion of job \(i\). Such an additional constraint is not necessary, however, because in the above-described situation all sequences in which job \(j\) precedes job \(i\) will be automatically fathomed by \(\text {DR}_2\) and \(\text {DR}_3\). Moreover, increasing the density of the precedence graph in this way would also decrease the tightness of the lower bounds, which is undesirable.

6.2 Dominance rule based on two-job interchange

We describe a dominance rule based on job interchange. This dominance rule consists of two parts. The first part deals with the interchange of jobs in an FB node, whereas the second part deals with the interchange of jobs in a BB node.

6.2.1 Interchanging jobs in an FB node

In an FB node, consider jobs \(j,k \in E_B\) that are not identical (they differ in at least one of their parameters). We will always assume that \( \hat{r}_k < \hat{r}_j+p_j\) and \(\hat{r}_j < \hat{r}_k+p_k\), because otherwise Dominance rule 2 enforces the scheduling of the job with smaller \(\hat{r}\) before the job with larger \(\hat{r}\); note here that \(\hat{r}_j = \bar{r}_j\) and \(\hat{r}_k = \bar{r}_k\) because all predecessors of jobs \(j\) and \(k\) have already been scheduled and therefore the branching decisions cover the propagation of precedence constraints. We also assume that any successor of job \(k\) is also a successor of job \(j\) (\(\mathcal {Q}_k \subset \mathcal {Q}_j\)). Consider a node of the search tree in which job \(k\) is scheduled at or after the completion of sequence \(\sigma _B\). Suppose that the partial schedule associated to the current node can be extended to a feasible schedule \(S_1\) in which job \(j\) is scheduled somewhere after job \(k\). We define a set \(\mathcal {B} = U \backslash \{j,k\}\) of jobs. We also construct a schedule \(S_1^{\prime }\) by interchanging jobs \(j\) and \(k\), while the order of jobs belonging to \(\mathcal {B}\) remains unchanged. Figure 8 illustrates schedules \(S_1\) and \(S_1^{\prime }\).

Fig. 8
figure 8

Schedules \(S_1\) and \(S_1^{\prime }\)

To prove that interchanging jobs \(j\) and \(k\) does not increase the total weighted tardiness, we argue that the gain of interchanging jobs \(j\) and \(k\), which is computed as \(\mathrm {TWT}(S_1) - \mathrm {TWT}(S_1^{\prime })\), is greater than or equal to zero, no matter when job \(j\) is scheduled. Let \(st_j(S)\) denote the start time of job \(j\) in schedule \(S\). Remember that \(st(\pi )\) denotes the start time of a sequence \(\pi \). Let \(\tau _1\) be the difference between the start time of job \(j\) in \(S_1\) and the start time of \(k\) in \(S_1^{\prime }\). If \(st_k(S_1^{\prime })\) is less than \(st_j(S_1)\) then \(\tau _1\) is negative, otherwise it is non-negative. By interchanging jobs \(j\) and \(k\) each job that belongs to set \(\mathcal {B}\) may be shifted either to the right or to the left. Let \(\tau _2 \ge 0\) be the maximum shift to the right of the jobs belonging to set \(\mathcal {B}\). Notice that if all jobs in \(\mathcal {B}\) are shifted to the left, then \(\tau _2 = 0\). For each \(t\) as the start time of job \(j\) in \(S_1\), Jouglet et al. (2004) define a function \(\varGamma _{jk} (t,\tau _1,\tau _2)\) as follows:

$$\begin{aligned} \varGamma _{jk} (t,\tau _1,\tau _2)=&\, w_j \max \{0,t+p_j-d_j\} \\&- w_k \max \{0,t+\tau _1+p_k-d_k\} \\&+w_k \max \{0,\hat{r}_k+p_k-d_k\} \\&- w_j \max \{0,\hat{r}_j+p_j-d_j\} - \tau _2 \sum _{i \in \mathcal {B}}{w_i}. \end{aligned}$$

For the subproblem of \(\mathrm {P}\) where precedence and deadline constraints are removed, Jouglet et al. (2004) show that \(\varGamma _{jk} (t,\tau _1,\tau _2)\) is a lower bound for the gain of interchanging jobs \(j\) and \(k\) when \(t = st_j(S_1)\). This result can be improved by adding the gain of shifting the jobs which are tardy in both schedules \(S_1\) and \(S_1^{\prime }\). We introduce the set \(\mathcal {B}^{\prime }\) of jobs where each job \(i \in \mathcal {B}^{\prime }\) is certainly a tardy job in \(S_1^{\prime }\). Let \(\mathcal {\hat{P}}_i\) be the set of transitive predecessors of job \(i\). The following set of jobs, which is a subset of \(\mathcal {B}^{\prime }\), is used in our implementations because the order based on which the jobs in \(\mathcal {B}\) are scheduled has not yet been defined and therefore computing \(\mathcal {B}^{\prime }\) is not possible

$$\begin{aligned} \left\{ i \in \mathcal {B} \Big | \hat{r}_j+p_j+\sum _{l\in (\mathcal {B} \cap \mathcal {\hat{P}}_i)}{p_l} + p_i \ge d_i \right\} . \end{aligned}$$

Let \(\tau _2^{\prime } \ge 0\) be the minimum shift to the left of the jobs belonging to set \(\mathcal {B}\). Note that at least one of the values \(\tau _2^{\prime }\) and \(\tau _2\) equals zero. We define the function \(\hat{\varGamma }_{jk} (t,\tau _1,\tau _2,\tau _2^{\prime })\) as follows:

$$\begin{aligned} \hat{\varGamma }_{jk} (t,\tau _1,\tau _2,\tau _2^{\prime }) = \varGamma _{jk} (t,\tau _1,\tau _2) + \tau _2^{\prime } \sum _{i \in \mathcal {B}^{\prime }}{w_i}. \end{aligned}$$

The values \(\tau _2\) and \(\tau _2^{\prime }\) cannot be negative. Therefore, we immediately infer \(\varGamma _{jk}(t,\tau _1,\tau _2) \le \hat{\varGamma }_{jk} (t,\tau _1,\tau _2,\tau _2^{\prime })\). We need the following result:

Theorem 7

\(\hat{\varGamma }_{jk} (t,\tau _1,\tau _2,\tau _2^{\prime })\) is a valid lower bound for the gain of interchanging jobs \(j\) and \(k\).

In a general setting (problem \(\mathrm {P}\)), however, job interchanges are not always feasible for every starting time \(t\). We opt for verifying the feasibility of an interchange by ensuring that it does not cause any violation of deadlines and/or precedence constraints for all possible \(t = st_j(S_1)\). Let \(\varPsi \) be an upper bound for the completion time of the sequence \(S_1^{\prime }\), computed as follows:

$$\begin{aligned} \varPsi = \max \left\{ \max \{\hat{r}_j+p_j,\hat{r}_k\}+p_k,\max _{i \in \mathcal {B}}\{\hat{r}_i\}\right\} +\sum _{i \in \mathcal {B}}{p_i}. \end{aligned}$$

The following theorem provides the conditions under which for every possible \(t = st_j(S_1)\) interchanging jobs \(j\) and \(k\) is feasible.

Theorem 8

For each feasible schedule \(S_1\), an alternative feasible schedule \(S_1^{\prime }\) is created by interchanging jobs \(j\) and \(k\), if the following conditions are satisfied:

  1. 1.

    \(\bar{\delta }_j - p_j \le \bar{\delta }_k - \tau _1 - p_k\) or \(\varPsi \le \hat{\delta }_k\);

  2. 2.

    \(\tau _2 = 0\) or \(\varPsi \le \min \limits _{i \in \mathcal {B}}\{\hat{\delta }_i\}\).

Jouglet et al. (2004) prove that if \(w_j \ge w_k\) then the value \(\varGamma _{jk} (\max \{d_j-p_j,\hat{r}_k + p_k\},\tau _1,\tau _2)\) is the minimum gain obtained by interchanging jobs \(j\) and \(k\) for the setting where deadlines and precedence constraints are removed. We derive a more general result using the following lemma.

Lemma 4

Let \(f:t \rightarrow \alpha \max \{0,t-a\} - \beta \max \{0,t-b\} + C\) be a function defined on \([u,v]\) for \(a,b,C \in \mathbb {R}\) and \(\alpha ,\beta ,u,v \in \mathbb {R}^+\). The function \(f\) reaches a global minimum at value \(t^*\) computed as follows:

$$\begin{aligned} t^*&(\alpha ,\beta ,a,b,u,v)= \\&{\left\{ \begin{array}{ll}{\min \{\bar{u},v\}}&{}{\text { if } \alpha \ge \beta }\\ {u}&{}{\text { if } \alpha < \beta \text {, } b > a \text {, } \alpha (\bar{v} - \bar{u}) \ge \beta (\bar{v} - b)}\\ {v}&{}{\text { otherwise}} \end{array}\right. } \end{aligned}$$

where \(\bar{u} = \max \{u,a\}\) and \(\bar{v} = \max \{v,b\}\).

Corollary 1 below follows from Theorems 7, 8, and Lemma 4, if we choose \(\alpha = w_j, \beta = w_k, a = d_j - p_j, b = d_k - \tau _1 - p_k, u = \hat{r}_k + p_k, v = \delta _j - p_j\) and \(C = w_k \max \{0,\hat{r}_k+p_k-d_k\} - w_j \max \{0,\hat{r}_j+p_j-d_j\} - \tau _2 \sum _{i \in \mathcal {B}}{w_i} + \tau _2^{\prime } \sum _{i \in \mathcal {B}^{\prime }}{w_i}\). Let \(st_j^*\) be computed as follows:

$$\begin{aligned} st_j^*= t^*(w_j,w_k, d_j \!-\! p_j,d_k - \tau _1 - p_k,\hat{r}_k + p_k,\delta _j - p_j ). \end{aligned}$$

Corollary 1

\(\varGamma ^*_{jk} (\tau _1,\tau _2,\tau _2^{\prime }) = \hat{\varGamma }_{jk} (st_j^*,\tau _1,\tau _2,\tau _2^{\prime })\) is the minimum gain obtained by interchanging jobs \(j\) and \(k\), provided that for every possible \(st_j(S_1)\) interchanging jobs \(j\) and \(k\) is feasible.

To compute \(\varGamma ^*_{jk} (\tau _1,\tau _2,\tau _2^{\prime })\), the values of \(\tau _1, \tau _2\), and \(\tau _2^{\prime }\) must be known. We establish an exhaustive list of cases for which \(\tau _1, \tau _2\), and \(\tau _2^{\prime }\) can be computed, which is summarized in Table 4. Given a particular case, the values \(\tau _1, \tau _2\), and \(\tau _2^{\prime }\) are computed as follows:

$$\begin{aligned}&\tau _1= {\left\{ \begin{array}{ll}{0}&{}{\text {Cases 1,5 }}\\ {\max _{i \in U}\{\hat{r}_i\} - \hat{r}_k - p_k}&{}{\text {Case 2}}\\ {\max \{\hat{r}_j+p_j ,\max _{i \in \mathcal {B}}\{\hat{r}_i\}\} - \hat{r}_k - p_k}&{}{\text {Cases 3,4,6}} \\ {\hat{r}_j + p_j - \hat{r}_k - p_k}&{}{\text {Cases 7,8}} \end{array}\right. } \\&\tau _2= {\left\{ \begin{array}{ll} {p_k - p_j}&{}{\text {Case 1}}\\ {\max _{i \in U}\{\hat{r}_i\} - \hat{r}_k - p_j}&{}{\text {Case 2}}\\ {0}&{}{\text {Cases 3,5,6}}\\ {\max \{\hat{r}_j+p_j ,\max _{i \in \mathcal {B}}\{\hat{r}_i\}\} - \hat{r}_k - p_j}&{}{\text {Case 4}} \\ {\hat{r}_j -\hat{r}_k}&{}{\text {Case 7}}\\ {\hat{r}_j + p_j - \hat{r}_k - p_k}&{}{\text {Case 8}} \end{array}\right. }\\&\tau _2^{\prime }= {\left\{ \begin{array}{ll} {0}&{}{\text {Cases 1,2,4,5,7,8}}\\ {\hat{r}_k - \hat{r}_j}&{}{\text {Case 3}}\\ {\hat{r}_k + p_k - \max \{\hat{r}_j+p_j ,\max _{i \in \mathcal {B}}\{\hat{r}_i\}\}}&{}{\text {Case 6}} \end{array}\right. } \end{aligned}$$

Following the above results, the first part of Dominance rule 4 is derived.

Table 4 Interchange cases

Dominance rule 4

\({\varvec{\mathrm{(DR}_4}}\); first part) Given an FB node associated with \({(\sigma _B,\emptyset )}\), if there exist two non-identical jobs \(j,k \in E_B\) with \(\mathcal {Q}_k \cap \mathcal {Q}_j = \mathcal {Q}_k\) and the inequality \(\varGamma ^*_{jk} (\tau _1,\tau _2,\tau _2^{\prime }) > 0\) holds, then \({(\sigma _B|j,\emptyset )}\) dominates \({(\sigma _B|k,\emptyset )}\).

6.2.2 Interchanging jobs in a BB node

Let \(j,k \in E_E\) where jobs \(j\) and \(k\) are not identical. We also assume that any unscheduled predecessor of job \(k\) is also a predecessor of job \(j\). In other words, we have \(\mathcal {P}_k \cap \mathcal {P}_j \cap U = \mathcal {P}_k \cap U\). Consider a BB node of the search tree with decision job \(k\). The partial schedule associated with the current node can be extended to a feasible schedule \(S_2\) in which job \(j\) is scheduled before job \(k\) but after all jobs in the sequence \(\sigma _B\). The set \(\mathcal {B}\) is the set of all remaining unscheduled jobs where \(\mathcal {B} = U\backslash \{j,k\}\). Let schedule \(S_2^{\prime }\) be constructed by interchanging jobs \(j\) and \(k\), while keeping the order based on which the jobs belonging to \(\mathcal {B}\) will be scheduled. Figure 9 illustrates schedules \(S_2\) and \(S_2^{\prime }\).

Fig. 9
figure 9

Schedules \(S_2\) and \(S_2^{\prime }\)

For each \(t\) as the start time of job \(j\) in \(S_2\), we define a function \(\Delta _{jk} (t)\) as follows:

$$\begin{aligned} \Delta _{jk} (t) =&\, w_j \max \{0,t+p_j-d_j\} \\&- w_k \max \{0,t+p_k-d_k\} \\&+w_k \max \{0,st(\sigma _E)-d_k\} \\&- w_j \max \{0,st(\sigma _E)-d_j\}\\&- \max \{0,p_k - p_j\} \sum _{i \in \mathcal {B}}{w_i}. \end{aligned}$$

In a BB node, for each \(t\) as the start time of job \(j, \Delta _{jk} (t)\) is a lower bound of the gain of interchanging jobs \(k\) and \(j\), if the conditions of Theorem 9 are satisfied. Theorem 9 provides the conditions on which for every possible \(t = st_j(S_1)\) interchanging jobs \(j\) and \(k\) is feasible.

Theorem 9

For each feasible schedule \(S_2\), a feasible schedule \(S_2^{\prime }\) can be created by interchanging jobs \(j\) and \(k\), if the following conditions are satisfied:

  1. 1.

    \(st(\sigma _E) \le \hat{\delta }_j\);

  2. 2.

    \(p_k - p_j \le 0\) or \(st(\sigma _E) - p_j \le \min \limits _{i \in \mathcal {B}}{\hat{\delta }_i}\).

Corollary 2 follows from Theorem 9 and Lemma 4, if we choose \(\alpha = w_j, \beta = w_k, a = d_j - p_j, b = d_k - p_k, u = C_{\max }(\sigma _B), v = st(\sigma _E) - p_k-p_j\), and \(C = w_k \max \{0,st(\sigma _E)-d_k\} - w_j \max \{0,st(\sigma _E)-d_j\} - \max \{0,p_k - p_j\} \sum _{i \in \mathcal {B}}{w_i}\). Let \({st_j^*}^{\prime }\) be computed as follows:

$$\begin{aligned} {st_j^*}^{\prime }= t^*(&w_j,w_k, d_j - p_j,d_k - p_k,\\&C_{\max }(\sigma _B) +\sum _{i \in \mathcal {P}_j \cap U}{p_i}, st(\sigma _E) - p_k-p_j). \end{aligned}$$

Corollary 2

\(\Delta ^*_{jk} = \Delta _{jk} ({st_j^*}^{\prime })\) is the minimum gain obtained by interchanging jobs \(j\) and \(k\), provided that for every possible \(t = st_j(S_1)\) interchanging jobs \(j\) and \(k\) is feasible.

Following the above results, the second part of Dominance rule 4 is derived.

Dominance rule 4

\(({\varvec{\mathrm{DR}_4}}\); second part) Given a BB node associated with \({(\sigma _B,\sigma _E)}\), if there exist two non-identical jobs \(j,k \in E_E\) with \(\mathcal {P}_k \cap \mathcal {P}_j \cap U = \mathcal {P}_k \cap U\) and \(\Delta ^*_{jk} > 0\), then \({(\sigma _B,j|\sigma _E)}\) dominates \({(\sigma _B,k|\sigma _E)}\).

6.3 Dominance rule based on job insertion

We describe a dominance rule based on job insertion. This dominance rule, similar to the dominance rule based on job interchange, consists of two parts. The first part deals with the insertion of a job in an FB node, whereas the second part deals with the insertion of a job in a BB node.

6.3.1 Inserting a job in an FB node

In an FB node, let \(j,k \in E_B\) where jobs \(j\) and \(k\) are not identical. Again we assume that \( \hat{r}_k < \hat{r}_j+p_j\) and \(\hat{r}_j < \hat{r}_k+p_k\), otherwise Dominance rule 2 enforces scheduling the job with smaller \(\hat{r}\) before the job with larger \(\hat{r}\) (remind that \(\hat{r}_j = \bar{r}_j\) and \(\hat{r}_k = \bar{r}_k\) because all predecessors of jobs \(j\) and \(k\) have already been scheduled and therefore the branching decisions cover precedence constraints propagation). Consider an FB node of the search tree in which job \(k\) is scheduled after the jobs in sequence \(\sigma _B\). Assume that the partial schedule associated with the current node can be extended to the feasible schedule \(S_1\) depicted in Fig. 8. We construct a schedule \(S_1^{{\prime }{\prime }}\) by inserting the job \(j\) before job \(k\) while keeping the order of jobs belonging to \(\mathcal {B}\). Figure 10 illustrates the construction of the schedule \(S_1^{{\prime }{\prime }}\).

Fig. 10
figure 10

Schedule \(S_1^{{\prime }{\prime }}\)

Let \(\tau _3\) be the maximum shift to the right of the jobs belonging to \(\mathcal {B}\), which is computed as follows:

$$\begin{aligned} \tau _3= \max \left\{ 0,\hat{r}_j+p_j+p_k-\max \left\{ \hat{r}_k+p_k,\min _{i \in \mathcal {B}}\{\bar{r}_i\}\right\} \right\} . \end{aligned}$$

For each \(t\) as the start time of job \(j\) in schedule \(S_1\), we define a function \(\varGamma _{jk}^{\prime } (t,\tau _3)\) as follows:

$$\begin{aligned} \varGamma _{jk}^{\prime } (t,\tau _3) =&\, w_j \max \{0,t+p_j-d_j\} \\&- w_k \max \{0,\hat{r}_j + p_j +p_k - d_k\} \\&+w_k \max \{0,\hat{r}_k+p_k-d_k\} \\&- w_j \max \{0,\hat{r}_j+p_j-d_j\} - \tau _3 \sum _{i \in OJ}{w_i}. \end{aligned}$$

Job insertion, similar to job interchange, is not always feasible for every starting time \(t\) of job \(j\). We verify feasibility of an insertion by ensuring that it does not cause any deadline and/or precedence-constraint violation for all possible \(t = st_j(S_1)\). Let \(\varPsi ^{\prime }\) be an upper bound for the completion time of the sequence \(S_1^{\prime }\), computed as follows:

$$\begin{aligned} \varPsi ^{\prime }= \max \left\{ \hat{r}_j+p_j+p_k, \max _{i \in \mathcal {B}}\{\hat{r}_i\}\right\} +\sum _{i \in \mathcal {B}}{p_i}. \end{aligned}$$

The following theorem provides the conditions under which for every possible \(t = st_j(S_1)\) inserting job \(j\) before job \(k\) is feasible.

Theorem 10

For each feasible schedule \(S_1\), another feasible schedule \(S_1^{{\prime }{\prime }}\) can be created by inserting job \(j\) before job \(k\) if the following conditions hold:

  1. 1.

    \(\hat{r}_j+p_j+ p_k \le \hat{\delta }_k \);

  2. 2.

    \(\tau _3 = 0\) or \(\varPsi ^{\prime } \le \min \limits _{i \in \mathcal {B}}\{\hat{\delta }_i\}\).

Corollary 3 below follows from Theorem 10.

Corollary 3

\(\varGamma ^{\prime *}_{jk} (\tau _3)=\varGamma ^{\prime }_{jk} (\hat{r}_k + p_k,\tau _3) = \varGamma _{jk} (\hat{r}_k + p_k,\hat{r}_j+p_j - \hat{r}_k - p_k,\tau _3)\) is the minimum gain obtained by inserting job \(j\) before job \(k\) provided that for every possible \(t = st_j(S_1)\) inserting job \(j\) before job \(k\) is feasible.

Following the above results, the first part of Dominance rule 5 is derived.

Dominance rule 5

\(({\varvec{\mathrm{DR}_5}}\); first part) Consider an FB node associated with \({(\sigma _B,\emptyset )}\). If there exist two non-identical jobs \(j,k \in E_B\) for which the inequality \(\varGamma ^{\prime *}_{jk} (\tau _3) > 0\) holds, then \({(\sigma _B|j,\emptyset )}\) dominates \({(\sigma _B|k,\emptyset )}\).

6.3.2 Inserting a job in a BB node

In a BB node, let \(j,k \in E_E\) where jobs \(j\) and \(k\) are not identical. Consider a node of the search tree in which job \(k\) is scheduled before sequence \(\sigma _E\). Assume that the partial schedule associated with the current node can be extended to the feasible schedule \(S_2\) depicted in Fig. 9. We also construct a schedule \(S_2^{{\prime }{\prime }}\) by inserting the job \(j\) to be scheduled after job \(k\) but before the jobs in the sequence \(\sigma _E\) and by keeping the order of jobs belonging to \(\mathcal {B}\). Figure 11 illustrates schedule \(S_2^{{\prime }{\prime }}\).

Fig. 11
figure 11

Schedule \(S_2^{{\prime }{\prime }}\)

For each \(t\), which is the start time of job \(j\) in schedule \(S_2\), we define the function \(\Delta ^{\prime }_{jk} (t)\) as follows:

$$\begin{aligned} \Delta ^{\prime }_{jk} (t) =&\,w_j \max \{0,t+p_j-d_j\} \\&- w_k \max \{0,st(\sigma _E)-p_j-d_k\} \nonumber \\&+w_k \max \{0,st(\sigma _E)-d_k\}\\&- w_j \max \{0,st(\sigma _E)-d_j\}. \end{aligned}$$

Similarly to the previous results, for each feasible schedule \(S_2\), a feasible schedule \(S_2^{{\prime }{\prime }}\) is constructed by inserting jobs \(j\) after job \(k\), if \(st(\sigma _E) \le \hat{\delta }_j\). The following corollary is obtained:

Corollary 4

\(\Delta ^{\prime *}_{jk}=\Delta ^{\prime }_{jk}(C_{\max }(\sigma _B)+\sum _{i \in \mathcal {P}_j \cap U}{p_i})\) is the minimum gain obtained by inserting job \(j\) after job \(k\) provided that \(st(\sigma _E) \le \hat{\delta }_j\).

Following the above results, the second part of Dominance rule 5 is derived.

Dominance rule 5

\(({\varvec{\mathrm{DR}_5}}\); first part) Consider a BB node associated with \({(\sigma _B,\sigma _E)}\). If there exist two non-identical jobs \(j,k \in E_E\) for which the inequality \(\Delta ^{\prime *}_{jk} > 0\) holds, then \({(\sigma _B,j|\sigma _E)}\) dominates \({(\sigma _B,k|\sigma _E)}\).

6.4 Dominance rules on scheduled jobs

The dominance theorem of dynamic programming (see Jouglet et al. 2004) is another existing theorem that can be used to eliminate nodes in the search tree. It compares two partial sequences that contain identical subsets of jobs and eliminates the one having the larger total weighted tardiness. When total weighted tardiness values are the same, then only one of the sequences is kept. Let us consider two feasible partial sequences \(\sigma _1\) and \(\sigma _2\) (\(\sigma _2\) is a feasible permutation of \(\sigma _1\)) of \(k\) jobs, where \(k<n\). Let \(\mathcal {C}\) be the set of jobs in either \(\sigma _1\) or \(\sigma _2\). We are going to decide whether it is advantageous to replace \(\sigma _2\) by \(\sigma _1\) in all (partial) schedules in which \(\sigma _2\) orders the last \(k\) jobs. The set of scheduled jobs and the set of unscheduled jobs are identical for both \(\sigma _1\) and \(\sigma _2\). Sequence \(\sigma _1\) is as good as sequence \(\sigma _2\) if it fulfills one of the following conditions:

  1. 1.

    \(C_{\max }(\sigma _1) \le C_{\max }(\sigma _2)\) and \(\mathrm {TWT}(\sigma _1) \le \mathrm {TWT}(\sigma _2)\);

  2. 2.

    \(C_{\max }(\sigma _1) > C_{\max }(\sigma _2)\) and the following inequality also holds:

    $$\begin{aligned} \mathrm {TWT}(\sigma _1) + \left( \min \limits _{i \in U}\{\bar{r}^{\sigma _1}_i\} - \min \limits _{i \in U}\{\bar{r}^{\sigma _2}_i\}\right) \sum \limits _{i \in U}{w_i} \le \mathrm {TWT}(\sigma _2), \end{aligned}$$

    where \(\bar{r}^{\sigma _1}_i\) is the updated release date associated with the sequence \(\sigma _1\) and \(\bar{r}^{\sigma _2}_i\) is the updated release date associated with the sequence \(\sigma _2\).

Jouglet et al. (2004) determine the sequences that can be replaced by a dominant permutation. They find that sequence \(\sigma _1\) dominates sequence \(\sigma _2\) if the following two conditions hold:

  1. 1.

    sequence \(\sigma _1\) is as good as sequence \(\sigma _2\);

  2. 2.

    sequence \(\sigma _2\) is not as good as \(\sigma _1\) or \(\sigma _1\) has lexicographically smaller release dates than \(\sigma _2\).

Note that the second condition enforces a tie-breaking rule where a lexicographical number associated to each sequence is computed and among those sequences that are equivalent, the one with lower lexicographic number is selected. To avoid conflicts with Dominance rule 2, jobs are renumbered in non-decreasing order of their release dates \(r_j\).

Dominance rule 6

\({\varvec{(\mathrm{DR}_6)}}\) If there exists a better feasible permutation of \(\sigma _B\) and/or a better feasible permutation of \(\sigma _E\), then the node \({(\sigma _B,\sigma _E)}\) is fathomed.

If \(\sigma _E=\emptyset \) and there is a better feasible permutation of \(\sigma _B\), then the dominance is proven similarly to Theorem 13.6 in Jouglet et al. (2004). If \(\sigma _E \ne \emptyset \), then all jobs belonging to the set \(U\) will be scheduled between \(C_{\max }(\sigma _B)\) and \(st(\sigma _E) = C_{\max }(\sigma _B) + \sum _{j \in U}{p_j}\). Therefore, all permutations of \(\sigma _E\) start at time \(st(\sigma _E)\) and if there exists at least one better feasible permutation of \(\sigma _E\), then fathoming the node associated with \({(\sigma _B,\sigma _E)}\) does not eliminate the optimal solution.

Dominance rule 6 where only permutations of the last \(k\) jobs are considered, is referred to as \(\text {DR}_6^k\). Computing \(\text {DR}_6^n\) amounts to enumerating all \(\mathrm {O}(n!)\) feasible solutions, which would yield an optimal solution but is computationally prohibitive. In our B&B algorithm, we therefore choose \(k < n\). There is a trade-off between the computational effort needed to compute \(\text {DR}_6^k\) and the improvement achieved by eliminating dominated nodes. Based on initial experiments (see Table 5; more details on the instance generation are provided in Sect. 8.1), we observe that the algorithms perform worse when \(k > 6\). We also notice that it is not efficient to use \(\text {DR}_6^k\) when \(k > |U|\) because the computational effort to solve the subproblem consisting of the remaining \(|U|\) jobs is less than the computational effort needed to enumerate all feasible permutations of the last \(k\) jobs. Thus, \(k = \min \{|\sigma _B|,|U|,6\}\) while scheduling forward and \(k = \min \{|\sigma _E|,|U|,6\}\) when scheduling backward.

Table 5 Average CPU times (in s; first number) and number of unsolved instances within the time limit (between brackets, if any; out of \(864\)) for different choices of \(k\) in BB1 run on \({\text {Ins}}\)

We observe that in BB2 with unequal release dates, at certain moments during the search procedure, we switch from forward to backward branching, which forces us to start with \(k =|\sigma _E| = 0\) and we thus lose a number of pruning opportunities.

7 Initial upper bound

Although for most of the instances the B&B algorithm finds a reasonably good solution (a tight upper bound) quickly, there are instances for which feasible solutions are encountered only after a large part of the search tree has been scanned. Therefore, we initialize the upper bound in the root node of the B&B algorithm using a stand-alone (heuristic) procedure, which we refer to as time-window heuristic (TWH).

figure a

The key idea of our TWH is to iteratively locally improve a given sequence of jobs within a varying time window (Algorithm 1); similar ideas have already been proposed in the literature (Debels and Vanhoucke 2007; Kinable et al. 2014). It starts with any given sequence (note that finding a feasible sequence might be very difficult for some instances, so we also allow infeasible sequences). Then to locally improve the solution, the algorithm constructs a number of subproblems. Each subproblem is defined by two positions: a \(start\) position and an \(end\) position. The subproblem tries to optimally resequence the jobs that are positioned between the given \(start\) and \(end\) positions in the initial sequence such that the completion time of the subsequence does not exceed the start time of the job in the position \(end+1\). This additional condition is fulfilled by updating the deadline of all jobs \(j\) in the subproblem to \(\delta ^{SP}_j = \min \{\delta _j,st_{end+1}(\sigma )\}\) and updating the release date of all jobs \(j\) in the subproblem to \(r^{SP}_j = \max \{r_j,C_{start-1}(\sigma )\}\).

In TWH, the subprocedure IMPROVE_BY_SWAP is a naive local search procedure in which each pair of jobs is examined for swapping exactly once, in a steepest descent fashion. The length of the subsequence to be reoptimized is in between \(minsize = 10\) and \(maxsize = \min \{\max \{n/2,10\},20\}\). Given a \(start\) and an \(end\) position, CONST_SP constructs the associated subproblem and SOLVE_BB solves the subproblem using the same branch-and-bound algorithms explained in this paper. A new sequence is constructed using COPYSEQ.

The input sequence for TWH is the result of a dynamic priority rule that stepwise schedules jobs (from time \(0\) onwards) that are eligible according to the precedence constraints (meaning that all predecessors were previously already selected) and whose release date has already been reached; if no job is eligible in this way, then the algorithm proceeds to the earliest ready time of all jobs for which all predecessors have already been scheduled. If multiple jobs are eligible, then priority is given to the one with the earliest deadline and the lowest processing time. In the computation of the eligible set of jobs the deadline constraints are ignored. Therefore, the resulting sequence might not be feasible to \(\mathrm {P}\). In such cases, we add a large infeasibility cost to the objective function in the hope of finding a feasible solution during TWH.

The upper bound that is the output of TWH improves the runtime for those instances for which the branch-and-bound algorithm fails to find a feasible solution fast. Furthermore, this upper bound turns out to be optimal for most of the instances of \(\mathrm {P}\) and for those instances for which the optimal solution is not found, the optimality gap is very low; see Table 6 (see Sect. 8.1 for more details on the different instance sets). To evaluate the efficiency of TWH, we have run some computational experiments, the results of which are reported in Tables 6 and 7. In Table 6, column total contains the total number of instances and the values in column feas represent the number of instances for which at least one feasible solution exists. Column fnd reports the number of times TWH finds a feasible solution, column opt counts the number of optimal solutions found, and column gap states the average optimality gap, averaged only for the instances for which the optimal solution was not found by TWH. The average optimality gap is computed as the average value of \(((\mathrm {UB} - \mathrm {OPT}) / \mathrm {UB})\), with \(\mathrm {UB}\) the output of TWH. Table 7 reports the average CPU times for the same subset of instances studied in Table 6. Note that the column that reports the average CPU times for \({\text {Ins}^{\mathrm{PAN}}}\) pertains to all instances with \(n = 30,40\), and \(50\).

Table 6 The performance of TWH
Table 7 Average CPU times (in s) of upper bound computation for different instance sets

8 Computational results

All algorithms have been implemented in VC++ 2010, while CPlex 12.3 is used to solve the MIP formulations. All computational results were obtained on a laptop Dell Latitude with 2.6 GHz Core(TM) i7-3720QM processor, 8GB of RAM, running under Windows 7.

8.1 Instance generation

To the best of our knowledge, there are no benchmark sets of instances of problem P available, and so we have generated our own set of instances, which is referred to as \(\text {Ins}\). Two sets of benchmark instances for subproblems of P are also used in our experiments; these are referred to as \(\text {Ins}^{\mathrm{TAN}}\) and \(\text {Ins}^{\mathrm{PAN}}\) and are discussed in Sects. 8.5.1 and 8.5.2, respectively.

The set \(\text {Ins}\) consists of the two disjoint subsets (namely, \(\text {Ins}^S\) and \(\text {Ins}^L\)) \(\text {Ins}^S\) contains instances with small processing times and \(\text {Ins}^L\) holds instances with large processing times. The values \(p_i (1 \le i \le n)\) are sampled from the uniform distribution \(U[1,\alpha ]\), where \(\alpha = 10\) for \(\text {Ins}^S\) and \(\alpha = 100\) for \(\text {Ins}^L\). For each subset, we generate instances with \(|N| = n = 10, 20, 30, 40\), and \(50\) jobs. Release dates \(r_i\) are drawn from \(U[0,\tau P]\), where \(P=\sum _{i \in N} p_i\) and \(\tau \in \{0.0,0.5, 1.0\}\). Due dates \(d_i\) are generated from \(U[r_i+p_i,r_i+p_i+\rho P]\) with \(\rho \in \{0.05, 0.25, 0.50\}\) and weights \(w_i\) stem from \(U[1,10]\). Up to here our generation is based on the instance generation procedure of Tanaka and Fujikuma (2012). Our modifications pertain to the generation of deadlines and precedence relations among jobs. Deadlines are chosen from \(U[d_i,d_i + \phi P]\) with \(\phi \in \{1.00, 1.25, 1.50\}\).

The addition of precedence constraints may lead to the generation of many instances with no feasible solution. For this reason, for each instance we first construct a feasible solution without considering precedence constraints (using branch-and-bound). Next, the jobs are re-indexed according to the job order in this feasible solution. If no feasible solution exists even without precedence constraints, we use the original indices. Subsequently, a precedence graph is created using the RanGen software (Demeulemeester et al. 2003) with \(\mathrm{OS} \in \{0.00, 0.25, 0.50, 0.75\}\), where OS is the order strength of the graph (a measure for the density of the graph). For any instance, if a feasible solution exists without precedence constraints, then the addition of precedence constraints will never render it infeasible because RanGen only generates arcs from lower indexed to higher indexed jobs.

In conclusion, for each combination of \((\alpha ,n,\tau ,\rho ,\phi ,\mathrm{OS})\), four instances are generated; the total number of instances is thus \(2 \times 5 \times 3 \times 3 \times 3 \times 4 \times 4 = 4320\). In all our experiments, the time limit is set to 1200 s. If an instance is not solved to guaranteed optimality, it is said to be ‘unsolved’ for the procedure. Throughout this section, we report averages computed only over the solved instances.

8.2 Lower bounds

We compare the quality of the lower bounds for the subset of instances with large processing times and \(n=30\). We set \(k_{\max } = 10\) for all lower bounds. The detailed results of this comparison are reported in Table 8.

Table 8 Average percentage gap from optimal value

The average gap for \(\mathrm {LB}_1\) is less than or equal to that for \(\mathrm {LB}_0\), especially when the precedence graph is dense; for \(\mathrm{OS} = 0\), on the other hand, there are no precedence constraints and \(\mathrm {LB}_0\) and \(\mathrm {LB}_1\) are essentially the same. A similar observation can be made for \(\mathrm {LB}_1\) and \(\mathrm {LB}_2\), where the gap for \(\mathrm {LB}_2\) is noticeably smaller than that for \(\mathrm {LB}_1\) when release dates are imposed, while in the case \(\tau = 0\), only one block is created and the lower bounds \(\mathrm {LB}_1\) and \(\mathrm {LB}_2\) coincide. The average gap for \(\mathrm {LB}_3\) is indeed smaller than that for \(\mathrm {LB}_2\), as was to be expected according to Observation 1.

Although we have no theoretical result that would indicate a better performance of \(\mathrm {LB}_2^{\mathrm{SS}_\mathrm{r}}\) in comparison with \(\mathrm {LB}_2\), the average gap for \(\mathrm {LB}_2^{\mathrm{SS}_\mathrm{r}}\) is less than that for \(\mathrm {LB}_2\) in case of non-zero release dates. When release dates are zero, however, the gap for \(\mathrm {LB}_2^{\mathrm{SS}_\mathrm{r}}\) is larger than or equal to that for \(\mathrm {LB}_2\). In fact, when release dates are zero, only one block is created and constraints (19) can be removed from \(\mathrm {LR}_{\mathrm{P}_2}\), and thus \(\mathrm {LR}_{\mathrm{P}_2}\) is a relaxation of \(\mathrm {LR}_{\mathrm{P}_1}\). \(\mathrm {LB}_2^{\mathrm{SS}_\delta }\) performs better than \(\mathrm {LB}_2\) and \(\mathrm {LB}_2^{\mathrm{SS}_\mathrm{r}}\) for most of the instances. Since \(\mathrm {LB}_2^{\mathrm{SS}_\delta }\) and \(\mathrm {LB}^{\mathrm{SS}_\mathrm{r}}_2\) require the same computational effort, we decide not to use \(\mathrm {LB}^{\mathrm{SS}_\mathrm{r}}_2\) in our B&B algorithms, where enumeration of child nodes is more efficient than extra computation of a weak bound. The gap for \(\mathrm {LB}^{\mathrm{SS}_\mathrm{r}}_3\) is less than that for \(\mathrm {LB}_2^{\mathrm{SS}_\mathrm{r}}\) and a similar observation holds for \(\mathrm {LB}^{\mathrm{SS}_\delta }_3\) versus \(\mathrm {LB}_2^{\mathrm{SS}_\delta }\), which confirms the result in Observation 1. Again, since \(\mathrm {LB}^{\mathrm{SS}_\mathrm{r}}_3\) and \(\mathrm {LB}^{\mathrm{SS}_\delta }_3\) are equally expensive in terms of computational effort, we decide not to use \(\mathrm {LB}^{\mathrm{SS}_\mathrm{r}}_3\).

In our final implementation, we will not compute all the bounds for all the nodes because this consumes too much effort. We start with computing \(\mathrm {LB}_\mathrm{T}, \mathrm {LB}_0\), and \(\mathrm {LB}_{\mathrm{SV1}}\) for the unscheduled jobs. Let \(S_\mathrm{best}\) be the best feasible schedule found. If the node is fathomed by \(\text {DR}_1\), then we backtrack; otherwise if \(\mathrm {TWT}(S_P)\,+\,\lceil \mathrm {LB}_0 + \mathrm {LB}_{\mathrm{SV1}}\rceil \times 1.4< \mathrm {TWT}(S_\mathrm{best})\) then we do not compute the remaining lower bounds and continue branching. If the latter equality does not hold, then we anticipate that with a better bound we might still be able to fathom the node, and we compute \(\mathrm {LB}_3\) and/or \(\mathrm {LB}^{\mathrm{SS}_\delta }_3\). For all lower bounds we choose \(k_{\max } = 0\) if \(\mathrm{OS} < 0.5\) and \(k_{\max } = 1\) otherwise.

8.3 Dominance rules

In each node of the B&B algorithm, dominance rules are tested. Based on some preliminary experiments, we find that applying the rules in the following order performs well, and we will therefore follow this order throughout the algorithm:

$$\begin{aligned} \text {DR}_2, \text {DR}_3, \text {DR}_6^2, \text {DR}_4, \text {DR}_5, \text {DR}_6^{k}, \text {DR}_1. \end{aligned}$$

In order to evaluate the effectiveness of the rules, we examine a number of scenarios with respect to the selection of the implemented bounds; the list of scenarios is given in Table 9. Scenario 1 includes the simplest combination of dominance rules, namely \(\text {DR}_2, \text {DR}_3\), and \(\text {DR}_6^2\). From Scenario 2 to Scenario 4, extra rules are gradually added. In Scenario 5, all dominance rules are active except \(\text {DR}_4\), and in Scenario 6, only \(\text {DR}_5\) is inactive. Scenario 7 similarly includes all dominance rules except \(\text {DR}_6\). Finally, in Scenario 8, all dominance rules are active.

Table 9 The list of scenarios

For each of these implementations, we report the average CPU times and the average number of nodes explored in the search tree in Table 10; the results pertain to the instances of \(\text {Ins}\) with \(n= 20,30\). Scenarios 2 and 3 show the effect of \(\text {DR}_4\) and \(\text {DR}_5\). In Scenario 2, \(\text {DR}_4\) improves the performance of both algorithms, whereas in Scenario 3 \(\text {DR}_5\) has a beneficial effect only for BB2. Scenario 4 reflects the impact of \(\text {DR}_6\) for \(k\) jobs.

Table 10 The effect of the dominance rules

Comparing Scenario 5 to Scenario 8, we see that inclusion of \(\text {DR}_1\) has a strong beneficial effect on both algorithms; the effect is strongest in BB2 because tighter bounds can be computed by scheduling backward. From Table 10, we learn that apart from \(\text {DR}_2\), which is always crucial in total tardiness scheduling problems, the most important dominance rule is \(\text {DR}_6\): deactivating this rule triggers a huge increase in the average number of nodes and the average CPU times; incorporating \(\text {DR}_4\) also has a marked effect (compare Scenarios 5 and 8). Among all dominance rules tested, \(\text {DR}_5\) is the least important; removing \(\text {DR}_5\) slightly increases the node count and the runtimes in BB2. In BB1, removing \(\text {DR}_5\) even decreases the number of nodes and the runtimes; it turns out that for \(n>30\), however, the effect of \(\text {DR}_5\) is also (slightly) beneficial for BB1, and so we decide to adopt Scenario 8 as the final setting in which the experiments in the following sections will be run.

As a side note, we observe that for all the foregoing dominance rules, after the root node, omitting the precedence constraints implied by sets \(\mathcal {Q}_j\) and \(\mathcal {P}_j\) from the updates of \(\bar{r}_j\) and \(\bar{\delta }_j\) has only little effect. We will therefore not include these precedence constraints into the updated release dates and deadlines and thus avoid the additional computational overhead.

8.4 Branch-and-bound algorithms

In this section we discuss the performance of our B&B algorithms. In Table 11 we compare the performance of BB1 and BB2 with the MIP formulations discussed in Sect. 3. In this table as well as in the following, we report the average runtime and the number of unsolved instances (if there are any).

Table 11 Average CPU times (in s) and number of unsolved instances within the time limit (out of \(432\)) for the MIP formulations and the B&B algorithms run on \({\text {Ins}}\) with \(n = 10, 20\) and \(30\)

Based on Table 11, we conclude that the time-indexed formulations are far better than the assignment formulations when processing times are small. For large processing times, the performance of ASF is slightly better than TIF. Although ASF\(^{\prime }\) and TIF\(^{\prime }\) are tighter than their counterparts with aggregate precedence constraints, the extra computational effort needed to process the larger models increases CPU times in both TIF\(^{\prime }\) and ASF\(^{\prime }\). The B&B algorithms BB1 and BB2 both clearly outperform the MIP formulations regardless of the size of the processing times. Table 12 shows the performance of BB1 and BB2 applied to the larger instances of \(\text {Ins}\) (\(n = 40\) and \(50\)) that cannot be solved by the MIP formulations. On average, BB1 performs better than BB2, although this does not hold for all parameter settings (more details follow below). BB1 solves all instances with 40 jobs and fails to solve around 1.5 % of the instances with 50 jobs. BB2 fails to solve one instance with 40 jobs and around 2 % of the instances with 50 jobs. We will indicate below that all these unsolved instances belong to a specific class; it is worth mentioning that the difficult instances are not the same for the two B&B algorithms.

Table 12 Average CPU times (in s) and number of unsolved instances within the time limit (out of \(432\)) for BB1 and BB2 run on \({\text {Ins}}\) with \(n = 40\) and \(50\)

The number of precedence constraints obviously affects the performance of the algorithms (Table 13). On the one hand, by adding precedence constraints, the set of feasible sequences shrinks; on the other hand, the lower bounds also become less tight. The net result of these two effects is a priori not predictable. For instance classes without release dates and deadlines (\(r_j= 0\) and \(\delta _j = \infty \)), the quality of the lower bound is very good when \(\mathrm{OS} = 0\); therefore, the effect of a weaker bound due to higher OS will be more pronounced than when release dates and deadlines are also imposed.

Table 13 Average CPU times (in s) and number of unsolved instances within the time limit (out of \(126\)) for different choices of \(n\) and OS in BB1 and BB2 run on \({\text {Ins}}\)

To identify the classes of difficult instances, we focus on case \(n=50\). Table 14 shows the outcomes of the experiments for each combination of \(\tau , \rho \), and OS. According to this table, the most time-consuming class of instances is the one where release dates are neither loose nor tight (\(\tau = 0.50\)), due dates are loose (\(\rho = 0.50\)) and the set of precedence constraints is empty (\(\mathrm{OS} = 0\)). No clear pattern can be distinguished for the algorithmic performance as a function of the tightness of the deadlines, so these results are excluded from the table. The unsolved instances are distributed differently for the two algorithms, although \(\tau = 0.5\) in all and \(\mathrm{OS} = 0\) in most of the unsolved instances. For example, BB1 solves all instances with \(\mathrm{OS} = 0.25\), whereas BB2 does not solve six of these instances. Also, BB1 fails to solve two instances with \({\rho = 0.25}\), whereas this occurs for only one instance with \({\rho = 0.25}\) for BB2.

Table 14 Average CPU times (in s) and number of unsolved instances within the time limit (out of \(24\)) for different choices of \(\tau , \rho \) and OS in BB1 and BB2 run on \({\text {Ins}}\) with \(n = 50\)

We will represent each class of instances by a triple \((\tau ,\rho ,OS)\). As mentioned before, the hardest class of instances for both algorithms is \((0.5,0.5,0)\). The class \((0,0.5,0)\), which seems to be the third most difficult class for BB1, is very easy for BB2. Also, \((0.5,0.5,0.25)\), which is the second hardest class for BB2, does not require very high runtimes from BB1. We infer that BB2 is better than BB1 when release dates are equal (zero); in this case (cf. Sect. 4) stronger bounds are computed in backward scheduling. Conversely, BB1 is better than BB2 when release dates are not equal (especially when \(\tau = 0.50\)). With unequal release dates, backward branching cannot start from the root node but rather only after a certain number of jobs have already been scheduled. Because branching forward increases the earliest possible starting and completion times of jobs, the trivial lower bound and the Lagrangian-based lower bounds will be stronger for BB1 than those for BB2. As explained at the end of Sect. 6.4 and contrary to BB2, in BB1 \(k\) is never restarted in the computation of \(\text {DR}_6^{k}\) and therefore we do not lose any pruning opportunity.

8.5 Experiments for subproblems of \(\mathrm {P}\)

In this section, we present the results of our B&B algorithms for subproblems of \(\mathrm {P}\) that have also been studied in the earlier literature.

8.5.1 A single-machine problem with precedence constraints: \(1|\mathrm{prec}|\sum w_j T_j\)

One special case of P is single-machine scheduling with precedence constraints where the objective is to minimize the total weighted tardiness. From our observations in Sect. 8.4, we know that we only need to consider BB2 for this subproblem because all release dates are zero, and so we compare the performance of BB2 with the SSDP algorithm proposed by Tanaka and Sato (2013). We apply both algorithms to the benchmark instances \(\text {Ins}^\mathrm{TAN}\) obtained from Tanaka and Sato (2013). For these instances, parameter \(Pr\) denotes the probability that each arc \((i,j)\in N \times N\) with \(i\ne j\) is present in the precedence graph. Note that the resulting precedence graph may contain transitive arcs. In such cases, the transitive reduction is computed and used as input to BB2. Table 15 shows the computational results for our B&B algorithms and for the SSDP algorithm (which was run on the same computer). SSDP solves instances in very short runtimes when there are no precedence constraints. SSDP performs worse, however, when the precedence graph is dense, while the B&B algorithms will tend to perform better exactly in this case. To conclude this comparison, we underline the fact that our algorithms have been developed to solve the more general setting in which time windows are also imposed, whereas the instance set examined here does not contain such time windows.

Table 15 Average CPU times (in s) for different choices of \(Pr\) and \(n\) in BB2 and SSDP run on \(\text {Ins}^{\mathrm{TAN}}\)

8.5.2 A single-machine problem with time windows: \(1|r_j,\delta _j|\sum w_j C_j\)

Another special case of P is the single-machine problem with time windows where the objective function is to minimize the total weighted sum of the completion times. We run our B&B algorithms on one of the instance sets provided by Pan and Shi (2005), which has been introduced as problem set (I) in which the parameters \(\alpha \) and \(\beta \) define the ranges for the generation of release dates and deadlines, respectively. We refer to this instance set as \(\text {Ins}^{\mathrm{PAN}}\). To solve these instances, we set all due dates to zero. Table 16 shows the computational results of BB1 and BB2 applied to \(\text {Ins}^{\mathrm{PAN}}\). Our B&B algorithms both solve 394 out of the 400 instances to optimality within the time limit of 1200 s. Although a consistent pattern cannot be recognized, it seems that the hardest instances belong to the subsets where \(\alpha = 1\) and \(\beta = 16\). Since we do not have access to the code of Pan and Shi, direct comparisons are difficult, but overall our runtimes are of the same order of magnitude, although the most difficult instances for Pan and Shi are not the most difficult ones for our code, and vice versa.

Table 16 Average CPU times (in s; first number) and number of unsolved instances within the time limit (between brackets, if any; out of \(10\)) for different choices of \(n, \alpha \) and \(\beta \) in BB1 and BB2 run on \(\text {Ins}^{\mathrm{PAN}}\)

Contrary to the discussion in Sect. 8.4, we notice that our two algorithms behave quite similarly for these instances. This can be explained as follows. First, for all members of \(\text {Ins}^{\mathrm{PAN}}\), release dates are non-zero, such that BB2 follows the same steps as BB1 until the release dates of all remaining jobs are less than the decision time. Second, the fact that due dates are zero makes all jobs late already in the root node and thus the scheduling of any job (even in the beginning of the schedule) has a positive contribution in the objective value. For the case where due dates are non-zero, scheduling backward is advantageous because jobs are mostly early in the beginning of the schedule, so they have zero contribution in the objective value.

9 Summary and conclusion

In this article, we have developed exact algorithms for the single-machine scheduling problem with total weighted tardiness penalties. We work with a rather general problem statement, in that both precedence constraints as well as time windows (release dates and deadlines) are part of the input; this generalizes quite a number of problems for which computational procedures have already been published. We develop a branch-and-bound algorithm that solves the problem to guaranteed optimality. Computational results show that our approach is effective in solving medium-sized instances, and that it compares favorably with two straightforward linear formulations. Our procedure was also compared with two existing methods, namely an SSDP algorithm and a B&B algorithm, for special cases of the problem. The SSDP algorithm requires only very low runtimes in the absence of precedence constraints, but it performs worse when the precedence graph is dense, which is exactly the easiest setting for our B&B algorithms.