1 Introduction

We consider the \(P_m || C_{\max }\) scheduling problem [as denoted in the three-field classification by Graham et al. (1979)] where the goal is to schedule n jobs on m identical parallel machines \(M_i~(i=1,\dots ,m)\) to minimize the makespan. \(P_m || C_{\max }\) is strongly NP-hard (Garey and Johnson 1979) and has been intensively investigated in the literature both from a theoretical and a practical point of view. For an exhaustive discussion, we refer, among others, to books by Leung et al. (2004), Pinedo (2016) and to the comprehensive survey by Chen et al. (1999). The pioneering approximation algorithm for the problem is the Longest Processing Time (LPT) rule proposed by Graham (1969). It requires to sort the jobs in non-ascending order of their processing times \(p_j ~(j=1,\dots ,n)\) and then, given the sorted job set, to assign one job at a time to the machine whose load is smallest so far. This assignment of jobs to machines is also known as list scheduling (LS). Several properties have been established for LPT in the last decades (Graham 1969; Coffman Jr. and Sethi 1976; Chen 1993; Blocher and Sevastyanov 2015). Among the various approaches explicitly based on the LPT rule, we mention also Dosa (2004), Dosa and Vizvari (2006). We recall the main theoretical results for LPT in the next section. LPT generally exhibits much better performance in practice than the expected theoretical ratios, especially as the number of jobs gets larger. Frenk and Rinnooy Kan (1987) also showed that LPT is asymptotically optimal under mild conditions on the probability distribution of the job processing times. Due to its simplicity and practical effectiveness, LPT became a cornerstone for the design of more involving exact or heuristic algorithms.

We mention other popular approximation algorithms which exploit connections of \(P_m || C_{\max }\) with bin packing: MULTIFIT (Coffman Jr. et al. 1978), COMBINE (Lee and Massey 1988) and LISTFIT (Gupta and Ruiz-Torres 2001). Such algorithms provide better worst-case performance than LPT but at the cost of higher running times. More recently, other more involved heuristics are available (see, e.g., Paletta and Ruiz-Torres 2015) that require, however, much higher computational effort. In addition, we mention the Largest Differencing Method (LDM) proposed by Karmarkar and Karp (1982), an efficient polynomial-time algorithm for which Michiels et al. (2007) showed that an approximation ratio not superior to the LPT ratio holds. Also, polynomial-time approximation schemes (PTASs) were derived for the problem. The first PTAS was given by Hochbaum and Shmoys (1987). PTASs with improved running times were then provided by Alon et al. (1998), Hochbaum (1997), Jansen (2010). Recently, an improved PTAS has been proposed by Jansen et al. (2017).

The contribution of this work is twofold. First, we revisit the LPT rule and provide a simple algorithmic variant that manages to improve the long-standing approximation ratio derived by Graham (1969) keeping the same computational complexity. To establish our theoretical results, we also employ linear programming (LP) to analyze the worst-case performance of the proposed algorithm and to derive approximation bounds. In a sense, this paper can also be seen as a follow-up of the work of Mireault et al. (1997) where several LPs were used to determine the worst-case approximation ratio of LPT on two uniform machines. Recently a growing attention has been paid to the use of LP modeling for the derivation of formal proofs (see Chimani and Wiedera 2016; Abolhassani et al. 2016; Della Croce et al. 2018) and we also show here a successful application of this technique.

We then move from approximation to heuristics. By generalizing the proposed LPT-based approach, we obtain a simple algorithm running in \({\mathcal {O}}(n\log n)\) time which drastically improves upon LPT and can hence be regarded as a valuable alternative to the most popular constructive heuristic designed for this problem.

2 Notation and LPT properties

We first recall the main theoretical properties of LPT applied to \(P_m || C_{\max }\). From now on, we will consider the job set \({\mathcal {J}} = \{1,\dots , n\}\) sorted by non-increasing \(p_j\) (\(p_j \ge p_{j+1}, \;\; j=1,\dots ,n-1\)). We denote the solution values of the LPT schedule and the optimal makespan by \(C_{m}^{LPT}({\mathcal {J}})\) and \(C_{m}^*({\mathcal {J}})\), respectively, where index m indicates the number of machines. Also, similarly to Chen (1993), we denote by \(r_l^{LPT}\) the performance ratio \(\frac{C_{m}^{LPT}({\mathcal {J}})}{C_{m}^*({\mathcal {J}})}\) of an LPT schedule with l jobs assigned to the machine yielding the completion time (the critical machine) and by \(j^\prime \) the last job assigned to the machine that gives the corresponding makespan (the critical job). As usually employed in the worst-case performance analysis of LPT, from now on we assume that the critical job is the last one, namely \(j^\prime = n\). Otherwise, we would have other jobs scheduled after the critical job that do not affect the makespan provided by LPT but can contribute to increase the optimal solution value. We summarize hereafter bounds on \(C_m^*({\mathcal {J}})\) and properties of the LPT schedule available in the literature.

Proposition 1

(Pinedo 2016) The following expressions hold:

$$\begin{aligned}&C_m^*({\mathcal {J}}) \ge \max \left\{ \frac{\sum \limits _{j=1}^{n} p_j}{m}, p_1 \right\} ; \end{aligned}$$
(1)
$$\begin{aligned}&C_{m}^{LPT}({\mathcal {J}}) = C_m^*({\mathcal {J}}) \text { if } p_{j^\prime } > \frac{C_{m}^*({\mathcal {J}})}{3}; \end{aligned}$$
(2)
$$\begin{aligned}&\text {otherwise, i.e., if} \, p_{j^\prime } \le \frac{C_{m}^*({\mathcal {J}})}{3}, \nonumber \\&C_{m}^{LPT}({\mathcal {J}}) \le \frac{\sum \limits _{j=1}^{j^\prime } p_j}{m} + p_{j^\prime }\left( 1-\frac{1}{m}\right) \nonumber \\&\quad \le C_{m}^*({\mathcal {J}}) + p_{j^\prime }\left( 1-\frac{1}{m}\right) . \end{aligned}$$
(3)

Proposition 2

(Chen 1993) For each job i assigned by LPT in position j on a machine, the following inequality holds

$$\begin{aligned}&p_i \le \frac{C_{m}^*({\mathcal {J}})}{j}. \end{aligned}$$
(4)

Proposition 3

The following tight approximation ratios hold for LPT:

$$\begin{aligned}&r_2^{LPT} \le \frac{4}{3} - \frac{1}{3(m-1)}; \quad \text {(Chen 1993)} \end{aligned}$$
(5)
$$\begin{aligned}&r_l^{LPT} \le \frac{l+1}{l} - \frac{1}{lm} \quad l \ge 3. \quad \text {(Coffman (Jr.) and Sethi 1976)} \end{aligned}$$
(6)

Approximation bounds (6) were derived by Coffman Jr. and Sethi (1976) and are also known as the a posteriori generalized bounds. For \(l=3\), the corresponding approximation ratio of \(\frac{4}{3} - \frac{1}{3m}\) is the well-known Graham’s bound (Graham 1969) and constitutes the worst-case bound for LPT. A straightforward implication of Proposition 2 is the following:

Lemma 1

If LPT provides a schedule where a non-critical machine processes at least k jobs before the critical job \(j^\prime \), then \(\frac{C_{m}^{LPT}({\mathcal {J}})}{C_{m}^*({\mathcal {J}})} \le \frac{k+1}{k} - \frac{1}{km}\).

Proof

Denote by i the job in the kth position on the non-critical machine, with \(p_i \ge p_{j^\prime }\). Since we have \(p_i \le \frac{C_{m}^*({\mathcal {J}})}{k}\) from Proposition 2 and \(p_{j^\prime } \le \frac{C_{m}^*({\mathcal {J}})}{k}\) also holds, we get from inequality (3) that \(C_{m}^{LPT}({\mathcal {J}}) \le (\frac{k+1}{k} - \frac{1}{km} ) C_{m}^*({\mathcal {J}})\). \(\square \)

3 LPT revisited

3.1 Results for LPT

We provide here further insights into the Longest Processing Time rule. We first elaborate on the approximation bound provided in Lemma 1. We state the following proposition.

Proposition 4

If LPT schedules at least k jobs on a non-critical machine before assigning the critical job, then \(\frac{C_{m}^{LPT}({\mathcal {J}})}{C_{m}^*({\mathcal {J}})} \le \frac{k+1}{k} - \frac{1}{k(m-1)}\) for \(m \ge k + 2\).

Proof

First, we can assume that the critical machine processes at least two jobs; otherwise, the LPT solution would be optimal as \(C_{m}^*({\mathcal {J}}) \ge p_1\) due to Proposition 1. Also, due to Proposition 2, condition \(p_{n} \le \frac{C_{m}^*({\mathcal {J}})}{k}\) holds. Denote by \(t_{c}\) the completion time of the critical machine before loading critical job n. We have \(C_{m}^{LPT}({\mathcal {J}}) = t_{c} + p_{n}\). Also, denote by \(t^{\prime }\) the completion time of a non-critical machine processing at least k jobs and by \(t^{\prime \prime }\) the sum of completion times of the other \((m-2)\) machines, namely \(t^{\prime \prime } = \sum \limits _{j=1}^{n} p_j - t^{\prime } - (t_{c} + p_{n})\). Since the application of list scheduling to the sorted jobs, each of the \((m-2)\) machines must have a completion time not inferior to \(t_{c}\). Hence, the following inequality holds

$$\begin{aligned}&\frac{t^{\prime \prime }}{m-2} \ge t_{c}. \end{aligned}$$
(7)

We now rely on linear programming to evaluate the worst-case performance ratio of LPT for any \(k \ge 1\) and \(m \ge k + 2\). More precisely, we introduce an LP formulation where we can arbitrarily set the value \(C_{m}^{LPT}({\mathcal {J}})\) to 1 and minimize the value of \(C_{m}^*({\mathcal {J}})\). We associate nonnegative variables \(sum_p\) and opt with \(\sum \limits _{j=1}^{n} p_j\) and \(C_{m}^*({\mathcal {J}})\), respectively. We also consider completion times \(t_{c}\), \(t^{\prime }\), \(t^{\prime \prime }\) and processing time \(p_n\) as nonnegative variables in the LP model. Since we have \(p_{n} \le \frac{C_{m}^*({\mathcal {J}})}{k}\), we introduce an auxiliary slack variable sl to write the corresponding constraint in the LP model as \(p_{n} + sl - \frac{opt}{k} = 0\). This implies the following LP model parametric in m and k:

$$\begin{aligned} \text {minimize}\quad&opt \end{aligned}$$
(8)
$$\begin{aligned} \text {subject to}\quad&- m\cdot opt + sum_p \le 0 \end{aligned}$$
(9)
$$\begin{aligned}&k\cdot p_n - t^{\prime } \le 0 \end{aligned}$$
(10)
$$\begin{aligned}&t_{c} - t^{\prime } \le 0 \end{aligned}$$
(11)
$$\begin{aligned}&(m-2)t_{c} - t^{\prime \prime } \le 0 \end{aligned}$$
(12)
$$\begin{aligned}&(t_{c} + p_n) + t^{\prime } + t^{\prime \prime } - sum_p = 0 \end{aligned}$$
(13)
$$\begin{aligned}&t_{c} + p_n = 1 \end{aligned}$$
(14)
$$\begin{aligned}&p_n+ sl - \frac{opt}{k} = 0 \end{aligned}$$
(15)
$$\begin{aligned}&t_{c}, t^{\prime }, t^{\prime \prime }, p_n, sum_p, opt, sl \ge 0 \end{aligned}$$
(16)

The minimization of the objective function (8), after setting without loss of generality the LPT solution value to 1 (constraint (14)), provides an upper bound on the performance ratio of LPT rule. Constraint (9) represents the bound \(C_{m}^*({\mathcal {J}}) \ge \frac{\sum \nolimits _{j=1}^{n} p_j}{m}\), while constraint (10) states that the value of \(t^{\prime }\) is at the least \(kp_{n}\), since k jobs with a processing time not inferior to \(p_{n}\) are assigned to the non-critical machine. Constraint (11) states that the completion time of the critical machine before the execution of the last job is not superior to the completion time of the other machine processing at least k jobs. Constraint (12) fulfills inequality (7). Constraint (13) guarantees that variable \(sum_p\) corresponds to \(\sum \nolimits _{j=1}^{n} p_j\) and constraint (15) represents condition \(p_{n} \le \frac{C_{m}^*({\mathcal {J}})}{k}\). Eventually, constraints (16) state that all variables are nonnegative. Note that model (8)–(16) is independent from the number of jobs n. By analyzing the solutions of model (8)–(16) for several given values of m and k, we were able to deduce that a feasible solution of the model, for any \(k \ge 1\) and \(m \ge 2\), is:

$$\begin{aligned}&t_{c} = \frac{k(m - 1) - 1}{(k + 1)m - k - 2}; ~ p_n = \frac{m - 1}{(k + 1)m - k - 2}; \\&t^{\prime } = \frac{k(m - 1)}{(k + 1)m - k - 2}; ~ t^{\prime \prime } = \frac{(m - 2)(k(m - 1) - 1)}{(k + 1)m - k - 2}; \\&opt = \frac{k(m - 1)}{(k + 1)m - k - 2}; ~ sum_p = \frac{m(m - 1)k}{(k + 1)m - k - 2}; \\&sl = 0. \end{aligned}$$

We can show by strong duality that such a solution is in fact optimal for any \(m \ge k + 2\). Plugging \( p_n = \frac{opt}{k} - sl\) from constraint (15) in constraints (13) and (14), we get an equivalent reduced LP model. If we associate dual variables \(\lambda _i\)\((i=1,\dots ,6)\) with constraints (9)–(14), respectively, the corresponding dual formulation of the reduced problem is as follows:

$$\begin{aligned} \text {maximize}\quad&\lambda _6 \end{aligned}$$
(17)
$$\begin{aligned} \text {subject to}\quad&- m\lambda _1 + \lambda _2 + \frac{\lambda _5 + \lambda _6}{k} \le 1 \end{aligned}$$
(18)
$$\begin{aligned}&\lambda _1 - \lambda _5 \le 0 \end{aligned}$$
(19)
$$\begin{aligned}&-k\cdot \lambda _2 - \lambda _5 - \lambda _6 \le 0 \end{aligned}$$
(20)
$$\begin{aligned}&-\lambda _2 - \lambda _3 + \lambda _5 \le 0 \end{aligned}$$
(21)
$$\begin{aligned}&\lambda _3 + (m - 2)\lambda _4 + \lambda _5 + \lambda _6 \le 0 \end{aligned}$$
(22)
$$\begin{aligned}&-\lambda _4 + \lambda _5 \le 0 \end{aligned}$$
(23)
$$\begin{aligned}&\lambda _1, \lambda _2, \lambda _3, \lambda _4 \le 0 \end{aligned}$$
(24)

where the dual constraints (18)–(23) are related to the primal variables opt, \(sum_p, sl\), \(t^{\prime }, t_{c}, t^{\prime \prime }\), respectively. For any \(m \ge k + 2\), a feasible solution of model (17)–(24) is

$$\begin{aligned} \lambda _1&= \lambda _2 = \lambda _4 = \lambda _5 = \frac{-k}{(k + 1)m - k - 2}; \\ \lambda _3&= 0; ~ \lambda _6 = \frac{k(m - 1)}{(k + 1)m - k - 2}; \end{aligned}$$

where condition \(m \ge k + 2\) is necessary to satisfy constraint (20). Since \(opt = \lambda _6\) in the above solutions, by strong duality these solutions are both optimal. We hence have

$$\begin{aligned} \frac{C_{m}^{LPT}({\mathcal {J}})}{C_{m}^*({\mathcal {J}})}\le & {} \frac{1}{opt} = \frac{(k + 1)m - k - 2}{k(m - 1)} \\= & {} \frac{(k +1)(m - 1) - 1}{k(m - 1)} = \frac{k + 1}{k} - \frac{1}{k(m-1)} \end{aligned}$$

which shows the claim. \(\square \)

Notably, with respect to Lemma 1, the result of Proposition 4 for \(k = 3\) provides already a better bound than Graham’s bound and equal to \(\frac{4}{3} - \frac{1}{3(m-1)}\) for \(m \ge 5\). Also, with the results of Proposition 4 we can state the following result.

Proposition 5

In \(P_m || C_{\max }\) instances with \(n \ge 2m + 2\) and \(m \ge 5\), LPT has an approximation ratio not superior to \(\frac{4}{3} - \frac{1}{3(m-1)}\).

Proof

In instances with \(n \ge 3m+1\), there exists at least one machine in the LPT schedule executing at least four jobs. Then, either the machine is the critical one or not. Correspondingly, either inequality (6) with \(l \ge 4\) holds or Proposition 4 with \(k \ge 3\) applies showing the above claim.

Consider the remaining cases with \(2m + 2 \le n \le 3m\). We assume the critical job n in third position on a machine; otherwise, either bound on \(r_2^{LPT}\) holds or at least bound on \(r_4^{LPT}\) holds. This implies that LPT schedules at least another job in position \(\ge 3\) on a non-critical machine. Hence, the results of Proposition 4 with \(k = 3\) apply. \(\square \)

In instances with \(n \ge 2m+2\) and \(3 \le m \le 4\), we can combine the reasoning underlying model (8)–(16) with a partial enumeration of the optimal/LPT solutions and state the following result.

Proposition 6

In \(P_m || C_{\max }\) instances with \(n \ge 2m+2\), LPT has an approximation ratio not superior to \(\frac{4}{3} - \frac{1}{3(m-1)})\) for \(3 \le m \le 4\).

Proof

See “Appendix.” \(\square \)

Consider now instances with 2m jobs at most. The following proposition holds.

Proposition 7

In \(P_m || C_{\max }\) instances with \(n \le 2m\), LPT has an approximation ratio not superior to \(\frac{4}{3} - \frac{1}{3(m-1)}\).

Proof

We consider the case \(n=2m\) only. All other cases \(n < 2m\) can be reduced to the previous one after adding \(2m - n\) dummy jobs with null processing time.

It is well known (see, e.g., Graham 1969) that if each machine processes two jobs at most in an optimal schedule, the solution provided by LPT would be optimal. Hence, we consider the case where there is one machine processing at least three jobs in an optimal solution. This situation straightforwardly implies that job 1 has to be processed alone on a machine. Therefore, we have \(C_{m}^*({\mathcal {J}}) \ge C_{m-1}^*({\mathcal {J}} \setminus \{1\})\) since the optimal makespan with m machines could be as well given by the machine processing only job 1.

On the other hand, to contradict the claim, LPT must have the critical machine processing more than two jobs; otherwise, we could use bound (5). This implies that job 1 is processed alone on a machine and cannot give the makespan; otherwise, LPT solution would be optimal due to expression (1). We thus have \(C_{m}^{LPT}({\mathcal {J}}) = C_{m-1}^{LPT}({\mathcal {J}} \setminus \{1\})\). Combining these results with Graham’s bound on the problem instance with \(m -1\) machines and without job 1, we get

$$\begin{aligned}&\frac{C_{m}^{LPT}({\mathcal {J}})}{C_{m}^*({\mathcal {J}})} \le \frac{C_{(m-1)}^{LPT}({\mathcal {J}} \setminus \{1\})}{C_{(m-1)}^*({\mathcal {J}} \setminus \{1\})} \le \frac{4}{3} - \frac{1}{3(m-1)}. \end{aligned}$$

\(\square \)

For instances with exactly \(2m + 1\) jobs, we provide the following proposition.

Proposition 8

In instances with \(n = 2m + 1\), if LPT loads at least three jobs on a machine before the critical job, then the approximation ratio is not superior to \(\frac{4}{3} - \frac{1}{3(m-1)}\).

Proof

If LPT schedules at least three jobs on a machine before critical job n, this means that job 1 is processed either alone on a machine or with critical job n only. In the latter case, the claim is showed through bound (5). Alternatively, job 1 is processed alone on machine \(M_1\). Also, \(M_1\) cannot give the makespan; otherwise, LPT would yield an optimal solution. This implies that \(C_{m}^{LPT}({\mathcal {J}}) = C_{m-1}^{LPT}({\mathcal {J}} \setminus \{1\})\) and that a trivial upper bound on the LPT solution value is equal to \(p_1 + p_n\) as the starting time of job n is not superior to \(p_1\). In this case, if an optimal solution schedules job 1 with another job, we have \(C_{m}^*({\mathcal {J}}) \ge p_1 + p_n\), and thus, LPT would also give the optimal makespan. If an optimal solution schedules job 1 alone on \(M_1\), then inequality \(C_{m}^*({\mathcal {J}}) \ge C_{m-1}^*({\mathcal {J}} \setminus \{1\})\) holds. Combining these results with Graham’s bound as in Proposition 7, we have

$$\begin{aligned} \frac{C_{m}^{LPT}({\mathcal {J}})}{C_{m}^*({\mathcal {J}})} \le \frac{C_{(m-1)}^{LPT}({\mathcal {J}} \setminus \{1\})}{C_{(m-1)}^*({\mathcal {J}} \setminus \{1\})} \le \frac{4}{3} - \frac{1}{3(m-1)}. \end{aligned}$$

\(\square \)

Summarizing the above properties, the following theorem holds.

Theorem 1

LPT has an approximation ratio not superior to \(\frac{4}{3} - \frac{1}{3(m-1)}\) in all instances with \(m \ge 3\) and \(n \ne 2m + 1\). Also, LPT can actually hit Graham’s bound of \(\frac{4}{3} - \frac{1}{3m}\) for \(m \ge 2\) only in instances with \(2m + 1\) jobs with the critical machine processing three jobs and all the other machines processing two jobs.

3.2 Improving the LPT bound: Algorithm LPT-REV

We consider a slight algorithmic variation in LPT where a subset of the sorted jobs is first loaded on \(M_1\), and then, LPT is applied to the remaining job set on all machines (including \(M_1\)). We denote this variant as \(LPT({\mathcal {S}})\) where \({\mathcal {S}}\) represents the set of jobs assigned altogether to a machine first. Consider the following algorithm, denoted by LPT-REV, which first applies LPT and then \(LPT({\mathcal {S}})\) for at most two suitable choices of the job set \({\mathcal {S}}\).

figure a

In practice, LPT-REV algorithm applies LPT first and then re-applies LPTafter having loaded on a machine first either its critical job \(j^\prime \) or the tuple of the smallest k jobs \((j^\prime - k+1),\ldots ,j^\prime \). (this latter case applies only for \(m = 2\)). Eventually, the algorithm takes the best of the computed solutions.

In the following, we will show how to improve the long-standing Graham’s bound by means of algorithm LPT-REV for \(m \ge 3\). To this extent, given Theorem 1, we just need to address the case where the LPT critical job \(j^\prime \) is job \(2m + 1\). So we consider instances with either \(j^\prime = 2m + 1 < n\) or \(j^\prime = 2m + 1 = n\). The LPT schedules with \(j^\prime = 2m + 1 < n\) will be considered in Sect. 3.2.1 by means of Proposition 9 and additional remarks. The LPT schedules with \(j^\prime = 2m + 1 = n\) will be covered in Sect. 3.2.2 by means of Propositions 1014. Finally, the approximation ratio of LPT-REV will be stated in Sect. 3.2.3 in Theorem 2.

3.2.1 LPT schedules with \(j^\prime = 2m + 1 < n\), \(m\ge 3\)

If there exists a job \(i> j^\prime \) that is critical in the solution provided by \(LPT^\prime \), the following proposition holds.

Proposition 9

In \(P_m || C_{\max }\) instances where there exists a job i such that \(j^\prime < i \le n\) and i is critical in \(LPT^\prime \) schedule, LPT-REV algorithm has a performance guarantee of \(\frac{4}{3} - \frac{7m - 4}{3(3m^2 + m - 1)}\).

Proof

Denote by \(\beta \sum \nolimits _{j=1}^{n} p_j\) the overall processing time of jobs \(j^\prime + 1, \dots , n\), with \(0< \beta < 1\). Due to Graham’s bound, the following relation holds for LPT when only jobs \(1, \dots , j^\prime \) are considered:

$$\begin{aligned} C_{m}^{LPT}({\mathcal {J}})&\le \frac{\sum \nolimits _{j=1}^{j^\prime } p_j}{m}\left( \frac{4}{3} - \frac{1}{3m}\right) \nonumber \\&= \frac{(1 - \beta )\sum \nolimits _{j=1}^{n} p_j}{m}\left( \frac{4}{3} - \frac{1}{3m}\right) . \end{aligned}$$
(25)

From (25), we have

$$\begin{aligned}&\frac{C_{m}^{LPT}({\mathcal {J}})}{C_{m}^*({\mathcal {J}})} \le \frac{C_{m}^{LPT}({\mathcal {J}})}{\frac{\sum \nolimits _{j=1}^{n} p_j}{m}} \le (1 - \beta )\left( \frac{4}{3} - \frac{1}{3m}\right) . \end{aligned}$$
(26)

We introduce a target LPT approximation ratio denoted as \(\rho \) and identify the value of \(\beta \) which gives such a bound. We have

$$\begin{aligned}&(1 - \beta )\left( \frac{4}{3} - \frac{1}{3m}\right) = \rho \implies \beta = 1 - \frac{3m\rho }{4m - 1}. \end{aligned}$$
(27)

Consider now the solution provided by \(LPT^\prime \) and denote by \(t_{c^\prime }\) the processing time of the jobs on the critical machine that are processed before the critical job i. Since the following relations hold

$$\begin{aligned}&mt_{c^\prime } + p_i \le \sum \limits _{j=1}^{n} p_j \le mC_{m}^{*} \implies t_{c^\prime } + \frac{p_i}{m} \le C_{m}^{*};\\&p_i \le \beta \sum \limits _{j=1}^{n} p_j \le m\beta C_{m}^{*}, \end{aligned}$$

we have, in combination with (27), that

$$\begin{aligned}&C_{m}^{LPT^\prime }({\mathcal {J}}) = t_{c^\prime } + p_i = \left( t_{c^\prime } + \frac{p_i}{m}\right) + p_i\left( 1 - \frac{1}{m}\right) \nonumber \\&\quad \le C_{m}^*({\mathcal {J}}) +(m-1) \beta C_{m}^*({\mathcal {J}}) \nonumber \\&\quad \implies \frac{C_{m}^{LPT^\prime }({\mathcal {J}})}{ C_{m}^*({\mathcal {J}})} \le 1 + (m-1)\beta \nonumber \\&\quad = 1 + (m-1)\left( 1 - \frac{3m\rho }{4m - 1}\right) . \end{aligned}$$
(28)

Hence, algorithm LPT-REV has a performance guarantee equal to \(\min \{1 + (m-1)(1 - \frac{3m\rho }{4m - 1}); \rho \}\). This expression reaches its largest value when the two terms are equal, namely

$$\begin{aligned}&1 + (m-1)\left( 1 - \frac{3m\rho }{4m - 1}\right) = \rho . \end{aligned}$$
(29)

From condition (29), we derive

$$\begin{aligned} \rho&= \frac{4m -1}{1 + 3m - \frac{1}{m}} = \frac{4}{3} - \frac{7m - 4}{3(3m^2 + m - 1)} \\&\ge \frac{C_{m}^{\textit{LPT-REV}}({\mathcal {J}})}{C_{m}^*({\mathcal {J}})}. \end{aligned}$$

\(\square \)

It easy to check that the bound of Proposition 9 is strictly inferior to \(\frac{4}{3} - \frac{1}{3(m-1)}\) for \(m \ge 3\). If the critical job in \(LPT^\prime \) schedule is a job \(h \le j^\prime \), then LPT-REV cannot have a worse approximation ratio than the one reached in a reduced instance where all jobs after \(j^\prime \) are discarded and the same analysis of Sect. 3.2.2 applies.

3.2.2 LPT schedules with \(j^\prime = 2m + 1 = n\), \(m \ge 3\)

We show here that the addition of \(LPT^\prime \) allows us to improve Graham’s bound to \(\frac{4}{3} - \frac{1}{3(m-1)}\). We concentrate on instances with \(2m+1\) jobs where LPT must couple jobs \(1,\dots , m\), respectively, with jobs \(2m, \dots , m+1\) on the m machines before scheduling job \(2m + 1\); otherwise, Theorem 1 would hold. Therefore, we will consider the following LPT schedules with pair of jobs on each machine \(M_i\)\((i=1,\dots ,m)\):

$$\begin{aligned}&M_1: p_{1}, p_{2m} \\&M_2: p_{2}, p_{2m-1} \\&\dots \\&M_{m-1}: p_{m - 1}, p_{m + 2} \\&M_m: p_{m}, p_{m + 1} \end{aligned}$$

where job \(2m+1\) will be assigned to the machine with the least completion time. We analyze the following two subcases.

Case 1: \(p_{(2m+1)} \ge p_{1} - p_{m}\)

In this case, the last job \(2m+1\) has a processing time greater than (or equal to) the difference \( p_{1} - p_{m}\). Consider \(LPT^\prime \) heuristic. Since the LPT critical job is assumed to be the last one, i.e., \(j^\prime = 2m + 1\), the heuristic will assign jobs \(2m+1, 1,\dots ,m-1\) to machines \(M_1, M_2, \dots M_m\), respectively. Then, job m will be loaded on \(M_1\) together with job \(2m+1\). Since \(p_{(2m+1)} + p_{m} \ge p_{1} \), job \(m + 1\) will be processed on the last machine \(M_m\) after job \(m-1\). Now we have

$$\begin{aligned} p_{(m-1)} + p_{(m+1)} \ge p_{(2m+1)} + p_{m} \ge p_{1} \end{aligned}$$

since \(p_{(m-1)} \ge p_{m}\) and \(p_{(m+1)} \ge p_{(2m+1)}\). Hence, job \(m+2\) is loaded on machine \(M_{(m-1)}\) with job \(m-2\). Similarly as before, it follows that \(p_{(m-2)} + p_{(m+2)} \ge p_{(2m+1)} + p_{m}\). Consequently, job \(m+3\) is processed on \(M_{(m-2)}\) after job \(m-3\). By applying the same argument, \(LPT^\prime \) will assign the second job in reversed order to each machine until job \(2m - 1\) is assigned to \(M_2\). Eventually, job 2m will be assigned to \(M_1\) since it will be the least loaded machine at that point. This because all other machines have the largest (resp. smallest) job with processing time not inferior to \(p_m\) (resp. \(p_{(2m +1)}\)). Summarizing, \(LPT^\prime \) will provide the following schedule:

$$\begin{aligned}&M_1: p_{(2m+1)}, p_m, p_{2m} \\&M_2: p_{1}, p_{(2m-1)} \\&M_3: p_{2}, p_{(2m-2)} \\&\dots \\&M_{(m-1)}: p_{(m - 2)}, p_{(m + 2)} \\&M_m: p_{(m - 1)}, p_{(m + 1)} \end{aligned}$$

Assume now that the critical machine is \(M_1\) with non-optimal completion time \(C_{m}^{LPT^\prime }({\mathcal {J}}) = p_{(2m+1)} + p_m + p_{2m} > C_{m}^*({\mathcal {J}})\) (or else LPT-REV would provide the optimal solution). The following proposition holds:

Proposition 10

If \(p_{(2m+1)} \ge p_{1} - p_{m}\) and \(C_{m}^{LPT^\prime }({\mathcal {J}}) = p_{(2m+1)} + p_m + p_{2m} > C_{m}^*({\mathcal {J}})\), then \(C_{m}^*({\mathcal {J}}) \ge p_{(m-1)} + p_m\) in any optimal schedule.

Proof

We prove the claim by contradiction. We assume that an optimal schedule assigns jobs \(1, 2, \dots , m\) to different machines or else \(C_{m}^*({\mathcal {J}}) \ge p_{(m-1)} + p_m\) immediately holds. Correspondingly, since there exists a machine processing three jobs, the optimal makespan can be lower bounded by \(p_m + p_{2m} + p_{(2m+1)}\). But as \(p_m + p_{2m} + p_{(2m+1)} > C_{m}^*({\mathcal {J}})\) holds, a contradiction on the optimality of the schedule is implied. \(\square \)

The following proposition also holds.

Proposition 11

If \(p_{(2m+1)} \ge p_{1} - p_{m}\) and \(C_{m}^{LPT^\prime }({\mathcal {J}}) = p_{(2m+1)} + p_m + p_{2m}\), then algorithm LPT-REV has an approximation ratio not superior to \(\frac{7}{6}\).

Proof

We again employ linear programming to evaluate the performance of \(LPT^\prime \). More precisely, we consider an LP formulation with nonnegative variables \(p_j\) (\(j=1,\dots ,n\)) denoting the processing times and a positive parameter \(OPT > 0\) associated with \(C_{m}^*({\mathcal {J}})\). The corresponding LP model for evaluating the worst-case performance of \(LPT^\prime \) heuristic is as follows:

$$\begin{aligned} \text {maximize}\quad&p_{(2m+1)} + p_m + p_{2m} \end{aligned}$$
(30)
$$\begin{aligned} \text {subject to}\quad&p_{(m-1)} + p_m \le OPT \end{aligned}$$
(31)
$$\begin{aligned}&p_{(2m-1)} + p_{2m} + p_{(2m+1)} \le OPT \end{aligned}$$
(32)
$$\begin{aligned}&p_{(2m+1)} - (p_{1} - p_{m}) \ge 0 \end{aligned}$$
(33)
$$\begin{aligned}&p_1 - p_{(m-1)} \ge 0 \end{aligned}$$
(34)
$$\begin{aligned}&p_{(m-1)} - p_m \ge 0 \end{aligned}$$
(35)
$$\begin{aligned}&p_m - p_{(m+1)} \ge 0 \end{aligned}$$
(36)
$$\begin{aligned}&p_{(m+1)} - p_{(2m-1)} \ge 0 \end{aligned}$$
(37)
$$\begin{aligned}&p_{(2m-1)} - p_{2m} \ge 0 \end{aligned}$$
(38)
$$\begin{aligned}&p_{2m} - p_{(2m + 1)} \ge 0 \end{aligned}$$
(39)
$$\begin{aligned}&p_1, p_{(m-1)}, p_m, p_{(m+1)}, p_{(2m-1)}, p_{2m}, p_{(2m+1)} \ge 0 \end{aligned}$$
(40)

The objective function value (30) represents an upper bound on the worst-case performance of the algorithm. Constraints (31)–(32) state that the optimal value \(C_{m}^*({\mathcal {J}})\) is lower bounded according to Proposition 10 and by the sum of the three jobs with the smallest processing times. Constraint (33) simply represents the initial assumption \(p_{(2m+1)} \ge p_{1} - p_{m}\). Constraints (34)–(39) state that the considered relevant jobs are sorted by non-increasing processing times, while constraints (40) indicate that the variables are nonnegative. We remark that parameter OPT can be arbitrarily set to any value \(> 0\). Further valid inequalities (such as \(p_{(2m+1)} + p_m + p_{2m} \ge p_{1} + p_{(2m-1)}\) or \(OPT \ge \frac{\sum \limits _{j=1}^{n} p_j}{m}\)) were omitted as they do not lead to any improvement on the worst-case performance ratio. Notice that the number of variables of the reduced LP formulation is constant for any value of m.

By setting w.l.o.g. \(OPT = 1\) and solving model (30)–(40), we get an optimal solution value \(z^*\) equal to \(1.1666\ldots = \frac{7}{6}\). Correspondingly, the approximation ratio is \(\frac{z^*}{OPT} = \frac{7}{6}\). \(\square \)

Consider now the case where the makespan of \(LPT^\prime \) schedule is given by one of the machines \(M_2, \dots , M_m\). In such a case, a trivial upper bound on \(LPT^\prime \) makespan is equal to \(p_1 + p_{m+1}\). We state the following proposition.

Proposition 12

If \(p_{(2m+1)} \ge p_{1} - p_{m}\) and the makespan of \(LPT^\prime \) is not on \(M_1\), then coupling LPT with \(LPT^\prime \) gives a performance guarantee not superior to \(\frac{15}{13}\) for \(m=3\) and \(\frac{4}{3} - \frac{1}{2m-1}\) for \(m \ge 4\).

Proof

We again consider an LP formulation with nonnegative variables \(p_j\) (\(j=1,\dots ,n\)), a positive parameter \(OPT > 0\) and two nonnegative auxiliary variables \(\alpha , y\). We can evaluate the worst-case performance of \(LPT + LPT^\prime \) by the following LP model

$$\begin{aligned} \text {maximize}\quad&y \end{aligned}$$
(41)
$$\begin{aligned} \text {subject to}\quad&\sum \limits _{j=1}^{2m + 1} p_j \le mOPT \end{aligned}$$
(42)
$$\begin{aligned}&p_{(2m-1)} + p_{2m} + p_{(2m+1)} \le OPT \end{aligned}$$
(43)
$$\begin{aligned}&p_{(j + 1)} - p_j \le 0 \quad j=1, \dots , 2m; \end{aligned}$$
(44)
$$\begin{aligned}&p_{j} + p_{(2m - j + 1)} - \alpha \ge 0 \quad j=1, \dots , m; \end{aligned}$$
(45)
$$\begin{aligned}&p_{(2m+1)} + \alpha - y \ge 0 \end{aligned}$$
(46)
$$\begin{aligned}&p_{(2m+1)} - (p_1 - p_m) \ge 0 \end{aligned}$$
(47)
$$\begin{aligned}&p_{1} + p_{(m + 1)} - y \ge 0 \end{aligned}$$
(48)
$$\begin{aligned}&p_j \ge 0 \quad j=1, \dots , 2m + 1; \end{aligned}$$
(49)
$$\begin{aligned}&\alpha , y \ge 0 \end{aligned}$$
(50)

where y represents the solution value reached by \(LPT + LPT^\prime \) and \(\alpha \) is the starting time of job \(2m + 1\) in LPT. Constraints (45) indicate that \(\alpha \) corresponds to the smallest load on a machine after processing jobs \(1,\dots ,2m\). The solution value of LPT is therefore the sum \(\alpha + p_{(2m + 1)}\). The objective function (41) provides an upper bound on the makespan of \(LPT + LPT^\prime \) since it maximizes the minimum between \(\alpha + p_{(2m + 1)}\) and the makespan reached by \(LPT^\prime \) through variable y and related constraints (46) and (48). Constraints (42) and (43) state that \(C_{m}^*({\mathcal {J}})\) is lower bounded by \(\frac{\sum \limits _{j=1}^{n} p_j}{m}\) and by \(p_{(2m - 1)} + p_{2m} + p_{(2m + 1)}\). Constraints (44) indicate that jobs are sorted by non-increasing processing time, while constraint (47) represents condition \(p_{(2m+1)} \ge p_{1} - p_{m}\). Finally, constraints (49) and (50) indicate that all variables are nonnegative.

By setting w.l.o.g. \(OPT = 1\), a feasible solution of model (41)–(50) for any value of m is:

$$\begin{aligned}&y = \frac{8m - 7}{3(2m - 1)}; ~ \alpha = \frac{2(m - 1)}{2m - 1};\\&p_1 = \frac{5m - 4}{3(2m - 1)}; ~ p_2 = p_3 = \cdots = p_{(m-1)} = \frac{4m - 5}{3(2m - 1)}; \\&p_m = p_{m + 1} = \frac{m - 1}{2m - 1}; \\&p_{m + 2} = p_{m + 3} = \cdots = p_{2m + 1} = \frac{1}{3}. \end{aligned}$$

We can show by strong duality that this solution is optimal for any \(m \ge 4\). The dual model with variables \(\lambda _i\)\((i= 1,\dots , 3m + 5)\) associated with constraints (42)–(48) is as follows:

$$\begin{aligned} \text {minimize}\quad&m\lambda _1 + \lambda _2 \end{aligned}$$
(51)
$$\begin{aligned} \text {subject to}\quad&\lambda _1 - \lambda _3 + \lambda _{(2m + 3)} - \lambda _{(3m + 4)} + \lambda _{(3m + 5)} \ge 0 \end{aligned}$$
(52)
$$\begin{aligned}&\lambda _1 + \lambda _{(1 + j)} - \lambda _{(2 + j)} + \lambda _{(2m + 2 + j)} \ge 0 \quad j=2, \dots , m - 1 \end{aligned}$$
(53)
$$\begin{aligned}&\lambda _1 + \lambda _{(m + 1)} - \lambda _{(m +2)} + \lambda _{(3m + 2)} +\lambda _{(3m + 4)} \ge 0 \end{aligned}$$
(54)
$$\begin{aligned}&\lambda _1 + \lambda _{(m + 2)} - \lambda _{(m + 3)} + \lambda _{(3m + 2)} +\lambda _{(3m + 5)} \ge 0 \end{aligned}$$
(55)
$$\begin{aligned}&\lambda _1 + \lambda _{(1 + j)} - \lambda _{(2 + j)} + \lambda _{(4m + 3 - j)} \ge 0 \quad j=m+2, \dots , 2m - 2 \end{aligned}$$
(56)
$$\begin{aligned}&\lambda _1 + \lambda _{2} + \lambda _{2m} - \lambda _{(2m + 1)} + \lambda _{(2m + 4)} \ge 0 \end{aligned}$$
(57)
$$\begin{aligned}&\lambda _1 + \lambda _{2} + \lambda _{(2m+1)} - \lambda _{(2m+2)} + \lambda _{(2m + 3)} \ge 0 \end{aligned}$$
(58)
$$\begin{aligned}&\lambda _1 + \lambda _{2} + \lambda _{(2m+2)} + \lambda _{(3m + 3)} + \lambda _{(3m + 4)} \ge 0 \end{aligned}$$
(59)
$$\begin{aligned}&-\sum \limits _{j=(2m + 3)}^{(3m + 2)}\lambda _j + \lambda _{(3m + 3)} \ge 0 \end{aligned}$$
(60)
$$\begin{aligned}&- \lambda _{(3m + 3)} - \lambda _{(3m + 5)} \ge 1 \end{aligned}$$
(61)
$$\begin{aligned}&\lambda _1, \lambda _2, \dots , \lambda _{(2m + 2)} \ge 0 \end{aligned}$$
(62)
$$\begin{aligned}&\lambda _{(2m + 3)} , \lambda _{(2m + 4)}, \dots \lambda _{(3m + 5)} \le 0 \end{aligned}$$
(63)

Constraints (52)–(61) correspond to primal variables \(p_j, \alpha , y\), respectively. A feasible solution of model (51)–(63) for \(m \ge 4\) is:

$$\begin{aligned}&\lambda _1 = \frac{2}{2m - 1}; ~ \lambda _2 = \frac{2m - 7}{3(2m - 1)}; \\&\lambda _3 = \lambda _4 = \dots = \lambda _{(m+1)} = 0; \\&\lambda _{(m+2)} = \frac{1}{2m - 1}; ~ \lambda _{(m+3)} = \lambda _{(m+4)} = \dots = \lambda _{2m} = 0; \\&\lambda _{(2m + 1)} = \frac{2m - 7}{3(2m - 1)}; \\&\lambda _{(2m + 2)} = \frac{4(m - 2)}{3(2m - 1)}; ~ \lambda _{(2m+3)} = 0; \\&\lambda _{(2m+4)} = \lambda _{(2m+5)} = \dots = \lambda _{(3m + 1)} = \frac{-2}{2m - 1}; \\&\lambda _{(3m + 2)} = \frac{-1}{2m - 1}; ~ \lambda _{(3m + 3)} = \frac{3 - 2m}{2m - 1}; \\&\lambda _{(3m + 4)} = 0; ~ \lambda _{(3m + 5)} = \frac{-2}{2m - 1}. \end{aligned}$$

The corresponding solution value is \(m\lambda _1 + \lambda _2 = \frac{8m - 7}{3(2m - 1)} = y\). Hence, for \(m \ge 4\) we have:

$$\begin{aligned} \frac{\min \{C_{m}^{LPT}({\mathcal {J}}),C_{m}^{LPT^\prime } ({\mathcal {J}})\}}{C_{m}^*({\mathcal {J}})}\le & {} \frac{y}{OPT} = \frac{8m - 7}{3(2m - 1)} \\= & {} \frac{4(2m - 1) - 3}{3(2m - 1)} \\= & {} \frac{4}{3} - \frac{1}{2m-1} \end{aligned}$$

For \(m=3\), an optimal solution of model (41)–(50) has value \(y = 1.15385\ldots =\frac{15}{13}\).

The corresponding approximation ratio is then \(\frac{y}{OPT} = \frac{15}{13}\). \(\square \)

Case 2: \(p_{(2m+1)} < p_{1} - p_{m}\)

The processing time of the last job is smaller than the difference \(p_{1} - p_{m}\). The following proposition holds.

Proposition 13

If \(p_{(2m+1)} < p_{1} - p_{m}\), the solution given by LPT has a performance guarantee not superior to \(\frac{15}{13}\) for \(m=3\) and \(\frac{4}{3} - \frac{1}{2m-1}\) for \(m \ge 4\).

Proof

We consider the worst-case LPT performance and notice that it can be evaluated through model (41)–(50) by simply reversing the inequality sign in constraint (47) and disregarding constraint (48). Correspondingly, dual model (51)–(63) is still implied with the differences that variable \(\lambda _{(3m + 5)}\) is discarded and that we have \(\lambda _{(3m + 4)} \ge 0\). The primal solutions turn out to be the same solutions stated in Proposition 12. Likewise, dual solutions slightly modify in the following variables entries which do not contribute to the objective function: \(\lambda _{(3m + 2)} = \frac{-3}{2m - 1}; \lambda _{(3m + 3)} = -1; \lambda _{(3m + 4)} = \frac{2}{2m - 1}\). Therefore, the approximation ratios stated in Proposition 12 hold. \(\square \)

3.2.3 The resulting approximation ratio of LPT-REV

We can now state the following theorem for LPT-REV.

Theorem 2

Algorithm LPT-REV runs in \({\mathcal {O}}(n \log n)\) time and has an approximation ratio not superior to \(\frac{4}{3} - \frac{1}{3(m-1)}\) for \(m \ge 3\).

Proof

Putting together the results of Propositions 111213, it immediately follows that LPT-REV has an approximation ratio not superior to \(\frac{4}{3} - \frac{1}{3(m-1)}\) for \(m \ge 3\). Besides, it is well known that the running time of the LPT heuristic is in \({\mathcal {O}}(n \log n)\): sorting the jobs by processing time has complexity \({\mathcal {O}}(n \log n)\), while an efficient implementation of the underlying LS procedure may employ a Fibonacci’s heap for extracting the machine with smallest load at each iteration with overall complexity \({\mathcal {O}}(n \log m)\). Since the proposed algorithm requires to run first LPT (to compute \(z_1\)) and then LS heuristic twice (to compute \(z_2\) and \(z_3\)) after the job sorting, the resulting time complexity is \({\mathcal {O}}(n \log n)\). \(\square \)

We notice that algorithm LPT-REV has a worst-case performance at least as good as the one of the more computationally demanding Largest Differencing Method of Karmarkar and Karp (1982), which runs in \({\mathcal {O}}(n \log n + n m \log m)\) and has an approximation ratio proved to be in the interval \(\left[ \frac{4}{3} - \frac{1}{3(m-1)}, \frac{4}{3} - \frac{1}{3m}\right] \) for \(m \ge 3\) (Michiels et al. 2007).

3.2.4 LPT-REV performance analysis for \(P_2 || C_{\max }\)

We remark that in the problem variant with two machines \((m=2)\), the approximation ratio of \(\frac{4}{3} - \frac{1}{3(m-1)} = 1\) cannot be reached by any polynomial-time algorithm unless \({\mathcal {P}} = \mathcal {NP}\), since \(P_2 || C_{\max }\) is well known to be NP-hard. At the current state of the art, the approach proposed by He et al. (2000) for \(P_2 || C_{\max }\) is the best approximation algorithm running with linear time complexity and has an approximation bound of \(\frac{12}{11}\). Although algorithm LPT-REV does not improve this bound, the following analysis shows that the proposed approach reaches an approximation ratio not superior to \(\frac{9}{8}\) and thus improves upon the bound of \(\frac{7}{6}\) implied for LPT [as well as for Karmarkar–Karp algorithm, as shown by Fischetti and Martello (1987)] when \(m=2\).

We proceed by assuming that the critical job in the LPT solution is the last one \((j^\prime = n)\). Otherwise, we would get an approximation ratio of \(\frac{14}{13} < \frac{9}{8}\) by using the same reasoning employed in the proof of Proposition 9 taking into account in this case both \(LPT^{\prime }\) and \(LPT^{\prime \prime }\). Given Lemma 1 and bound (6), an approximation bound worse than \(\frac{9}{8}\) could be reached only in instances with \(n \le 6\) and where LPT schedules no more than three jobs on each machine.

For \(n \le 4\), LPT will output an optimal solution according to Proposition 7. For \(n = 6\) , we can evaluate the worst-case LPT performance by model (64)–(70) (in “Appendix”) since \(n=3m\). The corresponding optimal objective value for \(m=2\) is \(opt = 0.8888\ldots =\frac{8}{9}\) which gives an approximation ratio equal to \(\frac{1}{opt} = \frac{9}{8}\). For \(n = 5\), we have the following proposition.

Proposition 14

In \(P_2 || C_{\max }\) instances with \(n = 5\), LPT-REV provides an optimal solution.

Proof

According to Proposition 8, LPT could not give the optimal makespan only if it assigns jobs 1 and 4 to \(M_1\) and jobs 2 and 3 to \(M_2\) before assigning the critical job 5. Considering this LPT schedule, we analyze all possible optimal solution configurations and show that LPT-REV always identifies them. We have the following two cases:

  • Case 1: jobs 1 and 2 are on the same machine in an optimal solution.

    There exists an optimal solution which assigns jobs 3, 4, 5 to \(M_1\) and jobs 1, 2 to \(M_2\). In fact, any other schedule cannot provide a smaller makespan. The same solution is also provided by \(LPT^{\prime \prime }\).

  • Case 2: jobs 1 and 2 are on different machines in an optimal solution.

    If there exists an optimal solution with job 1 on \(M_1\) and jobs 2 and 3 on \(M_2\), such a solution coincides with the LPT schedule. If instead an optimal solution assigns jobs 1 and 3 to \(M_1\), then it must assign jobs 2, 4 and 5 to \(M_2\). If \(p_3 = p_4\), clearly LPT also gives the optimal makespan. If \(p_3 > p_4\), inequality \(p_1 \le p_2 + p_5\) must hold or else the optimal solution would be improved by exchanging job 3 with job 4 on the machines giving a contradiction. But then, inequality \(p_1 \le p_2 + p_5\) implies that \(LPT^\prime \) solution is also optimal since it assigns jobs 5, 2 and 4 to \(M_1\) and jobs 1 and 3 to \(M_2\).

\(\square \)

4 From approximation to heuristics: a new LPT-based approach

Moving from approximation algorithms to heuristics, we decided to check whether it could possible from LPT-REV to come up with a heuristic procedure competitive with the relevant literature with emphasis on algorithms running with very low computational complexity. We considered to this extent the benchmark literature instances provided by Iori and Martello (2008). All tests were conducted on an Intel i5 CPU @ 2.30 GHz with 8 GB of RAM, and all algorithms have been implemented in C++.

Table 1 SLACK versus LPT/COMBINE/LDM: performance comparison on \(P_m || C_{\max }\) instances from Iori and Martello (2008)
Table 2 SLACK + NS versus LDM: performance comparison on \(P_m || C_{\max }\) instances from Iori and Martello (2008)

Iori and Martello (2008) considered two classical classes of instances from the literature: uniform instances proposed by França et al. (1994) and non-uniform instances proposed by Frangioni et al. (2004). In uniform instances, the processing times are integer uniformly distributed in the range [ab]. In non-uniform instances, \(98\%\) of the processing times are integer uniformly distributed in \([0.9(b-a), b]\), while the remaining ones are uniformly distributed in \([a, 0.2(b-a)]\). For both classes, we have \(a = 1; b = 100, 1000, 10000\). For each class, the following values were considered for the number of machines and jobs: \(m = 5, 10, 25\) and \(n = 10, 50, 100, 500, 1000\). For each pair (mn) with \(m < n\), 10 instances were generated for a total of 780 instances.

Preliminary testing showed that LPT-REV is not significantly superior to LPT, being able to improve only 142 out of 780 instances. Besides, it can be noted that in our theoretical results, \(LPT^\prime \) was necessary to improve Graham’s bound for \(m\ge 3\), while \(LPT^{\prime \prime }\) was necessary for \(m = 2\) only. Remarkably, for \(m \ge 3\) the relevant subcase was the one with \(2m + 1\) jobs, \(p_{2m+1} \ge p_1-p_m\) and \(LPT^\prime \) required to schedule job \(2m+1\) first and then to apply list scheduling first to the sorted job subset \(\{1,\ldots ,m\}\) and then to the sorted job subset \(\{m+1,\ldots ,2m\}\). This suggests a general greedy approach that considers not only the ordering of the jobs but also the differences in processing time within job subsets of size m. We propose a constructive procedure that splits the sorted job set in tuples of m consecutive jobs (\(1,\dots ,m; m+1,\dots ,2m;\) etc.) and sorts the tuples in non-increasing order of the difference between the largest job and the smallest job in the tuple. Then, list scheduling is applied to the set of sorted tuples. We denote this approach as SLACK.

figure b

In terms of computational complexity, since construction and sorting of the tuples can be performed in \({\mathcal {O}}(\Bigl \lceil \frac{n}{m} \Bigr \rceil \log \Bigl \lceil \frac{n}{m} \Bigr \rceil )\), the running time of SLACK is \({\mathcal {O}}(n \log n)\) due to the initial sorting of the jobs.

To get a clear picture of the performance of SLACK, we compared it to LPT, to LDM of Karmarkar and Karp (1982) and to COMBINE, the algorithm proposed by Lee and Massey (1988) that couples LPT with the MULTIFIT heuristic introduced by Coffman Jr. et al. (1978). Notice that with an increase in the computational effort, both COMBINE and LDM generally exhibit better performances than LPT (see, e.g., Lee and Massey 1988; Michiels et al. 2007) but still guarantee a very limited computational complexity.

The comparison is executed by counting how many times SLACK wins (W), is equivalent to (E), or loses (L) when compared to the relevant competitor. The results are reported in Table 1 where instances are aggregated by processing time range and number of machines as in Iori and Martello (2008). Running times of the heuristics are generally negligible; hence, we just report an aggregated entry summing up the CPU time over all instances.

SLACK algorithm strongly outperforms LPT rule in each instance category with the most impressive performance difference on non-uniform instances. Overall, on 780 benchmark literature instances, SLACK wins 513 times (65.8% of the cases) against LPT, ties 224 times (28.7%) and loses 43 times (5.5%) only. Given these results, SLACK heuristic can be regarded as a valuable alternative to the popular LPT rule.

The comparison between SLACK and COMBINE provides additional evidence on the effectiveness of SLACK. The proposed heuristic outperforms COMBINE on non-uniform instances and favorably compares to it on uniform instances. Finally, SLACK shows up to be competitive with LDM on non-uniform instances, while it is slightly inferior on uniform instances, yielding though a non-superior makespan in 70% of the instances. Notice that in the uniform instances the smaller the processing times distribution, the better the performances of SLACK. A possible explanation of this phenomenon is that , with larger processing times distributions, the slack between largest and smallest processing times of each tuple does not fully capture the jobs processing times variance and may negatively affect the performances of the algorithm. Besides, SLACK runs much faster than LDM, taking roughly 2% of the computational time required by that algorithm. Taking into account this last aspect, we performed a further test where we evaluated the performance of SLACK enhanced by a simple neighborhood search (NS) procedure. This procedure, called SLACK + NS, with an overall negligible increase in the CPU time (about 57 ms for the whole batch of instances), looks for the best swap \(SW_{ij}\) between any job i on the critical machine and any job j on any machine that possibly improves upon the makespan provided by SLACK. If an improvement is found, NS restarts and iterates until a local minimum is reached. The relevant results are provided in Table 2 and indicate that the performances of SLACK + NS are already globally superior to those of LDM (though still inferior on uniform instances), while the overall CPU time remains inferior by more than an order of magnitude.

5 Concluding remarks

We provided new insights into the well-known LPT rule for \(P_m || C_{\max }\) problem and proposed a slight algorithmic revisiting which improves previously published approximation ratios for LPT. As second major contribution, from our approximation results we came up with a simple heuristic requiring very low computational complexity which strongly outperforms LPT on a large set of benchmark literature instances and is competitive with more involved approaches such as COMBINE and LDM.

In our analysis of \(P_m || C_{\max }\), we deployed a novel approach which relies on linear programming. The proposed LP reasoning could be considered a valid alternative to techniques based on analytical derivation and may as well find application in other combinatorial optimization problems. For example, an attempt in this direction has been recently proposed by Della Croce et al. (2018) for a multiperiod variant of the knapsack problem.

We remark that in this work we did not derive tight approximation bounds for LPT-REV algorithm. We reasonably expect that improved bounds can be stated and leave this issue to future research. Nonetheless, we found out \(P_m || C_{\max }\) instances for \(m \ge 3\) which provide a lower bound on the worst-case performance ratio of LPT-REV equal to \(\frac{4}{3} - \frac{7}{3(3m + 1)}\). These instances have \(2m+2\) jobs with processing times:

$$\begin{aligned}&p_j = 2m - \left\lfloor \frac{j + 1}{2} \right\rfloor , \quad 1 \le j \le 2m - 2; \\&p_j = m, \quad 2m - 1 \le j \le 2m + 2 \end{aligned}$$

It is easy to check that \(C_{m}^{ \textit{LPT-REV}} = 4m - 1\) and \(C_{m}^{*} = 3m + 1\) and that such values give the above performance ratio.