1 Introduction

The single machine model studied in this paper combines two popular features in scheduling theory: (1) Generalized due-dates (gdd), and (2) Position-dependent job processing times. In a gdd setting, the due-dates are not job-specific, but the input contains a sorted set of numbers, such that the \(j\)-th number is the due-date assigned to the job processed in the \(j\)-th position. When position-dependent processing times are considered, the processing time of a given job varies in the most general way as a function of its position in the sequence. The objective function considered is minimum number of tardy jobs.

Hall [1] introduced the concept of generalized due-dates, and Hall et al. [2] provided the first set of complexity results for a number of problems considering classical scheduling measures. Some of the recently published papers dealing with various extensions of these gdd models are: Gerstl and Mosheiov [3], Choi and Park [4], Park et al. [5], Choi et al. [6], Gerstl and Mosheiov [7], Choi et al. [8], Li and Chen [9], Park et al. [10], Mor et al. [11], Choi et al. [12] and Mosheiov et al. [13, 14]. These papers contain numerous applications of gdd scheduling. An interesting example from the petrochemical industry was described in the original paper of Hall [1]. In this setting, a number of interchangeable heat exchangers must be maintained up to a certain date. The identity of the heat exchangers is immaterial, implying that the problem can be formulated as a gdd scheduling problem.

Scheduling with position-dependent job processing times is also a wide area which has been studied by many researchers in the last three decades. Most of the researchers considered either learning see e.g., Biskup [15, 16], or job deterioration see e.g., Yin et al. [17] and Huang and Wang [18]. These two settings assume monotonicity: when a learning effect is considered, the job processing times are non-increasing as a function of their position, in the case of deterioration, the job processing times are assumed to be non-decreasing as a function of the job-position. In this paper we assume position-dependent processing time in the most general way, i.e., they do not follow any given function, and in particular no monotonicity is assumed. This general form is justified e.g., in settings consisting of the following two effects: (1) a learning process (of the scheduler/producer) is valid and as a result the processing times decrease for jobs performed later in the sequence, and (2) the entire system deteriorates, which leads to larger processing times of late processed jobs. In such systems, the impact of the job position on its processing time may change, and in particular the resulting sequence is not necessarily monotone. Some examples of studies of scheduling with general position-dependent job processing times are: Mosheiov [19], Gerstl and Mosheiov [20], Yu et al. [21], Agnetis and Mosheiov [22], Gerstl et al. [23], Pei et al. [24], Fiszman and Mosheiov [25], Kovalyov et al. [26], Yang and Lu [27], Mosheiov et al. [28], Montoya-Torres et al. [29], and Przybylski [30].

As mentioned, the scheduling measure considered here is minimum number of tardy jobs. The classical single-machine problem of minimizing the number of tardy jobs is solved in \(O(nlogn)\) time (where \(n\) is the number of jobs); see Moore [31]. Numerous extensions of this problem have been studied since. Some prominent examples are: Ho and Chang [32], Lann and Mosheiov [33], Lodree et al. [34], Mosheiov and Sidney [35], Mosheiov and Oron [36], Adamu and Adewumi [37], Allahverdi et al. [38], Aydilek et al. [39], Mor et al. [40], He et al. [41] and Hermelin et al. [42].

Our paper studies for the first time a setting combining all these features: (1) Generalized due-dates, (2) Position-dependent job processing times, and (3) The objective of minimum number of tardy jobs. The single-machine problem combining features (1) and (3), i.e., Minimizing the number of tardy jobs with generalized due-dates, is easily shown to be solved in polynomial time as well. Specifically, it is solved by the Shortest Processing Times first (SPT) policy, i.e., in \(O(nlogn)\) time as well. On the other hand, the complexity status of the single-machine problem combining all three features was unknown. In this paper we prove that the problem is strongly NP-hard. Consequently, we introduce first an efficient heuristic of a greedy nature. Then, an exact solution algorithm is introduced. This algorithm finds the optimal schedule significantly faster than a standard full-enumeration procedure. A numerical study is performed in order to measure (1) The running time required by the (exact) algorithm, and (2) The percentage of the instances solved to optimality by the heuristic.

The paper is organized as follows: Sect. 2 contains the notation and the formulation. Section 3 presents the NP-hardness proof. In Sect. 4 we introduce the exact solution algorithm and the heuristic. The results of our numerical tests are reported in Sect. 5. Conclusions and some ideas for future research are provided in the last section.

2 Formulation

We study a single machine \(n\)-job scheduling problem. \({p}_{jr}\) denotes the processing time of job \(j\) if assigned to position \(r; j,r=1,\dots ,n\). \({d}_{r}\) denotes the \(r\)-th generalized due-date, i.e., the due-date of the job assigned to position \(r\), \(r=1,\dots ,n\).

For a given schedule of the jobs, \({C}_{r}\) denotes the completion time of the job in position \(r\), \(r=1,\dots ,n\). The tardiness of the job in position \(r\) is denoted by \({T}_{r}=\text{max}\left\{{C}_{r}-{d}_{r},0\right\}, r=1,\dots ,n\). \({U}_{r}\) is the tardiness indicator, i.e., \({U}_{r}=1\) if \({T}_{r}>0\), and \({U}_{r}=0\) otherwise (\({T}_{r}=0\)). The objective function is minimum number of tardy jobs: \({\sum }_{r=1}^{n}{U}_{r}\). Hence, the problem studied here is:

$$1 \left| { p_{jr} , gdd } \right|\sum U_{r}$$

3 NP-hardness proof

In this section we study the complexity status of the problem.

Theorem 1

: Problem \(1 \left| {p}_{jr}, gdd \right|\sum {U}_{r}\) is strongly NP-Hard.

Proof

The proof is by reduction from 3-Partition.

3-Partition: Consider a set \(A\) of positive rational numbers, \(A=\left\{{a}_{1},{a}_{2},\dots ,{a}_{3t}\right\}\), where \(\sum_{i=1}^{3t}{a}_{i}=t\) and \(\frac{1}{4}<{a}_{i}<\frac{1}{2}\), \(i=\text{1,2},\dots ,3t\). Can the set \(A\) be partitioned into \(t\) disjointed triplets, \({A}_{1}, {A}_{2},\dots ,{A}_{t}\), such as \(\sum_{i\in {A}_{j}}{a}_{i}=1, j=\text{1,2},\dots ,t\)?

We construct the following instance of the scheduling problem from 3-partition:

There are \(n=3t\) jobs. The position-dependent processing time of job \(j\) if assigned to position \(r\) (\({p}_{jr}, j,r=1,\dots ,n\)) is the following:

$$\begin{gathered} p_{j1} = p_{j2} = p_{j3} = a_{j} , \hfill \\ p_{j4} = p_{j5} = p_{j6} = 2a_{j} , \hfill \\ p_{j7} = p_{j8} = p_{j9} = 3a_{j} , \hfill \\ \ldots \hfill \\ p_{{j\left( {3k - 2} \right)}} = p_{{j\left( {3k - 1} \right)}} = p_{j3k} = ka_{j} , \hfill \\ \ldots \hfill \\ p_{{j\left( {3t - 2} \right)}} = p_{{j\left( {3t - 1} \right)}} = p_{j3t} = ta_{j} . \hfill \\ \end{gathered}$$

Let the generalized due-dates be:

$$\begin{gathered} d_{1} = d_{2} = d_{3} = 1, \hfill \\ d_{4} = d_{5} = d_{6} = 3, \hfill \\ d_{7} = d_{8} = d_{9} = 6, \hfill \\ \cdots \hfill \\ d_{3k - 2} = d_{3k - 1} = d_{3k} = \mathop \sum \limits_{i = 1}^{k} i, \hfill \\ \cdots \hfill \\ d_{3t - 2} = d_{3t - 1} = d_{3t} = \mathop \sum \limits_{i = 1}^{t} i. \hfill \\ \end{gathered}$$

The scheduling measure is minimum number of tardy jobs.

The recognition version (RV) of this scheduling problem: Is there a schedule with no tardy jobs?

We prove in the following that there is a YES answer to 3-Partition if and only if there is a YES answer to RV.

(\(\Rightarrow\)) Assume first that there is a YES solution to 3-partition. Then, schedule the jobs of the first triplet \({A}_{1}\) in the first 3 positions, the jobs of the second triplet \({A}_{2}\) in the next 3 positions, etc. It follows that the third job is completed at time \({d}_{3}=1,\) the sixth job is completed exactly at time \({d}_{6}=3\), and so on; see Fig. 1. The resulting sequence contains no tardy jobs.

Fig. 1
figure 1

A feasible schedule of \(3t\) jobs with no tardy jobs

(\(\Leftarrow\)) Assume now that there is NO solution to 3-partition. It follows that any allocation to triplets contains some with total load strictly smaller than 1, and some with total load strictly larger than 1. Consider the case that there is a single triplet of each type: triplet \({A}_{k}\) in which the sum of its elements is strictly smaller than 1 (say, \(1-\epsilon\)), and triplet \({A}_{l}\), in which the sum of its elements is strictly larger than 1 (\(1+\epsilon\)).

We create a schedule based on this allocation to triplets. Consider first a schedule in which \(l<k\), i.e., the triplet of jobs \({A}_{l}\) is processed before the triplet \({A}_{k}\). In this case, the completion time of job 3\(l\) (the last job in this triplet) is \(1+2+\dots l-1+l\left(1+\epsilon \right)={\sum }_{i=1}^{l}i+l\varepsilon >{\sum }_{i=1}^{l}i={d}_{3l}\). Hence, this schedule contains at least one tardy job, implying a NO solution to 3-partition (Fig. 2). Assume now that the triplet of jobs \({A}_{l}\) is processed after the triplet \({A}_{k}\). Assume that there are \(x\) triplets between them (i.e., \(l=k+x\)). Then, the completion time of job \(3l\) (the last job in triplet \({A}_{l}\)) is: \(1+2+\dots k-1+k\left(1-\epsilon \right)+\left(k+1\right)+\left(k+2\right)+,,,+\left(k+x-1\right)+(k+x)(1+\epsilon )={\sum }_{i=1}^{l}i-k\varepsilon +\left(k+x\right)\epsilon = {\sum }_{i=1}^{l}i+x\epsilon > {\sum }_{i=1}^{l}i={d}_{3l}\). Hence, as above, this schedule contains at least one tardy job, implying a NO solution to 3-Partition. ■

Fig. 2
figure 2

Average running time of algorithm F as a function of the number of jobs for different tightness factors (logarithmic scale)

4 A heuristic and an exact solution algorithm

Since the problem \(1 \left| {p}_{jr}, gdd \right|\sum {U}_{r}\) is NP-hard in the strong sense, we focus in this section on the introduction of a simple heuristic and of an efficient exact solution algorithm. The heuristic (denoted by H) is based on assigning the jobs sequentially from position 1 to position \(n\). It is of a greedy nature - at each iteration (position), the shortest possible job is selected. Let \(G\) denote the set of unscheduled jobs so far. Initially, \(G=\left\{j=1,\dots ,n\right\}\). Let \(\overrightarrow{S}\) denote the resulting job sequence. Clearly, the initial sequence is empty. In the first iteration, we schedule the shortest job (in the first position: \(i=1\)). The index of this job is denoted by \(minJobIndex\): \(minJobIndex=argmi{n}_{j\in G}\left\{{p}_{j1}\right\}, j=1,\dots ,n\). This job is removed from \(G\): \(G=\{G\backslash minJobIndex\}\) and added to the job sequence \(\overrightarrow{S}\) in position \(i=1\). We repeat the same job selection procedure (from the unscheduled jobs) for the remaining positions: positions \(2\,\, \text{to}\,\,n\). The following pseudo code introduces in detail:

figure a

Running Time: We consider all \(n\) positions. For each position, \(O(n)\) jobs are checked. The calculation of the completion time of the selected job requires constant time. It follows that the total running time of H is \(O\left({n}^{2}\right)\).

We now introduce an exact non-polynomial algorithm (denoted F) that solves to optimality problem \(1 \left| {p}_{jr}, gdd \right|\sum {U}_{r}\). The algorithm starts by running H and obtaining a feasible schedule and an upper bound on the optimal number of tardy jobs. The procedure is based on building partial schedules at each iteration such that a job is added to the sequence and the number of tardy jobs (so far) and the upper bound are updated. However, we avoid the evaluation of all \(n!\) schedules by discarding partial schedules when the number of tardy jobs (\(minNumOfTardyJobs\)) exceeds the current upper bound. (It should be noted that in the worst case, the algorithm evaluates the assignment of all jobs to all positions.) After obtaining the initial upper bound, Algorithm F calls function L that evaluates all options of adding the unscheduled jobs to the remaining positions.

In the following we introduce Algorithm F and the function L.

figure b

\({\varvec{F}} ({p}_{jr},{d}_{r})\)

Special Case 1: Position-independent processing times (\(1 \left| gdd \right|\sum {U}_{r}\)).

The special case of position-independent processing times (i.e., the case that \({p}_{jr}={p}_{j};\, j,r=1,\dots ,n\)) was shown to be solved by the SPT (Shortest Processing Time first) policy; see Hall [1].

Special Case 2: A common due-date (\(1 \left| {p}_{jr}, {d}_{j}=d \right|\sum {U}_{r}\)).

A special case of general learning curves with a common due-date was studied by Mosheiov and Sidney [35], who proved that an optimal solution is obtained in polynomial time. A similar idea can be used for solving the setting of general position-dependent processing times.

We start by fixing a number \(k \,(1\le k\le n)\). Then we solve the problem of minimizing the makespan of \(k\) jobs (out of the original set of \(n\) jobs). This problem can be formulated as a Linear Assignment Problem (LAP), where \(n\) jobs need to assigned to \(k\) positions. Thus, the input matrix is of size \(n\times k\), and each entry contains the processing time of job \(i\) if assigned to the \(j\)-th position (\({p}_{ij}, i=1,\dots ,n, j=1,\dots ,k)\). The LAP is the following:

$$\begin{gathered} {\text{MIN}} \mathop \sum \limits_{i = 1}^{n} \mathop \sum \limits_{j = 1}^{k} X_{ij} p_{ij} \hfill \\ {\text{S}}.{\text{T}}. \hfill \\ \mathop \sum \limits_{j = 1}^{k} X_{ij} \le 1 , i = 1, \ldots ,n \hfill \\ \mathop \sum \limits_{i = 1}^{n} X_{ij} = 1 , j = 1, \ldots ,k \hfill \\ X_{ij} binary, i = 1, \ldots ,n, j = 1, \ldots ,k. \hfill \\ \end{gathered}$$

The result of this LAP, denoted \({C}_{max}(k)\), is the minimal makespan value of \(k\) jobs. This value is compared to \(d\), and if \({C}_{max}(k)\le d\), then \(k\) jobs can be completed on time (leading to \(n-k\) tardy jobs). This procedure is repeated for all \(k\) values, and the largest \(k\) for which \({C}_{max}(k)\le d\) is the optimal solution. Running time: Each LAP is solved in \(O({n}^{3})\), the procedure is repeated \(O(n)\), leading to total running time of \(O({n}^{4})\).

5 Numerical study

We tested numericaly the performance of the exact algorithm F and the heuristic H. In all our numerical experiments, the job-position processing times were generated uniformly in the interval \([1,{p}_{max}=100]\). The generalized due dates were generated uniformly in the interval \([1,{d}_{max}=\alpha P]\), where \(P=n{p}_{max}\) (\(n\) is the number of jobs), and \(\alpha\) is the tightness factor. We assumed: \(\alpha =0.25, 0.5,\dots ,1.5\). For each combination of \(n\) and \(\alpha\), 10 instances were generated and solved. [All algorithms were coded in C, and executed on a Macintosh 2.7 GHz Intel Core i7 and 16 Gb RAM.]

We first evaluated Algorithm F assuming a small tightness factor: \(\alpha =0.25\). In this part of the numerical study, only small instances were solved (up to 14 jobs), and all problems were solved to optimality. Table 1 reports the average and worst-case running time for instances of \(n=\text{10,11},\dots ,14\). Table 1 is limited to instances of this size because the running time for \(n=14\) increased significantly and reached an average of 611 s. Next, the running time of Algorithm F was measured for larger due-dates densities: \(\alpha =0.5, 0.75,\dots ,1.5\). For these densities, we were able to solve instances of larger numbers of jobs: \(n=\text{10,15},\dots ,30\). The running times, reported in Table 2, indicate that instances of up to 25 jobs were solved in few milliseconds. However, for \(n=30\) and \(\alpha =1\), a single instance was solved in more than 2.7 s. A logarithmic scale graph of the average running time as a function of the number of jobs for different tightness factors is provided in Fig. 2. In the last part of the numerical study, the proposed heuristic \(H\) was evaluated. The results of \(H\) were compared to those obtained by Algorithm F given an upper bound of 60 s on its running time. We report that \(H\) reached an optimal schedule for a given instance, if the solution of algorithm \(H\) is identical to that obtained by the modified algorithm F (in less than 60 s). As above, the tightness factors were: \(\alpha =0.5, 0.75,\dots ,1.5\), and larger instances were solved: \(n=\text{10,20},\dots ,50\). Again, for each combination of \(n\) and \(\alpha\), 10 problems were generated and solved. Table 3 and Fig. 3 provide the number of times that algorithm \(H\) reached an optimal schedule (as a function of the number of jobs and the tightness factor). Note that \(H\) performs well for small- and medium-size problems: for instances of up to 30 jobs, only in one case (out of 300), the optimum was not obtained by the heuristic. Also, for (relatively) large tightness factor of \(\alpha =1.5\), the heuristic missed the optimum in one case only (out of 250). Based on these promising results, we believe that Heuristic \(H\) can be used in most practical settings.

Table 1 Average and worst-case run times (sec) for Algorithm F with the smallest tightness factor (\(\alpha =0.25\))
Table 2 Average and worst-case run times (sec) for Algorithm F with different tightness factors (\(\alpha =0.5, 0.75, 1, 1.25, 1.5\))
Table 3 Number of times that algorithm H reached the optimum with different tightness factors (\(\alpha =0.5, 0.75, 1, 1.25, 1.5\))
Fig. 3
figure 3

Number of optimal schedules obtained by algorithm H as a function of the number of jobs for different tightness factors

6 Conclusion

We studied a single machine scheduling problem to minimize the number of tardy jobs. The special features considered are: generalized due-dates and general position-dependent job processing times. We first proved that the problem is NP-hard in the strong sense. Hence, a simple greedy heuristic and an exact algorithm were introduced. Both procedures were tested numerically. The exact algorithm was shown to be able to handle medium size instances, and the heuristic reached the optimum in the vast majority of (the larger) instances in our tests.

Since no studies considering both generalized due-dates and general position-dependent processing times have been published, future research may focus on other settings combining these two features. Challenging options are either other machine settings (multi-machines or shops), or other scheduling measures. In addition, we note that the complexity of the problem of minimizing the number of tardy jobs with general position-dependent processing times (and standard job-dependent due-dates) is still unknown, and is clearly another possible topic for future research.