Keywords

1 Introduction

The Unrelated Parallel Machine Scheduling Problem with Setup Times (UPMSP-ST) consists of scheduling a set N of n independent jobs on a set M of m unrelated parallel machines. Each job \(j \in N\) must be processed exactly once by only one machine \(i \in M\), and requires a processing time \(p_{ij}\). Each machine can process only one job at a time. In addition, job execution requires a setup time \(S_ {ijk}\), which depends on the machine i and the sequence in which jobs j and k will be processed. The objective is to minimize the makespan.

The study of the UPMSP-ST is relevant due to its theoretical and practical importance. From a theoretical point of view, it attracts the interest of researchers because it is NP-hard, since it is a generalization of the Parallel Machine Scheduling Problem with Identical Machines [1]. In practical, it is found in a large number of industries, such as the textile industry [2]. According to Avalos-Rosales et al. [3], in a lot of situations where there are different production capacities, the setup time of machine depends on the previous job to be processed [4]. This situation is also found in the manufacture of chemical products, where the reactors must be cleaned between the handling of two mixtures; however, the time required for cleaning depends on the jobs that were previously completed [5].

In this work, a hybrid algorithm, named e-SGVNS, is proposed. It is an improvement of the SGVNS algorithm from Rego and Souza [6]. It is based on the General Variable Neighborhood Search – GVNS [7] and explores the solution space through five strategies: swap of jobs in the same machine; insertion of job on the same machine; swap of jobs between machines; insertion of jobs on different machines; and an application of a mathematical programming formulation based on the time-dependent traveling salesman problem to get the optimal solution to the sequencing problem on each machine. The first four strategies are used as shaking mechanism, while the last three are applied as local search through the Variable Neighborhood Descent. Unlike SGVNS, the proposed algorithm limits the increase of perturbation level. In addition, it applies MILP to all machines whose completion time is equal to makespan and not just to a single machine that meets this condition. This algorithm has been shown to be competitive when compared to state-of-the-art algorithms.

The remainder of this paper is organized as follows: Sect. 2 gives a brief review of the literature. In Sect. 3, a mathematical programming formulation for the problem is presented. In Sect. 4, the proposed algorithm is detailed. The results are presented in Sect. 6, while in Sect. 7 the work is concluded.

2 Related Work

Santos et al. [8] implemented four different stochastic local search (SLS) methods for the UPMSP-ST. The algorithms explore six different neighborhoods. The computational results show that the SLS algorithms produce good results, outperforming the current best algorithms for the UPMSP-ST. They updated 901 best-known solutions from 1000 instances used for testing.

Arnaout [9] introduced and applied a Worm Optimization (WO) algorithm for the UPMSP-ST. The WO algorithm is based on the behaviors of the worm, which is a nematode with only 302 neurons. The WO algorithm was compared to tabu search (TS), ant colony optimization (ACO), restrictive simulated annealing (RSA), genetic algorithm (GA), and ABC/HABC. The experiments showed the superiority of WO, followed by HABC, ABC, RSA, GALA, ACO, and TS last.

Arnaout et al. [10] proposed a two-stage Ant Colony Optimization algorithm (ACOII) for the UPMSP-ST. This algorithm is an enhancement of the ACOI algorithm that was introduced in [11]. An extensive set of experiments was performed to verify the quality of the method. The results proved the superiority of the ACOII in relation to the other algorithms with which it was compared.

Tran et al. [5] introduced a new mathematical formulation for the UPMSP-ST. This formulation provides dual bounds that are more efficient to find the optimum solution. The computational experiments showed that it is possible to solve larger instances than it was possible to solve with other previously existing formulations.

A variant of the Large Neighborhood Search metaheuristic, using Learning Automata to adapt the probabilities of using removal and insertion heuristics and methods, named LA-ALNS, is presented by Cota et al. [12] for the UPMSP-ST. The algorithm was used to solve instances of up to 150 jobs and 10 machines. The LA-ALNS was compared with three other algorithms and the results show that the developed method performs best in 88% of the instances. In addition, statistical tests indicated that LA-ALNS is better than the other algorithms found in the literature.

The UPMSP-ST was also approached by Fanjul-Peyro and Ruiz [13]. Seven algorithms were proposed: IG, NSP, VIR, IG+, NSP+, VIR+ and NVST-IG+. The first three are the base algorithms. The following three are improved versions of these latest algorithms. Finally, the last algorithm is a combination of the best ideas from previous algorithms. These methods are mainly composed of a solution initialization, a Variable Neighborhood Descent – VND method [7] and a solution modification procedure. Tests were performed with 1400 instances and it was showed that the results were statistically better than the algorithms previously considered state-of-the-art, which were, [14, 15].

A Genetic Algorithm was proposed by Vallada and Ruiz [16] for the UPMSP-ST. The algorithm includes a fast local search and a new crossover operator. Furthermore, the work also provides a mixed integer linear programming model for the problem. After several statistical analyzes, the authors concluded that their method provides better results for small instances and, especially, for large instances, when compared with other methods of the literature at the time [17, 18].

Rego and Souza [6] proposed the SGVNS algorithm for treating the UPMSP-ST. It explores the solution space by three strategies of local search: insertion of jobs in different machines, swap of jobs between machines and an application of a mixed integer linear programming formulation to obtain optimum scheduling on each machine. The SGVNS algorithm was tested in 810 instances and compared to four other literature methods (ACOII, AIRP and LA-ALNS). SGVNS had better performance when executed in small instances. The results of LA-ALNS and ACOII were significantly better than the results of SGVNS algorithm. Even so, SGVNS was superior in 5 groups of instances and able to find best results in 79 of the 810 instances.

3 Mathematical Formulation

This section provides a Mixed Integer Linear Programming (MILP) formulation for the unrelated parallel machine scheduling problem with sequence-dependent setup times with the objective of minimizing the makespan. This formulation was proposed by Tran et al. [5].

In order to introduce this MILP, the parameters and decision variables are defined and shown in Table 1.

The objective function is given by Eq. (1):

$$\begin{aligned} \text {min } C_{\max }, \end{aligned}$$
(1)

and the constraints are given by Eqs. (2)–(10):

$$\begin{aligned} \sum _{i \in M}^{} \sum _{\begin{array}{c} j \in N \cup \{0\},\\ j \ne k \end{array}}^{} X_{ijk} = 1&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \forall k \in N \end{aligned}$$
(2)
$$\begin{aligned} \sum _{i \in M}^{} \sum _{\begin{array}{c} k \in N \cup \{0\},\\ j \ne k \end{array}} X_{ijk} = 1&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \forall j \in N \end{aligned}$$
(3)
$$\begin{aligned} \sum _{\begin{array}{c} k \in N \cup \{0\},\\ k \ne j \end{array}} X_{ijk} = \sum _{\begin{array}{c} h \in N \cup \{0\},\\ h \ne j \end{array}} X_{ihj}&\qquad \qquad \qquad \qquad \qquad \,\, \forall j \in N,~\forall i \in M \end{aligned}$$
(4)
$$\begin{aligned} C_k \geqslant C_j + S_{ijk} + p_{ik} - V(1 - x_{ijk})&\qquad \forall j \in N \cup \{0\}, \forall k \in N, j \ne k, \forall i \in M \end{aligned}$$
(5)
$$\begin{aligned} \sum _{j \in N} X_{i0j} \leqslant 1&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \,\,\, \forall i \in M \end{aligned}$$
(6)
$$\begin{aligned} C_{0} = 0&\end{aligned}$$
(7)
$$\begin{aligned} \sum _{\begin{array}{c} j \in N \cup \{0\},\\ j \ne k \end{array}} \sum _{k \in N} (S_{ijk} + p_{ik})X_{ijk} = O_i ,&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \forall i \in M, \end{aligned}$$
(8)
$$\begin{aligned} O_{i} \leqslant C_{\text {max}},&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \forall i \in M,\end{aligned}$$
(9)
$$\begin{aligned} X_{ijk} \in \{0,1\}&\,\,\,\qquad \forall j \in N \cup \{0\}, \forall k \in N, j \ne k, \forall i \in M, \end{aligned}$$
(10)
$$\begin{aligned} C_{j} \ge 0&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \,\,\, \forall j \in N \end{aligned}$$
(11)
$$\begin{aligned} O_{i} \ge 0&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \,\,\, \forall i \in M \end{aligned}$$
(12)
$$\begin{aligned} C_{\text {max}} \ge 0&\end{aligned}$$
(13)
Table 1. Parameters and decision variables of Tran et al. [5] model.

Equation (1) defines the objective function of the problem, which is to minimize the maximum completion time or makespan. Eqs. (2)–(10) define the constraints of the model. The constraint set (2) ensures that each job is assigned to exactly one machine and has exactly one predecessor job. Constraints (3) define that every job has exactly one successor job. Each constraint (4) establishes that if a job j is scheduled on a machine i, then a predecessor job h and a successor job k must exist in the same machine. Constraints (5) ensure a right processing order. Basically, if a job k is assigned to a machine i immediately after job j, that is, if \(X_ {ijk} = 1\), the completion time \( C_k\) of this job k) must be greater than or equal to the completion time \(C_j\) of job j, added to setup time between jobs j and k and the processing time \(p_{ik}\) of k on machine i. If \(X_{ijk} = 0\), then a sufficiently high value V makes this constraint redundant. With constraint set (6) we define at most one job is scheduled as the first job on each machine. Constraints (7) establish that the completion time of the dummy job is zero. Constraints (8) compute, for each machine, the time it finishes processing its last job. Constraints (9) define the maximum completion time. Constraints (10)–(13) define the domain of the decision variables.

4 The Enhanced Smart GVNS Algorithm

The algorithm presented in this work, named e-SGVNS, is a is an improvement of the SGVNS algorithm from Rego and Souza [6]. In turn, SGVNS is a variant of the General Variable Neighborhood Search (GVNS) metaheuristic [7].

This metaheuristic performs systematic neighborhood exchanges to explore the solution space of the problem. It uses the Variable Neighborhood Descent procedure – VND [19], described in Sect. 4.4 as the local search procedure, and it has a perturbation phase in order to not get stuck in local optima, which is described in Sect. 4.2.

The perturbation phase of e-SGVNS depends of the perturbation level of the algorithm. This level is always increased when a certain number of VND applications occur without producing improvement in the current solution. The e-SGVNS was implemented according to the Algorithm 1:

figure a

Algorithm 1 has the following inputs: (1) the stopping criterion, which in our case was the CPU timeout t, described in Sect. 5.2; (2) MaxP, maximum level of perturbation; (3) MaxSameLevelP, the maximum number of iterations without improvement in f(s) with the same perturbation level; (4) the set \(\mathcal {N}\) of neighborhoods. In line 1, the solution s is initialized from the solution obtained by the procedure defined in Sect. 4.1. In line 6, a random neighbor \(s'\) is generated from a perturbation performed according to the procedure defined in Sect. 4.2. The loop from lines 5–23 is repeated while the stopping criterion is not satisfied. In line 7, a local search on \(s'\) using the neighborhood structures described in Sect. 4.3 is performed. It stops when it finds the first solution that is better than s or when the whole neighborhood has been explored. The solution returned by this local search is attributed to \(s''\) if its value is better than the current solution. Otherwise, the procedure continues to exploit from a new neighborhood structure.

4.1 Initial Solution

An initial solution to the problem is constructed according to Algorithm 2.

figure b

Algorithm 2 gets as input the sets M and N of machines and jobs, respectively. At each iteration, it looks for position j on a machine i to insert the job k into scheduling, it always chooses the position that gives the smallest increase in the objective function according to Eq. (1). The previously described steps are repeated for all jobs, so the procedure ends when all jobs are already allocated on some machine.

4.2 Shaking

The shaking procedure is an important phase of a VNS-based algorithm. It is applied to not limit the local search to the same region of the solution space of the problem, and consequently explore other solutions. The shaking procedure implemented increases progressively the level of perturbation in a solution when it is stuck in a local optimum.

The shaking procedure consists of applying to the current solution p moves chosen among the following: (1) change of execution order of two jobs on the same machine; (2) change of execution order of two jobs belonging to different machines; (3) insertion of a job from a machine into another position of the same machine and (4) insertion of a job from one machine into a position of another machine.

It works as follows: p independent moves are applied consecutively on the current solution s, generating an intermediate solution \(s'\). This solution \(s'\) is, then, refined by the VND local search method (line 7 of the Algorithm 1). The level of perturbation p increases after a certain number of attempts to explore the neighborhood without improvement in the current solution. This limit is controlled by the variable Max. When p increases, then p random moves (chosen from those mentioned above) are applied to the current solution. Whenever there is an improvement in the current solution, the perturbation returns to its initial level, \(p = 2\).

The operation of each type of perturbation is detailed below:

  • Swap on the Same Machine. This operation consists in randomly choosing two jobs \(j_1\) and \(j_2\) that are, respectively, in the positions x and y of a machine i, and allocate \(j_1\) in the position y and \(j_2\) in the position x of the same machine i.

  • Swap between Different Machines. This perturbation consists in randomly choosing a job \(j_1\) that is in the position x on a machine \(i_1\) and another job \(j_2\) that is in the position y of the machine \(i_2\). Then, job \(j_1\) is allocated to machine \(i_2\) in position y, and job \(j_2\) is allocated to machine \(i_1\) in position x.

  • Insertion on the Same Machine. It starts with the random choice of a job \(j_1\) that is initially in the position x of the machine i. Then, a random choice of another position y of the same machine is made. Finally, job \(j_1\) is removed from position x and inserted into position y of machine i.

  • Insertion between Different Machines. It consists of a random choice of a job \(j_1\) that is in the position x of the machine \(i_1\) and a random choice of position y of the machine \(i_2\). Then, the job \(j_1\) is removed from machine \(i_1\) and inserted into position y of machine \(i_2\).

4.3 Neighborhoods

We used three neighborhood structures to explore the solution space of the problem, and they are described below.

Fig. 1.
figure 1

Insertion move between machines \(i_1\) and \(i_2\).

Fig. 2.
figure 2

Swap move between machines \(i_1\) and \(i_2\).

  • \(\mathcal {N}_1\): Insertion between Machines. Let \(\pi \) and \(\sigma \) be two schedules, where \(\pi = (\pi _1, \pi _2, \dots , \pi _t)\) is performed on machine \(i_1\) and \(\sigma = (\sigma _1, \sigma _2, \dots , \sigma _r)\) on machine \(i_2\). In these schedules, t and r represent the number of jobs on machines \(i_1\) and \(i_2\), respectively. In this neighborhood, each job \(\pi _x \in \pi \) is removed from machine \(i_1\) and added to machine \(i_2\) at position \(y \in \{1, \cdots , r\}\). The set of insertion moves of jobs of a machine \(i_1\) in every possible positions of another machine \(i_2\) defines the neighborhood \(\mathcal {N} _1 (\pi , \sigma )\), which is composed by \(t \times (r+1)\) neighbors.

    Figure 1 illustrates an insertion move of a job \(\pi _x\) of a machine \(i_1\) in the position y of the machine \(i_2\). The right side of this figure shows the result of applying this move.

  • \(\mathcal {N}_2\): Swap Move between Machines. Let \(\pi \) and \(\sigma \) be two schedules as described above. Let also be two jobs \(\pi _x \in \pi \) and \(\sigma _y \in \sigma \). The swap move between machines consists in swapping these jobs between these schedules, that is, to move the job \(\pi _x\) to the position y of the machine \(i_2\) and the job \(\sigma _y\) to the position x of the machine \(i_1\). The set of swap moves between machines \(i_1\) and \(i_2\) defines the neighborhood \(\mathcal {N} _2 (\pi , \sigma )\), formed by \(t \times r\) neighbors.

    Figure 2 illustrates the swap between two jobs \(\pi _x\) and \(\sigma _y\), which are initially allocated to machines \(i_1\) and \(i_2\), respectively. After the swap move, the job \(\sigma _y\) is allocated to machine \(i_1'\) and job \(\pi _x\) to machine \(i_2'\).

  • \(\mathcal {N}_3\): Scheduling by Mathematical Programming. In this local search, the objective is to determine the best scheduling of the jobs in each machine by applying a MILP formulation. For this, the time-dependent traveling salesman problem (TDTSP) formulation of Bigras et al. [20] was adapted, where the distance between the cities i and j is represented by the sum of the processing time of job i and the setup time between jobs i and j. In addition, a dummy job 0 was added to allow the creation of a Hamiltonian cycle, where 0 represents the first and the last job.

    So, the MILP formulation is solved for the sequencing problem in each machine in which the completion time is equal to the makespan. If there is an improvement in the current solution, the local search method returns to the first neighborhood (\(\mathcal {N}_1\)). If there is no improvement in the current machine and there is another machine whose completion time is equal to the makespan, then the model is applied to this machine. If there is no improvement by applying this formulation, then the exploration in this neighborhood \(\mathcal {N}_3\) is ended.

    In order to introduce this MILP, the parameters and decision variables are defined and shown in Table 2. The other parameters used by the model are described in Table 1.

Table 2. Parameters and decision variables based on the Bigras et al. [20] model.

Then, the mathematical formulation used as local search strategy in each machine i is given by Eqs. (14)–(18).

Objective function:

$$\begin{aligned} \text {min } C_{\max }^{i}, \end{aligned}$$
(14)

Subject to:

$$\begin{aligned} \sum _{\begin{array}{c} j \in N_i \cup \{0\},\\ j \ne k \end{array}} Y_{jk} = 1&\qquad \qquad \qquad \qquad \qquad \qquad \forall k \in N_i \end{aligned}$$
(15)
$$\begin{aligned} \sum _{\begin{array}{c} k \in N_i \cup \{0\},\\ j \ne k \end{array}} Y_{jk} = 1&\qquad \qquad \qquad \qquad \qquad \qquad \forall j \in N_i \end{aligned}$$
(16)
$$\begin{aligned} \sum _{\begin{array}{c} j \in N_i \cup \{0\} \\ j \ne k \end{array}}\sum _{\begin{array}{c} k \in N_i \end{array}} (S_{ijk} + p_{ik})Y_{jk} = C_{\max }^{i}&\end{aligned}$$
(17)
$$\begin{aligned} \sum _{\begin{array}{c} j \in \delta \end{array}}~\sum _{k \notin \delta } Y_{jk} \ge 1&\qquad \qquad \qquad \qquad \qquad \forall \delta \subset N_i, \delta \ne \emptyset \end{aligned}$$
(18)
$$\begin{aligned} Y_{jk} \in \{0,1\}&\qquad \qquad \forall j \in \{0\} \cup N_i,~\forall k \in \{0\} \cup N_i, j \ne k \end{aligned}$$
(19)
$$\begin{aligned} C_{\max }^{i} \ge 0&\end{aligned}$$
(20)

Equation (14) defines the objective function, which is to minimize the completion time of the machine i. Equations (15)–(18) define the constraints for the sub-model. Constraints (15) ensure that every job k has exactly one predecessor job, and the predecessor job of the first job is the dummy job 0. Constraints (3) ensure that each job k has a successor job, and the successor of the last job is the dummy job 0. Constraints (17) compute the completion time on the machine i. Constraints (18) ensure that there is no subcycle, therefore, any subset \(\delta \in N_i\) of jobs must have at least one link with another subset complementary to \(\delta \), that is, \(N_i \backslash \delta \). This strategy is similar to the subtour elimination constraints for the traveling salesman problem, proposed by Bigras et al. [20]. Constraints (19) and (20) define the domain of the decision variables.

The mathematical model has a constraint for each subset of jobs. Thus, in cases where the scheduling problem has many subsets of jobs, the model will demand a high computational cost. For this reason, the set of constraints (18) was initially disregarded from the model. However, the relaxed model can produce an invalid solution, that is, a solution containing one or more subcycles. If this happens, a new set of constraints for each subcycle is added to the mathematical model to be solved again. In this new set of constraints (18), the set \(\delta \) is formed by the group of jobs belonging to the subcycle. This process is repeated until a valid solution is found.

For illustrating this situation, consider the matrix below that represents the values of the decision variables for a problem of one machine with five jobs (Table 3).

Table 3. Example of an invalid solution.

Consider that if \(Y_{jk} = 1\) then job j immediately precedes job k, and that the first job of the sequence is preceded by the dummy job 0. Then, we have the following subcycles: \(\delta _1 = \{5, 4, 3\}\) and \(\delta _2 = \{1, 2\}\). This solution is invalid since there should be a single scheduling involving all jobs and not two as can be observed. Figure 3 illustrated this situation:

Thus, a new constraint must be added for any solution that has a subcycle, since this situation does not obey Eq. (18).

Fig. 3.
figure 3

Representation of an invalid solution.

4.4 Local Search

The local search of our algorithm is done by a VND procedure, that uses the three neighborhood structures \(\mathcal {N}_1\), \(\mathcal {N}_2\) and \(\mathcal {N}_3\) defined in Sect. 4.3. Its pseudo-code is presented in Algorithm 3.

figure c

Thus, the VND returns a local optimum in relation to all three neighborhoods \(\mathcal {N}_1\), \(\mathcal {N}_2\) and \(\mathcal {N}_3\).

5 Computational Experiments

The Smart GVNS algorithm was coded in C++ language and the tests were performed on a microcomputer with the following configurations: Intel (R) Core (TM) i7 processor with clock frequency 2.4 GHz, 8 GB of RAM and with a 64-bit Ubuntu operating system installed. The mathematical heuristic, used as local search, was implemented using the Gurobi API [21] for the C++ language.

The proposed algorithm was tested in three sets of instances available by Rabadi et al. [18]: Balanced, Process Domain, and Setup Domain. Each set is formed by 18 groups of instances, and each group contains 15 instances, totaling 810 instances. In the first set, the processing time and the setup time are balanced. In the second, the processing time is dominant in relation to the setup time and in the third, the setup time is dominant in relation to the processing time.

5.1 Parameter Tuning

The implementation of the e-SGVNS algorithm requires the calibration of two parameters: MaxP and MaxSameLevelP, which are defined in Algorithm 1.

The Irace package [22] was used to tune the values of these parameters. Irace is an algorithm implemented in R that implements an iterative procedure having as main objective to find the most appropriate configurations for an optimization algorithm, considering a set of instances of the problem.

We tested the following values for the two parameters of the e-SGVNS: MaxP and MaxSameLevelP \(\in \) {3, 4, 5, 6, 7, 8, 9, 10, 11, 12}. The best configurations returned by Irace were MaxP = 5 and MaxSameLevelP = 12.

5.2 Stopping Criterion

As stopping criterion of the e-SGVNS algorithm and for a fair comparison, the average execution time of ACOII by Arnaout et al. [10] was used. Their time was divided by 2.0 because our computer is approximately 2.0 times fasterFootnote 1 than the computer used in [10] according to PassMark [23]. Cota et al. [12] also used the same stopping criterion to report the results of LA-ALNS algorithms (Table 4).

Table 4. Time limit for e-SGVNS Algorithm in minutes (the same of SGVNS).

The WO algorithm used a time limit different from others. Thus, it is compared only with respect to its results and not to its efficiency.

6 Results

Tables 5, 6 and 7 compare the results of the proposed e-SGVNS algorithm with those of ACOII reported by Arnout et al. [10], LA-ALNS reported by Cota et al. [12], WO described by Arnaout [9] and SGVNS by Rego and Souza [6] in relation to the average Relative Percent Deviation (RPD) in each group of 15 instances. For each instance l, the RPD is calculated by:

$$\begin{aligned} RPD_l = \dfrac{f_l^{alg} - f_l^{\star }}{f_l^{\star }}\times 100 \end{aligned}$$
(21)

where \(f_l^{alg}\) is the value of the objective function for the algorithm alg in relation to the instance l, while \(f_l^{\star }\) represents the Lower Bound (LB) for the l-th instance reported by Al-Salem [24].

Table 5. Average RPD in Balanced instances.
Table 6. Average RPD in the Process Domain instances.

In these tables, the first and second columns represent the number of machines and jobs, respectively. In the subsequent columns are the average RPD for ACOII, LA-ALNS, SGVNS, WO and e-SGVNS algorithms, respectively.

Table 7. Average RPD in the Setup Domain instances.

According to Tables 5, 6 and 7, the LA-ALNS algorithm was superior in 20 groups of instances, while the WO algorithm was superior in 14 groups of instances and the e-SGVNS algorithm was superior in 20 groups of instances. The results from SGVNS and ACOII algorithms were outperformed in all instance sets. Considering the presented results, it is possible to affirm that the LA-ALNS algorithm obtained the best average results, even though it was not applied to all the instances made available in [18].

The proposed algorithm presented a value for RPD less than 0 in instances with two machines. If we consider instances with 4 machines, the RPD was always less than 2, while for instances with up to 8 machines, the RPD was always less than 3. For the other instances, the RPD was always less than 4. These results indicate that the proposed method obtained a better performance in instances with fewer machines, in which the solution space is smaller. In other cases, the method has lower performance, given the high computational cost of the mathematical heuristic, which is used as one of the local search operators.

6.1 Statistical Analysis

A hypothesis test was performed to verify if the differences between the results presented by the algorithms are statistically significant. Therefore, the following hypothesis test was used:

$$ {\left\{ \begin{array}{ll} H_0: \mu _1 = \mu _2 = \mu _3 = \mu _4 = \mu _5 &{}\\ H_1: \exists i,j \,\, | \,\, \mu _i \ne \mu _j \end{array}\right. } $$

in which \(\mu _1\), \(\mu _2\) and \(\mu _3\) are the average RPDs for ACOII, LA-ALNS, SGVNS, WO and e-SGVNS, respectively.

An exploratory analysis of the data was performed in order to better understand the data of the samples before the application of the statistical test.

Fig. 4.
figure 4

Boxplot of the results.

Figure 4 shows the boxplot plot containing the sample distribution of the RPD values for the collected samples:

Before performing the hypothesis test, it is necessary to decide between test types, parametric or non-parametric. Generally, parametric tests are more powerful; however, they require three assumptions:

  1. 1.

    Normality: every sample must originate from a population with normal distribution,

  2. 2.

    Independence: the samples shall be independent of each other,

  3. 3.

    Homoscedasticity: every sample must have a population of constant variance.

The Shapiro-Wilk normality test was applied to the samples and its results are shown in Table 8:

Table 8. Shapiro-Wilk normality test.

Considering a significance level of 0.05, the results presented above indicate that the samples of the ACOII and WO algorithms come from populations with normal distribution, since the p-values presented are lower than the level of significance. However, the test does not present evidence that the LA-ALNS, SGVNS and e-SGVNS algorithm samples come from a normal population.

Therefore, it was decided to use the Pairwise Wilcoxon test, which calculates pairwise comparisons between group levels with corrections for multiple testing.

The Pairwise Wilcoxon test for the samples of the average results of the ICOII, LA-ALNS, SGVNS, WO and e-SGVNS algorithms are presented in Table 9. In this comparison, we excluded instance sets in which Cota et al. [12] did not report the results of LA-ALNS algorithm.

Table 9. Pairwise comparisons using Wilcoxon test in all algorithms.

According to Table 9, the observed differences are statistically significant for all algorithm pairs, except to (e-SGNVS, LA-ALNS) and (e-SGNVS, WO).

Table 10 displays the Pairwise Wilcoxon test considering the average RPD of algorithms that were tested in all instances.

Table 10. Pairwise comparisons using Wilcoxon test in all instances.

Considering that this p-value is much lower than 0.05, then the null hypothesis of equality between the means is rejected and it is concluded that there is evidence that at least two populations have different distribution functions.

As can be seen in Table 9, there is a statistically significant difference between the e-SGVNS algorithm and the SGVNS, ACOII algorithms.

7 Conclusions

This work dealt with the unrelated parallel machine scheduling problem with sequence-dependent setup times, aiming to minimize the makespan.

Since it is NP-hard, a hybrid heuristic algorithm was developed. The proposed algorithm, named Enhanced Smart General Variable Neighborhood Search (e-SGVNS), combines heuristic and exact optimization strategies to explore the solution space of the problem. The exact strategy works as local search and consists of applying a mathematical programming formulation based on the time-dependent traveling salesman problem to get the optimal solution to the sequencing problem on each machine. In turn, the heuristic strategy, in turn, explores neighborhoods based on swap and insertion moves.

The e-SGVNS was tested in benchmark instances from literature and its results were compared to four other literature methods (ACOII, LA-ALNS, SGVNS and WO).

The statistical analysis of the average results produced by the algorithms proved that e-SGVNS is statistically better than the SGVNS and ACOII algorithms. On the other hand, there is no statistical evidence of significant difference among the average results of the e-SGVNS, LA-ALNS and WO algorithms.

Overall, the e-SGVNS algorithm performed best on small instances, with up to 4 machines and up to 120 jobs, regardless of instance type.

As future work, we intend to test other mathematical programming formulations to perform the exact local search as, for instance, to apply a mixed integer linear programming formulation that considers two machines instead of a single one.