1 Introduction

The scheduling problems encounter commonly in modern industries, such as logistic transportation and chemistry industry, which attract more and more researchers’ attention to do such research work. While the batch scheduling problem is a key branch of such problems, which occurs in many real-life situations, such as semiconductor manufacturing and aluminum production. Thus, there exists extensive literature on batch scheduling problem. The batch scheduling problem is first proposed by Ikura and Gimple [1] who have led to extensive researches in such field (e.g., [2,3,4,5]). Since then, many researchers start to study such problems and propose many effective algorithms to solve them. Considering non-identical jobs have different features, such as arbitrary sizes including Uzsoy [2] and Ozturk et al. [6]. Some researchers study batch scheduling problem considering parallel machines, which we can refer Wang et al. [7] and Cheng et al. [8]. For solving such problem, many heuristic algorithms are proposed, as we can see in Ozturk [6], Abedi et al. [9], Damodaran et al. [10] and Melouk et al. [11]. Meanwhile, some researchers including Ng et al. [12], Yuan et al. [13] and Shen and Buscher [14] consider serial-batching scheduling problems, and some algorithms are developed to solve such problems.

Some production activities result in the effect of deterioration in dynamic and sophisticated production environment, such as machines maintenance or workers’ assignments, and there exists much related research work recently. Most research refers to linear deterioration, such as Browne and Yechiali [15], Mosheiov [16] and Cheng and Ding [17]. Moreover, Mosheiov [18] and Cheng et al. [19] propose some heuristics and a dynamic programming algorithm respectively to solve such problem. Some researchers consider different deterioration models, for example, Sundararaghavan and Kunnathur [20] and Mosheiov [18] study the step-deterioration model where the jobs share the same common break-point. While Jeng and Lin [21] and Low et al. [22] discuss such case where jobs are characterized with different break-points. Layegh et al. [23] propose a memetic algorithm to such problem with the objective of minimizing the total weighted completion time. While some researchers consider batching scheduling under the effect of deterioration, such as Li et al. [24], Pei et al. [25] and Liao et al. [26]. Based on the above research work, we can find that potential work considering other features including grouping is worth investigating.

In manufacturing system, enterprise can increase the efficiency of production by using group technology [27]. To solve such a practical issue, many researchers focus on such area and have done a lot of research work. Wu [28] and Wang et al. [29] considers a single-machine group scheduling problem with deteriorating effect. Other researchers consider such an issue like Wu et al. [30] and Wang et al. [31]. Furthermore, some researchers consider groups scheduling problem with parallel machines, such as Nomden and Zee [32] and Rustogi and Strusevich [33]. While parallel machine scheduling problems with inclusive processing set restrictions are of practical value, and such issues can be found in many real-life situations, for example, cargo loading operations case where multiple cranes with different weight capacity limits and coating process case where coating machines have the same functionality [34, 35]. Such problem with different processing times and release dates of jobs is proved to be NP-hard in the strong sense in Lawler et al. [36]. To solve such problem Li and Wang [37] develop a polynomial time approximation scheme. Li and Li [34] schedule jobs with release dates and equal processing times in such case aiming at minimize the makespan. However, to the best of our knowledge, there is few previous work taking grouping and inclusive processing set restrictions into account simultaneously. Thus, the problem considering such two features is worth studying.

Due to the demand of different customers and limited production capacity, the decision-makers of enterprise would choose outsourcing strategies to maintain their competitive advantage in the market place. The importance of outsourcing has increased in the last few years as increasing global competition [38,39,40]. Wu et al. [38] use a bi-objective genetic algorithm to determine the outsourcing order location with the objective of minimizing the total allocation cost. Some researcher consider cases with one in-house machine, such as Qi [41]. While some researchers study complex in-house production environment by extending to multiple machine, such as Ruiz-Torres [42]. Such work includes some research on two-stage production, for example, Qi [43] study a two-stage production system with an option of outsourcing some stage-one operations. Neto et al. [44] consider the parallel machine scheduling problem with outsourcing allowed and propose an ACO algorithm to solve the problem, which aims to minimize the sum of outsourcing and delay costs. Based on the above research of outsourcing cases, we can find out that most related research on such case mainly focuses on the general processing way and rarely considers batch scheduling cases.

Recently, we have conducted some previous research (Pei et al. [26, 45,46,47,48,49,50,51,52]). However, there are some significant differences between the previous research and this paper. In this paper, we take new features of outsourcing and inclusive processing set restrictions into account based on previous models. In addition, a novel specific deterioration function with serial-batching is investigated, which is different from previous research. Our previous work considers the jobs with the same size, while the studied paper considers non-identical jobs with distinct sizes. We also propose a new hybrid VNS–NKEA algorithm to solve the studied problem which is distinguishing from Fan et al. [47] and Liao et al. [26]. To present the difference between our previous research and the current study, the comparisons of our previous research and the current study are showed in Table 1. Although some previous research work considering step-deterioration effect like [51, 52] has been done, the features of the studied problem in this paper are different from those of [51, 52]. Pei et al. [51] address the parallel-machine scheduling problem with maximizing the total net revenue. The step-deterioration function and optimization objective in [51] are different from this paper. Ma et al. [52] consider the problem where jobs with release times are processed in the form of parallel-batching on a single machine, while this paper considers that jobs in different groups with outsourcing option are all ready before processing and are processed on some specific machines in the form of serial-batching. Meanwhile, the proposed algorithms in [51, 52] are different from this paper.

Table 1 Comparisons of our previous research and the current work

The main contributions of this paper can be summarized as follows:

  1. 1.

    This studied paper considers a group parallel-machine scheduling problem where jobs with distinct sizes and inclusive processing set restrictions are processed in the form of serial-batching. At the same time, we take the effect of step-deterioration into account and each machine has the distinct effect of deterioration, which means the break point is related to the machines. Moreover, we also consider jobs in each group with outsourcing allowed under our study.

  2. 2.

    We discuss some special cases of the studied problem and identify some structural properties on generating batches, sequencing the formed batches and sequencing the groups. Eventually, we propose a schedule rule to solve the in-house single machine group scheduling problem.

  3. 3.

    Based on the structural properties for special cases, we develop a novel hybrid VNS–NKEA algorithm which integrates effectively the core ideas of neighborhood knowledge-based evolutionary algorithm (NKEA) and variable neighborhood search (VNS) to solve the general problem.

The remainder of this paper is organized as follows. In Sect. 2, we present the notation and studied problem description. Some properties of special cases are identified in Sect. 3. To solve the studied problem, we propose a novel algorithm in Sect. 4. Moreover, we conduct numerical computational experiments to evaluate algorithms in Sect. 5. Finally, this paper is concluded in Sect. 6.

2 Notations and studied problems statement

The notation referred in the whole of this paper is described as in Table 2.

Table 2 The list of the notation

The studied problem can be briefly described as follows:

There is a set of groups containing non-preemptive deteriorating jobs G = {G1, …, Gk, …, GN} to be processed in the manufacturer or to be outsourced. The outsourcing strategies is not to decide which group is outsourced but which jobs in each group. Each job is associated with distinct outsourcing cost. All jobs from each group are processed on some machine independently, that is, the jobs from different groups cannot be processed in the same batch. We assume that the outsourcing has infinite productive ability, which means that the jobs which assigned into the outsourcing are processed with no limitations. While there exist m parallel in-house machines \( M = \left\{ {M_{1} , \ldots ,M_{j} , \ldots ,M_{m} } \right\} \) which can process any job in the groups’ set in the manufacturer. The jobs in each group are only processed on some machine, i.e., jobs in the same group cannot be processed in different machines. Each machine process the jobs in the form of batches which contain a certain number of jobs, i.e., the jobs assigned into the same batch are processed simultaneously. Each machine is associated with a specified breakpoint time of deterioration, that is, the deterioration incurred in machines may be different. The number of jobs in a batch cannot exceed the capacity of the machines. The completion time of all job in the batch is identical, which is equal to that of the last-processed job in the batch. Similarly, the completion time of some group is equal to that of the last-processed batch in the group. In this paper, we consider the effects of deteriorating jobs related to the groups, which are dependent on the current processing time and its belonged group. Each machine has the distinct effect of deterioration. Following the deterioration model of [23], we extend it to group scheduling with serial-batching. The extended deterioration model can be defined as:

$$ p_{ki} = \left\{ {\begin{array}{*{20}l} {\theta_{k} a_{ki} } &\quad {if\quad t \le Q_{j} } \\ {\theta_{k} \left( {a_{ki} + b_{ki} } \right) } &\quad {otherwise} \hfill \\ \end{array} } \right. $$

where pki is the actual processing time of the job Jki, t is the starting time of the job Jki, and Qj is the breakpoint time of deterioration on machine Mj.

Motivated by the practical environment, there exists a set-up time before processing the jobs in each group and each batch respectively. The set-up time of group is θG while that of batch is θB [53]. We assume the two kind of setup time is fixed, which means the set-up time is not related to other issues. Inclusive processing set restrictions is commonly found in the productive process of manufacturing. Moreover, there exists extensive applications in the scheduling problem with inclusive processing set restrictions. In order to make our study more practical, the studied scheduling problem we investigate is also subject to the inclusive processing set restrictions. Each machine is associated with the dedicated groups’ set, i.e., the machine Mj can process any job belonging to any group in the groups’ set Aj which is not empty.

The objective of problem is aimed at minimizing the in-house makespan incurred when jobs are processed and the total cost incurred when jobs are outsourced. For the sake of convenience, adopting the standard classification scheme for scheduling problems proposed by Graham et al. [54] and referring to [37], we denote the problem as \( P\left| {s - batch,G_{k} , p_{ki} = a_{ki} \, or\,p_{ki} = a_{ki} + b_{ki} , s_{ki} , incl.proc.sets} \right|\left( {1 - \delta } \right)C_{max}^M + \delta OC \), where δ denotes a constant cost parameter and 0 ≤ δ ≤ 1 [55].

3 Some special cases for the problem \( P\big| s - batch,G_{k} , p_{ki} = a_{ki} \,or\,p_{ki} = a_{ki} + b_{ki} , s_{ki} , incl.proc.sets \big|\left( {1 - \delta } \right)C_{max}^{M} + \delta OC \)

In this section, we mainly focus on the special cases for the problem \( P| s - batch,G_{k} , p_{ki} = a_{ki} \,or\,p_{ki} = a_{ki} + b_{ki} ,s_{ki} , incl.proc.sets |\left( {1 - \delta } \right)C_{max}^{M} + \delta OC \). To obtain the special cases, we relax some constrains of the problem, such as the environment of production, the number of machine or sizes of jobs. We analyze different special cases of the problem to obtain some properties corresponding to the cases.

3.1 Problem \( 1\left| {p_{i} = a_{i} \,or\,p_{i} = a_{i} + b_{i} } \right|C_{max} \)

The schedule of all jobs in each serial batch is equivalent to the schedule for the problem \( 1\left| {p_{i} = a_{i} \,or\,p_{i} = a_{i} + b_{i} } \right|C_{max} \). In this problem, we consider the jobs are processed by the general way under the effect of step-deterioration.

Definition 1

Let the optimal scheduling sequence be \( Z = \left\{ {J_{1} , \ldots , J_{i - 1} , J_{i} , J_{i + 1} , \ldots , J_{n} } \right\} \) on any machine Mj. If the start time of the job \( J_{l} \) is \( S_{l} \) and it satisfies \( S_{l} < Q_{j} \le S_{l} + a_{l} \), then the makespan of the schedule is Cmax(Z) = ∑ni=1 ai + ∑ni=l+1 bi, the early set of jobs is denoted as EJ = {J1, …, Jl−1}, the later set of jobs is LJ = {Jl+1, …, Jn} and the job \( J_{l} \) is defined as the key job, just as illustrated in Fig. 1.

Fig. 1
figure 1

The illustration of the sequence Z

Property 1

For the problem \( 1\left| {p_{i} = a_{i} \,or\,p_{i} = a_{i} + b_{i} } \right|C_{max} \), if the jobs \( J_{i} \,{\text{and}}\,J_{i + 1} \) (i ≥ l) satisfy that \( b_{i} < b_{i + 1} \), then the schedule obtained by swapping the two jobs dominates the original schedule.

Proof

Let a sequence be \( Z^{\prime} = \left\{ {J_{1} , \ldots , J_{i - 1} , J_{i + 1} , J_{i} , \ldots , J_{n} } \right\} \) on any machine Mj. Z′ is obtained from sequence Z by a pairwise interchange of jobs Ji and \( J_{i + 1} \) (i ≥ l), as illustrated in Fig. 2, and we assume that bi < bi+1. We consider two cases of sequence Z′:

Fig. 2
figure 2

The illustration of the sequence Z

Case 1 If i = l, then

$$ \begin{aligned} C_{max} \left( {Z^{{\prime }} } \right) & = \mathop \sum \limits_{i = 1}^{n} a_{i} + b_{l} + \mathop \sum \limits_{i = l + 2}^{n} b_{i} \\ \Delta & = C_{max} \left( {Z^{{\prime }} } \right) - C_{max} \left( Z \right) = b_{l} - b_{l + 1} < 0. \\ \end{aligned} $$

Case 2 If i > l, then

$$ \Delta = C_{max} \left( {Z^{\prime}} \right) - C_{max} \left( Z \right) = 0. $$

Based on the above cases, we can obtain that Δ ≤ 0, that is, Z′ dominates Z, which conflicts with the optimal schedule. Hence, if the jobs \( J_{i} \,{\text{and}}\,J_{i + 1} \) (i ≥ l) satisfy that \( b_{i} < b_{i + 1} \), then the schedule obtained by swapping the two jobs dominates the original schedule

The proof is completed.\( \hfill\square \)

Corollary 1

For the problem \( 1\left| {p_{i} = a_{i} or p_{i} = a_{i} + b_{i} } \right|C_{max} \), there exists an optimal schedule such that \( b_{l} \ge b_{l + 1} \ge \ldots \ge b_{n} \).

Property 2

For the problem \( 1\left| {p_{i} = a_{i} \,or\,p_{i} = a_{i} + b_{i} } \right|C_{max} \) , if the jobs \( J_{l - 1} \,{\text{and}}\,J_{l + 1} \) satisfy that \( a_{l - 1} > a_{l + 1} \,{\text{and}}\,b_{l - 1} < b_{l + 1} \) , then swapping the two jobs will get the schedule improved.

Proof

Let a sequence be \( Z^{\prime} = \left\{ {J_{1} , \ldots , J_{l + 1} , J_{l} , J_{l - 1} , \ldots , J_{n} } \right\} \) on any machine Mj. Z′ is obtained from sequence Z by a pairwise interchange of jobs Jl−1 and \( J_{l + 1} \), and we assume that al−1 > al+1 and bl−1 < bl+1, as illustrated in Fig. 3

$$ \begin{aligned} C_{max} \left( {Z^{\prime}} \right) & \le \mathop \sum \limits_{i = 1}^{n} a_{i} + b_{l - 1} + \mathop \sum \limits_{ = l + 2}^{n} b_{j} \\ \Delta & = C_{max} \left( {Z^{\prime}} \right) - C_{max} \left( Z \right) \le b_{l - 1} - b_{l + 1} < 0 \\ \end{aligned} $$

since Δ < 0, Z′ dominates Z, which conflicts with the optimal schedule. Hence, if the two jobs \( J_{l - 1} \,{\text{and}}\,J_{l + 1} \) satisfy that \( a_{l - 1} > a_{l + 1} \,{\text{and}}\,b_{l - 1} < b_{l + 1} \), then it will get the schedule improved by swapping the two jobs.

Fig. 3
figure 3

The illustration of the sequence Z

The proof is completed.\( \hfill\square \)

Property 3

For the problem \( 1\left| {p_{i} = a_{i} \,or\,p_{i} = a_{i} + b_{i} } \right|C_{max} \) , if the jobs \( J_{l - 1} \,{\text{and}}\,J_{l + 1} \) satisfy that \( a_{l - 1} < a_{l + 1} \,{\text{and}}\,b_{l - 1} < b_{l + 1} \) , then it will get the schedule improved by swapping the two jobs if the key job keeps unchanged.

Proof

The proof is similar to that of the Property 2, and we omit it. \( \hfill\square \)

Corollary 2

For the problem \( 1\left| {p_{i} = a_{i} \,or\,p_{i} = a_{i} + b_{i} } \right|C_{max} \), there exists a schedule such that \( b_{l} \ge b_{l + 1} \ge \cdots \ge b_{n} \) and a1 ≤ ··· ≤ al−1.

Based on the above properties, we propose a heuristic algorithm for sequencing the processed jobs on any machine Mj as follows:

Algorithm H1

Step 1

Sequence the jobs in non-decreasing order of ai such that a1 ≤ ··· ≤ ai ≤ ··· ≤ an

Step 2

Select the job Ji which satisfies \( S_{i} < Q_{j} \le S_{i} + a_{i} \), and we define the job as Jl

Step 3

Group the jobs into two subset EJ and LJ, where the jobs in EJ are processed before Qj and the jobs in LJ are processed after Qj

Step 4

Sequence the jobs in {Ji} ∪ LJ such that all jobs are sequenced in non-increasing order of bi

3.2 Problem \( 1\left| {s - batch,s_{i} ,p_{i} = a_{i} \,or\,p_{i} = a_{i} + b_{i} } \right|C_{max} \)

The schedule for each group is equivalent to solving the problem \( 1\left| {s - batch,s_{i} ,p_{i} = a_{i} \,or\,p_{i} = a_{i} + b_{i} } \right|C_{max} \), where the jobs are associated with sizes and processed in the form of serial batches.

Definition 2

Let a schedule be \( Z = \left\{ {b_{1} , \ldots , b_{f - 1} , b_{f} , b_{f + 1} , \ldots , b_{n} } \right\} \) on any machine Mj. If the start time of the batch \( b_{f} \) is \( S_{f}^{b} \) and it satisfies \( S_{f}^{b} < Q_{j} \le S_{f}^{b} + \sum\nolimits_{i = 1}^{{n_{f} }} a_{i} \), then we denote such a batch as bl and assume that JB is the set of the jobs are processed after Qj in the batch bl. The makespan of the schedule is \( C_{max} = \sum\nolimits_{f = 1}^{n} \sum\nolimits_{i = 1}^{{n_{f} }} a_{i} + \sum\nolimits_{i = 1}^{{\left| {J_{B} } \right|}} b_{i} + \sum\nolimits_{f = l + 1}^{n} \sum\nolimits_{i = 1}^{{n_{f} }} b_{i} + n\theta_{B} \) and bl is defined as the key batch, as illustrated in Fig. 4.

Fig. 4
figure 4

The illustration of the schedule Z

Property 4

For the problem \( 1\left| {s - batch,s_{i} ,p_{i} = a_{i} \,or\,p_{i} = a_{i} + b_{i} } \right|C_{max} \) , if the batches \( b_{l - 1} \,{\text{and}}\,b_{l + 1} \) satisfy that \( \sum\nolimits_{i = 1}^{{n_{l - 1} }} a_{l - 1,i} > \sum\nolimits_{i = 1}^{{n_{l + 1} }} a_{l + 1,i} \,{\text{and}}\,\sum\nolimits_{i = 1}^{{n_{l - 1} }} b_{l - 1,i} < \sum\nolimits_{i = 1}^{{n_{l + 1} }} b_{l + 1,i} \) , then it will get the schedule improved by swapping the two batches.

Proof

The proof is similar to that of the Property 2, and we omit it. \( \hfill\square \)

Property 5

For the problem \( 1\left| {s - batch,s_{i} ,p_{i} = a_{i} \,or\,p_{i} = a_{i} + b_{i} } \right|C_{max} \) , if the batches \( b_{l - 1} \,{\text{and}}\,b_{l + 1} \) satisfy that \( \sum\nolimits_{i = 1}^{{n_{l - 1} }} a_{l - 1,i} < \sum\nolimits_{i = 1}^{{n_{l + 1} }} a_{l + 1,i} \,{\text{and}}\,\sum\nolimits_{i = 1}^{{n_{l - 1} }} b_{l - 1,i} < \sum\nolimits_{i = 1}^{{n_{l + 1} }} b_{l + 1,i} \) , then it will get the schedule improved by swapping the two batches if the key job in the batch b l keeps unchanged.

Proof

The proof is similar to that of the Property 2, and we omit it. \( \hfill\square \)

Based on the above heuristic algorithm H1 and the BF rule [56], we proposed a heuristic algorithm for solving the problem on any machine Mj. The details of the heuristic algorithm are described as follows:

Algorithm H2

Step 1

Sequence the jobs according to the algorithm H1 and obtain a new jobs’ set J

Step 2

Assign the first job in J to the feasible batches, and select the feasible batch with smallest residual capacity and place the job into the batch, if there is not such batch or the job fits no existing batches, then go to Step 3

Step 3

Generate a new batch and place the job into the new batch

Step 4

Remove the first job from J

Step 5

Repeat Step 2 and Step 4 until all jobs are assigned into batches so that we can obtain a schedule \( Z = \left\{ {b_{1} , \ldots , b_{f - 1} , b_{f} , b_{f + 1} , \ldots , b_{n} } \right\} \)

Step 6

Select the group bf which satisfies \( S_{f}^{b} < Q_{j} \le S_{f}^{b} + \sum\nolimits_{i = 1}^{{n_{f} }} a_{i} \), and we define the batch as bl

Step 7

If the two batches \( b_{l - 1} \,{\text{and}}\,b_{l + 1} \) satisfy the conditions in Properties 4 or 5, then swap such two batches

3.3 Problem \( 1\left| {s - batch,G_{k} , p_{ki} = a_{ki} \,or\,p_{ki} = a_{ki} + b_{ki} ,s_{ki} } \right|C_{max} \)

For this problem \( 1\left| {s - batch,G_{k} , p_{ki} = a_{ki} \,or\,p_{ki} = a_{ki} + b_{ki} ,s_{ki} } \right|C_{max} \), we need to sequence the groups, the batches in each group and the jobs in each batch.

Definition 3

Let a sequence be \( Z = \left\{ {G_{1} , \ldots , G_{f - 1} , G_{f} , G_{f + 1} , \ldots , G_{N} } \right\} \) on any machine Mj. If the start time of the group \( G_{l} \) is \( S_{l} \) and it satisfies \( S_{l} < Q_{j} \le S_{l} + \sum\nolimits_{j = 1}^{{n_{l} }} a_{lj} \), then we denote the group as Gl and assume that JG is the set of the jobs are processed after Qj in the group Gl. The makespan of the schedule is \( C_{max} = \sum\nolimits_{k = 1}^{N} \theta_{k} \sum\nolimits_{i = 1}^{{n_{k} }} a_{ki} + \sum\nolimits_{i = 1}^{{\left| {J_{G} } \right|}} b_{kj} + \sum\nolimits_{k = l + 1}^{N} \theta_{k} \sum\nolimits_{i = 1}^{{n_{k} }} b_{ki} + N\theta_{G} + \sum\nolimits_{k = 1}^{N} \sum\nolimits_{f = 1}^{{N_{k} }} \theta_{B} \), and Gl is defined as the key group.

Property 6

For the problem \( 1\left| {s - batch,G_{k} ,s_{i} ,p_{i} = a_{i} \,or\,p_{i} = a_{i} + b_{i} } \right|C_{max} \) , if the groups \( G_{l - 1} \,{\text{and}}\,G_{l + 1} \) satisfy that \( \theta_{l - 1} \sum\nolimits_{i = 1}^{{n_{l - 1} }} a_{l - 1,i} > \theta_{l + 1} \sum\nolimits_{i = 1}^{{n_{l + 1} }} a_{l + 1,i} \,{\text{and}}\,\theta_{l - 1} \sum\nolimits_{i = 1}^{{n_{l - 1} }} b_{l - 1,i} < \theta_{l + 1} \sum\nolimits_{i = 1}^{{n_{l + 1} }} b_{l + 1,i} \) , then it will get the schedule improved by swapping the two groups.

Proof

The proof is similar to that of the Property 2, and we omit it. \( \hfill\square \)

Property 7

For the problem \( 1\left| {s - batch,G_{k} ,s_{i} ,p_{i} = a_{i} \,or\,p_{i} = a_{i} + b_{i} } \right|C_{max} \) , if the groups \( G_{l - 1} \,{\text{and}}\,G_{l + 1} \) satisfy that \( \theta_{l - 1} \sum\nolimits_{i = 1}^{{n_{l - 1} }} a_{l - 1,i} < \theta_{l + 1} \sum\nolimits_{i = 1}^{{n_{l + 1} }} a_{l + 1,i} {\text{and }}\theta_{l - 1} \sum\nolimits_{i = 1}^{{n_{l - 1} }} b_{l - 1,i} < \theta_{l + 1} \sum\nolimits_{i = 1}^{{n_{l + 1} }} b_{l + 1,i} \) , then it will get the schedule improved by swapping the two groups if the key job in the key group keeps unchanged.

Proof

The proof is similar to that of the Property 2, and we omit it.\( \hfill\square \)

Based on the heuristic algorithms H1 and H2, we can make decisions on sequencing the batches and the jobs. For sequencing the groups on any machine Mj, we also propose a heuristic algorithm to get a better schedule. The details of the heuristic algorithm are described as follows:

Algorithm H3

Step 1

Sequence the groups such that \( \theta_{1} \sum\nolimits_{i = 1}^{{n_{1} }} a_{1i} \le \ldots \le \theta_{k} \sum\nolimits_{i = 1}^{{n_{k} }} a_{ki} \le \ldots \le \theta_{N} \sum\nolimits_{i = 1}^{{n_{N} }} a_{Ni} \)

Step 2

Select the group Gk which satisfies \( S_{k} < Q_{j} \le S_{k} + \sum\nolimits_{i = 1}^{{n_{k} }} a_{ki} \), and we define the group as Gl

Step 3

Group the groups into two subset EG and LG where all jobs in EG are processed before Qj while all jobs in LG are processed after Qj

Step 4

Sequence the groups in {Gl} ∪ LG such that all groups are sequenced in non-increasing order of \( \sum\nolimits_{i = 1}^{{n_{k} }} b_{ki} \)

Step 5

If the two groups Gl−1 and Gl+1 satisfy the conditions in Properties 6 or 7, then swap such two groups to generate a new schedule

4 The hybrid VNS–NKEA algorithm

Hansen and Mladenović are the pioneers of the well-known variable neighborhood search (VNS) [26]. Since the algorithm was introduced in 1997 by the pioneers, VNS has motivated many researchers to solve some optimization problems, which indicates that VNS has the unique performance in solving such problems. Recently, much work tries to apply the VNS to solve the NP-hard problems in the aspect of optimization, such as Wen et al. [57] and Duarte et al. [58]. As referred in [59,60,61,62], the performance of the VNS is mainly affected by the following four key issues:

  1. (i)

    The quality of the initial solution;

  2. (ii)

    The design of the neighborhood structures;

  3. (iii)

    The local-search strategies;

  4. (iv)

    The order of the neighborhood structures selected.

In this paper, we modify the VNS based on the above factors to solve the problem we study. According to the properties we identify, a heuristic algorithm is proposed to ensure the quality of the initial solution. To obtain excellent neighborhoods, we propose some constructed strategies based on the structure of the problem. In each neighborhood, we apply some core operators from other algorithms to enhance the ability of local search, which can improve the searching efficiency.

The Neighborhood Knowledge-Based Evolutionary Algorithm, which was firstly proposed by Yu et al. [63], demonstrates the good performance of solving the multi-objective optimization problems. The algorithm consists of three major stages: the direction learning stage, the mutual adaptation stage, and the self-adaptation stage, which shows novel strategies to search in the solution space. NKEA not only involves global and local exploration, but also conducts careful use of the neighborhood knowledge between nominee solutions to enhance solution quality. The local-searching strategies play an important role to the VNS, which influences the performance heavily. Hence, we utilize the major three stages as the local-searching strategies in our proposed hybrid algorithm to enhance the ability to search in the local solution space, which integrates the advantages of both NKEA and VNS.

If we set δ = 0 and assume that jobs with same sizes are processed in m parallel machines in the general way without considering deterioration effect, then the problem is reduced to \( P\left| { incl.proc.sets} \right|C_{max}^{M} \). Meanwhile, the problem \( P\left| { incl.proc.sets} \right|C_{max}^{M} \) is NP-hard in the ordinary sense [37], thus the problem \( P\left| {s - batch,G_{k} , p_{ki} = a_{ki} \,or\,p_{ki} = a_{ki} + b_{ki} ,s_{ki} , incl.proc.sets} \right|\left( {1 - \delta } \right)C_{max}^{M} + \delta OC \) is also NP-hard. In order to address such a problem, we develop the novel hybrid VNS–NKEA algorithm. To solve the problem \( P\left| {s - batch,G_{k} , p_{ki} = a_{ki} \,or\,p_{ki} = a_{ki} + b_{ki} ,s_{ki} , incl.proc.sets} \right|\left( {1 - \delta } \right)C_{max}^{M} + \delta OC \), the decisions generated by the algorithm are follows:

  1. (i)

    whether some job in some group must be outsourced or not;

  2. (ii)

    In which machines each job in one group must be processed;

  3. (iii)

    How to assign the jobs into batches;

  4. (iv)

    What is the sequence of the jobs in some batch;

  5. (v)

    What is the sequence of the batches in some group;

  6. (vi)

    What is the sequence of the groups assigned onto a certain machine.

4.1 Encoding and decoding strategies

The coding is a key issue for the algorithm to solve the problem. We represent the individuals referred in the algorithm by adopting the binary coding which is a string of 0 s and 1 s. Figure 5 shows a chromosome contains nine genes which belong to two groups G1 and G2. The set of jobs in the group G1 is \( \left\{ {J_{11} ,J_{12} , J_{13} ,J_{14} ,J_{15} } \right\} \) and that of the other is \( \left\{ {J_{21} ,J_{22} , J_{23} ,J_{24} } \right\} \). The genes corresponding to all jobs in the groups are regarded as a one-dimensional vector. The gene value 0 represents that the job needs to be outsourced while value 1 means that the job is processed in the manufacturer. As Fig. 6 shows, each job in the groups is associated with a position value. Due to the locations of the jobs which are independent of the chromosomes keep unchanged in the groups, the position values of the jobs keep unchanged. The position values of the jobs in one group start from 1 and end with the number of the jobs in the group, which is unrelated to other groups. To make the decision on the assignment of the groups, we present a novel decoding mechanism in Fig. 7. Based on the position values, we can obtain the gene values of the groups which are the sum of the position values of the jobs assigned into the manufacturer. According to the gene values of the groups, we can take reminder of the number of machines which can process the jobs of the groups. There exist two special cases in the decoding procedures. Case I: if the reminder is 0, then we select the machine randomly. Case II: if there is only one machine for some group, then this group has to be processed on the machine regardless of the reminder. Figure 7 presents an example which includes five groups \( \left\{ {G_{1} ,G_{2} , G_{3} ,G_{4} ,G_{5} } \right\} \). The set of gene value of the groups is \( \left\{ {10,4, 7,4,9} \right\} \). After taking reminder, we can obtain the schedule of assignment \( \left\{ {M_{2} ,M_{4} , M_{1} ,M_{3} ,M_{2} } \right\} \), which means that machine M2 processes the group G1 and G5, while M1, M3, and M4 process the group G3, G4, and G2, respectively.

Fig. 5
figure 5

The representation of a chromosome

Fig. 6
figure 6

The two parts of a chromosome

Fig. 7
figure 7

The decoding of a chromosome

4.2 Neighborhood structures

The design of neighborhood structures influences heavily the effectiveness of VNS in the shaking phase. The neighborhood structures are based on some evolutionary strategies. In order to design effective neighborhood structures, three different kinds of neighborhood structures are proposed in this paper, in which three operators are utilized, including insertion operator, logical operator, and crossover operator. The different neighborhood are applied in the hybrid algorithm, which can make the hybrid algorithm effectively avoid trapping into the local optimization as well as improve the ability of global search. The insertion operator and logical operator can contribute to improving the diversity of solutions. While it is beneficial to guarantee the quality of solutions by using crossover operator.

  1. (i)

    Insertion neighborhood structure: insert the part sequence between two selected points whose values are both 0 into another two points whose values are 0. If there exist less than three points satisfy that the values are 0, then the part sequence is selected randomly and inserted into any location selected randomly. Figure 8 shows an illustrative example of the insertion neighborhood structure.

    Fig. 8
    figure 8

    An illustrative example of insertion neighborhood structure

  2. (ii)

    Logical neighborhood structure: if and only if the gene values from the two parents are both 1, then the equipotential gene value in the offspring is 1, or the gene value in the offspring is 0. Figure 9 shows an illustrative example of the logical neighborhood structure.

    Fig. 9
    figure 9

    An illustrative example of logical neighborhood structure

  3. (iii)

    Crossover neighborhood structure: a random one-point crossover operator is utilized to construct a neighborhood structure, in which a segmentation point is selected randomly for both parents. The segments to the right of the segmentation point are exchanged to form two distinct offsprings. Figure 10 shows an illustrative example of the crossover neighborhood structure.

    Fig. 10
    figure 10

    An illustrative example of crossover neighborhood structure

4.3 Reinforced-local searching

In each neighborhood structure, in order to improve the searching efficiency, we appy the core evolutionary idea of the NKEA [63] to be the local-seaching strategies in our proposed algorithm. During the local-searching process, the major operations are carried out including the direction learning algorithm (LDA), the mutual adaption algorithm (MAA), and the self adaption algorithm (SAA). The pseudo-code of the three major modified operations are presented in Figs. 1112, and 13, respectively.

Fig. 11
figure 11

The pseudo-code of the MDLA

Fig. 12
figure 12

The pseudo-code of the MMAA

Fig. 13
figure 13

The pseudo-code of the MSAA

4.4 The framework of the proposed algorithm

It is useful to design distinct and effective neighborhood structures at shuffling the solution space. Further, to intensify the search in promising regions, we apply the evolutionary major operations from the NKEA. The searching mechanism is carried out in our proposed algorithm to balance global exploration and local exploitation effectively. The main steps of the novel hybrid VNS–NKEA algorithm are described in Fig. 14. Figure 15 illustrates the framework of our proposed algorithm.

Fig. 14
figure 14

The pseudo-code of the implemented VNS–NKEA algorithm

Fig. 15
figure 15

The framework of the VNS–NKEA algorithm

5 Computational experiment

In this section, we present computational experiments which evaluate the effectiveness of the proposed hybrid algorithm by comparing with the performance of other algorithms, i.e., GA–VNS [57], VNS [64], and NKEA [63]. All algorithms were coded in JAVA and run on a PC powered by an Intel (R) Core (TM) i7-6700 CPU 3.40 GHz with 8.00 GB RAM operating under Windows 7. Motivated by the previous literature, we follow the experimental design in [26, 53]. The processing time of the jobs in each group were generated from a uniform distribution over the integers between 1 and 100. Besides, the extra punitive processing time of the jobs were obtained by another interval. The performance measures for all algorithms include the mean and the maximum fitness value (Table 3).

Table 3 Parameters setting

The above Table 4 and Fig. 16 show the results of each instance whose codes are executed for 10 runs independently. In the Table 4, we present the average and the worst value of objective corresponding to each instance from all compared algorithm including VNS–NKEA, VNS, NKEA, and GA–VNS. For each instance, we can see that the worst value from VNS is larger than those of other compared algorithms including VNS–NKEA, VNS, and NKEA. While the worst value from the hybrid is lower than that from VNS or NKEA except for instance 5 and 9 in which the values are equal. As for the average objective value, the value from the hybrid algorithm is lower than any other compared algorithms except for instance 5, 9, and 13. However, the average value from VNS is larger than that from other algorithms. Figure 16 presents the convergence curves for each instance from each compared algorithm in terms of the average objective value for each iteration. For instances 1–6, all algorithms are executed for 100 iterations while other instances are executed for 200 iterations. The solution quality of VNS is better than that of NKEA in early iterations while we can get the contrary result in the later iterations. When GA–VNS is compared with NKEA, the results are the same as VNS except for instances 5, 9, and 14. Meanwhile, GA–VNS performs better than VNS in terms of the convergence speed and solution quality. As for the hybrid algorithm, the eventual solution quality of VNS–NKEA is better than that of other compared algorithm obviously except instances 5, 9, 10, 13, and 14. In addition, the hybrid algorithm starts to converge from 15 to 25 iterations, and it is earlier than other compared algorithms. Based on the above analysis, we can conclude that VNS tends to be trapped into local optimal resulting in premature convergence while NKEA leads to slow convergence easily. In summary, the hybrid algorithm integrates VNS and NKEA effectively, which embodies the characteristics of both VNS and NKEA for each instance.

Table 4 The average and the maximum objective value of compared algorithms for each instance
Fig. 16
figure 16figure 16figure 16

Convergence curves of each instance, a convergence curves of instance 1, b convergence curves of instance 2, c convergence curves of instance 3, d convergence curves of instance 4, e convergence curves of instance 5, f convergence curves of instance 6, g convergence curves of instance 7, h convergence curves of instance 8, i convergence curves of instance 9, j convergence curves of instance 10, k convergence curves of instance 11, l convergence curves of instance 12, m convergence curves of instance 13, n convergence curves of instance 14, o convergence curves of instance 15, p convergence curves of instance 16

6 Conclusion

In this paper, we investigate a parallel-machine grouping scheduling problem where the jobs with inclusive processing set restrictions are characterized by distinct sizes. For solving such a problem, we first analyze some special case of the studied problem and identify some structural properties corresponding to the optimal solution. Moreover, according to the structural properties, we propose a novel hybrid VNS–NKEA algorithm to solve the problem. In addition, we implement the hybrid algorithm and compare with other algorithms including VNS, NKEA, and GA–VNS in terms of the solution quality and convergence speed by conducting numeral instances. The results show that the proposed hybrid algorithm performs quite better than the compared algorithms, which demonstrates that the hybrid algorithm integrates VNS and NKEA effectively.

For the future research work, we put forward some suggestions on models and methods of scheduling problems. In terms of models, we can consider more complex and practical situations, and machine maintenance, dynamic job arrival, uncertain processing times can be further incorporated. In addition, other different outsourcing features, i.e., capacity and budget limitation, can be also added in models. As for methods, due to the complexity of problem, most literature focus on developing meta-heuristic algorithms. Based on such circumstance, researchers can try to develop some other methods to reduce the complexity.