1 Introduction

As a new branch of scheduling, batch scheduling problems (BSPs) [1] extensively exist in industrial manufacturing systems, such as metal manufacturing, pharmaceuticals, aeronautics, semiconductor manufacturing, and logistics, among others [2]. Different from the classical scheduling problems, one typical feature of scheduling on batch processing machines (BPMs) is that several jobs can be processed as a batch on one BPM simultaneously as long as the total size of jobs in the batch does not exceed the capacity of the BPM.

In recent years, the coordination of production scheduling and product transportation has become increasingly important in supply chain management. With the advancement of the make-to-order (MTO) business model in which products are customized and delivered from factory to customer and no finished product inventory is held between them, the production and distribution are therefore more closely linked and jointly scheduled. The integrated scheduling models are often used in time-sensitive supply-chain management, such as the production and delivery of perishable products [3] and industrial adhesive material [4]. In such examples from practical life, because of the time-sensitive characteristics of these products, orders should be delivered to the customers directly without intermediate inventory. Additionally, the delay in the delivery of these products may not only incur a tardiness penalty due to customer dissatisfaction and potential loss of reputation, but also lead to failure of the supply chains. However, with the increasingly fierce competition in the case of MTO manufacturing, productivity related to due dates has become more significant for manufacturers. Therefore, the integrated scheduling problem involving due-date considerations becomes very crucial for most of today’s the businesses, and how to effectively integrate the production and delivery stages at the operational level to lower operational costs and improve customer service become very important to company success.

A large number of companies globally rely on third-party logistics (3PL) providers for their daily transportation of products and other logistics needs. 3PL providers usually follow a fixed daily or weekly service plan to serve their customers. The objective of integrated scheduling is generally to achieve an overall optimization for the supply chain and the obtained schedule can offer effective guidance for operations management.

There are two different modes related to vehicle capacity. In the first mode, it is assumed that each job occupies one unit of the capacity, without loss of generality. On the one hand, if the sizes of jobs are identical, each job is assumed to occupy one unit of the capacity. On the other hand, if the jobs have non-identical sizes, each job is assumed to take up one standard-size pallet for delivery convenience, regardless of the sizes of jobs. Hence, each vehicle can transport a fixed number of jobs in the first mode. In the second mode, the amount of capacity occupied by each job is assumed to equal the size of the job, which is suitable for addressing BSPs with unequal-size jobs. Therefore, the second mode is employed in addressing the studied problem.

In this paper, the problem of integrated scheduling of production and distribution is investigated, and the objective of the studied problem is to minimize the total weighted tardiness (TWT) of the jobs. Furthermore, an ant colony optimization (ACO) algorithm with a novel local optimization is proposed to group the jobs into batches in the production stage. Moreover, a heuristic is developed to transport the completed jobs to customers during the distribution phase. The main contributions of the proposed algorithm are the following.

  1. (1)

    A new local optimization strategy called LOC is proposed to improve the local exploitation ability of the algorithm and further search the neighborhood solution to improve the quality of the solution.

  2. (2)

    Two interval candidate lists (ICLs) are proposed to reduce the search for the feasible solution space and improve the search speed.

  3. (3)

    Three objective-oriented heuristics are developed to accelerate the convergence of the algorithm.

The rest of this paper is organized as follows. In Section 2, the literature on batch scheduling and the integrated scheduling model is reviewed. In Section 3, the studied problem is formulated. Thereafter, an ACO algorithm with a novel local optimization based on efficient solution construction is proposed in Section 4. In Section 5, results and analysis on comparative experiments are provided. Conclusions and future research directions are presented in Section 6.

A list of all abbreviations used in this paper can be found in Appendix A.

2 Literature review

With the emergence of manufacturing, the BSP has been widely investigated [5,6,7]. Furthermore, makespan, one of the common objectives of scheduling problems, is usually used to evaluate the productivity of companies from many different scenarios. Reference [8] studied BSP with non-identical job sizes on a single machine and proposed an artificial bee colony (ABC) approach to minimize the makespan. Reference [9] designed two improved heuristics: (1) the first-fit longest processing time (FFLPT) rule is improved by considering identical job size and (2) the best-fit longest processing time (BFLPT) rule is improved by using an enumeration scheme. The objective of the problem was to minimize makespan on a BPM with non-identical job sizes and unequal job processing times. Reference [10] dealt with the problem of scheduling jobs with arbitrary sizes on the parallel BPMs with non-identical capacities and presented a heuristic based on the first-fit-decreasing (FFD) rule, as well as a meta-heuristic based on max-min ant system (MMAS), to minimize the makespan. Reference [11] proposed a branch-and-bound algorithm to minimize the makespan of the problem of scheduling jobs with unit size, different processing times, and arbitrary release dates on parallel BPMs. Reference [12] studied the problem of scheduling jobs with dynamic arrival times on parallel BPMs with arbitrary capacities to minimize makespan and proposed two meta-heuristics based on ACO to tackle it.

The objective of total (weighted) completion time (TCT/TWCT) is related to measure customer satisfaction in general. Therefore, with the rise of MTO manufacturing, an increasing number of BSP researchers have begun to consider this objective. Reference [13] proposed a dynamic programming approach to minimize the TCT of the problem of scheduling jobs on an unbounded BPM. Reference [14] investigated the problem of scheduling jobs in incompatible families and dynamic arrivals on a single BPM to minimize the TCT. They put forward a decomposed branch-and-bound algorithm according to the characteristics of dynamic job arrivals. Reference [15] studied the problem of serial batch scheduling on uniform parallel machines to minimize TCT and presented a polynomial-time procedure with the complexity of O(m2nm+ 2). Reference [16] proposed two ACO-based algorithms with different coding methods to minimize the TCT of the BSP. The two coding methods are based on job sequence and batch sequence, respectively. Reference [17] investigated the problem of minimizing the total weighted completion time on parallel BPMs with identical capacities, non-identical job sizes, and unequal weights. They presented an ACO-based meta-heuristic, as well as a mixed-integer programming (MIP) formulation, to address it.

Minimizing total (weighted) tardiness, one of the important objectives that reflect the production performance of the time-sensitive jobs, has widely been considered in scheduling problems with order due dates. Reference [18] studied the scheduling problem of minimizing earliness–tardiness (E/T) on a single BPM, in which jobs have non-identical sizes and a common due date. They presented a heuristic called the minimum attribute ratio of the batch (MARB), a hybrid genetic algorithm (GA), and a mathematical model to solve the problem. Reference [19] dealt with scheduling problems in two-stage hybrid flow shops in which jobs have different due dates and each machine has identical capacities. They proposed a MIP model and three iterative algorithms to address the problem. Reference [20] provided a mathematical model considering dynamic starting conditions and developed heuristic algorithms to solve the scheduling problem for minimizing TWT under the constraint of unequal release times, incompatible job families, non-identical job sizes, and heterogeneous BPMs. Reference [21] studied the scheduling problem on non-identical parallel BPMs with job release times and non-identical job sizes to minimize the TWT and presented several heuristics, a MIP model, dynamic programming methods, and simulated annealing (SA) algorithms to resolve the problem.

The above studies only considered the production stage of products but ignored the transportation process. With the development of the logistics industry, the transportation of products plays an important role in this field. Reference [22] studied the vehicle scheduling problem of minimizing the required transportation time and presented two meta-heuristic methods, variable neighborhood search (VNS) and greedy randomized adaptive search procedure (GRASP), to solve the problem. Reference [23] proposed a novel dynamic public transport vehicle allocation scheme based on the emergent intelligence (EI) technique and built mathematical models for estimation of resources, utilization, and reliability parameters for solving public transport system problems. Reference [24] dealt with transportation vehicle scheduling problems in a supply-chain network and presented a collaborative transportation scheduling strategy, as well as a meta-heuristic algorithm called chemical reaction optimization(CRO), to minimize the total transportation cost. Reference [25] investigated the problem of minimizing the total transportation cost in urban cold chain transportation logistics. They presented an improved GA to address it. Reference [26] dealt with the vehicle routing problem of minimizing the traveling cost. The hybridization of the particle swarm optimization (PSO) and adaptive large neighborhood search (ALNS) algorithms was first developed for solving the problem.

In the past, the optimized objectives of BSPs with a single phase were related to production efficiency, and included makespan, total completion time, and total delay time, among other objectives. With the rise of the service and manufacturing industries, two-stage integrated scheduling problems have played an increasingly important role in reducing operating costs and improving service level and customer satisfaction in those industries. In addition, to be closer to real world applications, the completion time of an order extended from the completion time of processing in a single production stage to the delivery time to the customer in the transportation phase. This problem was often faced by manufacturers that made time-sensitive products such as fashion apparel and toys, which was sold only during specific seasons. Therefore, it was necessary for these manufacturers to make the production and distribution scheduling decisions. In addition, the customers often set due dates on the orders they placed with the manufacturer and there was typically a penalty imposed on the manufacturer if the orders were not completed and delivered to the customers on time. Hence, the manufacturer was incentivized to meet the due dates as close as possible.

With the rise of supply-chain management technology, a growing number of studies on production scheduling with simultaneous order distribution have appeared. Reference [27] presented a branch-and-bound algorithm for small-scale problems and a tabu search and a hybrid GA for large-scale problems of minimizing the total tardiness of a set of jobs to be scheduled on identical parallel machines for which jobs can only be delivered on certain fixed delivery dates. Reference [28] considered the production-distribution scheduling problem of minimizing total weighted delivery times on parallel BPMs with non-identical job sizes and equal processing time. They proposed a deterministic heuristic and two hybrid meta-heuristics based on MMAS to solve the problem. Reference [29] studied an integrated production and distribution problem considering homogenous vehicles with capacity constraints to minimize the total weighted delivery times of orders and presented an improved large neighborhood search algorithm with a local search. Reference [30] developed an approximation algorithm with a worst-case ratio of 3/2 to minimize the total delivery times of jobs for the problem that considers both production and job delivery simultaneously with the consideration of machine availability. Reference [31] dealt with the problem of minimizing total weighted delivery time of scheduling a set of jobs with identical processing time, non-identical sizes, and unequal weights on parallel BPMs with arbitrary capacities. They presented two heuristic algorithms and an algorithm based on ACO to address the problem. Reference [32] investigated several integrated problems of production, inventory, and delivery. After being processed, the orders were delivered to the customers by transporters at fixed delivery departure times. They considered two classes of problems based on whether the order delivery was splittable or not. For every class of problems, they analyzed the computational complexity, and pointed that each problem is NP-hard. For a detailed review on integrated scheduling, the reader is referred to [33, 34].

Since the completion time of each job in the production stage is the start time of jobs in the transportation phase for two-phase integration scheduling problems, the processing completion time of each job is diverse for different production scheduling schemes. Thus, in the transportation stage, jobs that are transported on each delivery are different and the distribution scheduling scheme obtained is also different. In addition, it is difficult to coordinate the integrated scheduling of the two stages of production and transportation, especially considering the parallel BPM environment in the production stage. Hence, the relationship between the production and transportation stages is rarely studied. In summary, studies on integrated scheduling considering TWT, especially on parallel non-identical BPMs with different job sizes, unequal job weights, and arbitrary job due dates, are still far fewer those of traditional objectives. Since the objective of TWT is widely applicable in the service and manufacturing industries, the integrated scheduling problems considering TWT are still worth further study.

3 Problem formulation

The problem studied in this paper can be denoted by Pm | p-batch,delivery,pj = p,wj,sj,dj,Si,CV,DTl, \(VA_{l} ~|~ {\sum }_{j=1}^{n}w_{j}\cdot T_{j}\). It is assumed that all parameters are integers and known in advance. The assumptions of this problem are the following.

There are a set of n jobs, denoted by J = {Jj|j = 1,2,...,n}, to be grouped into b batches that are scheduled on m parallel BPMs, denoted by M = {M1,M2,…,Mm}. Each job JjJ has four attributes, i.e., processing time pj = p, size sj, weight wj, and due date dj. The generated batch set is denoted by B = {Bu|u = 1,2,…,b}. Each batch BuB generally includes more than one job. All jobs in the same batch are processed simultaneously without interruption. Note that b is not determined in advance. The capacity of the machines are non-identical. Each machine Mi has a capacity Si. There are z different machine capacity, denoted by S1,S2,⋯ ,Sz (1 ≤ zm). Without loss of generality, it can be assumed that S1 < S2 < ⋯ < Sz. The number of machines whose capacity equals to Sξ (1 ≤ ξz) is denoted by mξ, where m1 + m2 + ⋯ + mz = m and Si ∈{S1,S2,⋯ ,Sz}. The size of each job is not more than the maximum capacity of machines, i.e., Sz. All machines and jobs are available at time zero. In the studied problem, some jobs can only be processed on the machines with large capacity due to the constraint of machine capacity. Moreover, the total size of all jobs in one batch cannot exceed the capacity of the machine that processes the batch, and the k-th batch scheduled on Mi is denoted by Bki. The processing time of the batch Bki, denoted by Pki, is equal to p.

Once the processing of a job is finished, this job is delivered to the customer by one of the vehicles at d(l = 1,2,…,d) different departure times. At the l-th departure time, denoted by DTl(l = 1,2,…,d), there are V Al vehicles to be used for transportation, and each vehicle has the identical capacity CV.

To describe the studied problem more accurately, the MIP model of this problem is presented in Appendix B.

In [1], the scheduling problem of minimizing makespan on a single BPM with arbitrary job sizes was studied and the proof of strong NP-hardness of the problem was provided. Since the due dates and weights of the jobs, as well as the capacity of machines, are all non-identical in the studied problem, which is more complicated than the problem in [1], the problem investigated in this paper is also NP-hard.

4 Proposed algorithm

Considering that the studied problem is NP-hard, a meta-heuristic based on ACO is proposed to address the studied problem. The ACO algorithm is inspired by the communication mechanism of ant foraging behavior, where individual ants communicate with each other by secreting pheromones. As the ants move, they leave pheromones in their path, and the ones behind choose their path by sensing the concentration of pheromones. Since the collective behavior of the ant colony shows the phenomena of positive feedback, self-organization, strong robustness, and distributed calculating. ACO has been successfully used to handle a great deal of complex discrete combinatorial optimization problems.

In this paper, the problem under study can be solved in two stages. In the production stage, the jobs are grouped into batches that are scheduled on the parallel BPMs. In the transportation stage, an effective heuristic method is proposed to obtain the optimal schedule.

This section contains the following contents. First, the solution construction scheme is introduced. Second, the way the pheromones update is presented. Third, to obtain the best solution, the local optimization strategy is developed. Fourth, a detailed description of the proposed algorithm is presented.

4.1 Solution construction

Encoding is the first step in ACO being applied to solve the combinatorial optimization problem. In this study, each solution is encoded as a 2 × n vector, where n represents the number of jobs. For the production phase of the integrated scheduling problem, jobs must be grouped into batches before the jobs can be processed on the BPMs. Once a job is assigned to a batch, the batch index and machine index of the job are determined simultaneously.

To clearly describe the state of each job being accessed, an access list, denoted by AL, is employed to record the access state of each job, and the initial value of AL is set to (1,2,…,n), indicating that all jobs are not scheduled. The number at the corresponding position on the access list becoming zero denotes that the job has been scheduled. After all jobs are processed, AL = (0,0,…,0), indicating that one schedule to be delivered in the stage of production is built.

4.1.1 Interval candidate list

In the solution construction (SC) algorithm of the studied problem, since the selection of the first job in the new empty batch on the machine has a great influence on the quality of solution. ICLs based on four attributes, i.e., capacity of batches, size of jobs, due dates of jobs, and departure times, are designed to reduce the search for the feasible solution space and improve the search speed. Specifically, there are three steps to construct the interval candidate list \(ICL^{1}_{lki}\): (1) Build a set of jobs that meet the size of current batch Bki on the machine Mi; (2) divide an interval into different sub-intervals according to the departure time; and (3) assign jobs to different sub-intervals according to their due dates. \(ICL^{1}_{lki}\) is used to randomly select a job from the sub-intervals in order from front to back and assign it to the new empty batch Bki on machine Mi. In addition, the process of constructing interval candidate list \(ICL^{2}_{lki}\) is to remove jobs that do not meet the remaining space of the current batch Bki on machine Mi based on \(ICL^{1}_{lki}\). \(ICL^{2}_{lki}\) is used to select a job from the sub-intervals in order from front to back and assign it to the current batch Bki on machine Mi according to (7). \(ICL^{1}_{lki}\) and \(ICL^{2}_{lki}\) are defined as (1) and (2), respectively.

$$ ICL^{1}_{lki} =\{J_{j}|s_{j}\leq S_{ki} \wedge d_{j}\geq DT_{l}\wedge d_{j}\leq DT_{l+1}\} $$
(1)
$$ ICL^{2}_{lki} = \{J_{j}|s_{j}\!\leq\! (S_{ki}-\underset{h\in B_{ki}}{\sum}s_{h}) \wedge d_{j}\!\geq\! DT_{l} \wedge d_{j}\!\leq\! DT_{l+1}\} $$
(2)

where Jj in (1) and (2) represents any one unscheduled job.

4.1.2 Pheromone trails

Pheromone trails are of great importance in constructing solutions by ants. Each ant selects the unscheduled jobs and assigns the selected jobs into one batch based on the state-transition probability. Moreover, pheromones usually represent past experiences of the ants in building solutions. Moreover, an unsuitable definition of pheromones generally leads to poor solutions. In this study, pheromone ψhj represents the desirability to place jobs Jh and Jj in the same batch, and 𝜃jki represents the average desirability of candidate job Jj being assigned into batch Bki. To improve the accuracy of the algorithm, 𝜃jki is defined as follows:

$$ \theta_{jki} = \frac{{\sum}_{J_{h} \in B_{ki}} \psi_{hj}}{~|~ B_{ki} ~|~} $$
(3)

where ψhj denotes the expectation of candidate jobs Jj and Jh being grouped into the same batch; | Bki | denotes the number of jobs in Bki.

4.1.3 Heuristic information

As a crucial component of the ACO algorithm, heuristic information enables guiding the search direction during the search process. Specifically, it can provide the problem-specific knowledge, thereby accelerating the convergence of the algorithm. In the problem described herein, jobs with large weight, tight due date, and small size should be processed with a higher priority. Thus, \({\eta _{j}^{1}}\) is defined as (4). Additionally, since the job with a tight due date should be selected with a higher probability, \({\eta _{j}^{2}}\) is defined as (5). To reduce the number of batches and make the jobs able to be processed as early as possible, each batch should be as full as possible to reduce the idle space of the batch on the machine. Therefore, the third heuristic information \(\eta _{jki}^{3}\) is defined as (6):

$$ {\eta_{j}^{1}}=\frac{w_{j}}{d_{j}\cdot s_{j}+\gamma} $$
(4)
$$ {\eta_{j}^{2}}=\frac{1}{d_{j}} $$
(5)
$$ \eta_{jki}^{3}=\frac{1}{S_{i}-({\sum}_{J_{h}\in B_{ki}}s_{h}+s_{j})+1} $$
(6)

where γ is a constant.

4.1.4 Decision rules

To build each batch, two decision rules are required, namely decision making in selecting the machine and job, respectively. In the decision rule for machine selection, the machine with the smallest completion time is selected to decrease the makespan of production. If more than one machine has the same smallest completion time, the machine with the smallest index number is selected. In the decision rule for job selection, when an empty batch is constructed on the selected machine, one unscheduled job is randomly selected from \(ICL_{lki}^{1}\) and added into the current batch as the first job first. Then, the unscheduled jobs are chosen from \(ICL_{jki}^{2}\) and added into the current batch one by one according to the state-transition probability, defined as (7):

$$ p_{jki} = \begin{cases} \frac {{(\theta_{jki})}^{\alpha}\cdot ({\eta_{j}^{1}})^{\beta1} \cdot ({\eta_{j}^{2}})^{\beta2}\cdot (\eta_{jki}^{3})^{\beta3}} {{{\sum}_{J_{h} \in ICL_{lki}^{2}}(\theta_{hki})}^{\alpha} \cdot ({\eta_{h}^{1}})^{\beta1} \cdot ({\eta_{h}^{2}})^{\beta2}\cdot(\eta_{hki}^{3})^{\beta3}}, ~~\text{if}~J_{j} \in ICL_{lki}^{2} \\ 0, {~~~~~~~~~~}\text{otherwise. } \end{cases} $$
(7)

According to (7), the selection of jobs is influenced by four parameters, i.e., α,β1,β2, and β3, determining the relative importance of pheromone trails and heuristic information.

4.2 Update of pheromone trails

Pheromone update is a vitally significant step in ACO to explore and exploit better solutions and search the solution space efficiently. For ant colonies, pheromone update is used to dynamically adjust the search direction based on the ants’ previous experiences. There are generally two types of strategies for updating pheromones, namely the local and global updates. In the local update, only the best solution found by the colony in the current generation is used to update pheromones. However, the best solution found from the first generation to the current generation is employed in the global update of pheromones. To decrease the complexity of the proposed algorithm and elevate the quality of solutions, only the global update strategy of pheromones is utilized in this study. Moreover, pheromones evaporate at a fixed rate to hinder them from accumulating indefinitely. Specifically, the pheromone trails are updated according to (8):

$$ \psi_{hj}(t+1)=(1-\rho)\cdot\psi_{hj}(t)+{{{\varDelta}}}\psi_{hj}(t)\cdot m_{hj}(t) $$
(8)

where ρ denotes the fixed evaporation rate, ranging from 0 to 1. mhj(t) represents the number of times that jobs Jh and Jj are grouped into the same batch at the current iteration t. Δψhj(t), representing the increment of pheromones generated by the global best solution at iteration t, is defined as (9):

$$ {{{\varDelta}}} \psi_{hj}(t)= \begin{cases} \frac {Q}{TWT^{\ast}(t)} ,\text{ if jobs } J_{h}\text{ and } J_{j} \text{ are grouped into the same batch} \\ 0, \text{otherwise. } \end{cases} $$
(9)

In the t-th generation, if jobs Jh and Jj are assigned in the same batch, Δψhj(t) = Q/TWT(t), where Q is an input parameter, then TWT(t) denotes the objective value of the global best solution at the t-th generation; that is, the minimum TWT is found at the t-th generation. If Jh and Jj are not grouped in the same batch, Δψhj(t) = 0.

4.3 Local optimization strategy

Ants can search for a feasible solution under the guidance of historical and heuristic information. However, it is difficult to explore the entire solution space exhaustively due to the complexity of the problem. Thus, it is quite possible to find a better neighbor solution of the constructed solution. To improve the local exploitation ability of the algorithm, and further search the neighborhood solution to improve the quality of the solution, a swap-based local optimization strategy, called LOC, is proposed. In LOC, the jobs in different batches are exchanged so that the jobs can be finished before or close to their due dates as much as possible. Before introducing the LOC, the following definition is provided.

Definition 1

For the studied problem, the sequence number of batches on a machine is called the rank of the batch. Furthermore, the first batch on each machine is called the first-rank batch, the second batch on each machine is called the second-rank batch, and so on.

Since jobs being processed and transported earlier benefits the objective, the jobs with larger weights and smaller due dates should be processed as early as possible. In other words, these jobs should be assigned lower-rank batches. In LOC, one job with smaller weight and larger due dates in one lower-rank batch is exchanged with one job with larger weight and smaller due date in the higher-rank batch under the constraint of machine capacity.

To make LOC more readable, an example is given as follows. In the production stage, suppose there are two machines and five jobs, the information of which is listed in Table 1. In Table 1, the first row represents the indices of the jobs, while the second, third, and fourth rows denote the weights, sizes, and delivery times of each job, respectively. In the transportation stage, it is assumed that there are two departure times, denoted by 3 and 6, there are two available vehicles, the capacity of which are five at every departure time, and the interval between the two departure times is three.

Table 1 Infomation of jobs

The distribution of jobs in batches before and after using the LOC algorithm are displayed in Figs. 1 and 2, respectively.

Fig. 1
figure 1

Gantt chart of feasible schedule before using LOC

Fig. 2
figure 2

Gantt chart of improved schedule after using LOC

From Figs. 1 and 2, it can be found that job J2 is in the second-rank batch and J5 is in the first-rank batch. Moreover, w2 > w5 and d2 < d5. Therefore, J2 is exchanged with J5 after LOC is executed, causing the objective value of TWT to be reduced from 34 to 20.

The LOC algorithm described as Algorithm 1.

figure d

4.4 Description of proposed algorithm

To make the proposed algorithm easier to understand, the processes of algorithm SC and job transportation (JT) are described as Algorithms 2 and 3, respectively.

Algorithm SC consists of four parts. To begin, some parameters are initialized. Second, an empty batch is built on the selected machine and the first job for the current batch is chosen. Third, the unscheduled jobs are selected and grouped into the current batch based on the state-transition probability pjki. Fourth, the solution to the production sub-problem is output.

figure e
figure f

Let Flagj denote the shipped states of jobs; that is, Flagj = 0 indicates that job Jj is not shipped, while Flagj = 1 means that Jj is shipped. Let cj represent the completion time of job Jj. Jtrp denotes the set of jobs that are transported. Algorithm JT is described as Algorithm 3.

The proposed algorithm, i.e., ant colony optimization with local optimization (LACO), is described as Algorithm 4. In line 1, the parameters, such as the maximum iteration Imax; index of the current iteration, g; evaporation rate of pheromones, ρ; global best solution GS; number of ants, Nant; and pheromone matrix, among others, are initialized separately. In lines 2–11, the best solution is generated. In line 4, Algorithm SC is called to construct a production schedule. In line 5, Algorithm LOC is called to improve the production schedule. In line 6, Algorithm JT is called to generate the solution. In lines 7 and 9, GS, the best solution from the first generation to the current generation, and pheromone trails, are updated separately. In line 12, the global best solution GS is output.

figure g

4.5 Complexity analysis

For each iteration of the proposed LACO algorithm, the computation time consists of three separate parts, i.e., building the batches, local optimization, and transporting the jobs, the complexities of which are O(n2), O(b2), and O(dn2), respectively. Since the maximum number of batches, b, is not more than the number of jobs, n, and the local optimization strategy is executed by every ant, the computational complexity of the LACO algorithm is O(Imax × Nant × d × n2), where Imax, Nant, and d denote the maximum number of iterations, number of ants, and number of departure times, respectively.

5 Computational experiments

To evaluate the performance of the LACO algorithm comprehensively, it is compared with three other representative algorithms, i.e., PSO [35], the random-keys genetic algorithm (RKGA) [36], and SA [37]. To verify the effectiveness of the local optimization strategy in LACO, WACO (LACO without local optimal strategy) is also implemented as one of the benchmark algorithms. Moreover, to apply the benchmark algorithms to the problem studied in this paper, several modifications are required.

To make a fair comparison, the strategy of job transportation in all algorithms is the same. Additionally, the method of batching in this paper is incorporated into the benchmark algorithm SA, meanwhile, machine environment in [37] is modulated according to the studied problem.

All algorithms are programmed in C++ using the software Visual Studio 2017 and executed on a Windows 7 PC with an Intel Core i3 3.20GHz processor and 8 Gb of RAM.

5.1 Experimental designs

In this paper, eight categories of test instances are determined by the combination of four factors such as number of jobs (n), number of machines (m), tightness of due dates (A), and departure modes (DW ). Specifically, two combinatorial values of n and m are considered, i.e., (n = 100,m = 5) and (n = 200,m = 10), denoted by NM1 and NM2, respectively. Parameter A is set according to [38] to adjust the urgency degrees of the due dates of jobs. The value of A is selected from set {0.5,1}, denoted by A1 and A2, respectively. Two kinds of departure modes are designed, denoted by DW1 and DW2, respectively. Hence, eight categories of test instances are generated totally, denoted by NM1A1DW1, NM1A1DW2, NM1A2DW1, NM1A2DW2, NM2A1DW1, NM2A1DW2, NM2A2DW1, and NM2A2DW2, respectively. For each category, ten instances are generated randomly, according to the settings in Table 2, where P and U represent poisson distribution and uniform distribution, respectively.

Table 2 Factors and levels of the test instances

To test the effect of the number of jobs on the performance of the LACO algorithm, two levels of the number of jobs are considered. The processing times of jobs (pj), the weights of jobs (wj), the types of machine capacities (Sξ), and the number of machines (m) are set according to the method in [31]. The due date of each job (dj) is generated by U[2p,A × Cmax], where Cmax denotes the makespan obtained by the first fit (FF) rule. The sizes of jobs are set according to the method in [39]. In the distribution stage, the capacity of each vehicle (CV ) is set to 80. The vehicles are set to depart at d different departure times, where d is determined until all jobs have been transported in the transportation stage. At the l-th departure time, denoted by DTl(l = 1,2,⋯ ,d), the corresponding number of vehicles is denoted by V Al. Additionally, in DW1, the number of vehicles for departure times with odd and even indexes is generated by U[1,2] and U[3,4] for 100 jobs, respectively. Correspondingly, the number of vehicles for departure times with odd and even indexes is generated by U[3,4] and U[4,5] for 200 jobs, respectively. In DW2, the number of vehicles for each departure time is set by U[2,3] for 100 jobs and U[4,5] for 200 jobs, respectively.

5.2 Parameter settings

In the proposed LACO algorithm, the setting of several parameters has great influence on solution quality and computation time. Therefore, to achieve better performance, it is necessary to determine appropriate values for these parameters. The LACO parameters include Nant, Imax, ρ, α, β1, β2, and β3. To balance between solution quality and computation time, the values of several parameters are set, according to the preliminary experiments, as follows: Nant = 20, Imax = 200, ρ = 0.5, and Q = 1. Meanwhile, to compare the algorithms fairly, the number of particles, number of chromosomes, and number of iterations at the same temperature are set to 20 in PSO, RKGA, and SA, respectively; that is, the number of fitness function evaluations in each iteration for all algorithms is 20. Moreover, the maximum number of iterations for all algorithms is set to 200. Therefore, the numbers of fitness function evaluations of all the algorithms are the same and equal to 4,000.

To determine the effect of the values of the other four key parameters on the performance of LACO, the Taguchi design-of-experiment (DOE) method is employed. The combinations of selected values of these parameters are listed in Table 3.

Table 3 Combinations of parameter values

To determine the effect of the four parameters, i.e., α, β1, β2, and β3, on pheromone trails and heuristic information, the values of pheromones and heuristic information are set within (0,1). Moreover, the input parameter γ is set to 5 to ensure the value of heuristic information \({\eta _{j}^{1}}\) is within (0,1). As shown in Table 3, five levels of values are set for each parameter. As a result, an orthogonal array L25(54) is selected. To test the robustness of the proposed LACO algorithm and simplify the simulated experiments, one group of representative instances, denoted by NM2A1DW1, is chosen, in which 200 jobs with urgent due dates are processed on 10 machines and transported in the first mode of departure. Furthermore, five out of 10 instances are randomly selected and each selected instance is independently tested 10 times. Finally, the average TWT value obtained on each instance is regarded as the response variable (RV ). Apparently, the smaller the value of RV, the better the combinatorial values of the four parameters. The orthogonal array and obtained RV values are listed in Table 4.

Table 4 Orthogonal array and RV values

The average RV values on the numbers in the rows with the same level of each parameter in Table 4 are listed in Table 5 and illustrated in Fig. 3. In Table 5, Delta represents the difference between the maximum and the minimum numbers in the rows for each column. The numbers in the row of Ranks are determined according to decreasing order of Delta. The smaller the value of Ranks, the greater the influence of the corresponding parameter on the objective value.

Table 5 Average response values
Fig. 3
figure 3

Trends of factor levels of α,β1,β2, and β3

From Fig. 3 and Table 5, it can be seen that β1 is the most significant one among the four parameters. Moreover, a smaller value of β1 can degrade the performance of the algorithm. This is because the jobs with loose due dates and larger weights being given higher priorities to be processed and transported, which can result in greater delays for the jobs with tight due dates. As for β3, if it is too large the idle space in each batch on the machine increases. On the one hand, if the value of α is set to be too large, the algorithm converges faster without guaranteeing better solution quality. On the other hand, a smaller value of α can cause the solution space searched by the algorithm to decrease. Therefore, to ensure better performance of LACO, an appropriate value for α is required. It can be seen that the influence of β2 on the algorithm is insignificant, so only a reasonable value is preferred. According to the above analysis, the values of the four parameters are set to α = 1, β1 = 5, β2 = 0.2, and β3 = 0.2, which are used in the following experiments.

The parameters of all the benchmark algorithms are set as in Table 6. As shown in Table 6, for each algorithm, the maximum iteration number Imax is set to 200. In PSO, npop denotes the number of particles, and c1 and c2 denote the cognitive and social learning coefficients, respectively. w and α denote the inertia weight and decrement factor, respectively. Meanwhile, the range of velocity and position for each particle are set within [-4.0,4.0] and [0.0,4.0], respectively. There are three main parameters in RKGA, i.e., population size N, crossover probability pc, and mutation probability pm. In SA, K and Tmin denote the iterative number under the same temperature and lower bound of the temperature, respectively. At the same time, 𝜃 denotes the cooling rate. Additionally, the values of other parameters of the benchmark algorithms are set according to their original papers [35,36,37].

Table 6 Parameter settings

5.3 Evaluation metrics

To guarantee the fairness of comparison between the algorithms, the following performance metrics are adopted.

(1)Relative distance (RAlg)::

This metric is used to measure the quality of the obtained solution [31]. Specifically, RAlg indicates the distance between the obtained solution and best objective value, denoted by TWTBest, among those obtained by all algorithms, including LACO, WACO, PSO, RKGA, and SA, to each test instance. RAlg is formulated as (10):

$$ R_{Alg}(\%)=\frac{TWT^{Alg}-TWT^{Best}}{TWT^{Best}}\times 100 $$
(10)

where TWTAlg denotes the average value of each algorithm on its 10 runs Alg for each instance. It is clear that the smaller the value of RAlg, the closer the objective value of algorithm Alg to the TWTBest.

(2) Run time (t)::

This metric is generally used to compare the time performance of different algorithms under the same execution environment.

5.4 Experimental results and analysis

The experimental results of different combinations are shown in Tables 714. In each table, the first column represents the indexes of instances in the corresponding instance group. Each algorithm is run on each instance 10 times and the average values of \(\bar {R}\) and \(\bar {t}\) are calculated to measure the performance of the algorithms. The last row denotes the average results on the 10 instances of each instance group. Note that the best result of \(\bar {R}\) on each test instance is shown in bold type.

Table 7 Average results on NM1A1DW1
Table 8 Average results on NM1A1DW2
Table 9 Average results on NM1A2DW1
Table 10 Average results on NM1A2DW2
Table 11 Average results on NM2A1DW1
Table 12 Average results on NM2A1DW2
Table 13 Average results on NM2A2DW1
Table 14 Average results on NM2A2DW2

Table 7 shows the experimental results on the instances with n = 100, m = 5, A = 0.5, and DW1, denoted by NM1A1DW1. Table 8 lists the results on instance group NM1A1DW2. It can be observed from Tables 7 and 8 that the LACO algorithm can obtain the smallest \(\bar {R}\) of any other benchmark algorithm, except for the seventh instance in NM1A1DW1, no matter what the departure mode is. This shows that the quality of the solution obtained by LACO is better than that of each other benchmark algorithm. In other words, the solution found by LACO is much closer to the best solution that can be found by all algorithms for each instance. From Tables 7 and 8, it can be found that the performance of each algorithm on instances with the first mode, i.e., DW1, is different from that on the mode of DW2. Specifically, the average values of \(\bar {R}\) in Table 7 are slightly lower than those in Table 8. Although the computation time of PSO is obviously less than that of LACO, the solution quality of PSO is much worse than that of LACO. This may be because the strategies adopted in LACO, such as the local optimization, are relatively more deliberate.

It can be seen from Tables 7 and 8 that the difference between the runtime of LACO and that of WACO is very small, meaning that the computation time of the local optimization strategy is negligible. Furthermore, the solution quality of LACO slightly outperforms that of WACO. This means that the performance of LACO is better than that of WACO as a whole.

Tables 9 and 10 present the results obtained by the algorithms for instances with 100 jobs, five machines, A= 1, and combined with two modes of departure. It can be observed from Tables 9 and 10 that the average values of \(\bar {R}\) obtained by LACO in Tables 9 and 10 are larger than the corresponding average values of \(\bar {R}\) in Tables 7 and 8, respectively. The reason may be that the delays of most jobs become smaller when the value of A increases, which results in a decrease in objective value. According to (10), the average values of \(\bar {R}\) become slightly larger.

Tables 1114 provide the experimental results on 200 jobs and 10 machines with A1DW1, A1DW2, A2DW1, and A2DW2. It can be observed from the four tables that, as the number of jobs increases, all algorithms take slightly longer to find the solutions, which is still acceptable. At the same time, the average values of \(\bar {R}\) obtained by the LACO algorithm are still better than those of the other benchmark algorithms. Additionally, according to the experimental results in Tables 11 and 13, it can be seen that the average values obtained by all algorithms in Table 11 are slightly better than those in Table 13. The reason for this phenomenon may be due to the different setting of parameter A. Furthermore, since the value of parameter A in Table 10 is less than that in Table 12, there are more jobs with tight due dates in instance A1DW1 than in instance A2DW1, which results in the target values obtained in instance A1DW1 being greater than those in instance A2DW1. According to (10), one can conclude that the average \(\bar {R}\) values of instance A1DW1 are slightly less than those of instance A2DW1. The comparative results in Tables 12 and 14 are consistent with those in Tables 11 and 13, respectively.

It is remarkable that the average values of \(\bar {R}\) obtained by all algorithms on instances with a DW1 mode are slightly better than those of all algorithms on a DW2 mode, according to Tables 714, which means that the first departure mode is superior to the second mode for the studied problem. Moreover, it can be seen from each table that the average values of \(\bar {R}\) obtained by LACO are slightly better those of WACO, due to the WACO algorithm being unable to find a better schedule of assigning jobs on machines than the LACO algorithm. In other words, in the solution found by WACO, more jobs with loose due dates or small weights are processed in advance, leading to the jobs with tight due dates or large weights being delayed for processing. Therefore, the objective values obtained by the WACO algorithm is slightly larger than that of the LACO algorithm.

In all, according to the results of comparative experiments, the LACO algorithm is able to find better solutions than any other benchmark algorithm, although the computation time it takes is slightly longer than that of the WACO algorithm, PSO algorithm, and RKGA. Moreover, the computation time of the LACO algorithm is acceptable.

6 Conclusions

In this paper, an algorithm based on ACO, called LACO, is proposed to address the integrated production distribution scheduling problem on parallel BPMs and to minimize the TWT of jobs. In the production stage, the jobs are processed on the BPMs with non-identical capacities. In the delivery phase, the jobs having finished their processing are transported by vehicles with equal capacity. A new local optimization strategy called LOC is proposed to improve the local exploitation ability of the algorithm, and further search the neighborhood solution to improve the quality of the solution. Two interval candidate lists are proposed to reduce the search for the feasible solution space and improve the search speed. Additionally, three objective-oriented heuristics are developed to accelerate the convergence of the algorithm. To verify the performance of the proposed algorithm, several algorithms are compared with LACO on test instances designed deliberately. Experimental results show that the proposed LACO can offer better solutions than the other four benchmark algorithms in addressing the studied problem.

In future research, the studied problem can be extended to some other variations that are interesting and closer to reality. However, the extended problems may be more complex. For example, the production phase can be extended to more complex environments, such as unequal processing speeds of machines and jobs with dynamic release times. Another more realistic extension is to extend the equal capacity of vehicles to non-identical vehicle capacity. Additionally, other objectives could also be considered, such as the maximum tardiness and total revenue.