1 Introduction

The manufacturing industry consumes an enormous amount of energy to generate economic benefits while the increasing demand of energy has imposed tremendous pressure on the environment (Guemri et al. 2014; Sinoquet et al. 2011). For instance, the endothermic greenhouse gases (e.g., CO2 and N2O) have made a great contribution to the global warming (China Energy and CO2 Emission Research Group 2009). According to the International Energy Outlook 2016(EIA International Energy Outlook 2016), the industrial sector worldwide would account for more than half of the total energy use through 2040. Furthermore, the energy-related CO2 emissions around the world rise from 32.2 billion metric tons in 2012 to 35.6 billion metric tons in 2020 and to 43.2 billion metric tons in 2040. Thus, it is a great challenge from the viewpoint of enterprises and countries to decrease the ultimate achievement of emissions targets, which will depend on how to balance environmental goals with economic growth.

The CE assessment of machines and the advanced production scheduling play a significant role in balancing the economic benefits with the environmental impact (Ballestin et al. 2012; Kelly and Zyngier 2017; May et al. 2016; Piroozfard et al. 2018; Schinko et al. 2014). An empirical model based on power measurements was developed by Li and Kara (2011) who optimized the energy efficiency of processes under different cutting conditions by an effective method. Laurent et al. (2010) and Kara et al. (2010) studied the energy efficiency and the CE respectively in a manufacturing system by using life cycle assessment (LCA). Zheng and Wang (2015) investigated the low-carbon production scheduling problem with a Pareto-based estimation of distribution algorithm, concerning the objectives of makespan and CE. Liu and Huang (2014) introduced two scheduling problems involving two objectives (i.e. reduction of power consumption and carbon footprints) on batch-processing machines and a hybrid flow shop respectively. NSGA-II was used to solve their problems and an adaptive multi-objective GA is proposed to validate the results.

Flexible job shop problem (FJSP) is an extension of the job shop problem (JSP) that considers the flexibility of machines, and has been extensively existed in modern manufacture. With the development of customized manufacturing, workers with specific capacity are needed on various machines. To make the scheduling model closer to realistic production, it is essential to take worker resources into consideration. Fryer (2010) presented a dual-constraint JSP problem by considering the constraints of machines and workers simultaneously, where each worker was assumed with equal flexibility. Felan and Fry (2001) introduced the flexibility of workers for the same problem, and concluded that workers with different flexibility perform better than all workers with equal flexibility. A variable neighborhood search was proposed by Lei and Guo (2014) in order to minimize the makespan of FJSP with worker flexibility. Xu et al. (2011) pointed out that workers supposed to be trained in several skills or departments such that they can be assigned to a variety of jobs as the need arises. Zheng and Wang (2016) investigated a FJSP that workers have different skills to operate different machines, and developed a knowledge-guided fruit fly optimization algorithm to solve their problem. A FJSP considering different labor wages under dynamic electricity pricing is solved by the non-dominated sorting genetic algorithm-III (NSGA-III) (Gong et al. 2019b). Ferjani et al. (2017) studied the dynamic multi-skilled workers assignment problem and considered the impact of fatigue.

Analyzing the literatures above, most of them studying the constraints imposed by workers assume that all workers have constant efficiency throughout the whole production process. From a realistic perspective, workers ought to have learning abilities that the efficiency of a worker is supposed to be meliorated with the accumulation operation time towards corresponding jobs or machines. This learning effect has been proved to exist in many empirical studies. The effects of learning in scheduling problems were considered for the first time by Biskup (1999). Ostermeier (2019) pointed out that workers are prone to learning and forgetting even in the short-term when they are working in a highly-manual production environment where several complex products are processed alternately, frequent new product introductions occur and a high degree of product variation and customisation is presented. Koulamas and Kyparisis (2007) presented a new learning effect for scheduling environments and derived some results on special cases for the two-machine flow-shop problem with position-based learning effect. Biskup (2008) gave a review on scheduling with learning effects and discussed the existing literature on scheduling with position-based learning effect and sum-of-processing-time based learning effect, respectively. More recently, Wu et al. (2018) investigated the FJSP with consideration of worker’s learning ability and put forward a hybrid genetic algorithm to solve their problem, whereas they only optimized the maximum completion time (economic benefits), and ignored the environmental impact. This paper aims to further narrow the gap between the scheduling models and the realistic production while balancing the economic benefits and the environmental impact.

In this paper, we proposed a low carbon flexible job shop scheduling problem considering worker learning (LFJSP-WL) with reference to the CE reduction, the makespan and the total cost of workers. Since the JSP has been proved to be strong NP-hard combinatorial optimization problems due to its high computational complexity (Brucker and Schlie 1990; Garey et al. 1976; Zhou et al. 2009), the FJSP with worker learning must be more complex than JSP. Thus it is quite difficult for traditional approaches to obtain an optimal solution within a certain time.

For the multi-objective FJSP problem, the memetic algorithm (MA) hybridizing evolutionary algorithms with local search has shown promising performance (Kannan and Ramaraj 2010). In addition, MA takes advantage of both global and local search. Thereby, it is able to strike a better balance between exploration and intensification (Hertz and Kobler 2000) and it has been proved to be an effective method for FJSP(Frutos et al. 2010; Gong et al. 2019a; Ma et al. 2014a, 2014b; Phu-Ang and Thammano 2017; Yi et al. 2016; Yuan and Xu 2013). In order to solve the new mathematical model (LFJSP-WL) efficiently, we put forward a memetic algorithm based on NSGA-II with a useful variable neighborhood search (VNS), in which a three-layer chromosome representation encoding method and an active schedule decoding method are proposed. Besides, we present a mixed initialize method which consists of three initialize strategies, and apply several useful mutation operators and crossover operators for the three subproblems respectively. The elite pool strategy is adopted to reserve elite solutions generated along with all iterations.

The rest of this paper is organized as follows. The mathematical model of the proposed LDFJSP-LE is introduced in Sect. 2. The memetic algorithm (MA) is elaborated in Sect. 3. The case study is presented in Sect. 4. Conclusions and future studies are described in Sect. 5.

2 The mathematical model of the proposed LDFJSP-LE

2.1 Problem description

The problem of LFJSP-WL is described as follows. It considers processing n independent jobs J = (J1, J2,…, Jn) on m machines M = (M1, M2,…, Mm) with l workers W = (W1, W2,…, Wl). A job Ji consists of a predefined sequence of ri operations (Oi1, Oi2,…,Oir). Each Oij indicates that the jth operation of Ji can be processed on a machine out of from the given machine set M. Each machine can be performed by at least one worker out of from the given worker set W. As regards worker, there are two different types (ElMaraghy et al. 1999), one supposed that every worker owns the same capacity, the other considered different workers have diverse skills and operate machines with varying efficiencies. The latter type is adopted in LFJSP-WL, furthermore, the skill level and learning ability of workers are taken into account, a worker with higher-level should have preferable learning parameters and higher-cost. Owing to workers’ learning capacity, the efficiency of workers will be enhanced since the accumulation of processing time when performing different machines and jobs, i.e. worker’s learning ability for machines and jobs. Nonetheless, efficiency will reach the limitation in spite of the study time increasing continuously. Three objectives, productivity (i.e. makespan), environmental impact (i.e. total CE) and economic influence (i.e. total cost of workers), are optimized simultaneously, which can provide guidance to production managers when they make scheduling decisions considering profit and environment protection. The model consists of three subproblems: (1) operation sequence; (2) machine assignment; (3) worker assignment.

To make it easier to understand, an instance of the proposed LFJSP-WL is shown in Table 1. There are two jobs, four machines and five workers. Job 1, 2 have two and three operations respectively. Machine 1, 2, 3, 4 can be operated by two, two, three and three corresponding workers chosen from the worker set with three different skill levels, the level classification and some learning parameters are organized in Table 2. For each column in Table 1. Column 1 indicates jobs. Column 2 shows the operations belonging to a certain job. Column 3 states the machines which can process the corresponding operations. Column 4 represents the workers which can operate the corresponding machines. Column 5 lists the homologous cost of workers. Column 6 expounds the basic processing time before considering the improvement of worker’s efficiency. Especially, the actual time will be changed compared to the basic time with the accumulation processing time of the worker. Column 7–11 addresses the special energy consumption, the CE coefficient of cutting tools, the CE coefficient of auxiliary materials, the CE of machine tool at the stand-by state and the limiting efficiency, those mentioned parameters are related to corresponding machines respectively. For each row in Table 2. The level classification, basic efficiency and learning coefficient of diverse workers are listed where Level 1, 2, 3 represents the best-skilled worker set, the average-skilled worker set and the worst-skilled worker set. Take the first operation of job1 for example. There are two machines (M1, M2) for this operation, M1 can be operated by W1, W2 while M2 can be operated by W2, W4. The cost of the worker on M1 is 110 and 95 respectively. Similarly, the cost of the worker on M2 considering the skill level is 100 and 80 respectively, and corresponding basic time is 15, 12, 5 and 6 time units, the other variable values can be obtained in the same way.

Table 1 An instance of LFJSP-WL
Table 2 Parameters of the workers with different levels

Some assumptions are made to simplify our model. based on the assumptions which have been widely proposed in many other papers on FJSP (Demir and İşleyen 2014; Lei and Guo 2014; Li et al. 2012; Zhang et al. 2016), we further consider the worker resources and their learning ability, which indicates that our model is more close to reality, especially in a highly-manual production environment. The assumptions made for LDFJSP-LE are listed as follows:

All machines and workers are ready at time 0;

  1. (1)

    All raw materials are ready at time 0;

  2. (2)

    Jobs are mutually independent and have the same priority;

  3. (3)

    The efficiency of various workers only concerned with the accumulation of processing time;

  4. (4)

    Each machine can be operated by one and only one worker at a time;

  5. (5)

    Each operation can be processed by one and only one machine at a time;

  6. (6)

    An operation cannot be performed until its preceding operation completed;

  7. (7)

    Interruption is not allowed once the operation has started;

  8. (8)

    Workers with higher skill level rank supposed to be owned better parameters related to learning effect and cause the higher cost;

  9. (9)

    Setting up time of machines, transportation time among different positions and released time of jobs are negligible.

2.2 Problem formulation

Refer to Wu et al. (2018) and Zhang et al. (2015), the mathematical model can be established as follows.

$$ A_{ijkr} = \sum\limits_{h = 1}^{n} {\sum\limits_{g = 1}^{{p_{h} }} {a_{hg} \cdot X_{hg - ij} \cdot X_{hgkr} } } $$
(1)
$$ B_{ijr} = \sum\limits_{g = 1}^{{p_{i} }} {a_{ig} \cdot X_{ig - ij} \cdot X_{igr} } $$
(2)
$$ a_{ij} = t_{ij} \cdot \max (e_{kr} \cdot (1 + A_{ijkr} )^{{b_{r} }} ,e_{ir} \cdot (1 + B_{ijr} )^{{b_{r} }} ,d_{k} ) $$
(3)

Through Eqs. (1)–(3), the actual processing time of Oij processed by worker r on machine k can be obtained. Owing to the differences between the accumulation time of worker r operating machine k (Aijkr) and the accumulation working time of worker r processing job i (Bijr) before performing Oij, the degree of familiarity with the job and the machine may be different for worker r, indicating that a worker’s efficiency is decided by the maximum of the two parts. According to the learning effect model (Kuo and Yang 2006), we build the efficiency model of each worker. The ekr·(1 + Aijkr)br indicates the efficiency of worker r towards machine k when considering learning effect, while the eir·(1 + Bijr)br indicates the efficiency of worker r for processing job i when considering the learning effect. It is worth to note that the bigger the value of a worker’s efficiency is, the longer time needs for the worker to perform a processing activity. Furthermore, each machine has its ultimate efficiency dk, thus we calculate the final efficiency by max(ekr·(1 + Aijkr)br, eir·(1 + Bijr)br, dk). Then the final actual processing time can be obtained via multiplying final efficiency by basic processing time tij.

$$ CE_{i} = CE_{i}^{p} + CE_{i}^{t} + CE_{i}^{a} $$
(4)

Equation (4) shows the total carbon emission of job i which involves machining processes, tool wear and auxiliary materials.

$$ SEC = C_{0} + \frac{{C_{1} }}{MRR} $$
(5)
$$ CE_{i}^{p} = \left( {\sum\limits_{j = 1}^{{p_{i} }} {\sum\limits_{k = 1}^{m} {V_{ij} \cdot SEC_{ijk} \cdot X_{ijk} } } } \right) \cdot EF^{e} $$
(6)

Equation (5) is the calculation of the special energy consumption (SEC) for removing 1 cm3 material during processing (Kara and Li 2011), where MRR denotes the material removal rate, and C0 and C1 are special coefficients. Based on Eq. (5), the material removal energy can be calculated via multiplying SEC by the material removal volume (Vij) (Zhang et al. (2016)). Then the total CE of job i in the processing state can be calculated by Eq. (6) (Zhang et al. 2015), where Xijk ensures that operation Oij is performed on machine k, and EFe represents the emission factor of electric energy (kg CO2 – e/kg).

$$ \begin{aligned} CE_{i}^{t} & = \sum\limits_{j = 1}^{{p_{i} }} {\sum\limits_{k = 1}^{m} {\frac{{a_{ij} }}{{T_{k}^{t} }}} } \cdot TL_{k}^{t} \cdot EF^{e} \cdot X_{ijk} \\ & { = }\sum\limits_{j = 1}^{{p_{i} }} {\sum\limits_{k = 1}^{m} {a_{ij} \left( {\frac{{TL_{k}^{t} \cdot EF^{e} \cdot X_{ijk} }}{{T_{k}^{t} }}} \right)} } \\ & { = }\sum\limits_{j = 1}^{{p_{i} }} {\sum\limits_{k = 1}^{m} {a_{ij} } } \cdot\upmu _{ijk}^{t} \\ \end{aligned} $$
(7)

Tool wear is a crucial factor which usually appears during the processing in workshops. By regrinding, a cutting tool can get recovery after reaching the life limitation, but the CE (carbon emission) of regrinding cutting tool should be considered. According to Zhang et al. (2015), the CE of job i related to the cutting tool can be calculated by Eq. (7).

$$ \begin{aligned} CE_{i}^{a} & = \sum\limits_{j = 1}^{{p_{i} }} {\sum\limits_{k = 1}^{m} {\left( {\frac{{a_{ij} }}{{T_{k}^{c} }} \cdot TQ_{k}^{c} \cdot EF_{ijk}^{c} \cdot X_{ijk} + \frac{{a_{ij} }}{{T_{k}^{l} }} \cdot TQ_{k}^{l} \cdot EF_{ijk}^{l} \cdot X_{ijk} } \right)} } \\ & = \sum\limits_{j = 1}^{{p_{i} }} {\sum\limits_{k = 1}^{m} {a_{ij} \cdot X_{ijk} \left( {\frac{{TQ_{k}^{c} \cdot EF_{ijk}^{c} }}{{T_{k}^{c} }} + \frac{{TQ_{k}^{l} \cdot EF_{ijk}^{l} }}{{T_{k}^{l} }}} \right)} } \\ & = \sum\limits_{j = 1}^{{p_{i} }} {\sum\limits_{k = 1}^{m} {a_{ij} \cdot\upmu _{ijk}^{a} } } \\ \end{aligned} $$
(8)

In addition to the resources mentioned above, the machine tool consumes a certain amount of auxiliary materials, particularly the coolant and lubricant oil. Therefore, coolant and lubricant oil are considered in this model. Since they will decrease bit by bit with the machine tool processing, the supplement actions should be taken after a fixed interval time. According to Zhang et al. (2015), the mathematical derivation of Eq. (8) is similar to Eq. (7). Besides, two types of CE are included: CE of coolant and CE of lubricant oil.

$$ CE_{k}^{s} = \left( {C_{k} - \sum\limits_{i = 1}^{n} {\sum\limits_{j = 1}^{{p_{i} }} {a_{ij} \cdot X_{ijk} } } } \right) \cdot e_{k}^{s} $$
(9)

Equation (9) obtains the CE of machine tool at standby state, which is calculated by multiplying total standby time with CE per unit time during the standby state.

In this model, we optimize three multiple conflicting objectives simultaneously. After the analysis of Eqs. (1)-(9), the objectives are presented as below:

$$ {\text{Minimize}}\quad C_{max} { = }\mathop {\max}\limits_{1 \le k \le m}^{{}} \{ C_{k} \} $$
(10)
$$ {\text{Minimize}}\quad W = \sum\limits_{i = 1}^{n} {\sum\limits_{j = 1}^{{p_{i} }} {\sum\limits_{k = 1}^{m} {\sum\limits_{r = 1}^{l} {W_{ijkr} X_{ijkr} } } } } $$
(11)
$$ {\text{Minimize}}\quad CE_{total} = \sum\limits_{i = 1}^{n} {CE_{i} } + \sum\limits_{k = 1}^{m} {CE_{k}^{s} } $$
(12)

Subject to:

$$ C_{ij} = S_{ij} + a_{ij} $$
(13)
$$ C_{i(j + 1)} - C_{ij} \ge a_{i(j + 1)} $$
(14)
$$ S_{ij} + (1 - X_{hgk - ijk} ) \cdot L \ge C_{hg} $$
(15)
$$ (C_{hg} - C_{ij} - a_{hg} ) \cdot (C_{ij} - C_{hg} - a_{ij} ) \cdot X_{ijk} \cdot X_{hgk} \le 0 $$
(16)
$$ (C_{hg} - C_{ij} - a_{hg} ) \cdot (C_{ij} - C_{hg} - a_{ij} ) \cdot X_{ijkr} \cdot X_{hgqr} \le 0 $$
(17)
$$ \sum\limits_{k = 1}^{m} {X_{ijk} } = 1 $$
(18)
$$ \sum\limits_{r = 1}^{l} {X_{ijkr} = 1} $$
(19)

Equation (10) represents the makespan objective. Equation (11) represents the total cost of workers objective. Equation (12) represents the total carbon emission objective. Constraint (13) explains the relation between completion time, starting time, and actual processing time of operation Oij. Constraint (14) ensures the precedence constraint between operations of a job. Constraint (15) ensures that each operation cannot start until the immediate precedence operation completed on the same machine. Constraint (16) ensures that each machine can process at most one operation at a time. Constraint (17) ensures that each worker can process exactly one operation at a time. Constraint (18) ensures that each operation can be processed on exactly one machine. Constraint (19) ensures that each operation on a machine can be performed by one and only one worker.

3 Proposed MA for LDFJSP-LE

3.1 The framework

In this section, we propose a memetic algorithm with variable neighborhood structures (MA) to solve LFJSP-WL. The genetic operators in MA conclude crossover operator, mutation operator and the local search which can improve the quality of solutions effectively. Furthermore, to retain the good solutions produced in each iteration, the elite pool strategy is adopted during evolution. The pseudo-code of MA is presented as below.

figure a

3.2 Encoding and decoding

A three-layer chromosome representation that contains operation sequence (OS) vector, machine assignment (MA) vector and worker assignment (WA) vector is adopted for this research (Gong et al. (2018a)). The length of the chromosome equals to 3* p where p stands for the total number of operations of all jobs. The OS vector consists of p integer values in the range of 1 to n, where n represents the total number of jobs, and different integer values which appear pi times indicate the corresponding job-number. By scanning the OS vector from left to right, the jth occurrence of an integer value indicates the jth operation of the corresponding jobs. An example is presented in Fig. 1a, the first position shows the first operation of job 2, namely, O21, and the fifth position shows O22. An example of the MA vector is presented in Fig. 1b, the MA vector is as long as the OS vector which consists of three parts owing to three jobs. The number in each part represents the machine-number selected for the corresponding operation. Number 1 in the first position of the first part represents machine 1 is selected for operation O11. Number 2 in the next position of the first part represents machine 2 is selected for operation O22. Number 2 in the last position of the first part represents machine 2 is selected for operation O13. An example of the WA vector is shown in Fig. 1c, the length of WA vector also equals to OS vector and consists of three parts. The number in each part represents the worker-number selected for the corresponding operation. Number 3 in the first position of the second part represents worker 3 is selected for operation O21. Number 2 in the next position of the second part represents worker 2 is selected for operation O22. The example shown in Fig. 1 can be interpreted as follows: (O21, M3, W3), (O31, M3, W3), (O11, M1, W1), (O32, M1, W1), (O22, M2, W2), (O12, M2, W2), (O33, M1, W1), (O13, M2, W1).

Fig. 1
figure 1

Individual encoding example

For the above chromosome representation, an active decoding method referring to Gong et al. (2019b) which considers the constraints of machines and workers simultaneously is applied for this research. The details of the decoding method are as follows:

Step 1 Extract operation (Oij) from OS vector in sequence from left to right, then get the corresponding machine (Mk) and worker (Wr);

Step 2 Get the worker’s (Wr) accumulation processing time of Mk and job i before processing Oij respectively. Next, calculate the actual processing time of Oij by combining worker’s learning parameters;

Step 3 Find the blank spaces, namely, the idle time between adjacent operations which processed prior to Oij, from left to right. Then obtain the blank space [Tks, Tke] existing on Mk and [Trs, Tre] existing on Wr. Additionally, getting the completion time of Oi(j−1), namely, C i(j−1);

Step 4 Get all the overlapping parts [Tis, Tie] between [Tks, Tke] and [Trs, Tre]. If Tie—max (Tis, C i(j−1)) ≥ aij, and the earliest processing time Oij can be obtained by ts = max(Tis, C i(j−1)). The completion time of Oij can be calculated by te = ts + aij. Otherwise, go to step 5;

Step 5 If Oij cannot insert into the idle period, the earliest processing time Oij can be calculated by ts = max(Mkc, Wrc), where Mkc represents the completion time of the last operation processed on machine k, Wrc represents the completion time of the last operation performed by worker r. The completion time of Oij can be obtained by te = ts + aij;

Step 6 If all operations in OS vector are scheduled, go to the end. Otherwise, go to step 1.

3.3 Initialization of population

The population initialization plays a vital role in improving the quality and diversity of the population. Since the problem consists of three subproblems, we employ three initial strategies for the three corresponding subproblems respectively. The OS vectors are generated by using three strategies: (1) Sequence the operations randomly; (2) A job with the longest remaining processing time has the priority to be selected; (3) Choose a job with the maximum rest operations. The MA vectors also produced by three methods: (1) Select a machine out of alternative machines randomly; (2) Choose a machine with the lowest global processing time; (3) Give preference to the machine with the smallest SEC. Similarly, we initial the WA vectors with three strategies: (1) Select a worker out of alternative workers randomly; (2) Choose a worker with the lowest global processing time; (3) A worker with the lowest cost has the prior to be chosen.

3.4 Crossover operator

Crossover operator is an indispensable element during the reproduction phase which generates new individuals (offspring) by exchanging good genes between two different parental chromosomes. Two different crossover operators are employed respectively for this problem. The job-based crossover (JBX) operator is proposed for OS. First, we divide the Jobset J into two subsets Jobset1 and Jobset2 randomly; Next copy any genes in parent 1 belongs to Jobset1 into offspring 1 in the same position and copy any genes in parent 2 belongs to Jobset2 into offspring 2 in the same position; Finally, add any genes in parent 2 belongs to Jobset2 into offspring 1 in the remain position one by one, and add any genes in parent 1 belongs to Jobset1 into offspring 2 in the remain position one by one. An example is shown in Fig. 2. For MA and WA, the random probability crossover (RPX) operator is proposed for this problem. The theory is similar to JBX and the main difference is that a binary vector unit is used instead of subsets, where the 0–1 expresses the two different subsets. An example of MA is shown in Fig. 3.

Fig. 2
figure 2

OS job-based crossover operator

Fig. 3
figure 3

MA random probability crossover operator

3.5 Mutation operator

Mutation operator can provide extra variation into the population so that enhances the diversity of the population. Three mutation operators are constructed for this problem respectively. The procedures for the mutation are depicted as follows and an example of OS is shown in Fig. 4.

Fig. 4
figure 4

OS mutation operator

Step 1 Generate a random number Rm. If Rm < Mr where Mr states the mutation rate, go to Step 2, else go to the end;

Step 2 Get two different positions point1, point2 in the OS vector randomly, and generate two random position numbers mr, wr in the corresponding MA vector, WA vector respectively, then go to Step 3;

Step 3 For OS vector, insert the gene in point1 into the position before point2, as shown in Fig. 4, then go to step 4;

Step 4 For MA vector, replace the machine in position mr with another machine randomly amongst the alternative machines and choose an available worker randomly, then go to step 5;

Step 5 For WA vector, replace the worker in position wr with another worker randomly amongst the alternative workers.

3.6 Local search with four neighborhood structures

The neighborhood structure is a mechanism that can produce neighbor solutions by utilizing a small change in the original solutions. Inspired by Lei and Guo (2014), we construct four neighborhood structures as follows:

  • Neighborhood structure 1(N1): N1 is designed for the OS vector. It first gets a critical block with more than two operations, then obtains the first operation in this critical block, finally moves an internal operation which differs from the first operation to the beginning of the block.

  • Neighborhood structure 2(N2): N2 is also designed for the OS vector. It first obtains a critical block with more than two operations, next get the last operation in this critical block. Then moves an internal operation which differs from the last operation to the rear of the block.

  • Neighborhood structure 3(N3): N3 is designed for the MA vector. It first selects a gene in MA vector randomly, then selects a new machine with the smallest SEC from the alternative machines, and lastly changes the worker randomly for the corresponding new machine.

  • Neighborhood structure 4(N4): N4 is designed for the WA vector. Similarly, it obtains a worker gene wk in WA vector randomly, then replaces wk with the lowest cost out of from alternative workers.

figure b

3.7 Elite pool

To reserve elite solutions along with each iteration, the notion of elite pool is used in the proposed MA. The elite pool (ep) initially filled with a number of solutions generated by initialization. Then at each iteration obtains the non-dominated solutions (ns) produced by non-dominated sorting, and updates the elite pool finally through doing non-dominated sorting amongst the mixture of ns and ep.

figure c

4 Case study

4.1 Performance evaluate metrics

In order to evaluate the effectiveness of MA, three commonly used metrics are adopted. They are the Set Coverage (SC), the inverted generational distance (IGD) and the ratio of non-dominated solutions (Rnd) (Yuan and Hua 2014; Wu and Che 2019), which can be calculated as follows.

  1. 1.

    C (A, B) measures the fractions of members of B which are dominated by members of A, it can be obtained by Eq. (20).

    $$ C(A,B) = \frac{{|\{ b \in B|\exists a \in A:{\text{ a dominates }}b\} |}}{|B|} $$
    (20)

    where A and B denote the approximation to Pareto front (PF) respectively, |B| represents the number of solutions in B. C (A, B) = 1 indicates that all solutions in B are dominated by at least one solution in A, and C (A, B) = 0 shows that all solutions in B are not dominated by any solution in A. If C (A, B) is larger than C (B, A), that means the solutions in A is better than solutions in B in a sense.

  2. 2.

    Let P* denotes the set of non-dominated solutions amongst all solutions obtained by all runs of all algorithms, the IGD of set A can be calculated by Eq. (21).

    $$ IGD(A,P*) = \frac{1}{|P*|}\sum\limits_{x \in P*} {\mathop {\min }\limits_{y \in A} } \, d(x,y) $$
    (21)

    where d (x, y) shows the Euclidean distance between point x and y, |P*| represents the size of P*. The smaller value of IGD(A, P*) states that set A is closer to P*. In addition, the objectives of every non-dominated solutions should be normalized by Eq. (22) before calculating IGD.

    $$ \tilde{f}_{i} (x) = (f_{i} (x) - f_{i}^{\min } )/(f_{i}^{\max } - f_{i}^{\min } ), \, i = 1,2,3 $$
    (22)

    where \(f_{i}^{\min }\) is the minimal value of the ith objective amongst all obtained solutions, besides, \(f_{i}^{\max }\) states the maximum.

  3. 3.

    Let N denotes the non-dominated solutions obtained by one algorithm, and the Nnd of the algorithm shows the number of solutions in N that remains in P*.

    $$ R_{nd} = N_{nd} /|P*| $$
    (23)

    where the definition of P* is identical with Eq. (21). Rnd = 0 expresses that solutions obtained by the algorithm all dominated by those solutions obtained by other algorithms. Rnd = 1 represents that any solution obtained by the algorithm cannot be dominated by any other solutions obtained by other algorithms.

4.2 Design of experiments

4.2.1 Experiment setting

All the algorithms are coded in MATLAB R2016 and implemented on a desktop computer with an IntelCorei3 CPU with 3.60 GHz frequency and 4 GB RAM.

4.2.2 Data source

The flexible job-shop scheduling problem (FJSP) is commonly seen in the real manufacturing industry. A reasonable FJSP solution can effectively help manufacturers to improve production efficiency and reduce production costs. Some benchmarks have been proposed by researchers based on realistic production conditions, where consist of many actual production problems that plague the enterprises (e.g. maximal completion time of machines, workload of the most loaded machine, total workload of all machines etc.). Although the benchmarks contain many practical problems, the benchmarks can also be expanded easily to use if some new factors or actual production targets that have not been considered before appear, (e.g. Demir and İşleyen 2014; Kacem et al. 2014; Zheng and Wang 2016; Gong et al. 2018b; Gong et al. 2019a).

In our problem, we consider the objectives of minimizing the makespan, total CE and total cost of workers, which are related to the interests of the companies, especially when they make scheduling decisions considering profit and environment protection. We construct our benchmarks for testing LFJSP-WL extended from Dauzèrepérès and Paulli (1997) and Brandimarte (1993), both of them are widely used for FJSP. For the sake of simplicity, the original names of these benchmarks are used in this paper. Some supplementary data for extension are shown in Table 3. The intervals in rows 1–4 may be different in different levels (worker set), and the value of variables in rows 5–8 are related to diverse machines. Besides, the removal volume of materials and CE coefficient of the tool are given in the last two rows respectively, and the values are related to various operations. All the parameters in column 1 mentioned in Table 3 are distributed according to the given intervals and distribution mode (normal distribution and uniform distribution). The total number of workers is generated by [m × 0.6] while the partial worker flexible is considered in this paper, namely, some machines cannot be operated by an apart of workers.

Table 3 Parameters for extension

The emission factors used in this paper can be obtained according to 2050 China energy and CO2 emission report (China Energy and CO2 Emission Research Group 2009) and China energy conservation and emission reduction development report (China Energy Saving and Investment Company 2009). For example, the emission factors of coolant (EFc), lubricant oil (EFl) and electric energy (EFe) are 3.782, 0.469 and 2.41.

4.2.3 Parameters setting

In MA, there are four critical parameters, i.e., population size (PS), crossover rate (CR), mutation rate (MR) and neighborhood search rate (NR). To obtain the best combination of these parameters in our algorithm, The Taguchi method of DOE (Montgomery 2008) is used with instance mk04. Different levels for each parameter are listed in Table 4 and the combinations are shown in Table 5. For each parameter combination, the iteration index is 100 and the proposed MA runs 10 times independently. One of the proposed metrics, i.e., Rnd, is adopted to measure each combination. We first obtain the exact Pareto front and individuals in the elite pool of all experiments (DT), and the Pareto front (DE) amongst DT can be calculated with DT as the reference front. Finally, the average percentage of Pareto front (\(\overline{R}\)) in DE for each experiment is used to measure the effectiveness of various combinations.

Table 4 Parameter level
Table 5 Results of parameter combination

Based on the results in Table 5, the trend of four critical parameters is illustrated in Fig. 5. It can be seen that different combinations of the four parameters have various influences on the final results. The level with the maximum Rnd is the best one for every factor. According to Fig. 5, we choose PS = 180, CR = 0.7, MR = 0.1, and NR = 0.3 for the following experiment parameters of the LFJSP-WL benchmarks.

Fig. 5
figure 5

The factor level trend of PS, CR, MR, NR

4.3 Effectiveness of the elite pool

A comparison experiment is executed to test the effectiveness of the elite pool proposed in our paper. To ensure fairness, all parameters and operators of MA are consistent, the only difference is that one algorithm with the elite pool, i.e. MA, while another modified memetic algorithm without the elite pool, i.e. MMA. The parameter settings are following the results of Taguchi method and set the iteration index as 100. After 10 independent runs, MA gets the corresponding non-dominated solutions and the solutions in the elite pool respectively, while MMA obtains the corresponding non-dominated solutions. Then both algorithms integrated the obtained solutions as a comparison population, one metric (i.e. SC) is adopted to measure the results.

According to the results shown in Table 6. The C(B,A) are all equal to 0 in all problems which states that all solutions obtained by MA cannot be dominated by any solution obtained by MMA. In addition, the value of C(A,B) in each problem is greater than 0 expressing that some solutions obtained by MMA are dominated by at least one solution obtained by MA. What is more, the C(A,B) in the problem la14 is approximately equals to 0.5 which indicates that nearly half of the solutions obtained by MMA are dominated. Based on the above discussion, the effectiveness of the elite pool can be verified.

Table 6 Effectiveness test of elite pool

4.4 Comparison of two initialization methods

Two initialization methods are constructed in this paper. One introduced in Sect. 3 (method1), another method modified from Zhang et al. (2011) which includes global selection, local selection and random selection (method2). To obtain the best effectiveness of MA, a comparison experiment between method1 and method2 is carried out, the algorithm settings except the initialization method are identical.

From the comparison results listed in Table 7, method1 outperforms method2 in 15 problems out of 22 problems. Especially, C(B,A) = 0 in la21, la22, la29 and la31 problems, which shows that all solutions obtained by method1 cannot be dominated by solutions obtained by method2 in these problems. Overall, method 1 behaves better than method 2.

Table 7 Comparison of initialization methods

4.5 Comparison of NSGA-II, NNIA, NNIALS and MA

To make a more detailed performance analysis with respect to MA towards LFJSP-WL, we compare MA with three other commonly used multi-objective algorithms, namely, NSGA-II (Deb et al. 2002), NNIA (Gong et al. 2008) and NNIALS (NNIA combines our proposed local search). IGD, SC and Rnd are used to evaluate the performance of different algorithms. All algorithms have the same parameters and set the iteration index as 100. After each algorithm runs 10 times independently, the comparison results are listed in Tables 8 and 9.

Table 8 Set Coverage comparison of MA, NSGA-II, NNIA and NNIALS
Table 9 Rnd and IGD comparison of MA, NSGA-II, NNIA and NNIALS

In Table 8, the value of C(A,B) listed in the third column outperforms the value of C(B,A) listed in the fourth column significantly for all test problems, which means that MA derives better solutions than NSGA-II. With regard to the comparison of MA and NNIA, MA also shows great superiority to NNIA in the light of the values listed in the fifth column and sixth column for all test problems. For the last two columns expresses the comparison of MA and NNIALS, we can observe that MA obtains better solutions in 21 test problems out of 22 test problems.

In Table 9, the Rnd and IGD comparison of the four algorithms are presented. In terms of Rnd, MA outperforms the other three algorithms by a large margin for all 22 test problems. Besides, MA obtains all better values of IGD compared to NSGA-II and NNIA. It can be found that 21 test problems out of 22 test problems MA is superior to NNIALS. In summary, MA shows its effectiveness for the proposed LFJSP-WL, while the values of NNIALS also indicates the validity of our proposed local search in a sense.

5 Conclusions and future studies

This paper put forward a novel LFJSP-WL model that addresses productivity (i.e. makespan), environmental impact (i.e. total CE) and economic influence (i.e. total cost of workers). The CE of the model consists of four categories, namely, the energy consumption of processing, the consumption of auxiliary materials, the tool wear and the energy consumption in the stand-by stage of the machine tool. An effective memetic algorithm is tailored to solve this problem, some useful initialization methods, crossover operators and mutation operators are presented, and the variable neighborhood search with four neighborhood structures is proposed to strengthen the local search capacity. Besides, the elite pool is adopted to reserve elite solutions along with evolution. To evaluate the effectiveness of MA, we constructed 22 benchmarks based on the classic FJSP benchmarks from Dauzèrepérès and Paulli (1997) and (Brandimarte (1993)). The experimental results show that the proposed LFJSP-WL model can be solved effectively by MA.

In the future, some limitations in this paper can be further studied. First, consider the CE of inventory, the time of transportation and the workpiece clamping time. Second, address the uncertain events, for example, the machine breakdown, the arrival of new jobs and the cancellation of the order. Finally, the combination of learning and forgetting effects would be an interesting topic.