Keywords

1 Introduction

The concept of the smart industry, often referred to as Industry 4.0, encompasses a transformative paradigm shift in manufacturing and production processes, leveraging advanced technologies such as the Internet of Things, artificial intelligence, robotics, and data analytics [8]. It aims to create highly interconnected, data-driven, and adaptive manufacturing ecosystems, enabling real-time monitoring, optimization, and automation of production, supply chains, and services. Smart industry endeavors to enhance efficiency, flexibility, and innovation while fostering sustainable practices, ultimately reshaping traditional industrial practices into agile, intelligent, and interconnected systems.

The smart industry also fosters an interconnected ecosystem where customers, suppliers, and producers collaborate harmoniously [10]. On the hand of customers, mass customization, a hallmark of this transformation, empowers customers to define their unique product preferences, shaping demand in real-time. On the other hand, through intelligent data-driven systems, suppliers seamlessly adjust their offerings, optimizing inventory and production processes to meet dynamic customization needs. This strategy enhanced interplay among smart industry’s technological prowess, empowered customers, and agile suppliers has fundamentally reshaped conventional supply chains through comprehensive integration [24].

This new paradigm, paves the way to transform the classic production process into mass customization processes, where the client has an active role in the design of the final product. This situation of personalized products, has a significant impact in terms of production processes, since not all the finished product will be the same, then, their production processes must will not be the same. In production systems that are configured as flow shop, this personalization may impact in a missing operation fashion [25]. In missing operation flow shop scheduling problems, the operation route of each job may be different, where the differences are basically if a job may skip or not one of the operations. Then, the cardinality of the set of operations of jobs is not constant for all jobs. This modification represents a challenging scenario for production scheduling decision-making, because the orders to be planned are not all the same [20]. Furthermore, decision makers must fulfill many criteria for solving the scheduling of production efficiently nowadays, then, the complexity and difficulty of the problem enhances.

This article addresses a missing operation, multi-objective, flow shop scheduling problem using a metaheuristic approach [14]. Mainly, the problem considered is a regular flow shop system, where there is one machine or production resource per stage, and the jobs to be processed by that system may not require to be processed in every machine. Also, as mentioned before, to optimize this problem involves to consider simultaneously more than one criterion. In this case three different objective functions are analyzed, namely, makespan, total tardiness and total completion time. These goals treated as a multiobjective optimization problem, enable to optimize production system utilization, customer service level and production orders flow, respectively. As far as the authors know, this is the first time that a missing operation flow shop problem with three objectives is studied. The metaheuristics applied to solve the problem are NSGA-II, NSGA-III, MOEA/D and SPEA2.

The article is structured as follows. Section 2 formally presents the flowshop problem, describing its mathematical formulation and the main related works. Section 3 presents the metaheuristic algorithms used for the resolution of the multiobjective flowshop problem. Section 4 describes the computational experimentation, including the implementation details, the description of instances and the main results. Finally, Sect. 5 presents the conclusions of the research and formulates the main lines for future work.

2 Mass Customization and the Multiobjective Flowshop Problem with Missing Operations

This section presents a comprehensive presentation of mass customization in Smart Industry environments. Then, a detailed description of the problem addressed in this work is introduced, where the objectives function considered for the multi-objective approach are mathematically described. Finally, the related works found in literature are revised in order to highlight the main contributions of the reported research.

2.1 Mass Customization Impact on the Shop-Floor Operations

As aforementioned, an important aspect of smart industry is mass customization. Mass customization refers to the capacity of efficiently producing goods and services that are tailored to meet individual customer preferences and requirements, while still achieving economies of scale similar to mass production [1, 18]. Several companies have successfully invested in enhancing their mass customization strategies to offer personalized products to their customers. One example is Nike, which allows customers to design their own sneakers through its Nike By You platform, where they can choose colors, materials, and customize various design elements [26]. Larger products also have entered to this wave of customization. For example in cars production, BMW enables customers to personalize their luxury vehicles with a wide range of custom features, including paint colors, interior materials, and technology options [27].

Smart industry enables manufacturers to gather insights from customer preferences, adapt production processes, optimize resource allocation, and dynamically reconfigure assembly lines, resulting in the cost-effective creation of highly customized products on a scale previously unattainable. Mass customization starts with an intelligent smart product design, in which the preferences of the user are translated into instructions for the shop-floor operations on how to plan the production phase. In this regard, mass customization has a huge impact in shop-floor operations [28]. Among the aspects that are involved in an efficient shop-floor management are: i) Workflow flexibility and Advanced manufacturing technologies: shop floors must be designed to accommodate varying product configurations and customization options based on advanced manufacturing assessts [11]. ii) Real-time data integration: shop-floor operations need to integrate data systems to ensure accurate and up-to-date information for decision-making  [29]. iii) Inventory management: inventory management systems must be optimized to ensure that the right components are available for each customization option [6]; iv) Quality 4.0: robust quality assurance protocols and testing procedures are crucial to maintain customer satisfaction [30]; v) Skilled Workforce: training programs are essential to empower workers with the knowledge needed to execute customization tasks effectively [12]; vi) Smart production planning and control: aims to intelligently perform the activities of loading, scheduling, sequencing, monitoring, and controlling the use of resources and materials during production by means of data analytics, AI, and machine learning [17].

Figure 1 presents a summary of the main concepts involved in an Smart industry and the impact of mass customization to shop-floor operations management. This article focuses on smart production planning proposing new resolution methodologies to solve the flowshop problem that arises in the context of mass customization with missing operations.

Fig. 1.
figure 1

Smart industry: the new paradigm and the impact on shop-floor operations management.

2.2 Mathematical Formulation

The mathematical formulation of the multiobjective flowshop problem with missing operations considers the following elements:

  • A set of machines or operations M that can be performed.

  • A set of jobs J that have to be delivered.

  • A due date \(d_{j}\) in which each job has to be delivered.

  • Given the matrix \(P_{JM}\) which contains the processing times \(p_{jm}\) for each job j on each machine m and the processing order of jobs on the machines Or, a completion time function \(C(j): P_{JM} \times Or \rightarrow \mathcal {R}_{0}^{+}\) that returns the completion time of job j, i.e., the time when the job has performed all the required operations in all the machines.

  • A vector \(w_j\) that gives the relative importance of each job regarding the total completion time.

Then, the optimization problem addressed in this work is to define the processing order of jobs on the machines that simultaneously minimized the following three metrics: the makespan, the weighted total completion time and the total tardiness, which are computed as in Eqs. (1a)–(1c).

$$\begin{aligned} \text{ min } \quad & \text {Makespan} = \displaystyle \max _{j\in {J}} \{C(j)\} \end{aligned}$$
(1a)
$$\begin{aligned} \text{ min } \quad & \text {Total Completion Time} = \displaystyle \sum _{j\in {J}} w_j \times C(j) \end{aligned}$$
(1b)
$$\begin{aligned} \text{ min } \quad & \text {Total Tardiness} = \displaystyle \sum _{j\in {J}} max(0;C(j) - d_j) \end{aligned}$$
(1c)

The impact of missing operations affects the matrix of processing times \(P_{JM}\) since several parameters \(p_{jm}\) can be equal to 0. Regarding constraints, the problem at hand is bound by specific restrictions. First, there are non-overlapping constraints in place, which prohibit more than one job from being processed on the same machine simultaneously. Second, there are logical timely constraints, meaning that the start time for processing a job on a machine must occur after the finish time of the same job on the previous machine. These restrictions shape the flowshop problem which is known to be a computationally complex problem [4].

2.3 Related Work

The study of missing operation in regular flow shop problems (i.e., a single machine per stage) is not a new problem, it has been studied for more than two decades [5]. However, in the last years it has gained renewed attention in the scientific community since, as it was mentioned previously, there is a growing tendency in the transformation of traditional manufacturing processes towards personalized manufacturing processes. [3]. As for instance, the case of concrete personalized industry where the production process is configured as a flow shop with missing operation. In [25], the objective is to minimize the cycle time, and for this, the authors developed a comprehensive innovative approach that overcomes the particular restrictions the production process has, like lags between operations. Other type of problem is tackled at [19], where a non-permutation flow shop problem with missing operation is considered. Mathematical programming models are developed to optimize the makespan in this work. Also a non-permutation flow shop problem is analyzed at [20], but in this case total tardiness is considered as objective function and metaheuristics are used for optimization. More recently in [7] a missing operation flow shop problem arises at the semiconductor industry. At this case, a special feature of the studied problem is the presence of time waiting constraints, and the objective function is the minimization of total tardiness. For solving the problem heuristic algorithm are implemented. Regarding multi-objective optimization with missing operation in regular flow shop problems, the literature is more scarce. Basically, as far as the authors know, it can be found only [21, 22], where in these works a bi-objective problem is approached by means of Evolutionary algorithms. In those studies makespan and total tardiness are minimized, and different levels of missing operations are considered.

Therefore, after revising the literature, it has not been detected a missing operation problem with three objectives, even less that analyzed makespan, total tardiness and total completion time simultaneously. These objectives involves different relevant interests for decision makers, whom must to cope with in their optimization processes. Makespan contribute to optimizing production resources usage, reducing the idle time of them. Meanwhile, total tardiness focus on service level to the client, trying to accomplish the due dates agreed with the client. And, total completion time, try to reduce the time a production order is on the system, that is, tries to reduce the time this order is considered as work-in-progress. The present article addresses these three objectives in a multi-objective manner.

3 Resolution Approach

This section describes the proposed resolution approach for solving the multiobjective flowshop problem.

3.1 Overall Description and Algorithms

Various strategies have been proposed to solve multi-objective optimization problems. Resolution approaches include exact methods rooted in mathematical programming [23], as well as heuristic and metaheuristic strategies [16]. For complex combinatorial challenges like the one described in this paper, metaheuristics offer an efficient resolution strategy to attain high-quality solutions in reasonable computing times and, thus, have been extensively used in similar problems [20, 22].

Among metaheuristics, multi-objective evolutionary algorithm (MOEAs) are population-based methods inspired by the evolutionary process of species in nature. MOEAs have demonstrated to be successful methods with application in diverse complex optimization problems [14]. Particularly, this article proposes applying four state-of-the-art evolutionary metaheuristics to address the target problem: SPEA2, NSGA-II, NSGA-III and MOEA/D, which are described next.

Strength Pareto Evolutionary Algorithm 2 (SPEA2). is a MOEA that focuses on non-dominated sorting and density estimation to generate a diverse set of solutions, allowing for effective exploration of the Pareto front. In this line, a notable aspect of SPEA2 is its fitness calculation, which takes into account both Pareto dominance and diversity. The algorithm introduces the concept of strength to gauge how many candidate solutions are dominated by or dominate other solutions. Additionally, fitness assignment involves density estimation. Elitism is also incorporated through the use of a population that stores non-dominated individuals discovered during the search.

Non-dominated Sorting Genetic Algorithm II (NSGA-II). is a widely used evolutionary algorithm that employs non-dominated sorting, crowding distance, and elitism to evolve a diverse population of solutions, efficiently approximating the Pareto front. NSGA-II is characterized by an evolutionary search using a non-dominated elitist ordering that diminishes the complexity of the dominance check, a crowding technique for diversity preservation, and a fitness assignment method considering dominance ranks and crowding distance values. All these features are integrated to provide a robust and effective search, which has been successfully applied to solve multiobjective optimization problems in many application areas.

Non-dominated Sorting Genetic Algorithm III (NSGA-III). NSGA-III is an extension of NSGA-II that incorporates reference points to guide the optimization process, enhancing the spread of solutions along the Pareto front and supporting better convergence.

Multi-objective Evolutionary Algorithm based on Decomposition (MOEA/D). MOEA/D decomposes a multi-objective optimization problem into subproblems, each solved by a separate optimization process. It balances exploration and exploitation to efficiently approximate the Pareto front by iteratively updating solutions through collaboration among subproblems.

3.2 Description of the Proposed Metaheuristics

The proposed MOEAs operate using the following features:

Solution Representation. As it usual in similar works, solutions are denoted by permutations of integers within a vector. The index placement within the vector represents the processing sequence on the initial machine, with the associated integer values corresponding to individual jobs slated for scheduling. Thus, the length of the vector represents to the total job count.

Initialization. The population, comprising \(\#P\) individuals, is initialized through a random procedure that generates permutations devoid of repeated integer values. Employing a uniform probability distribution, each value within a solution representation is chosen from the interval [1,n].

Evolutionary Operators. The well-known Partially Mapped Crossover (PMX) is employed as the recombination operator. This crossing mechanism pairs two chosen individuals with a probability of \(p_c\), and it has been widely utilized in various studies tackling permutation-encoded scheduling issues. Subsequently, the mutation operator relies on Swap Mutation, involving the interchange of two elements within the permutation. Application of the mutation operator to an individual occurs with a probability of \(p_m\). Notably, the proposed operators ensure the feasibility of the resultant solutions.

4 Computational Experimentation

This section presents the computational experimentation of the proposed approach, including the description of instances, the methodology used for the experimental evaluation, and the main numerical results.

4.1 Description of the Problem Instances

A set of realistic instances were constructed for the computational experimentation, following the procedure by Henneberg and Neufeld [9]. Processing times were generated as integer values within the range [0:100] following a pseudo-uniform distribution, with the probability of a processing time been zero with a relatively higher value compared to the other possible processing times. This approach ensured the existence of varied processing times including the possibility of missing operations. The sets of instances were constructed considering three different numbers of jobs (30, 40 and 50), two different numbers of machines or operations (10 and 20) and three different percentage probability of missing operations (0%, 10% and 20%). The instances were named using the following convention n \(\times \) \(m - p\%\), where n is the number of jobs, m is the number of machines and \(p\%\) for the percentage probability of missing operations.

4.2 Methodology for the Computational Experimentation

This subsection presents the description of how the computational experimentation of the proposed MOEAs is performed.

Implementation Details and Excecution Platform. The implementation of the proposed MOEAs was carried out in Java, using the JMetal framework version 6.1 [13]. The computational experimentation phase was executed on the National Supercomputing Center, Uruguay (Cluster-UY) [15].

Evaluation Metrics. The evaluation is performed considering two multiobjective optimization metrics: spread and relative hypervolume (RHV). Spread [2] is a metric of diversity that evaluates the distribution of the non-dominated solutions, assessing the capacity of correctly sampling the Pareto front. Unlike other typical distribution metrics such as spacing, the spread as formulated in Eq. (2) takes into account the information about the extreme points of the true Pareto front to calculate a more accurate value of the dispersion.

$$\begin{aligned} Spread = \frac{\sum _{o\in \mathcal {O}} d_{o}^{e} + \sum _{i\in \mathcal{N}\mathcal{D}} |\overline{d}-d_i|}{\sum _{o\in \mathcal {O}} d_{o}^{e} + |\mathcal{N}\mathcal{D}| \overline{d}} \end{aligned}$$
(2)

where \(\mathcal {O}\) is the set of objectives, \(\mathcal{N}\mathcal{D}\) is the set of non-dominated solutions, \(d_{o}^{e}\) is the distance between the extreme point of the Pareto front regarding objective o and the closest non-dominated solution in the computed Pareto front, \(d_i\) is the distance between the non-dominated solution i in the computed Pareto front and the closest neighbor non-dominated solution, and \(\overline{d}\) is the average value of all \(d_i\). On the other hand, the RHV quantifies the ratio between the hypervolumes (in the search space of the objective functions) covered by the computed Pareto front and the true Pareto front of the problem. Thus, in an ideal situation the RHV value equals one. Consequently, RHV serves as a comprehensive metric that evaluates both numerical accuracy (proximity of the computed Pareto front to the real Pareto front) and the distribution of the non-dominated solutions. When the true Pareto front is unknown for a problem instance, as it is the case in this study, the true Pareto front is approximated using all the non-dominated solutions obtained from all the resolutions performed for that instance.

Parametrization. The determination of the optimal parametric configuration was guided by statistical analysis. This process was pivotal in establishing the values for the key parameters of the studied MOEAs: population size (\(\#P\)), crossover probability (\(p_c\)), and mutation probability (\(p_m\)). To determine these parameters different values were assessed: 50 and 100 for population size, 0.5, 0.7, and 0.9 crossover probabilities, and 0.01, 0.05, and 0.1 mutation probabilities. Consequently, a comprehensive evaluation encompassing sixteen parametric configurations ensued for each of the four MOEAs. The analysis for the parameter setting was based on the RHV, which as aforementioned is a robust summary metric. The stopping condition was set to 150,000 evaluations of the objective function. For the comparison three small instances different from the main computational study were used. As the RHV values did not follow a normal distribution according to the Shapiro-Wilk test, the Friedman rank test, a non-parametric method, was employed to assess the goodness of each configuration. Particularly, the neighborhood size of the MOEA/D was chosen in 3% of \(\#P\) which showed a good performance in our previous work [21]. After the parameter setting, the following configurations were chosen for the studied MOEAs:

  • MOEA/D: \(\#P = 50\), \(p_c = 0.5\), and \(p_m=0.1\)

  • NSGA-II: \(\#P = 100\), \(p_c = 0.7\), and \(p_m=0.1\)

  • NSGA-III: \(\#P = 50\), \(p_c = 0.7\), and \(p_m=0.1\)

  • SPEA2: \(\#P = 100\), \(p_c = 0.9\), and \(p_m=0.1\)

4.3 Numerical Results

This subsection describes the result of the computational experimentation. For each instance and each MOEA, 30 independent runs were performed.

Multi-objective Optimization Metrics. Tables 1 and 2 present the summary of the results of the RHV and the spread respectively. The tables report the statistical test used to study if there are significant differences among the medians or averages, a central tendency and a dispersion measure for the studied MOEAs. In the instances in which results follow a normal distribution, the ANOVA test is applied as statistical test (expressed with “A” in the Tables 1 and 2) and the mean and standard deviation are used as central tendency and dispersion measures respectively. Conversely, in the case of non-parametric distributions, Kruskal-Wallis (expressed with “K-W” in the table) is applied as statistical test, and the median and interquartile range are used as central tendency and dispersion measures respectively. For each instance, the best result is marked with bold font. Results marked with gray background indicate the cases in which the test verified a significant statistical difference with respect to the other MOEAs. Regarding RHV, NSGA-II obtained the largest mean/median in 8 out of 18 instances. SPEA2 obtained the largest mean/median in 6 out of 18 instances. Finally, the NSGA-III obtained the largest mean/median in 4 instances out of 18 instances. The largest mean/median value was obtained by SPEA2 for instances 30J \(\times \) 10M-0% (0.7822). In terms of spread, SPEA2 obtained the smallest value in 13 out of 18 instances. NSGA-II and NSGA-III obtained the smallest values in 3 out of 18 instances and in 2 out of 18 instances, respectively. The overall smallest value of spread was obtained by SPEA2 for instance 30J \(\times \) 20M-0% (0.3895). Overall the SPEA2 and the NSGA-II had the best performance for the instances studied, been able to outperformed the other MOEAs in both analyzed metrics.

Table 1. Results of RHV metric for the studied MOEAs.

Consolidated Pareto Fronts. Table 3 reports the spread and RHV metrics for the consolidated Pareto fronts computed from all the nondominated solutions obtained by each MOEA for each instance in the 30 independent runs. Regarding RHV, the NSGA-II, SPEA2 and NSGA-III are able to outperformed the rest of the MOEAs in 6 instances, 6 instances, and 5 instances, respectively. Regarding spread, the NSGA-II, SPEA2, NSGA-III, and MOEA/D are able to outperformed the rest of the MOEAs in 10 instances, 6 instances, 1 instance, and 1 instance respectively. Similarly to the previous numerical results, the SPEA2 and NSGA-II are able to obtain the best results in more instances than the other two MOEAs.

Table 2. Results of Spread metric for the studied MOEAs.
Table 3. Consolidated Spread and RHV metrics for the studied MOEAs.

Impact of Missing Operations Over Instances. For showing the relation between the values of the objectives, Fig. 2 presents the consolidated Pareto fronts of the instances according to the percentage of missing operations for instance 30J \(\times \) 10M. Similar results were obtained for the rest of the instances. The Total Tardiness and the weighted Total Completion Time seem to be highly sensitive to the percentage of missing operation, i.e., the larger the percentage probability of missing operations, the smaller the values of Total Tardiness and the weighted Total Completion Ttime. On the other hand, the Makespan is less affected by the percentage probability of missing operation.

Fig. 2.
figure 2

Pareto front of instance 30J \(\times \) 20M with different levels of missing operations.

5 Conclusions and Future Work

Mass customization, as part of the Smart industry paradigm, seeks to combine the advantages of mass production with those of customization. This article studied the flow shop problem with missing operations that arises in shop-floor operations as a consequence of mass customization. Four state-of-the-art MOEAs were applied to simultaneously optimize three traditional metrics of flowshop problems: weighted Total Completion Time, total tardiness and makespan. In the computational analysis over realistic instances, SPEA2 and NSGA-II consistently computed the best results, regarding both convergence and diversity metrics. Moreover, another relevant result is the greater impact of the percentage of missing operations on total tardiness and weighted total completion time, whereas the makespan remained relatively unaffected.

Future work includes enlarging the computational experimentation, incorporating more instances and higher percentage probabilities of missing operations. Additionally, other MOEAs will be included in the analysis to assess their performance in this specific problem domain.