1 Introduction

Weld quality plays a major role in evaluating product performance in manufacturing environments. The quality of welded materials can be evaluated by means of many characteristics such as ultimate tensile strength (UTS), yield stress (YS), % elongation (% Elng) and hardness. FSW is a great solid-state joining process that was introduced in 1991 by [1]. FSW is usually applied for welding aluminum, magnesium and other soft metals. The basic concept of FSW is simple. A rotating tool composed of a shoulder and a pin in the end is inserted into workpiece and moved along the weld line [2]. While traveling along the workpiece, the tool deforms the joint material plastically and mixes it to perform a strong weld joint [3]. Welding characteristics are controlled by a number of process parameters such as plunging depth, tool rotation speed, tool geometry, shoulder diameter, pin diameter, tool pin length, dwell time and welding speed. The advantage of FSW over the other fusion welding processes is that the welding process does not involve material melting, which produces less weld cracks and defects. Moreover, it does not need shielding gas, electrodes or filling material and outputs less distortion in the welded joints [2, 3].

In FSW process, the optimal parameter settings are difficult to determine due to the large number of process parameters, and the relationships among them are nonlinear, highly complex and interdependent. Due to the facts that existing mathematical models suffer from inefficiency in describing the nonlinear characteristics of FSW process, the intelligent systems like ANNs come into picture. ANNs are powerful tools to correlate properties existing between the input and output parameters of FSW process when compared with other techniques of modeling like regression analysis, analytical and numerical techniques. For modeling of weld quality, different types of ANNs can be used, namely back-propagation neural network (BPNN) and radial basis functions (RBFs). Boldsaikhan et al. [4] used back-propagation algorithm to train ANN model to classify the feedback forces frequency patterns in FSW process to use them for wormhole defects detection. Lakshminarayanan and Balasubramanian [5] compared ANN modeling with response surface methodology for prediction of ultimate tensile strength for FSW of Al alloy with the conclusion of ANN being better. Buffa et al. [6] developed ANN model using back-propagation training algorithm and combined it with a finite element model (FEM) for FSW of Ti–6Al–4 V alloy. The model was to estimate microhardness and microstructure of the weld. Okuyucu et al. [7] used ANN model to predict the mechanical properties of FS welded Al plates. Fratini et al. [8] used ANN and FEM models for prediction of average grain size of FS welded Al alloys. Ghetiya and Patel [9] developed ANN model for the estimation of tensile strength of Al alloy in FSW process. Asadi et al. [10] successfully developed a BPNN for diagnosing both grain size and hardness in a AZ91/SiC nanocomposite with accurate estimation. In their other work, Akbari et al. [11] explained the implementation of ANN and EAs for estimating and optimizing the properties of FS welded plates. From the literature, it is found that various researchers have successfully used ANN models to correlate the input and output relationship in FSW process.

Multi-objective optimization (MOO) problems are common in engineering environment. Classical approaches for solving MOO like weighted sum and weighted metric methods combined with single-objective EAs were applied [12], but they suffer from many difficulties. They convert the MOO problem into single-objective problem. Moreover, they need good knowledge about the problem and the good distribution of solutions may not be guaranteed. To overcome those difficulties, multi-objective EAs (MOEAs) have been developed [13]. Many successful MOEAs have been proposed by various researchers such as elitist non-dominated sorting genetic algorithm (NSGA-II) [14], multi-objective particle swarm optimization (MOPSO) [15] and differential evolution for multi-objective (DEMO) [16]. Those MOEAs share the desire of finding uniformly distributed Pareto optimal front of the problem. The difference between those algorithms is in the criteria of which the non-dominated solutions can be chosen and how to maintain exploration and exploitation in the problem search space. EAs were applied successfully in FSW process. Shojaeefard et al. [17] used BPNN to model FSW process and MOPSO to get optimum mechanical properties. Two inputs, namely rotational speed and welding speed, and two outputs which are tensile strength and hardness of the welded joint were considered for the optimization problem. To determine the best compromised solution, technique for order preference by similarity to the ideal solution (TOPSIS) was applied. Tutum and Hattel [18] developed thermo-mechanical model of FSW process and applied NSGA-II for optimization of residual stresses in the welded joint and production efficiency. Shojaeefard et al. [19] used BPNN for modeling FSW of AA5083 aluminum alloy. The considered inputs were the rotational and welding speeds, and the outputs were welding force, peak temperature and heat-affected zone width. For the optimization purpose, NSGA-II was applied and TOPSIS to find the best compromised solution. From the literature, it is observed that DEMO has not been tested yet for optimization of manufacturing or welding processes, although it has great potential. Moreover, numbers of inputs and outputs parameters considered in FSW optimization problem are less. Therefore, it is necessary to apply DEMO and compare results with those obtained from other algorithms like NSGA-II. Also, it is essential to consider more numbers of input and output parameters within the optimization process to ensure the best weld quality.

In this work, experimental study for FSW process is conducted using Taguchi and full factorial design of experiments. Then, the contribution of various FSW process parameters in determination of weld qualities is investigated. Consequently, ANN models are developed using back-propagation training algorithm for modeling the FSW process. Thereafter, two multi-objective optimization methods, namely NSGA-II and DEMO for optimization of FSW process, are employed. The objective is to find the optimal process parameter settings corresponding to maximum weld quality and to compare the performance of NSGA-II and DEMO. Finally, TOPSIS is used to find best compromise solutions to the process and confirmation experiment is conducted accordingly.

2 Experimental details

2.1 Experimental approach and results

In the current work, 6-mm-thick aluminum plates (1100 Al alloy) are used for experiments. The plates are prepared into rectangular pieces of 200 × 100 mm for joining purpose of butt joints by FSW process. It is important to select an appropriate FSW tool material which should be difficult to wear out and can withstand the vertical pressure and torque applied to it. For the present work, stainless steel (SS-310) is used as tool material because of its excellent high-temperature properties. A vertical milling machine is used to carry out the welding operations with the specifications of: spindle speed: 12 steps (50–1500 rpm), table feed: 8 steps (22–555 mm/min), main motor power: 5.5 kW, table motor power: 0.75 kW.

The parameters used in the present work are plunge depth (PD), tool rotational speed (RPM), welding speed (WS), tool geometry [TG—straight cylindrical (SC), tapered cylindrical (TC), square (SQ), threaded (THD)], shoulder diameter (SD), pin diameter (PnD), tool pin length (TPL) and dwell time (DT). A total of 59 experiments are conducted by varying eight input parameters. The first experimental set has been designed by utilizing Taguchi’s L32 orthogonal array in which plunge depth is varied in two levels because of the small working range and four levels for the rest of the parameters. The other experimental set has been designed based on full factorial design of experiments where TG, RPM and PnD are varied in three levels. The parameter settings are shown in Table 1. In case of tapered cylindrical tool, a taper angle of 10° is considered, and in threaded tool, 1-mm pitch is deemed.

Table 1 Full experimental input parameter settings

Once the welding is over, specimens are prepared to measure weld quality characteristics. The tensile specimens are tackled as per the American Society for Testing of Materials (ASTM E8) guidelines. The tensile, bending and hardness specimens are shown in Fig. 1a–c, respectively. Tensile tests are carried out in a digitally controlled closed-loop servo hydraulic dynamic testing machine (Make: INSTRON, Model 8801). The capacity of the testing machine is 10 tons (100 KN). Experimental weld qualities corresponding to each welding parameter settings mentioned in Table 1 are given in Table 2. For bending test, root and face bend tests are carried out to achieve accurate bending angle. The hardness values are measured by using Vicker’s microhardness indentation machine (Make: Omni Tech) at 500 g load for 10 s. The various output responses considered for optimization are ultimate tensile strength (UTS in MPa), yield strength (YS in MPa), ductility (% Elng), bending angle (BA in °) and nugget zone hardness (HRD in HV).

Fig. 1
figure 1

a Tensile, b bending and c microhardness specimens (dimensions are in mm)

Table 2 Experimental results corresponding to various welding conditions in Table 1

2.2 Effect of process parameters on the weld qualities

In order to analyze the significance and the contribution of each parameter to the weld qualities, ANOVA is carried out. The percentage influence of the considered process variables on the measured outputs is shown in Table 3. It is found that measured weld characteristics are significantly influenced by RPM, TG and PnD. As RPM is responsible for overall material mixing in surface level as well as in thickness direction of the workpiece, it is the most influencing factor for UTS having 29.67% weightage. TG and PnD are responsible for the material mixing along the workpiece thickness direction, and these are the next influencing factors having 21.85 and 21.07% influence, respectively.

Table 3 Percentage contribution on weld quality parameters for each input parameter

Bending angle is measured at the time of visible crack initiation. All the good joints are bent up to an angle of 140° without any crack. Nevertheless, PnD is the most crucial for both BA and % Elng having significances of 28.63 and 38.65%, respectively. RPM and TG are seen to be the next most contributing factors on BA and % Elng. It is also found that PD, TPL and WS do not have significant effect on the weld qualities. Finally, we recommend that RPM, TG and PnD are considered to be the most prominent parameters that affecting weld qualities in FSW process.

3 Prediction of weld quality using BPNN models

ANN modeling can be conducted using experimental data. A detailed description of the principles of multilayer neural networks and back-propagation training algorithm can be referred to the relevant technical book [20]. A schematic diagram of the proposed ANN model architecture is shown in Fig. 2. The network is made up of three layers, namely input layer, hidden layer and output layer. For present work, each neuron of a layer is connected to all the neurons in the other layers. The input neurons receive information from an external with appropriate bias, which is then multiplied by the interconnection weights between it and the hidden layer. The summation of all products is modified by an activation function in the hidden layer, which is here the log sigmoid activation function. The outputs of the hidden neurons are multiplied then with the connection weights between hidden and output neurons. After that, the summation of all products is modified by an activation function in the output layer, which is also the log sigmoid activation function. These modified values of the output layer are considered as the output of the ANN model.

Fig. 2
figure 2

The proposed ANN architecture

In this work, a source code for a multi-neurons, single hidden-layer ANN model has been developed for correlating the FSW process parameters to the weld quality parameters. The training of the ANN models is performed in a supervised manner using batch mode of training and back-propagation algorithm. The training process is done using 40 randomly selected input–output data pairs from the total 59 experiments. The remaining 19 pairs are divided into validation set of 9 and testing set of 10. The purpose of validation is to prevent over training. By monitoring the training and validation errors, training process should stop when the best matching between these errors is reached. Initial weight values are chosen randomly between ±0.9. All the input and output data are normalized between 0.1 and 0.9. The objective of the training process is to minimize the mean square error (MSE) by updating the network parameters through the gradient descent method.

$${\text{MSE}}\left( i \right) = \frac{1}{2PN}\mathop \sum \limits_{p = 1}^{P} \mathop \sum \limits_{k = 1}^{N} \left( {O_{Ok}^{p} \left( i \right) - T_{k}^{p} } \right)^{2} ,$$
(1)

where \({\text{MSE}} \left( i \right)\) is the MSE at the ith iteration, P is the total number of training patterns, N is the number of neurons in the output layer, \(O_{Ok}^{p} \left( i \right)\) is the output of kth output neuron for the pth pattern at the ith iteration and \(T_{k}^{p}\) is the desired kth output for the pth pattern. The performance of a neural network depends on number of hidden neurons (NHN), learning rate (η) and momentum coefficient (α). Therefore, several combinations should be tried out to choose an optimal combination. The considered outputs are UTS, YS, % Elng, BA and HRD. The number of hidden neurons, η and α values are optimized by varying within a range of 5–30 and 0.05–0.95, respectively. This process is carried out separately for each output. After training, the network testing data set is used to test the network performance. The optimum ANN architecture, learning rate and momentum coefficient corresponding to the five ANN models are shown in Table 4. The ANN predicted values and percentage errors in the outputs (UTS, YS, % Elng, BA and HRD) are shown in Table 5. From the ANN models, it is observed that the average errors in prediction of joint properties are within ±10%. So the developed model can be used effectively for prediction of weld quality in FSW process.

Table 4 The best ANN architecture and parameters for all models
Table 5 The ANN predicted values and percentage errors in the outputs

4 Multi-objective optimization

Since multi-objective optimization problems usually consist of two or more objectives, it is not possible to optimize the entire objectives in a simultaneous way. To solve those problems, the concept of “non-dominance” is used. Non-dominated solutions are the solutions which are not dominated by any other solution in the solution space. The set of optimal non-dominated solution is called “Pareto optimal front.” In other words, it is the best set of solutions which can be obtained from the multi-objective optimization problem. In order to optimize FSW process parameters, two multi-objective EAs are applied. Each of those suggested methods is composed of two stages: generation of Pareto front by NSGA-II and DEMO, and then to obtain best compromise solutions from Pareto front, TOPSIS which is a multi-attributes decision-making technique proposed by Hwang and Yoon [21] is implemented.

4.1 Elitist non-dominated sorting genetic algorithm (NSGA-II)

NSGA-II is a multi-objective optimization algorithm proposed by Deb et al. [14]. NSGA-II incorporates the powerful procedure of non-dominated sorting and crowding distance metric method to generate uniformly distributed Pareto optimal front. Detailed description about the algorithm is available in [13, 14]. A detailed flowchart of the combination of pre-trained ANN models and NSGA-II is shown in Fig. 3.

Fig. 3
figure 3

Neuro-NSGA-II flowchart for optimization of FSW process

4.2 Differential evolution for multi-objective

DEMO is a multi-objective optimization algorithm proposed by Robič and Filipič [16]. DEMO combines the advantages of differential evolution (DE) with the mechanisms of non-dominated sorting and crowding distance metric to create a powerful MOO algorithm. DEMO can be explained as following steps, and a complete flowchart for Neuro-DEMO procedure is shown in Fig. 4.

Fig. 4
figure 4

Neuro-DEMO flowchart for optimization of FSW process

  • Step 1. Create initial population P of random individuals.

  • Step 2. While stopping criteria are not satisfied, do:

    • Create mutant vector \(V_{i}^{t + 1} = x_{r1} + F \cdot \left( {x_{r2} - x_{r3} } \right)\), where N is the population size, \(i = 1 \ldots N\), \(x_{r1} , x_{r2}\) and \(x_{r3}\) are randomly selected individual, F is real constant factor \(\in \left[ {0-2} \right]\).

    • Evaluate the mutant vector.

    • If the mutant vector dominates the parent, it replaces the parent. If the parent dominates the mutant vector, then the mutant vector is discarded. Otherwise the mutant vector is added to the population.

  • Step 3. If the new population has more individuals than parent population, truncate it.

The truncation procedure is composed of two steps: the first one is sorting the extended population vectors with non-dominated sorting method and then the evaluations of the sorted vectors by means of crowding distance. This procedure helps to preserve elitism and obtain uniformly distributed Pareto optimal front.

4.3 Multi-objective optimization of FSW process parameters

The optimization procedure starts by creating initial population of solutions randomly inside the search space of the experiment. Then, the population is fed to the pre-trained ANN models. The response characteristics are computed inside the ANN models and fed to NSGA-II and DEMO algorithms. In each algorithm, various operators are used to generate a new population. The new population is again fed to the ANN models, and the response characteristics are again computed and fed to each algorithm. That process proceeds until the optimal quality characteristics are obtained. The objective of that optimization procedure is to maximize weld quality characteristics, which is shown in the following equations:

$${\text{Max}}. f\left( {\text{UTS, YS, Elong, BA, HRD}} \right)$$
(2)
$${\text{Subjected to:}} \left\{ {\begin{array}{l} {0.09 \le {\text{PD}} \le 0.15 {\text{mm}}.} \\ {600 \le {\text{RPM}} \le 1500 } \\ {63 \le {\text{WS}} \le 200 {\text{mm/s}} .} \\ {{\text{TG}} \in \left\{ {\text{SC,TC,SQ,THRD}} \right\}} \\ {20 \le {\text{SD}} \le 35 {\text{mm}}.} \\ {5 \le {\text{PnD}} \le 8 {\text{mm}} .} \\ {5.6 \le {\text{TPL}} \le 5.8 {\text{mm}} .} \\ {10 \le {\text{DT}} \le 25 {\text{s}}.} \\ \end{array} } \right.$$
(3)

The parameters considered for NSGA-II computations are 100 population size, tournament selection with tournament size of 5, simulated binary crossover with 0.9 crossover rate and random mutation with 0.1 mutation rate. Similar procedure is done with DEMO having 100 population size and F factor as 0.9. The two algorithms are run with 500, 1000, 1500 and 2000 iterations. After getting the non-dominated solutions, TOPSIS is applied to obtain best solution among them. Results show that DEMO can find the optimal solutions within 500 iterations, whereas NSGA-II needed 2000 iterations. The Pareto fronts in 2D plains obtained from Neuro-NSGA-II and Neuro-DEMO are schematized in Fig. 5. There are total twenty Pareto fronts obtained from the two optimization techniques. However, for representation purpose only six fronts are shown. It is obvious from the figure that the optimal fronts achieved by Neuro-DEMO framework outperform those obtained from Neuro-NSGA-II paradigm in most cases by means of good distribution of Pareto solutions and better uniformity when the objectives are compared into 2D plains. The minimum, mean and maximum values of the overall Pareto solutions for both Neuro-NSGA-II and Neuro-DEMO paradigm are written in Table 6. It is clear from Table 6 that results produced by Neuro-DEMO are more reliable than Neuro-NSGA-II results in the cases of tensile properties. Nevertheless, both Neuro-DEMO and Neuro-NSGA-II generated almost similar results when comparing BA and HRD outcomes. Also, the Neuro-DEMO technique is able to find more accurate results with less than quarter computational time than Neuro-NSGA-II paradigm. For those reasons, we can confidently recommend the Neuro-DEMO framework to be more efficient and reliable than the Neuro-NSGA-II for optimization of FSW process parameters. Furthermore, the Neuro-DEMO can be suggested for implementation when optimizing other welding processes is under consideration.

Fig. 5
figure 5

Representative Pareto fronts in 2D plains resulted from: ac Neuro-NSGA-II and df Neuro-DEMO

Table 6 Comparison of results obtained from Neuro-NSGA-II and Neuro-DEMO

The four best solutions that are produced by the proposed procedure are shown in Table 7. It is clear from the table that predicted weld quality characteristics obtained from DEMO are better comparing to those from NSGA-II, and DEMO achieved the target solutions with less number of iterations and less computational time. This is due to the combination of the powerful DE search scheme with non-dominated sorting method.

Table 7 Four best solutions obtained from the hybrid NSGA-II and DEMO with TOPSIS for five objectives

4.4 Confirmation experiment

One confirmation experiment is conducted to validate the best DEMO predicted weld qualities. The optimum process parameters settings corresponding to solution 1 of DEMO are considered from Table 7 and rounded to near possible parameters setting available in the FSW machine to conduct the experiment. The measured weld quality values are 145.38 MPa, 99.25 MPa, 19.98%, 140° and 64.1 HV for UTS, YS, % Elng, BA and HRD, respectively. Mean absolute percentage error is 7.4% which is a good agreement between simulated and experimental weld characteristics, indicating that Neuro-DEMO framework can be suggested as an efficient and well-performed technique to be used for modeling and MOO of FSW process. Moreover, it can be extended for implementation in other welding processes.

5 Conclusion

In this work, FSW process optimization using Neuro-NSGA-II and Neuro-DEMO has been investigated. ANN models have been used to predict weld quality characteristics before the optimization procedure. The optimization problem has considered 8 inputs and 5 weld quality characteristics that are the joint strength, yield stress, percentage elongation, bending angle and nugget zone hardness. Results showed that Neuro-DEMO paradigm is able to find the optimum parameter settings of FSW process efficiently and robustly. Over and above, the predicted optimal weld qualities obtained from Neuro-DEMO are more accurate than Neuro-NSGA-II. Moreover, the confirmation experiment has revealed that the proposed Neuro-DEMO approach is a good tool for optimization of FSW process. That approach can also be well utilized for optimization of other welding processes.