1 Introduction

Friction stir welding (FSW) is a solid-state joining process which was introduced in 1991 by Thomas et al. (1991). It is usually used for welding soft metals like aluminum, copper. The advantage of FSW process is that the welding is performed below the melting temperature, which does not lead to crack formation, solute redistribution and porosity, right after joining (Neto and Neto 2013; Mishra and Mahoney 2007). The FSW process uses a cylindrical tool with a profiled pin at the end. The tool is rotated and moved with a constant speed along the joint line. This movement causes plastic deformation and material mixing of the workpiece along the weld line which may lead to excellent welded joint (Neto and Neto 2013). The weld quality is usually evaluated by many characteristics like tensile strength, yield stress, elongation, hardness. These quality characteristics are controlled by a number of process parameters like plunging depth, tool rotation speed, tool geometry, shoulder diameter, pin diameter, tool pin length, dwell time. To obtain high weld quality, it is important to select optimal process parameter setting. This selection is not easy, because the number of parameters involved is large and the relationships between them and the output parameters are nonlinear and complex. Hard computing techniques require precise analytical model and lot of computation time. This makes it difficult to implement, especially when it is coupled with optimization techniques. Therefore, it is more preferable to implement soft computing techniques for the optimization of FSW process.

Table 1 Chemical composition and mechanical properties of the base material

Recent studies attempted to model FSW process using artificial neural networks (ANNs). Boldsaikhan et al. (2011) studied wormhole defect in FSW process. They used back-propagation algorithm to train ANN model for classification of the feedback forces frequency patterns as indicators for wormhole defect. Lakshminarayanan and Balasubramanian (2009) focused on estimation of ultimate tensile strength in FSW of Al alloy. They found that ANN modeling was more accurate compared to response surface methodology. Buffa et al. (2012) combined ANN model with a finite element model (FEM) for FSW of Ti–6Al–4V alloy. The model could predict the microhardness as well as the microstructure of the welded joints. Okuyucu et al. (2007) used ANN model to predict the mechanical properties of FSWed joints by considering only two input parameters (welding speed and tool rotation speed). Fratini et al. (2009) developed ANN model coupled with FEM model for estimation of average grain size of FSWed joints. Ghetiyaa and Patel (2014) developed ANN model for the prediction of tensile strength of Al alloy in FSW process. From the literature, it is found that researchers have successfully used ANN models to correlate the input and output relationship in FSW process.

Evolutionary algorithms simulate Darwin’s principle of evolution to construct powerful search and optimization algorithms. Genetic algorithms (GAs) have been used as an optimization tool in various problem domains (Deb 2001). Kennedy and Eberhart (1995) proposed an optimization algorithm to simulate the behavior of flocks with particle swarm optimization (PSO). Storn and Price (1997) proposed differential evolution (DE) algorithm which is a fast and efficient population-based optimization algorithm.

Few attempts were made to apply evolutionary algorithms for optimization of the FSW process. Shojaeefard et al. (2013) utilized back-propagation training algorithm to develop ANN model for FSW process and multi-objective particle swarm optimization to optimize the mechanical properties. For the optimization problem, they considered only two inputs, namely rotational speed and welding speed, and two weld qualities, namely tensile strength and hardness of the welded joint. Tutum and Hattel (2010) were concerned about multi-objective optimization of residual stresses and production efficiency in FSW process. They developed thermomechanical model and applied NSGA-II for optimization. Shojaeefard et al. (2014) developed ANN model for FSW of AA5083 Al alloy by using back-propagation training algorithm. They used NSGA-II for optimization of two input parameters with three outputs. From the literature, it can be seen that number of inputs and outputs parameters considered for optimization of FSW process are less. But for effective application of FSW, it is important to consider all significant input and output parameters in the optimization process to ensure best weld quality.

In this work back-propagation training algorithm is utilized to develop ANN models to correlate the input–output parameters of FSW process. Four optimization techniques, namely binary-coded GA, real-coded GA, DE and PSO, are applied on the ANN models to obtain the optimum input parameters’ settings offline. The results obtained from those algorithms are compared to determine the best optimization algorithm. Two cases are considered; the first one is maximization of mechanical properties of the welded joint, and the other is the optimization of desired weld quality parameters. Experiments are conducted to confirm the predicted results.

2 Experimental details

2.1 Experimental approach and results

In this work commercially available 6-mm-thick aluminum plates are used for experimentation purpose. The plates are cut and machined to rectangular pieces of \(200\times 100\) mm for joining purpose. But joints are prepared by FSW process. The chemical composition and mechanical properties of the plates are given in Table 1. The selection of appropriate tool material for carrying out FSW is an important issue. The FSW tool should withstand the vertical pressure and torque applied to it and should not wear out easily. Stainless steel (SS-310) is used as tool material because of its excellent properties at high temperature. A vertical milling machine is used to carry out the welding process with the specifications of spindle speed: 12 steps (50–1500 rpm), table feed: 8 steps (22–555 mm/min), main motor power: 5.5 kW, table motor power: 0.75 kW.

Initially, a list of all possible process parameters is prepared. Depending on the machine flexibility and setup limitations, the number is narrowed down to eight. The parameters considered in the present work are plunge depth (PD), tool rotational speed (RPM), welding speed (WS), tool geometry (TG), shoulder diameter (SD), pin diameter (PnD), tool pin length (TPL) and dwell time (DT). Four different tool geometries are used: straight cylindrical (SC), tapered cylindrical (TC), square (SQ) and threaded (THRD).

For the present investigation 59 experiments have been conducted, as shown in “Appendix, Table 11.” Out of these 59 experiments the first 32 experiments are based on Taguchi’s \(\hbox {L}_{32}\) orthogonal array (OA) design of experiment. In \(\hbox {L}_{32}\) OA design, plunge depth is varied in two levels and other seven parameters are varied in four levels. Initial trial runs show that the working range of PD is less due to which it has been varied in two levels. The next 27 experimental data sets are based on full factorial experimental design in which three process parameters (TG, RPM and PnD) are varied in three levels.

The plates to be joined are clamped in such a way that the plate movement is completely restricted under vertical as well as translational forces exerted by the tool. The tool rotation speed and traverse speed of the bed are set before each run of welding. Forty-one numbers of tools having different shapes and dimensions are fabricated in house to conduct the experiments. The welded plates are cut as per the diagram shown in Fig. 1a. Then the tensile specimens are prepared as per the American Society for Testing of Materials (ASTM E8) guidelines. The tensile, bending and hardness specimens are shown in Fig. 1b–d, respectively. Tensile tests are carried out in a digitally controlled closed loop servo hydraulic dynamic testing machine (Make: INSTRON, Model: 8801). The measured weld quality values of each welding condition corresponding to the parameter settings mentioned in Table 11 are given in “Appendix, Table 12.” Two types of bend tests are carried out, namely root and face bend test to achieve accurate bending angle. Hardness values are measured at three different layers across the material thickness direction at 1, 2.5 and 4 mm distance from the top surface of the weld, respectively. Total 15 points are considered for nugget zone hardness, and their average value is considered for analysis. The various weld quality characteristics considered for optimization are ultimate tensile strength (UTS in MPa), yield strength (YS in MPa), ductility (% Elng.), bending angle (BA in degree) and nugget zone hardness (HRD in HV).

Fig. 1
figure 1

Schematic diagrams of a position of extraction of tensile, bending and hardness specimens, b tensile, c bending, d microhardness specimen dimensions (all the above-mentioned dimensions are in mm)

Table 2 Percentage contribution of each input parameter on the different weld qualities
Fig. 2
figure 2

Architecture of multilayer neural network (L and M are number of input and hidden neurons, respectively)

2.2 Contribution of process parameters on the weld qualities

Analysis of variance (ANOVA) is applied to identify the effects of individual input parameter on the weld qualities. The contribution of each process parameter on the different weld quality parameters is calculated and is shown in Table 2. From the ANOVA results (Table 2), it is found that measured weld qualities are mostly influenced by RPM, TG and PnD. The most contributing parameter is RPM for UTS having the contribution of 29.67 %. This is because RPM is responsible of overall material mixing in both the surface level and along the thickness direction of the workpiece. The next most influencing parameters are TG and PnD having 21.85 and 21.07 % contributions, respectively. These two parameters are responsible for the material mixing along the thickness of the workpiece. Similarly for YS the contribution of the above three parameters are 26.34, 19.26 and 16.63 %, respectively.

For bending test, all the good joints are bent up to \(140{^{\circ }}\) without any visible defect (or crack formation). The PnD is found to be the most important factor for bending angle as well as % elongation having contribution of 28.63 and 38.65 %, respectively. RPM and TG are found to be the next influencing factors. It is also found that RPM, TG and PnD parameters have the most significant effect on the considered weld qualities, whereas other considered parameters do not have significant effect on the weld qualities.

Table 3 Actual values, ANN predicted values and corresponding percentage errors for the testing cases

3 Modeling of weld quality characteristics using ANN

ANN approach of modeling can be performed using experimental data without making any simplifying assumptions. A detailed description of the operating principles of ANN can be referred to the relevant technical book (Haykin 2003). A schematic representation of a fully connected multilayer neural network (MLNN) architecture is shown in Fig. 2. The network consists of three layers, namely input layer containing different input neurons/parameters, hidden layer containing hidden neurons and output layer containing output neuron(s)/parameter(s). The information is received by the input layer from an external source, which is then multiplied by the interconnection weights between it and the adjacent hidden layer. The products are summed up and then modified by an activation function. In this case log-sigmoid function is chosen as the activation function for both the layers. These modified values become the outputs from the hidden layer and input signals for the next layer and finally reach to the output layer. Then, the procedure is terminated at the external receptor node(s).

In the present work, a C code for a multi-neurons, single hidden layer ANN model is developed for mapping the FSW process parameters to the weld quality parameters. The developed model is trained in a supervised manner using batch mode of training with error back-propagation algorithm. The model is trained on randomly selected data set of 40 input–output data pairs. Initial weight values are chosen randomly between \(\pm 0.9\), and the bias value at the input layer is taken as zero and those for hidden and output layers as one. All the input and output variables are normalized between 0.1 and 0.9. The training objective is the mean square error (MSE) minimization by updating the network parameters through the gradient descent method.

$$\begin{aligned} \hbox {MSE}(n)=\frac{1}{2PN_O }\mathop \sum \limits _{p=1}^P \mathop \sum \limits _{k=1}^{N_O } \left( {O_{Ok}^p \left( n \right) -T_k^p } \right) ^{2} \end{aligned}$$
(1)

where \(\hbox {MSE}(n)\) is the MSE at the \(n\mathrm{th}\) iteration, P is the total number of training patterns, \(N_O\) is the number of neurons in the output layer, \(O_{Ok}^p (n)\) is the output of the \(k\mathrm{th}\) output neuron for the \(p\mathrm{th}\) pattern at the \(n\mathrm{th}\) iteration and \(T_k^p\) is the desired \(k\mathrm{th}\) output for the \(p\mathrm{th}\) pattern. The performance of neural network depends on number of hidden neurons (NHN), learning rate (\(\eta )\) and momentum coefficient (\(\alpha )\). Therefore, several combinations should be tried out to choose an optimal combination. To develop optimal ANN models for UTS, YS, % Elong., BA and HRD, the NHN, \(\eta \) and \(\alpha \) values are varied within a range of 5–30 and 0.05–0.95, respectively. This process is carried out separately for each output. After training the developed model, the remaining nineteen data sets are used to test the network performance. The ANN predicted values and percentage errors in prediction are shown in Table 3. From the predicted values, it is found that the average errors in prediction of joint properties are within 10 % deviation. So the developed model can be used effectively for prediction of weld quality in FSW process.

4 Optimization procedures

4.1 Genetic algorithms

GAs are search and optimization procedures that are motivated by the principles of natural evolution. In binary GA, the input parameters of the optimization problem are represented by binary strings (Deb 2001). In other words, initially a population of random strings of bits is created. To make sure that the population satisfies the problem bounds, Eq. (2) is applied

$$\begin{aligned} x_i =x_i ^{\min }+\frac{x_i ^{\max }-x_i ^{\min }}{2^{l_i }-1}\hbox {DV}\left( {s_i } \right) \end{aligned}$$
(2)

where \(l_i\) the string length is used to code the \(i\mathrm{th}\) parameter and \(\hbox {DV}({s_i })\) is the decoded value of the string \(s_i\). \(l_i \) can be obtained from the following relation with desired accuracy (\(\varepsilon )\):

$$\begin{aligned} l_i =\log _2 \left( {\frac{x_i ^{\max }-x_i ^{\min }+1}{\varepsilon }} \right) \end{aligned}$$
(3)

In order to decide the survival of each individual, the next necessary procedure is fitness evaluation, by means of the objective function and constraints. If there is absence of constraints, the fitness is made equal to the objective function value.

The objective function that is used in this work is created by using weighted sum method. The weighted sum method converts the set of objectives into one single objective by the multiplication of each objective with a specific weight which depends on the importance of the objective. The objective function considered in this work is defined as:

$$\begin{aligned} f_i= & {} 0.25\hbox {UTS}_{i} +0.25\hbox {YS}_{i} \nonumber \\&+\,0.2\hbox {Elong}_i +0.15\hbox {BA}_i +0.15\hbox {HRD}_i \end{aligned}$$
(4)

where \(f_i \) is the objective function value or the fitness of the \(i\mathrm{th}\) individual in the population; \(\hbox {UTS}_i ,\hbox {YS}_i , \hbox {Elong}_i , \hbox {BA}_i\) and \(\hbox {HRD}_i\) are the ultimate tensile strength, yield stress, % elongation, bending angle and hardness values corresponding to the \(i\mathrm{th}\) individual in the population. The flowchart of the neuro-GA model is shown in Fig. 3.

GAs are dependent on three main parameters; these are selection, crossover and mutation (Deb 2001). In this work tournament selection method is used with size of 5. After selection of good individuals, crossover and mutation operations are performed. Crossover gives the GA its exploration ability, whereas mutation gives exploitation ability. In this work uniform crossover and global mutation are applied.

In real-coded GA instead of binary strings, real values are used directly. Similar operators to that in binary-coded GA can be used. In this work tournament selection, simulated binary crossover (SBX) and random mutation are performed.

Fig. 3
figure 3

Flowchart of neuro-GA

4.2 Differential evolution

DE is a very simple population-based optimization technique (Storn and Price 1997). The population is composed of a set of random vectors. The basic concept of DE relies on selecting two random vectors from the population and finding the difference between them. This difference is then multiplied with a certain weight and added to a third random vector. This operation is called mutation. For all the target vectors \(x_i\), the mutant vector \(v_i \) is calculated by using Eq. (5)

$$\begin{aligned} v_i =x_{r_1 } +F\left( {x_{r_2 } -x_{r_3 } } \right) \end{aligned}$$
(5)

where \(x_{r_1 } ,x_{r_2 } \) and \(x_{r_3 } \) are randomly chosen vectors and F is a constant factor \(\in [{0,2}]\).

Another way to create the mutant vector is to replace \(x_{r_1 } \) by \(x_{\mathrm{best}} \) which is the best vector obtained so far during the evolution. And to increase diversity, two difference vectors are used in this work as it is shown in Eq. (6)

$$\begin{aligned} v_i =x_{\mathrm{best}} +F\left( {x_{r_1 } -x_{r_2 } +x_{r_3 } -x_{r_4 } } \right) \end{aligned}$$
(6)

where \(x_{r_1 } ,x_{r_2 } ,x_{r_3 } \) and \(x_{r_4 } \) are randomly chosen vectors.

To perform crossover, DE generates trial vectors \((u_{ij,G+1} )\) and mixes them with the original target vectors. This process is shown in Eq. (7)

$$\begin{aligned} u_{ij,G+1} =\left\{ {{\begin{array}{lll} u_{ij,G+1} &{}\quad \hbox {if} \;r\le \textit{ pc }&{}\quad \hbox {or}\;j=\delta \\ x_{ij,G} &{}\quad \hbox {if}\; r>\textit{pc} &{}\quad \hbox {or}\;j\ne \delta \\ \end{array} }} \right. \end{aligned}$$
(7)

where r is a random number \(\in [{01}], \delta \in \left\{ {1,2,\ldots ,n} \right\} \) randomly chosen index of any vector, G is the generation number, and pc is the crossover probability.

The last operator in DE is Greedy selection. If the trial vector \(u_{i,G+1} \) results in better objective function value than the target vector \(x_{i,G} \), then \(x_{i,G+1} \) is set to \(u_{i,G+1} \), otherwise, the old vector \(x_{i,G} \) is retained. The flowchart of DE is shown in Fig. 4.

Fig. 4
figure 4

Flowchart of neuro-differential evolution

4.3 Particle swarm optimization

PSO is a population-based optimization algorithm. The population is constructed by random solutions named as particles. These particles move through the problem search space by utilizing some current optimal particles. In every iteration, each particle is updated by two other best particles, namely p-best which is the best performance of the corresponding particle so far and the other is g-best which is the best value obtained from the whole population. After finding the two best values, the particle updates its velocity and positions until reaching the best solution. The position and velocity of particle are updated using following equations. The flowchart of PSO is shown in Fig. 5.

$$\begin{aligned}&\hbox {par}_i^{({t+1})}=\hbox {par}_i ^{( t)}+v_i ^{({t+1})} \end{aligned}$$
(8)
$$\begin{aligned}&v_i^{(t+1)}=w_i v_i^{(t)}+c_1 r_1 ({\hbox {pbest}^{(t)}-\hbox {par}_i ^{(t)}})\nonumber \\&\qquad \qquad +c_2 r_2 ( {\hbox {gbest}^{(t)}-\hbox {par}_i^{(t)} }) \end{aligned}$$
(9)

where \(\hbox {par}_i^t \) is the \(i\mathrm{th}\) particle at the \(t\mathrm{th}\) iteration, \(v_i^{(t)} \) is the velocity of the \(i\mathrm{th}\) particle at the \(t\mathrm{th}\) iteration, \(w_i \) is the inertia added to the \(i\mathrm{th}\) particle, \( c_{1}\, \& \, c_2 \) are acceleration coefficients, and \( r_1\, \& \,r_2\) are random numbers \(\in \) [0 1].

Fig. 5
figure 5

Neuro-PSO flowchart

Table 4 Bounds and number of bits used in binary-coded GA
Table 5 Parameters of binary-coded GA computations

5 Results and discussion

5.1 Determination of optimal input parameters for maximization of weld qualities

Initial population of solutions is created randomly and normalized between [0.1 0.9] and fed to the pretrained ANN models. The response characteristics are computed inside the ANN models, denormalized and fed to the four evolutionary algorithms mentioned in Sect. 4. The objective of these optimization algorithms is to find out the optimal input parameter settings for higher weld quality characteristics and to compare the performance of each of them.

In binary GA, 2 bits are chosen to represent four tool geometries (SC, TC, SQ and THRD) and 5-bit string for the tool rotation speed. The number of bits in each string of the input parameters with accuracy \(\varepsilon =0.001\) and the bounds of the input parameters are shown in Table 4. The parameters of binary-coded GA computations are shown in Table 5.

In GA, the performance of the algorithm is influenced by the population size, crossover and mutation probability. A large population size allows better exploration of the search space and reduces the chances of sticking in local optima. Large crossover and small mutation rates are better to maintain good convergence of the algorithm. The variations of the maximum objective function values with population size, crossover and mutation rates are shown in Fig. 6a–c. From the figures it is obvious that population size more than 200, crossover rate above 0.5 and mutation rate between 0.1 and 0.25 have the best objective function values. The convergence of the objective function value to the optimal solution with 200 population size, 0.9 crossover rate and 0.2 mutation rate is shown in Fig. 6d.

Fig. 6
figure 6

Variation of maximum objective function value with: a population size, b crossover rate, c mutation rate in binary-coded GA, d the convergence of binary-coded GA algorithm

Similar analysis is done in the case of real-coded GA. The parameters of real-coded GA computations are shown in Table 6. The variations of the maximum objective function values with population sizes, crossover rates and mutation probabilities are shown in Fig. 7a–c. It can be seen that population size with 300 individuals and more, crossover rates higher than 0.8 and mutation rates between 0.15 and 0.3 are giving the best objective function values. The convergence of the objective function value to the optimal solution with 400 population size, 0.95 crossover rate and 0.2 mutation rate is shown in Fig. 7d.

Table 6 Parameters of real-coded GA computations
Fig. 7
figure 7

Variation of maximum objective function value with: a population size, b crossover rate, c mutation rate in real-coded GA, d the convergence of real-coded GA algorithm

In PSO, the performance of the algorithm is influenced by the population size, inertia component w and acceleration coefficients c1 and c2. A population size with more than 150 particles, inertia component less than 1.5, acceleration coefficient c1 more than 1 and c2 between 0.5 and 3.5 are found to give the best objective function values as it is shown in Fig. 8a–c. By running the algorithm with w equals to 0.9, 2 for c1 and c2, 150 population size and 200 iterations, the convergence of the objective function value to the optimal solution is shown in Fig. 8e.

Fig. 8
figure 8

Variation of maximum objective function value with: a population size, b inertia component, c coefficient c1 and d coefficient c2 using PSO, e the convergence PSO algorithm

In order to check the performance of DE algorithm, population size, F factor and crossover rate are varied. Form Fig. 9a–c, it can be seen that population size more than 150, F between 0.5 and 1.7, crossover rate more than 0.5 give the best objective function values. The convergence of DE algorithm with crossover rate of 0.5, F factor of 1.5 and population size of 150 is shown in Fig. 9d.

Fig. 9
figure 9

Variation of maximum objective function value with: a population size, b factor F and c crossover rate in DE, d the convergence of DE algorithm

Table 7 Results obtained from maximization of weld quality parameters

The comparative results of all the techniques used in this study are shown in Table 7. From the tabulated result it is found that real-coded GA, PSO and DE algorithms give better weld characteristics than binary-coded GA. Moreover, PSO gives the near optimal solution with a relatively low population size and iteration number. Approximately same optimum parameter settings are obtained from real-coded GA, DE and PSO. It is clear from Table 7 that low values of PD, WS, PnD and DT with large values of rotation speed, SD, TPL and THRD tool give the best weld quality. The reason can be explained by the excellent material mixing along the surface and thickness level of the weld joint and also the suitable heat characteristics which may produce high-quality joints after cooling. One confirmation experiment is conducted to confirm the best model predicted outputs. The optimum process parameters settings are rounded to near possible parameters in the FSW machine. The measured weld quality values are 145.38 MPa, 99.25 MPa, \(19.98, 140{^{\circ }}\) and 64.1 HV for UTS, YS, % Elong., BA and HRD, respectively. Mean absolute percentage error is 9.85 %. Even though the error is relatively high, the experiment has given good confirmation of the model predicted values.

Table 8 Desired weld quality parameters
Fig. 10
figure 10

Objective function value versus the iteration number for a binary-coded GA b real-coded GA, c DE and d PSO

Table 9 Optimized parameter settings with model predicted weld quality for the target value

5.2 Determination of optimum input parameters setting for desired weld quality parameters

In the previous case (discussed in the Sect. 5.1) optimum process parameters settings have been determined for maximum weld quality characteristics. Other than maximization of weld quality parameters sometime desired weld quality values are also required. Therefore, for finding the optimum process parameters setting for desired (target) weld quality characteristics, following objective function is considered.

$$\begin{aligned} O_f (i)= & {} w_1 \left( {\frac{\hbox {UTS}_t -\hbox {UTS}(i)}{\hbox {UTS}_t }} \right) ^{2}+w_2 \left( {\frac{\hbox {YS}_t -\hbox {YS}(i)}{\hbox {YS}_t }} \right) ^{2}\nonumber \\&+\,w_3 \left( {\frac{\%\, \hbox {Elong}_t -\%\, \hbox {Elong}(i)}{\%\, \hbox {Elong}_t }} \right) ^{2}\nonumber \\&+\,w_4 \left( {\frac{\hbox {BA}_t -\hbox {BA}(i)}{\hbox {BA}_t }} \right) ^{2}\nonumber \\&+\,w_5 \left( {\frac{\hbox {HRD}_t -\hbox {HRD}(i)}{\hbox {HRD}_t }} \right) ^{2}\quad 10 \end{aligned}$$
(10)

where \(O_f (i)\) is the value of the objective function of the \(i\mathrm{th}\) individual in the population; \(\hbox {UTS}_t \) is the target (desired) value for the joint tensile strength; \(\hbox {UTS}(i)\) is the value of tensile strength of the \(i\mathrm{th}\) individual; \(\hbox {YS}_t \) is the target value for the yield stress; \(\hbox {YS}(i)\) is the value of yield stress of the \(i\mathrm{th}\) individual; \(\%\, \hbox {Elong}_t \) is the target value for elongation; \(\%\,\hbox {Elong}(i)\) is the % elongation value of the \(i\mathrm{th}\) individual; \(\hbox {BA}_t \) is the target value for the bending angle; \(\hbox {BA}(i)\) is the bending angle value of the \(i\mathrm{th}\) individual; \(\hbox {HRD}_t \) is the target value for the nugget zone hardness, \(\hbox {HRD}(i)\) is the experimental value for the nugget zone hardness of the \(i\mathrm{th}\) individual in the population; \(w_1 ,w_2 ,w_3 ,w_4 \) and \(w_5 \) are weights that give different status or importance to each response. The responses evaluated in this work do not have equal importance. The most important response is the UTS, followed by the yield stress, elongation, bending angle and hardness. The weights are 0.25 for UTS and yield stress, 0.2 for elongation and 0.15 for bending angle and nugget zone hardness.

In this case, one arbitrary target value is considered. These desired weld quality parameters are shown in Table 8. Similar procedure to that performed for the first objective function (i.e., maximization of quality parameters) is performed to obtain best parameters for each proposed algorithm. The objective function (Eq. 10) value versus the iteration number for the four optimization techniques is shown in Fig. 10a–d. The optimum input parameter settings and model predicted outputs corresponding to the target value are shown in Table 9. It is clear that real GA, DE and PSO are able to find approximately the same near optimum weld characteristics which are better than binary GA. Nevertheless, PSO has the maximum convergence speed among the other algorithms which makes it the most preferable algorithm.

Table 10 Comparison between experimental and model predicted weld characteristics
Table 11 Design matrix of experimental run

5.3 Confirmation results

To check the accuracy of the modeling and optimization procedure, two confirmation experiments are conducted. The input parameters are taken from Table 9 and rounded to the near possible parameters in the FSW machine. The first experiment is performed for checking the optimal solution of binary-coded GA for the target value. The second one is for checking the value obtained from real-coded GA, DE and PSO. Experimental weld characteristics with corresponding model predicted outputs and percentage errors are shown in Table 9. Experimental results for the target value in Table 10 show that the errors in both the experiments are less than 10 % which is acceptable. The model predicted weld quality parameters are also well close to the corresponding experimental results. Moreover, the second experiment gives better results (6.6 % error) which leads to the conclusion that results obtained from real-coded GA, PSO and DE are more accurate.

From the aforementioned case studies, it can be recommended that PSO is more suitable for optimization of FSW process. This is mainly because it has showed the ability to find the optimal solutions for both the objective functions with less number of iterations (Figs. 9, 10) comparing to other three algorithms.

6 Conclusions

In the present work, the contribution of each FSW process parameter on weld qualities has been studied. ANOVA has showed that the most significant parameters on UTS, YS, ductility and BA are RPM, TG and PnD, whereas TG, TPL and DT are the most significant on HRD. Modeling and optimization of FSW process using ANN and four evolutionary algorithms have been also investigated. The search for the optimum is based on two cases. The first case is the maximization of an objective function, which takes into account the joint strength, yield stress, percentage elongation, bending angle and nugget zone hardness. The second case is the determination of optimum input parameter settings for desired weld quality values. For both the cases, it is found that the neuro-EA approach can be a powerful tool in welding process optimization with a relatively small experimental data set. PSO and DE are more suitable algorithms to apply because they are able to find the optimum values of the objective functions. Moreover, since PSO has given the optimum solution with less number of iterations, we can say that PSO is relatively the best optimization method for FSW process.