1 Introduction

Friction stir welding (FSW) is a solid-state welding process developed by The Welding Institute (UK) in 1991 that is currently increasingly utilized for joining aluminum alloys for which conventional fusion welding is problematic because of welding defects such as hot cracking, distortion and loss of work hardening [1, 2]. In essence, a non-consumable rotating tool, harder than the base material, is plunged into the adjoining edges of the plates to be joined, under adequate axial force and advanced along the line of the joint [3]. The tool consists of two parts, namely shoulder and pin. The material around the tool pin is softened by the frictional heat generated by the tool rotation. The advancement of the tool transports plastically deformed material from the front of the rotating tool to the back of where forging occurs and tool moves on the weld line using welding speed to complete the joining process. Since the material subjected to FSW does not melt, the resultant weld offers advantages over conventional fusion welds, such as less distortion, lower residual stresses and fewer weld defects [47]. It also has the important added benefit that joining of dissimilar aluminum alloys is possible [811]. Generally, the optimization of a welding process occurs through trial and error. To avoid this drawback, numerical modeling has recently been introduced and has shown that significant reductions in the cost of both the design and the analysis of welding operations are possible [12].

Artificial neural network (ANN) techniques may be utilized as alternative methods for modeling of materials and their processing techniques [12]. ANN “learns” from examples and recognizes patterns in a series of input and output values without any prior assumptions about their nature. Moreover, ANN does not integrate any physical information about the process criteria in its model and, therefore, has the ability to predict the strength of the welded joints by incorporating different process parameters. During modeling, ANN also has the added benefit that it retains significant data in memory that may be adjusted based on the availability of new experimental data [13]. Chang and Na [14] developed a combined model of finite element analysis and neural networks which can be effectively applied for the prediction of LASER spot-welded bead shapes (in stainless steel) welded with and without gap. Dutta and Pratihar [15] modeled the gas tungsten arc welding process using conventional regression analysis and neural network-based approaches and found that the performance of the ANN was superior when compared to the regression analysis approach. Anand et al. [16] successfully made comparative study for friction welding of Incoloy800H using response surface methodology (RSM) and ANN. They reported that ANN solutions closer to the actual value as compared to RSM. Varol et al. [17] used ANN technique to predict densification behavior of Al–Cu–Mg/B4C metal matrix composites. They reported that ANN is an alternative tool for evaluating the density and porosity values of synthesized metal matrix composites. Asadi et al. [18] used ANN for evaluating the grain size, and hardness of FSP of AZ91/SiC nanocomposites. They concluded that ANN can be used as an alternative technique for analyzing relationship between input parameters and outputs (responses). Okuyucu et al. [19] demonstrated the possibility of utilizing neural network techniques for the computation of the mechanical properties of FSW of aluminum plates incorporating process parameters such as rotational speed and welding speed. Lakshminarayanan and Balasubramanian [20] used the back propagation algorithm with a single hidden layer improved with numerical optimization techniques for predicting the tensile strength of FSW of AA7039. The predictive ANN model was found to be capable of improving prediction of tensile strength within the trained range. They concluded that the ANN model is more robust and accurate in estimating the values of tensile strength when compared with the response surface model.

Hence, the present work has been conducted to establish an appropriate ANN model to predict the strength of FSW performed joints. The tool pin profile, tool rotational speed, welding speed and axial force are used as inputs of the network, whereas the ultimate tensile strength acts as output. Finally, the suitability of the proposed model is assessed using standard statistical parameters.

2 Experimental procedure

Five different tool pin profiles of straight square (SS), tapered square (TS), straight hexagon (SH), straight octagon (SO) and tapered octagon (TO) without draft were manufactured using CNC Turning center and wire cut electrical discharge machining machine to get accurate profiles. All the tools were made of high carbon high chromium steel (D2 Tool Steel), had a shoulder diameter of 18 mm, pin diameter of 6 mm and pin length of 5.6 mm, and shoulder to work piece interference surface with 3 concentric circular equally spaced slots of 2 mm in depth. The manufactured tools are shown in Fig. 1. The composition of tool material is shown in Table 1. Specimens of 100 mm × 50 mm × 6 mm were prepared from stock rolled plate. AA6351-T6 and AA5083-H111, respectively, were associated with the advancing and retreating tool rotational direction. The chemical and mechanical properties of AA6351-T6 and AA5083- H111 are presented in Tables 2 and 3, respectively. The composition was measured using optical emission spectrometry. An exclusive FSW machine was used for welding. The machine will maintain the set axial force automatically which had a load cell to measure the axial force and display it.

Fig. 1
figure 1

Different tool geometries utilized; SS straight square, TS tapered square, SH straight hexagon, SO straight octagon, TO tapered octagon

Table 1 Chemical composition of tool material (mass fraction %)
Table 2 Chemical composition of AA6351-T6 and AA5083-H111 (mass fraction %)
Table 3 Mechanical properties of the AA6351 and AA5083-H111

The FSW line was parallel to the rolling direction of AA5086-H111 and perpendicular to the rolling direction of AA6351-T6. Feasible limits of the parameters were chosen in such a way that the joint should be free from visible defects [20]. The maximum and minimum limits of the parameters are coded as +2 and −2, respectively. The intermediate coded values were calculated using the following equation:

$$ X_i = 2[2X - (X_{\hbox{max} } + \, X_{\hbox{min} } )] \, /(X_{\hbox{max} \, - } X_{\hbox{min} } ) $$
(1)

where X i is the required coded value of a variable X, and X is any value of the variable from X min to X max. X min is the lower limit of the variable and X max is the maximum limit of the variable. The welding parameters and tools utilized are presented in Table 4 and Fig. 1, respectively. The experiments were conducted considering a four-factor five level central composite design matrix [21]. The welded specimen of trail number of 25 is shown in Fig. 2.

Table 4 FSW process parameter and its levels
Fig. 2
figure 2

Friction stir-welded specimen using straight square tool, rotational speed of 950 rpm welding speed of 63 mm/min and axial force of 1.5 ton

Tensile testing was conducted as per the American Society for Testing of Materials (ASTM E8) code of practice to obtain the UTS [22]. Standard sized specimen was prepared perpendicular to the weld joint and tested in a computerized universal testing machine (HITECH TUE-C-1000). Three specimens were prepared and tested for each weld. The average values as obtained experimentally for each set of process parameters are presented in Table 5.

Table 5 Comparison of measured and predicted values of UTS

3 Development of neural network model

A typical ANN architecture consists of an input layer, an output layer and hidden layers which are associated with the processing units called neurons. The feed-forward three-layered back propagation network architecture is depicted in Fig. 3. The input layer consists of four nodes, which represent the tool pin profile, tool rotational speed, welding speed and axial force. The output layer consists of one node which represents the UTS. The number of hidden layers and neurons in the hidden layer was determined by training several networks. The neural networks developed consist of four input neurons for the process parameters, 4–6 hidden layers consisting of 10–20 neurons in each layer for training the data. The inputs and outputs of the experimental values are normalized between the ranges of 0.1–0.9 using the following equation [23]

$$ Y_i = 0.1 \, + \, 0.8(Z_i - \, Z_{\hbox{min} } )/(Z_{\hbox{max} } - \, Z_{\hbox{min} } ) $$
(2)

where Y i is the normalized input/output value, Z i is the actual input/output value, Z max is the maximum input/output value and Z min is the minimum input/output value.

Fig. 3
figure 3

Neural network architecture

The process for the ontogeny of neural network is presented in Fig. 4. From the 25 experimental trails, 15 samples were used for training, five samples for testing and five samples for validation. The network was trained several times by fixing 1000 epochs to obtain the best validation performance using MATLAB version 7.8R2009a software [16]. A best validation performance of mean squared error of 0.00081163 was obtained at 140 epochs (see Fig. 5). The low value of the mean squared error indicates that the neural network model developed will provide good predictions. The simulated results from the network developed were exported to the MATLAB workspace. The predicted UTS values from the ANN are presented in Table 5. Regression analysis was performed to obtain a correlation coefficient and best fit curves for training, testing and validation of the developed neural network model. A correlation coefficient was calculated to show the relationship between the experimental data and the value predicted by neural network model [17]. The accuracy of the prediction is illustrated by presenting the lines of best fit along with the appropriate experimental data for the training, testing, validation and overall data in Fig. 6. A correlation coefficient value (R) close to 1 implies a close relationship between actual output and predicted output. The correlation coefficient values obtained for training, testing, validation and overall data are 1, 0.93991, 0.99871 and 0.97852, respectively, which show that the neural network model developed was capable of predicting an output with good accuracy (Fig. 6). A graphic comparison of the UTS between the ANN model and the experimental results for all the experimental trails is presented in Fig. 7. It is once again apparent that a good comparison was achieved.

Fig. 4
figure 4

Flow chart of artificial neural network development using neural network tool

Fig. 5
figure 5

Network training to predict UTS

Fig. 6
figure 6

Line of best fit and correlation coefficient between actual and predicted values for training, validation, testing and all data of UTS

Fig. 7
figure 7

Comparison of UTS of friction stir-welded dissimilar aluminum alloy

4 Effect of FSW process parameters

The effect of tool pin profile on the ultimate tensile strength of friction stir welding joint of dissimilar aluminum alloys is presented in Fig. 8. The straight square tool pin profile produces more tensile strength compared to other tool pin profiles. The ratio between static and dynamic volume of straight square, straight hexagon, straight octagon, tapered octagon and tapered square tools is, respectively, 1.56, 1.21, 1.11, 2.04 and 3.51. The static volume is the volume of the tool pin. The dynamic volume is the volume swept by the rotating tool pin. This ratio influences the material flow path from leading edge to the trailing edge of the rotating tool. The highest and the lowest ratio affects the materials’ flow and produces less tensile strength. The straight square (SS) pin profiled tool produced the highest tensile strength probably due to the pulsating effect of the rotating square when compared to other tool pin profiles that produced a more continuous flow of material towards the root of the tool. The square pin, hexagon pin and octagon pin tool profiles, respectively, produce 63, 95, 126 pulses/s when rotating at 950 rpm. Although the hexagon and octagon tool pin profiles have the highest pulses, it almost resembles a conventional cylindrical pin profiled tool at this high rpm. The tapered square and tapered octagon tool pin profiles produce the lowest tensile strength due to reduced sweeping of material [24]. The macrostructure of the dissimilar joints using various tool pin profile is presented in Table 6 with probable reason. No visible defects were observed when using SS, SH and SO tool pin profiles due to sufficient pulsating stirring action and flow of plasticized material. Because of insufficient interaction of the tool, less sweeping of materials in a tunnel defect at the bottom of the joint was observed and yielded less tensile strength when the tapered tool pin profiles were used. Figure 9 illustrates that the highest ultimate tensile strength is obtained at an intermediate rotational speed of 950 rpm. The lower (600 rpm) and the higher (1300 rpm) both produce lower, albeit similar, tensile strengths. The reduced strength at the lower rotational speed is due to the lack of stirring action of the tool and, therefore, also lower associated heat input. This produces defects at the bottom of the joints and, therefore, yields a lower tensile strength. Typically in FSW, an increased rotational speed is associated with an increased heat input. More heat input hampers the material flow behavior and causes excessive release of stirring materials to the upper surface which leaves voids in the weld zone. These defects at higher and lower rotational speeds are visible from the macrostructural observations as presented in Table 7.

Fig. 8
figure 8

Effect of tool pin profile on UTS

Table 6 Effect of tool pin profile on macrostructure (N = 950 rpm, S = 63 mm/min, F = 1.5 ton)
Fig. 9
figure 9

Effect of tool rotational speed on UTS

Table 7 Effect of tool rotational speed on macrostructure (P = SS, S = 63 mm/min, F = 1.5 ton)

Figure 10 illustrates that the welding speed has a similar effect on the weld quality as the rotational speed presented in the previous paragraph. Once again the highest UTS is obtained at an intermediate welding speed (63 mm/min). The low (36 mm/min) and high (90 mm/min) welding speeds both produced lower joint tensile strengths with the lowest speed the lower of the two. This is once again a function of the associated heat input. The lower welding speed produces excessive frictional heat generation that leads to defects at the bottom of the weld (Table 8). The converse is true at the higher welding speed with insufficient time for adequate stirring and heat build up once again leading to weld defects that reduce the weld strength accordingly. The effect of axial force on the tensile strength and the macro-structural weld geometry is presented in Fig. 11 and Table 9, respectively. The highest tensile strength is displayed at an intermediate force of 1.5 tons. Higher and lower axial loads once again produce lower tensile loads due to the formation of weld defects. The tensile strength is less at the lower load due to insufficient material consolidation that leads to worm hole defects (Table 9). Tunnel defects are formed at the higher axial load due to insufficient coalescence of transferred material [25] (Table 9). The presence of these macro-scale defects once again leads to a reduction in UTS.

Fig. 10
figure 10

Effect of welding speed on UTS

Table 8 Effect of welding speed on macrostructure (P = SS, N = 950 rpm, F = 1.5 ton)
Fig. 11
figure 11

Effect of axial force on UTS

Table 9 Effect of axial force on macrostructure (P = SS, N = 950 rpm, S = 63 mm/min)

5 Conclusion

  1. 1.

    ANN models were developed to predict the UTS of dissimilar aluminum friction stir-welded joints. Friction stir welding process parameters such as tool pin profile, tool rotational speed, welding speed and axial force were incorporated into the models.

  2. 2.

    A comparison between the ANN model and the experimental data resulted in good correlation (less than 5 % difference) for the process parameter ranges evaluated.

  3. 3.

    The results showed that the ANN model is statistically accurate and is a robust tool to describe and predict the UTS of friction stir-welded dissimilar aluminum alloys. This method can be used for other combinations of dissimilar alloys.

  4. 4.

    The experimental results also showed that the UTS of the joint is sensitive to tool pin profile, tool rotational speed, welding speed and axial force. In each case, a relatively narrow band was identified where the joint efficiency was adequate. Process parameters outside this region led to significant joint defects that in most cases resulted in the reduction of the effective weld zone resulting in lower UTS values. This implies that careful process control will be required to produce consistent quality joints. Most of the process parameters directly or indirectly control the heat introduced into the fusion area. The heat input and the control thereof play a significant role in the success of the process.