Introduction

Pulsed gas metal arc welding (P-GMAW) is widely used in metal plate joining, as it delivers uniform metal transfer with reduced heat input [1]. The weld quality can be significantly improved by controlling the pulse parameters to achieve a stable arc [2]. The arc stability is found to be excellent for one metal droplet per each current pulse condition (ODPP) with the droplet diameter adjacent to the electrode wire diameter [3], which may be attained by choosing the suitable amplitude and duration of peak current higher than transition current to ensure droplet detachment [4]. Various mathematical models have been proposed to monitor the metal transfer behavior focused on the mean or the variance of the weld quality features [5]. The dual response approach considers both mean and variance to improve the regression model [6], which was further used for optimization of process variables.

However, GMAW processes are highly non-linear with different uncontrollable factors, which recommends the necessity of an adaptive intelligent system to describe the process characteristics for further monitoring. Therefore, different evolutionary algorithms such as soft computing tools have also been used with numerical techniques for the modelingand optimization of GMAW parameters more accurately [7, 8].

Multi-objective optimization is considered as a set of single objective problems to handle more number of objectives. The higher joint tensile strength andminimum butt weld plate distortion both arecrucial weld quality features. Various conventional optimization techniques like RSM, Taguchi method are used in manufacturing optimization problems [9, 10] limited to regular experimental spaceoften found to be inadequate in highly non-lineararc welding processes with lot of uncertainties. This limitation may be overcome with the introduction of genetic algorithm (GA) [11]. It can generate global optimum point rather than local optimum solutions [12]. However, there is a risk of insufficient sweeping of the search space with improper parameter settings in GA [13]. The controlled random search algorithm similar to GA used to overcome these difficulties [14]. The adaptive gradient descent neural network also found to be useful in GMAW optimization [15]. The GA technique usedin trained neural network, called neuro-GA, to improve the optimization capability [16].

In recent years, advanced multi-objective non-elitist optimization GA tools developed like weight-based GA, non-dominated sorting GA (NSGA). The weighting method is limited as it highly sensitive to the weight vectors considered for which prior information about the problem is required. NSGA prefers non-dominated solutions and preserves diversity among the generated solutions of a particular non-dominated front using a sharing approach. The concept of elitism is used to enhance the probability of creating better offspring in NSGA-II by Deb et al. [17]. Since NSGA-II technique works with a population of solutions, pareto-optimal solutions may be captured simultaneously [18]. The NSGA-II algorithm had been applied to various fields such as optimization of electro chemical machining process [19], kinematic modeling and performance optimization of robot manipulator [20], and neuro-NSGA-II used in wire electric discharge machining optimization [21].

Though there are various works on arc welding process optimization using various numerical, analytical, regression methods as well as different soft computing tools like ANN, GA andits hybrid approach, there is hardly any work on multi-objective optimization using NSGA-II, especially in case of pulsed GMAW process. Therefore, the present work addresses hybridneuro based NSGA-II in pulsed GMAW process optimization to achieve desired weld quality. Neuro-NSGA-II, combination of BPNN with NSGA-II, has been used to achieve multi-solutions (Pareto-optimal front) in P-GMAW. BPNN has been used to compute the fitness of each solution after adequate training, whereas NSGA-II algorithm generates the optimum solutions for two conflicting objectives i.e. transverse shrinkage and ultimate tensile strength of the butt welded joint.

Experimental Procedure

In this work, a pair of low carbon steel (C = 0.208%) specimens, each having dimensions of 150 × 120 × 6 mm, were used as the base plate for butt welding experiments. The butt welding specimens were prepared with V shaped groove having the groove angle, the root face and the root gap of 60°, 1.5 and 1 mm, respectively. A constant voltage P-GMAW machine (FRONIOUS make) was used. The experiments were carried out using copper coated mild steel filler electrode wire (ESAB, S-6 wire, 1.2 mm diameter) in pure argon (99.9%) shielding gas. The schematic diagram of the experimental set-up is shown in Fig. 1.

Fig. 1
figure 1

Schematic representation of the experimental set-up at 0° torch angle

Development of Design of Experiments

Half fractional central composite RSM (alpha = 2.378) with nine center point experiments were used for design of welding experiments. The three pulse parameters such as peak voltage, pulse frequency and pulse on-time (at constant mean voltage conditions) significantly effect on weld quality as per earlier experiments at different welding torch angles [22, 23]. Thus, present work focused on significant pulse voltage parameters at different torch angle to achieve desired weld joint quality, i.e. modeling and optimization of the pulsed GMAW process. Therefore, these three pulse parameters along with torch angle, wire feed rate and welding speed (i.e. total six parameters) considered as control parameters for the design of experiments. The negative value of torch angle indicated backhand welding, whereas positive value showed forehand welding. The torch perpendicular condition was represented by 00 torch angle. The complete matrix containing a total of fifty-three number of experiments generated using MINITAB software.

To make a butt weld joint, two plates were tack welded before welding run at the two ends (points T1 and T2), as shown in Fig. 2. Vernier caliper was used to calculate transverse shrinkage after welding. The transverse shrinkage can be expressed as the ratio of change in length due to welding to the initial length across the weld plate.

Fig. 2
figure 2

Schematic representation for measurement of transverse shrinkage (butt joint)

It was measured at three linear positions (L11, L22 and L33), and their average has been calculated as per Eq. 1, where, ∆L11, ∆L12 and ∆L13 represent the linear change along the respective lines after welding.

$$ \rho_{t} = \frac{1}{3}\left[ {\frac{{\Delta L_{11} }}{{L_{11} }} + \frac{{\Delta L_{22} }}{{L_{22} }} + \frac{{\Delta L_{33} }}{{L_{33} }}} \right] $$
(1)

The butt weld joint samples have been prepared for the tensile test according to ASTM (E8) standard. The tensile testscarried outusing universal tensile testing machine (INSTRON, 8862) with attached software (Instron Wave Matrix) to generate stress–strain diagram. The ultimate tensile strength of the base metal was found to be 630–680 MPa with an average of 652.4 MPa as per ten random samples chosen.The input parameters are based on coded matrix generated using half fractional central composite RSM are shown in Table 1.

Table 1 Process parameters with corresponding weld quality characteristics

Neuro Elitist Non-dominated Sorting Genetic Alogorithm (Neuro-NSGA-II)

The diversity preserving mechanism and sorting of generated solutions can be maintained by NSGA-II. However, the objective functions must be contradictory in nature. The prime objective is to generate a set of Pareto-optimal solutions. The diversity among the solutions on a specific Pareto front is maintained using crowding distance operator. The elite solutions of the parent population are carried to the subsequent generation after competing with the offspring population members. The fitness of the objective functions was computed by the BPNN, which was used to generate Pareto-optimal front by NSGA-II. The computational procedure of the neuro-NSGA-II algorithm is outlined in Fig. 3.

Fig. 3
figure 3

Computational procedure of neuro-NSGA-II

The initial binary parent population (\( P_{i} \)) of size N has been generated randomly. It is converted to real value using linear mapping. The offspring population (\( Q_{i} \)) of size N has also been generated by binary tournament selection, crossover and mutation operations on parent population (\( P_{i} \)). The tournament selection is done among each two different parent solutions (chromosomes) based on their fitness values. Thus, better parent solutions are selected, which are used for further crossover and mutation. Crossover or recombination operator uses two different parent string that exchange binary gene (or genes) between them to create two offspring solutions. However, the mutation is randomly used to invert one or more genes in the chromosome.

The parent population (\( P_{i} \)) and offspring population (\( Q_{i} \)) are then combined as \( R_{i} \), which allows a global non-domination check over all the solutions of size 2 \( N \). The real value of input parameters corresponding to these solutions has been normalized from 0.1 to 0.9 to determine fitness value using previously trained BPNN model. The training and testing of the BPNN model is also done using normalized value of the experimental dataset. Thus, the fitness of the objective functions has been calculated.

The combined population of size 2 \( N \) is then sorted based on their non-domination ranks (r) and crowding distance (d). The rank of a solution indicates its non-domination level over other solutions, whereas crowding distance represents the degree of spreading among the solutions. The rank of the best solution is comprised of minimum rank with maximum crowding distance.

The new population (\( P_{i + 1} \)) of size N has been generated by the non-dominated sorting based on the rank of solutions and crowding distance of each solution belongs to a particular rank. Thus, it is initiated with the best non-dominated ranked solutions to the second non-dominated front and so on until \( N \) number of solutions is selected. When the number of selected solutions exceeds \( N \), then crowding comparison operator is used to choose better solutions only from the higher non-dominated front, so that total number of selected solutions is equal to \( N \) in this generation. This process sequence is repeated until maximum generation number is reached. Thus, the Pareto optimal front comprised of best solutions is generated, which shows more number of optimal solutions for two conflicting objectives.

Optimization of P-GMAW Using Neuro-NSGA-II

In the present work, it was found that the joint tensile strength improved with an increase of pulse voltage due to adequate fusion which is responsible for strong bonding between weld and base plates. However, the transverse shrinkage was found to be higher with an increase of pulse voltage. The maximum joint strength was found 520 MPa, i.e. joint efficiency of 79.7% (experiment #30) with higher transverse shrinkage of 0.141%, whereas minimum transverse shrinkage was 0.057% (experiment #43) with poor joint strength of 47.59 MPa (joint efficiency of 7.3%). Thus, there were two conflicting objectives as the maximization of joint strength with minimum distortion to achieve better weld quality.

The architecture of a fully interconnected multi-layered neural network consisting of one hidden layer is used as there was no further improvement in mean error with more than one hidden layer. Firstly, the network architecture and its parameters are initialized. The network structure is represented by the number of neurons in input layer (i), hidden layer (j), and output layer (k). Number of input and output nodes is same as input and output parameters considered in the network. However, the neurons in the hidden layer may be any integer value. The interconnection weights between two subsequent layers are randomly initialized within − 0.1 to + 0.1. The biases were considered 0 for the nodes of the input layer and 1 for the nodes of hidden layer as well as output layer. The value of learning rate (η) and momentum coefficient (α) were varied from 0.1 to 0.9 with an increment of 0.1.

The total dataset patterns were grouped into training and testing sets for the training and further testing the network, respectively. One epoch indicates a complete presentation of the entire normalized training set considered for training to accelerate the learning process [24]. In the forward pass, the synaptic weights are remained unchanged throughout the network and the output signals of the network are computed on a neuron-by-neuron basis. The activation function \( \varphi ( \cdot ) \) is considered as log-sigmodal in this case. The error signal at the output node t in the output layer at the nth iteration (i.e. for nth training pattern) is determined. In the backward pass, computation starts at the output layer by passing the error signals leftward through the network. The synaptic weights of the network are modified according to the delta rule using this recursive computation. The delta rule to determine weight correction factor is further modified by including a momentum coefficient (\( \alpha \)) to avoid the risk of instability in the network associated with learning rate (\( \eta \)) value. The sequential mode of training was used and thus, mean square error (MSE) to train the network was set to a very small value, which has been used as error goal. This stopping limit was set according to the convergence behavior of the MSE of testing patterns in the network.

The performance of the BPNN model depends on the network parameters like number of neurons in hidden layer (\( h \)), no of nodes in the hidden layer (\( j \)), learning rate of the synaptic weights (\( \eta \)) and momentum coefficient (\( \alpha \)). There was no significant improvement of MSE in testing with the consideration of more number of hidden layers in this case. Therefore, single hidden layer was considered, whereas \( j \), \( \eta \) and \( \alpha \) were varied from 1 to 30, 0.1 to 0.9 and 0.1 to 0.9, respectively. Several trials were made to finally obtain the optimal architecture, which can provide the minimum MSE in testing. The networks were compared on the basis of their prediction accuracy in testing by running the code several times. Once the best network was developed, they were tested by the seven randomly chosen experimental data out of 53 experiments (experiment no. 2, 5, 12, 19, 31, 39 and 49) to check the prediction capability of the network. The optimum architecture was found by varying the number of neurons in the hidden layer along with the variation of \( \eta \) and \( \alpha \). This evaluation was carried out by the determination of MSE in testing (MSE_TEST) based on the absolute value of the weld quality features. However, the best BPNN model having 6-7-2 architecture with network parameter \( \eta = 0.6 \) and \( \alpha = 0.4 \) was used for obtaining the fitness value of the solutions generated by NSGA-II. This optimum architecture provided the minimum MSE in training (MSE_TRAIN) and testing (MSE_TEST) as 0.00737 and 0.00938, respectively. The effect of each six process parameter on weld quality parameters (i.e. mean joint tensile strength and mean transverse distortion) have also been studied assuming fixed center point value of other five parameters using the best BPNN network parameter stated above. It has been found that both outputs increased at higher each pulse voltage parameters; whereas wire feed rate and welding speed have an opposite effect. It is due to higher heat input for the first case and lower heat content of metal droplet transferred to the weld for the next. To identify the critical parameters and their degree of importance on the model outputs, a sensitivity analysis has also been performed using PaD Method. The contribution of three pulse parameters (Vp, fp and tp) on outputs have found to be significant as 53.2, 21.4 and 12.2%, respectively, whereas other input parameters (F, S and αt) were not so much significant (1, 9.1 and 3.1%, respectively).

The optimal parameter settings for maximum joint tensile strength with minimum distortion were determined by NSGA-II technique. This technique was applied to achieve the Pareto-optimal solutions. The second objective was modified to convert it into a maximization problem (Eq. 2). Thus, the objective functions were considered as follows:

$$ \begin{aligned} Objective1 = {\text{Joint strength}} \hfill \\ Objective2 = \;\frac{ 1}{\text{Transverse shrinkage}} \hfill \\ \hfill \\ \end{aligned} $$
(2)

In order to search the optimal value of decision variables, it was represented as binary strings. The bit length of each process parameter was considered as 10. Thus, the string length for each chromosome is 60. However, optimization capability of the NSGA-II depends on various other control factors such as, cross over rate, mutation rate and number of generations. The crossover rate (\( C_{r} \)) was 0.9 with two point crossover, whereas the mutation rate (\( M_{r} \)) was considered as 0.1 with bitwise mutation. Firstly, a total of 100 numbers of solutions were generated randomly within the experimental parameter range. The 100 scattered solutions were found to be converged after 4th iterations in the optimization process, which had only three ranked solutions as shown in Fig. 4. The Pareto optimal front after 20th generations was considered as the best as there was no further improvement in terms of solution ranking, i.e. all the solutions were in the same rank. The final Pareto optimal front after 20 generations is shown in Fig. 5 and the corresponding 100 solutions are shown in Table 2. The maximum value of joint strength was found to be 555 MPa (solution #27) with higher transverse shrinkage, whereas the minimum transverse shrinkage was 0.02% (solution #1, #61 and #95.) with lower joint strength. Thus, the maximum joint strength as well as transverse shrinkage improved over the design of experimental results. The maximum ultimate butt joint strength have found to be improved from 520 MPa (experiment #30) to 555 MPa (solution #27) i.e. joint efficiency of 79.7 to 85.1% (increase of 5.4%) with same transverse shrinkage. On the other hand, transverse shrinkage was found to be significantly reduced from 0.131 (experiment #48) to 0.096% (solution #35 and #41) (i.e. a reduction of 26.7%) with almost same joint strength, which was more significant at higher joint strength/efficiency. The joint tensile strength had found to be enhanced at negative torch angle (backhand welding) due to higher heat concentration caused by weld arc preheating as per simulation results for 100 solutions as well as parametric study. However, the transverse shrinkage increased with negative torch angle due to more heating, whereas it was found to be less at higher wire feed rate and welding speed due to less heat input to the weld as per simulation results. However, there should be adequate heat input to the weld (affected by mean voltage as a function of pulsed voltage parameters) to achieve the two conflicting objectives as per simulation results. Two optimal solutions (solution #55 and #67) were found to be closed to two different experimental (experiment #31 and #11, respectively) values. Thus, it was validated without further experiments. Therefore, the neuro based NSGA-II is a powerful tool to achieve more number of optimal solutions to achieve multi-objectives in P-GMAW.

Fig. 4
figure 4

Solutions after 4th generations

Fig. 5
figure 5

Pareto-optimal front

Table 2 Pareto optimal solutions with corresponding weld quality features

Conclusion

Butt weld quality can be improved with higher tensile strength and less transverse shrinkage by adjusting process parameters in P-GMAW. However, one particular weld quality feature sometimes more important. Thus, multi-objective optimization is often required which can be efficiently solved by neuro based NSGA-II technique. This tool can generate more number of solutions for different service requirements without much computation time. The maximum ultimate joint strength increased from 520 to 555 MPa (i.e. more than 5%) at same transverse shrinkage, whereas there was a significant reduction of transverse shrinkage of 26.7% at higher joint efficiency (more than 60%), using this hybrid approach. Thus, joint strength as well as transverse shrinkage were found significantly improved evidenced by simulation as well as experimental results.