1 Introduction

Weld technology is widely used in manufacturing, and the online monitoring of quality is very important for robot automatic welding. The previous research has shown that when the arc behavior is unstable due to external conditions in gas metal arc welding (GMAW), weld seam defects occur, and the complexity of the weld current signal also changes (Ref 1,2,3). Hence, the weld quality can be effectively diagnosed by using the current signal. Recently, some nonlinear analysis methods have been used to diagnose weld quality. Vieria et al. extracted the fractal dimension feature of the current signal of different droplet transfer modes and combined it with principal component analysis to classify short‐circuit, globular, and spray droplet transfer modes (Ref 4). Li et al. extracted the maximum Lyapunov exponent of the current signal as a criterion for evaluating weld stability and aided in the selection of welding parameters (Ref 5). Lv et al. calculated the approximate entropy of the current signal of different CO2 gas metal welding processes, and the results showed that the larger the approximate entropy value is, the more stable the welding process (Ref 6). Nie et al. extracted the approximate entropy of the voltage signal under different pulsed metal inert gas welding process parameters, and the results showed that approximate entropy has a negative correlation with weld stability (Ref 7). However, approximate entropy has some shortcomings, such as the impact of data length and lack of relative consistency (Ref 8, 9). Compared to approximate entropy, fuzzy entropy has better statistics to evaluate the complexity of time series (Ref 10,11,12). Because the actual signal often contains different frequency components, a signal decomposition method is required before calculating the fuzzy entropy (Ref 13,14,15).

Variational mode decomposition (VMD) has been proposed for the analysis of nonlinear or nonstationary signals (Ref 16), and the original signal can be decomposed into a series of subsignals of different frequencies. Gu et al. (Ref 17) used an adaptive VMD and a Teager energy operator to diagnose the incipient fault of rolling bearings. Sahani et al. (Ref 18) applied VMD and an online sequential extreme learning machine to detect power quality events. Wang et al. (Ref 19) extracted multicomponent features of a vibration signal by VMD and then identified multiple rubbing-caused rotor–stator faults. Li et al. (Ref 20) combined successive VMD and snake optimizer slope entropy to obtain the feature matrix dataset of ship-radiated noise signals, and the highest recognition rate was 95.1%.

After extracting the signal feature with VMD and fuzzy entropy, a multiclassifier model often needs to be established to achieve automatic weld quality diagnosis. The online sequential extreme learning machine (OS-ELM) is an incremental fast learning algorithm for single-hidden layer feedforward neural networks (Ref 21), and it provides better generalizability than that of backpropagation neural network (BP) algorithms. The OS-ELM has been applied to many research areas, such as electrocardiogram signals, electroencephalogram signals, and mechanical fault diagnosis (Ref 22, 23). Therefore, a multiclassifier model for weld quality can be established by OS-ELM.

In summary, a weld quality diagnosis method that incorporates VMD, fuzzy entropy, and OS-ELM is proposed. First, all current signals are normalized to  (−1,1), and the normalized signals are decomposed into several IMFs by VMD. Then, the fuzzy entropies of IMFs that contain mainly weld quality information are extracted. Finally, the fuzzy entropies are set as input vectors for the OS-ELM classifier, and the output of the OS-ELM is applied to automatically identify weld quality types.

2 Theoretical Methods

2.1 Brief Review of VMD

VMD applies an entirely nonrecursive variational model to achieve adaptive decomposition by searching for the optimal solution of the model. The original signal is decomposed into several IMFs \(\mu_{{\text{k}}} \left( t \right)\), which are amplitude-modulated and frequency-modulated signals. VMD mainly includes two parts: variational problem construction and variational problem-solving. The constraint variation problem can be described as follows:

$$\begin{gathered} \mathop {\min }\limits_{{\left( {u_{{\text{k}}} } \right)\left( {\omega_{{\text{k}}} } \right)}} \left\{ {\mathop \sum \limits_{{{\text{k}} = 1}}^{{\text{K}}} \partial_{t} \left[ {\left( {\delta \left( t \right) + \frac{j}{\pi t}} \right)*u_{{\text{k}}} \left( t \right)} \right]e^{{ - j\omega_{{\text{k}}} t_{2}^{2} }} } \right\} \hfill \\ \quad \quad {\text{s}}.{\text{t}}.{ }\mathop \sum \limits_{{{\text{k}} = 1}}^{{\text{K}}} u_{{\text{k}}} \left( t \right) = x\left( t \right) \hfill \\ \end{gathered}$$
(1)

where \(x\left( {\text{t}} \right)\) represents the original signal, \(u_{{\text{k}}} \left( t \right)\) is the band-limited IMF, and \({\text{K}}\) is the number of \(u_{{\text{k}}} \left( t \right)\). \({*}\) represents the convolutional operation, \(\omega_{{\text{k}}}\) is the center frequency for each \(u_{{\text{k}}} \left( t \right)\), ∂t is the time derivative, and δt is the impulse function.

To solve the above constraint variational problem, the quadratic penalty factor α and Lagrangian multiplier λ are used to convert the problem into an unconstrained one. The augmented Lagrangian function \(L\) is given as follows:

$$\begin{gathered} L\left( {\left\{ {u_{{\text{k}}} } \right\},\left\{ {\omega_{{\text{k}}} } \right\},\lambda } \right) = \alpha \mathop \sum \limits_{{{\text{k}} = 1}}^{{\text{K}}} \partial_{t} \left[ {\left( {\delta \left( t \right) + \frac{j}{\pi t}} \right)*u_{{\text{k}}} \left( t \right)} \right]e^{{ - j\omega_{{\text{k}}} t_{2}^{2} }} \hfill \\ \quad \quad \quad \quad \quad \quad + f\left( t \right) - \mathop \sum \limits_{{{\text{k}} = 1}}^{{\text{K}}} u_{{\text{k}}} \left( t \right)_{2}^{2} + \lambda \left( t \right),\mathop \sum \limits_{{{\text{k}} = 1}}^{K} u_{{\text{k}}} \left( t \right) \hfill \\ \end{gathered}$$
(2)

where \(f\left( t \right) - \mathop \sum \limits_{{{\text{k}} = 1}}^{{\text{K}}} u_{{\text{k}}} \left( t \right)_{2}^{2}\) is the quadratic penalty term, and \(\left\langle \cdot \right\rangle\) is the inner product operation.

2.2 Brief Review of Fuzzy Entropy

The main steps for fuzzy entropy are as follows (Ref 11):

(1) For the original time series \(\left\{ {u\left( {\text{i}} \right):1 \le {\text{i}} \le {\text{N}}} \right\}\) and given embedding dimension m, new time series \(\left\{ {X_{{\text{i}}}^{{\text{m}}} , {\text{i}} = 1,2,{ } \cdots ,{\text{ N}} - {\text{m}} + 1} \right\}\) are constructed:

$$X_{{\text{i}}}^{{\text{m}}} = \left\{ {u\left( {\text{i}} \right),u\left( {{\text{i}} + 1} \right), \cdots ,u\left( {{\text{i}} + {\text{m}} - 1} \right)} \right\} - u_{0} \left( {\text{i}} \right)$$
(3)
$$u_{0} \left( i \right) = \frac{{\mathop \sum \nolimits_{k = 0}^{m - 1} u\left( {i + k} \right)}}{m}$$
(4)

(2) The distance \(d_{{{\text{ij}}}}^{{\text{m}}}\) (\(X_{i}^{m}\) and \(X_{{\text{j}}}^{{\text{m}}}\)) is defined as the maximum absolute difference.

$$\begin{gathered} d_{{{\text{ij}}}}^{{\text{m}}} = d\left[ {X_{{\text{i}}}^{{\text{m}}} ,X_{{\text{j}}}^{{\text{m}}} } \right] = \mathop {\max }\limits_{{{\text{k}} \in \left( {0,{\text{m}} - 1} \right)}} \left| {\left[ {u\left( {{\text{i}} + {\text{k}}} \right) - u_{0} \left( {\text{i}} \right)} \right] - \left[ {u\left( {{\text{j}} + {\text{k}}} \right) - u_{0} \left( {\text{j}} \right)} \right]} \right| \hfill \\ {\text{i}},{\text{j}} = 1,2, \ldots ,{\text{N}} - {\text{m}},{ } {\text{i}} \ne {\text{j}} \hfill \\ \end{gathered}$$
(5)

(3) Given n and r, the similarity degree \(D_{{{\text{ij}}}}^{{\text{m}}} s\) is calculated by a fuzzy function.

$$D_{{{\text{ij}}}}^{{\text{m}}} = u\left( {d_{{{\text{ij}}}}^{{\text{m}}} ,{\text{n}},{\text{r}}} \right) = e^{{\left( { - \left( {d_{{{\text{ij}}}}^{{\text{m}}} } \right)^{n} /{\text{r}}} \right)}} s$$
(6)

(4) The function \(\phi^{{\text{m}}}\) is calculated as follows:

$$\phi^{{\text{m}}} \left( {{\text{n}},{\text{r}}} \right) = \frac{1}{{{\text{N}} - {\text{m}}}}\mathop \sum \limits_{{{\text{i}} = 1}}^{{{\text{N}} - {\text{m}}}} \left( {\frac{1}{{{\text{N}} - {\text{m}} - 1}}\mathop \sum \limits_{{\begin{array}{*{20}c} {{\text{j}} = 1} \\ {{\text{j}} \ne i} \\ \end{array} }}^{{{\text{N}} - {\text{m}}}} D_{{{\text{ij}}}}^{{\text{m}}} } \right)$$
(7)

(5) Finally, the fuzzy entropy (FuzzyEn(m, n, r)) of the original time series is defined as the negative natural logarithm of the deviation of \(\phi^{{\text{m}}}\) from \(\phi^{{{\text{m}} + 1}}\).

$${\text{FuzzyEn}}\left( {{\text{m}},{\text{n}},{\text{r}},{\text{N}}} \right) = {\text{ln}}\phi^{{\text{m}}} \left( {{\text{n}},{\text{r}}} \right) - {\text{ln}}\phi^{{{\text{m}} + 1}} \left( {{\text{n}},{\text{r}}} \right)$$
(8)

There are four main parameters for the fuzzy entropy algorithm, generally, r = (0.1~0.25)*SD (SD is the standard deviation), n = 2, m = 2. The last parameter N, \({\text{N}}\left( {10^{{\text{m}}} {-}30^{{\text{m}}} } \right)\), is satisfied.

3 Experiment and Analysis

3.1 Experiment

For research on the relationship between the current signal and weld seam quality, a robot weld experiment of GMAW is carried out with specimens of Q235 steel plates (Fig. 1). The current signal is acquired by a current sensor with a sampling rate of 10 kHz, and five types of weld seam quality are obtained by changing the current and shielding gas flow rate parameters (Table 1 and Fig. 2).

Fig. 1
figure 1

GMAW experiment

Table 1 Weld process parameters of the experiment
Fig. 2
figure 2

Weld seam of tests one to five in the experiment. (a) Weld seam of test one, (b) weld seam of test two, (c) weld seam of test three, (d) weld seam of test four, and (e) weld seam of test five

In GMAW, the flow rate of the shielding gas is a very important process parameter. When the flow rate is too low (0 L/min), the water vapor and oxygen in air will enter the high-temperature arc area and then undergo metallurgical reactions with the molten pool metal. This will lead to unstable welding processes, poor weld quality, and the presence of pores on the weld seam surface (Fig. 2a). Generally, the larger the current parameter value is, the greater the arc heat input, which will cause the force of the arc droplet to change. As the current parameter increases, the droplet transfer changes from short-circuit transfer to mixed transfer and then to spray transfer, and the weld quality also changes. When the current parameter is 80 A and the shielding gas flow rate is sufficient (15 L/min), stable short-circuit droplet transfer occurs, and good weld quality can be obtained (Fig. 2b). As the current parameter increases to 140 and 180 A, mixed droplet transfer occurs, and the welding process becomes unstable and exhibits a large amount of spatter (Fig. 2c and d). When the current parameter increases to 330 A, spray droplet transfer occurs, and the weld seam is very wide with a small amount of spatter (Fig. 2e). At the same time, the current signal is acquired.

3.2 VMD of the Current Signal of Different Weld Parameters

To avoid the influence of signal amplitude on the relationship analysis of the current signal and weld quality, all signals are normalized to (−1,1). Normalized current signals and frequency spectra of different gas flow rates are shown in Fig. 3 (0 L/min) and 4 (15 L/min). The signal waveform fluctuates periodically and contains many frequency components in the low-frequency band. When the gas flow rate is insufficient (0 L/min), the external air entering the arc area causes the arc to become unstable, and droplet transfer is irregular. The frequency component of the signal is relatively divergent and has many frequency peaks. However, when the gas flow rate is sufficient (15 L/min), arc droplet transfer is very stable. The frequency component of the signal is relatively concentrated (f1 = 65.92 Hz and f2 = 134.3 Hz) and has stronger signal periodicity. These results show that the change in gas flow rate parameters directly affects the physical behavior, and the fluctuation of the current signal is also affected.

Fig. 3
figure 3

Current signal and frequency distribution of test one. (a) Current signal and (b) frequency distribution

Fig. 4
figure 4

Current signal and frequency spectrum of test two. (a) Current signal and (b) frequency distribution

To depict the frequency components of the current signal, the normalized current signal (Fig. 4) is decomposed by VMD, and several frequency IMF components (u1-u9) are obtained (Fig. 5a). Compared to the result (Fig. 5b) of EMD (Ref 24), the periodicity of signal components is stronger. Imf 4-imf 9 of EMD and u4-u9 of VMD are selected to calculate the frequency spectrum by Fourier transform, which shows that the frequency of the component signal of VMD is more concentrated. In particular, the peak frequencies of the signal (Fig. 4) are f1 = 65.92 Hz and f2 = 134.3 Hz, and the signal is accurately decomposed into u8 and u9 components by VMD (Fig. 6). However, there are four components (imf 5-imf 8) of EMD that contains this peak frequency information (Fig. 7). Furthermore, for normalized current signals with insufficient gas flow rates (Fig. 3), the frequency distribution is dispersed. VMD can also obtain several frequency peaks in the signal (Fig. 8 and 9), such as f1 = 68 Hz, f2 = 117 Hz, and f3 = 268 Hz. Therefore, VMD is more suitable for signal decomposition than EMD in welding.

Fig. 5
figure 5

VMD and EMD of the current signal of test two. (a) VMD and (b) EMD

Fig. 6
figure 6

Frequency spectrum of u4-9 of VMD of test two

Fig. 7
figure 7

Frequency spectrum of imf 4-imf 9 of EMD of test two

Fig. 8
figure 8

Frequency spectrum of u4-9 of VMD of test one

Fig. 9
figure 9

Frequency spectrum of imf 4-imf 9 of EMD of test one

3.3 Fuzzy Entropy Analysis of VMD Component

To depict the complexity of IMF components, fuzzy entropy is used for analysis. For the same current signal of five kinds of weld quality (Table 1 and Fig. 2), signal decomposition is performed for each of 4096 points, and a total of 500 sets of data samples are obtained. The signal is decomposed by VMD and EMD, and then, several frequency components are selected to calculate the fuzzy entropy and sample entropy. The fuzzy entropy of 500 sets of data samples is calculated to obtain the mean value (Fig. 10). The result shows that the fuzzy entropy of VMD has good distinguishability for weld quality, and there is a certain difference between the fuzzy entropy of different IMF components. However, the fuzzy entropy of different frequency components has a small difference after the signal is decomposed by EMD. To analyze the clustering results of different weld types in 500 sets of data samples, a nonlinear dimensionality reduction method, t-distributed stochastic neighbor embedding (t-SNE (Ref 25)), is used for 3D visualization of the entropy feature (Fig. 11), and the results show that the fuzzy entropy of VMD has better clustering performance than EMD.

Fig. 10
figure 10

Entropy of different signal decomposition methods. (a) Fuzzy entropies of u1-u9 of VMD, (b) sample entropies of u1-u9 of VMD, and (c) fuzzy entropies of imf 1-imf 8 of EMD

Fig. 11
figure 11

Feature visualization based on t-SNE. (a) Fuzzy entropies of u1-u9 of VMD and (b) fuzzy entropies of imf 1-imf 8 of EMD

4 Quality Diagnosis Based on Fuzzy Entropy and OS-ELM

4.1 Brief Review of OS-ELM

The OS-ELM is an incremental fast learning algorithm for single-hidden layer feedforward neural networks, and it commonly contains an initialization phase and a sequential learning phase.

  1. Step 1: Initialization phase

    Initialize the learning using a small amount of initial training data \(\aleph_{0} = \left\{ {\left( {x_{{\text{i}}} ,{ }t_{{\text{i}}} } \right)} \right\}_{{{\text{i}} = 1}}^{{{\text{N}}_{0} }}\) from the given training set \(\aleph = \{ \left( {{\text{x}}_{i} ,{\text{ t}}_{i} } \right)|{\text{x}}_{i} \in {\text{R}}^{n} ,{\text{ t}}_{i} \in {\text{R}}_{m} , i = 1,{ }\cdot{ }\cdot{ }\cdot{ },N\}\), \({\text{N}}_{0} \ge {\tilde{\text{N}}}\).

    1. (a)

      Assign arbitrary input weight \({\text{w}}_{i}\) and bias \(b_{i}\) (for additive hidden nodes) or center \(\mu_{i}\) and impact width \({\upsigma }_{i}\) (for RBF hidden nodes), \({\text{i}} = 1,{ } \ldots { },{\tilde{\text{N}}}\).

    2. (b)

      Calculate the initial hidden layer output matrix \(H_{0} = \left[ {h_{1} , \ldots , h_{{{\tilde{\text{N}}}}} } \right]^{{\text{T}}}\), where \(h_{{\text{i}}} = \left[ {g\left( {w_{1} \cdot x_{{\text{i}}} + b_{1} } \right), \ldots , g\left( {w_{{{\tilde{\text{N}}}}} \cdot x_{{\text{i}}} + b_{{{\tilde{\text{N}}}}} } \right)} \right]^{{\text{T}}}\), \({\text{i}} = 1,{ } \ldots { },{\tilde{\text{N}}}\).

    3. (c)

      Estimate the initial output weight \(\beta^{\left( 0 \right)} = {\text{M}}_{0} {\text{H}}_{0}^{{\text{T}}} T_{0}\), where \({\text{M}}_{0} = \left( {{\text{H}}_{0}^{{\text{T}}} H_{0} } \right)^{ - 1}\) and \(T_{0} = \left[ {t_{1} , \ldots , t_{{{\text{N}}_{0} }} } \right]^{{\text{T}}}\).

    4. (d)

      Set \({\text{k}} = 0\).

  2. Step 2: Sequential Learning Phase
    1. (a)

      Present the \(\left( {{\text{k}} + 1} \right){\text{th}}\) portion of the new observations \(\aleph_{{{\text{k}} + 1}} = \left\{ {\left( {{\text{x}}_{i} ,{\text{ t}}_{{\text{i}}} } \right)} \right\}_{{{\text{i}} = \left( {\mathop \sum \limits_{{{\text{j}} = 0}}^{{\text{k}}} {\text{N}}_{{\text{j}}} } \right) + 1}}^{{\mathop \sum \limits_{{{\text{j}} = 0}}^{{{\text{k}} + 1}} {\text{N}}_{{\text{j}}} }}\), where \(N_{{{\text{k}} + 1}}\) denotes the number of observations in the \(\left( {k + 1} \right){\text{th}}\) portion.

    2. (b)

      Calculate the hidden layer output vector \(h_{{\left( {{\text{k}} + 1} \right)}} = \left[ {g\left( {w_{1} \cdot x_{{\text{i}}} + b_{1} } \right), \ldots , g\left( {w_{{{\tilde{\text{N}}}}} \cdot x_{{\text{i}}} + b_{{{\tilde{\text{N}}}}} } \right)} \right]^{{\text{T}}}\).

    3. (c)

      Calculate the latest output weight \(\beta^{{\left( {{\text{k}} + 1} \right)}}\) based on the RLS algorithm:

      $$M_{{{\text{k}} + 1}} = M_{{\text{k}}} - \frac{{M_{{\text{k}}} h_{{{\text{k}} + 1}} h_{{{\text{k}} + 1}}^{{\text{T}}} M_{{\text{k}}} }}{{1 + h_{{{\text{k}} + 1}}^{{\text{T}}} M_{{\text{k}}} h_{{{\text{k}} + 1}} }}$$
      (9)
      $$\beta^{{\left( {{\text{k}} + 1} \right)}} = \beta^{{\left( {\text{k}} \right)}} + M_{{{\text{k}} + 1}} h_{{{\text{k}} + 1}} \left( {t_{{\text{i}}}^{{\text{T}}} - h_{{{\text{k}} + 1}}^{{\text{T}}} \beta^{{\left( {\text{k}} \right)}} } \right)$$
      (10)
    4. (d)

      Set \({\text{k}} = {\text{k}} + 1\) and then train the next sample.

4.2 The Classification of Weld Quality Based on Fuzzy Entropy and OS-ELM

The fuzzy entropies of IMFs are different for different weld quality types. The fuzzy entropies of IMFs are input into the OS-ELM, and the weld quality type is output. A multiclassifier model for weld quality based on VMD fuzzy entropy and OS-ELM is shown in Fig. 12.

Fig. 12
figure 12

Weld quality diagnosis based on VMD fuzzy entropy and OS-ELM

The steps of weld quality diagnosis are described as follows:

  1. (1)

    Obtain weld current signals for different weld quality types (Table 1 and Fig. 2) and normalize all signals to (−1,1) to avoid influencing the signal amplitude for the analysis results.

  2. (2)

    Decompose the normalized current signal into a series of band-limited IMFs by VMD and select the first nine IMFs that contain the weld quality information.

  3. (3)

    Extract the fuzzy entropies of the first nine IMFs and construct the input feature vector.

  4. (4)

    Diagnose weld quality based on the OS-ELM.

The OS-ELM classification model is constructed, the input vector is the first nine fuzzy entropies, and the output vector is five kinds of weld quality.

VMD is performed for each of the 4096 points, the fuzzy entropies of the first nine IMFs are calculated as the input vector of the OS-ELM, and a total of 500 sets of data samples are obtained. A total of 250 samples are used as training samples, and the other 250 samples are used as prediction samples. Generally, the number of hidden neurons and activation function are very important parameters for the OS-ELM classification model. The classification accuracy results of different hidden neurons and activation functions are obtained (Fig. 13), demonstrating that using the sig function as the activation function is more suitable for weld quality diagnosis. When the number of hidden neurons is 30, the classification accuracy of OS-ELM reaches 95.2%. In addition, to verify the effectiveness of this method, the sample entropy (Ref 26) of the first nine IMFs of VMD and the fuzzy entropy of the first eight IMFs of EMD are used as input feature vectors for comparison (Table 2). We randomly select 250 training samples for a total of ten trials, and the diagnosis result shows that weld quality based on VMD fuzzy entropy and OS-ELM can obtain better classification, with an average accuracy of 95.5% (Fig. 14).

Fig. 13
figure 13

OS-ELM with different numbers of hidden neurons and activation functions

Table 2 Average classification accuracy of different methods
Fig. 14
figure 14

Ten trials are carried out for diagnosing each dataset

5 Conclusions

A weld quality diagnosis based on VMD and fuzzy entropy is proposed to extract quality features, and the fuzzy entropies of IMFs are chosen as the input features to the OS-ELM classifier model, which obtains high classification accuracy. The results are summarized as follows:

  1. (1)

    Changes in gas flow rates and current parameters directly affect the arc physical behavior, and the fluctuation of the current signal is also affected.

  2. (2)

    VMD is more suitable for signal decomposition than EMD in welding, and the frequency of the IMF component is more concentrated.

  3. (3)

    Fuzzy entropy can measure the complexity of the current signal, and the results show that the fuzzy entropies of VMD have good distinguishability for different weld quality types.

  4. (4)

    The first nine fuzzy entropies of VMD are selected as the input features to the OS-ELM classifier model, and the classification accuracy of five weld quality types reaches 95.5%.