1 Introduction

Currently, intelligent manufacturing is an important extension of manufacturing automation, and many countries have proposed different future policies to improve the efficiency and autonomy of modern manufacturing; these include “Industry 4.0” by Germany and “Intelligent manufacturing 2025” by China. Robot welding is a typical representative of intelligent manufacturing that has broad applications in many areas. However, robot welding is a relatively complex manufacturing process due to many factors, such as welding voltage, welding current, welding speed, welding gun height, etc. Welding defects that affect welding quality do not inevitably appear in welding workpieces. Different welding defects have different impacts on the structural strengths and comprehensive performances of welding objects. Therefore, an effective and accurate defect recognition system is a key element of intelligent welding robots that can effectively help with the assessment of structural properties and system maintenance [1].

For welding defect detection, a suitable sensor system is the key component of the detection system. Until now, different sensors, such as vision sensors [2], infrared sensors [3], ultrasonic sensors [1], and X-ray sensors [4], have been applied in industrial inspection applications. Compared with other sensors, nondestructive X-ray detection sensors can acquire the internal structure and defects effectively, and this can help to evaluate the effects of defects on the structural strengths and the comprehensive performances of different objects. Duan et al. proposed an automatic inspection method for detecting welding defects with X-ray images [5]. Roy et al. proposed a welding defect identification method for friction stir welding by using X-ray micro-CT scans [6]. Inspired by these works, the X-ray inspection is proposed in this paper for welding defect recognition.

With the X-ray inspection, faced with weak-textured and weak-contrast welding images, a novel welding defect recognition algorithm is proposed based on multi-feature fusion to assist with assessing structural properties and performing system maintenance. It is evaluated and verified on a public dataset (GDXray set) through a comprehensive experimental analysis and comparison. The main contributions of this paper can be summarized as follows: (1) To address the training issue of recognition networks, an effective data augmentation algorithm is proposed to enlarge and construct the dataset. (2) Combined with transfer learning, with the pre-trained AlexNet network model, a novel feature extraction method is proposed for multi-scale feature extraction of X-ray welding images to acquire abstract and effective image features. (3) To ensure the detection precision of the proposed approach, based on multi-feature fusion, a welding defect recognition algorithm that fuses an SVM classifier with DS evidence theory is proposed to realize accurate defect detection.

The rest of this paper is organized as follows. Section 2 gives the detailed related work. Section 3 shows the system framework of the proposed method. Section 4 describes the data augmentation algorithm for welding images. Section 5 describes the feature extraction methods used. Section 5 explains the proposed defect recognition algorithm. Section 6 presents detailed experiments and discussions. Finally, the conclusions and future prospects of this paper are described.

2 Related Work

To improve the recognition efficiency and precision of detection methods, a considerable amount of literature has been published on welding defect recognition. These studies can mainly be divided into three categories: image-based methods [7], feature-based methods [8] and deep learning-based methods [9].

2.1 Image-Based Methods

Image-based methods are conventional image analysis methods for different detection tasks based on the principles of image morphology [10].

Due to the of good robustness and high precision of laser structured light, Chu et al. proposed an automatic post-welding quality detection method [11]. Laser structured light acted as the robot sensor to acquire the 3D profile of weld beads. On this basis, the detailed parameters of the weld beads and welding defects were extracted. To improve the measurement efficiency of laser structured light, some optimized laser structured light sensors have been designed for welding robots. Zhang et al. proposed a weld bead inspection method based on cross-structured light [12]. Jia et al. proposed an inspection method for weld beads based on grid laser structured light sensors [13]. However, structured light sensors are local sensors, and they can only acquire limited measurement data for each measurement. To address the above issues, some researchers have proposed different inspection methods based on passive light vision that can acquire additional measurement information and large measurement ranges. Combined with monocular vision, Du et al. proposed an inspection method for weld beads based on the shape from shading (SFS) algorithm [14]. Chen et al. developed a defect detection algorithm based on X-ray welding images [15]. Combined with optimized image smoothing and the information fusion method, Du et al. proposed a real-time defect inspection method based on X-ray welding images [16].

For welding defect recognition, image-based methods always involve many links, such as image filtering, edge analysis, and image postprocessing. However, a complex welding environment has a certain effect on the robustness of such algorithms. Therefore, based on a priori knowledge, image-based methods are mainly designed for specific objects or application scenes.

2.2 Feature-Based Methods

Due to their good detection performance on small-scale samples, fusion with feature vectors, and different classifiers, many researchers have proposed different feature-based recognition methods for detecting welding defects [17].

In our previous work, a welding defect inspection algorithm was proposed based on a SVM classifier [18]. Combined with monocular vision, the 3D profiles of weld beads were acquired based on the SFS algorithm. On this basis, a defect recognition algorithm was proposed based on the 3D curvature features and SVM classifier. Kasban et al. proposed a new welding defect detection approach based on radiography images [19]. The discrete wavelet transform (DWT), discrete cosine transform (DCT), and discrete sine transform (DST) were proposed for effective feature extraction, and an artificial neural network (ANN) was built for defect detection. Duan et al. proposed an automatic welding defect detection method based on X-ray welding images [5]. It could be mainly divided into three steps: defect extraction, detection and recognition. Defect extraction was used to detect potential defects. On this basis, defect detection and recognition were solved by the adaptive cascade boosting (AdaBoost) classifier. Das et al. proposed a welding quality evaluation method based on an ANN model [20]. The wavelet packet transformation was proposed for the feature extraction of friction stir welding. The ANN model was built for accurate quality evaluation.

Feature-based detection methods provide a fast and accurate detection scheme for different small-scale samples. However, the detection performance relies on effective feature selection and design. How to realize strong image feature expression against complex welding environment still faces a certain challenges, such as handling backgrounds and materials.

2.3 Deep Learning-Based Methods

With the strong support of hardware platforms and big data, deep learning methods have been greatly developed that can process raw data well and provide an end-to-end detection scheme. Based on the strong feature expression abilities of deep learning models, many researchers have sought to apply deep learning methods in welding robots to realize intelligent detection schemes [21,22,23].

Combined with a three-way image acquisition system, Zhang et al. proposed an online defect detection method based on a convolutional neural network (CNN) [24]. Based on transfer learning, Sassi et al. proposed a quality control and assessment method for the inspection of welding defects [25]. To realize the inspection of small-scale weld beads in complex welding environments, Yang et al. proposed a weld bead location method based on a deep convolutional neural network (DCNN) [26]. Günther et al. proposed a representation and prediction method for laser welding [27]. A deep auto-encoding neural network was proposed for the feature expression of welding images. The temporal-difference learning algorithm was adopted for automatic predictions about the welding process. Combined with the SqueezeNet-based CNN model, Yang et al. proposed a machine vision-based surface defect detection method with multi-scale and channel-compressed features [28]. Inspired by feature fusion, Gao et al. proposed a vision-based defect recognition method [29]. The Gaussian pyramid was proposed to generate multiscale images of defects. On this basis, a pretrained VGG16 CNN model was applied to multiscale images to learn strong image features, and these outputs were fused to improve the recognition precision of the model. By incorporating fusion with a CNN model and a multilayer perceptron (MLP), Makantasis et al. proposed a fully automated tunnel assessment method [30]. Combined with a deep semantic segmentation network, Zou et al. proposed an automatic crack detection and location method [31]. Gong et al. proposed a defect detection of aeronautics composite materials with a deep transfer learning model which could be well applied into inclusion defect detection form X-ray images [32].

Although deep learning achieves good detection performances in many application scenarios, it mainly relies on labeled data for network training purposes. The manual annotation of large datasets of welding images is a time-consuming and laborious task. Furthermore, for real welding production, it is not easy to collect many samples under different welding situations for model training. More importantly, when faced with a complex welding process, different welding parameters cause different welding defects. Unbalanced samples of welding defects will also affect the detection performances of deep learning models.

3 System Framework

For different welding objects, based on the unique features of X-ray detection, X-rays detection can acquire internal defect information, which is the basis for accurately assessing structural properties and performing system maintenance. To effectively help with the quantitative assessment of welding defects with respect to the comprehensive performances of welding objects, combined with X-ray detection, a welding defect recognition system is set up in this paper, as shown in Fig. 1.

Fig. 1
figure 1

The framework of the proposed method

Nevertheless, X-ray welding images present some unique characteristics that bring a certain challenges for accurate welding defect recognition.

(1) Due to the materials of welding workpieces, X-ray welding images exhibit weak-textured and weak-contrast features, which affect the accuracy of feature expression.

(2) The welding defects are small-scale samples, and it is not easy to collect sufficient training samples for model training.

To address the above issues regarding welding defect recognition, a novel welding defect recognition algorithm is proposed, and some key links are needed to ensure the performance of the algorithm on X-ray welding images as follows.

(1) Data Acquisition: In addition to an X-ray detection system, a suitable dataset needs to be set up, and this is the basis of the welding defect recognition system.

(2) Feature Expression: Faced with weak-textured and weak-contrast X-ray welding images, the effective feature expression of welding images is the core of the defect recognition system.

(3) Defect Recognition: To ensure the detection performance of the proposed method on small-scale welding defects, a suitable recognition model is also a key component of the defect recognition system.

4 Data Augmentation

4.1 Dataset

Combined with the X-ray sensor, a public X-ray image set called GDXray is set up for research and educational purposes only. It includes a subset of welding images (Welds) by the BAM Federal Institute for Materials Research and Testing, Berlin, Germany [33], and this set is composed of 78 welding images with a length of 4 K. Figure 2 shows some samples of X-ray welding images.

Fig. 2
figure 2

Samples of X-ray welding images

As shown in Fig. 2, the welding images present the weak-textured and weak-contrast characteristic. And the cracks, blow holes or solids randomly appear in the X-ray welding images.

On the basis of the image set, a variety of different welding defects exist in the welding images. Here, two main flaws are considered in this paper for welding defect recognition [34]: cracks and blow holes (or solids), as shown in Fig. 3.

Fig. 3
figure 3

Samples of welding defects. a, c Cracks. b, d Blow holes or solids

4.2 Image Preprocessing

X-ray welding images are inevitably affected by image noise during the process of image collection. To improve the image quality, image preprocessing is proposed to reduce the noise.

The valuable information contained in 2D X-ray images is mainly concentrated in the low-frequency part. Otherwise, the image noise belongs to high-frequency signals. On this basis, a Gauss low-pass filter is applied to the 2D X-ray images for image filtering.

However, the Gauss low-pass filter cannot completely remove the image noise, and there is also a relatively large amount of noise. The median filter provides a good means for removing the large noise. Furthermore, it can effectively retain the image details. Therefore, by combining the Gauss low-pass filter and median filter, a preprocessing method for 2D X-ray images is proposed.

4.3 Random Cropping

The Welds subset in GDXray only includes 78 X-ray welding images that cannot be directly applied to model training and testing. Due to the 4K length, random cropping is used to process the raw images to obtain many image patches for constructing the dataset.

As shown in Fig. 3, different image patches with different sizes, such as 320*320 and 240*240, are acquired from the X-ray welding images to construct the dataset. The set includes three types of samples: cracks, blow holes or solids and no defects. Figure 4 shows some samples of different patches.

Fig. 4
figure 4

Samples of welding defects

5 Feature Extraction

For the recognizing task of welding defects, the weak-textured and weak-contrast X-ray images bring certain challenges to feature expression for welding images. Furthermore, multi-scale samples also have some impact on defect recognition. To ensure the recognition precision of the proposed approach for welding defects, an effective feature expression method is a core component of the defect recognition algorithm.

5.1 Transfer Learning

For feature expression and description, many researchers have proposed different handcrafted features, such as histograms of oriented gradients (HOGs) [35] and local binary patterns (LBPs) [36]. They can be used to extract low-level image features, such as edges and textures. For weak-textured and weak-contrast X-ray welding images, these handcrafted features have certain limitations with respect to high-precision defect detection.

Transfer learning is a typical representative of multi-task learning models that can transfer the learned information from the source domain to the target domain. It does not need a large training dataset regarding the target domain, and this enables researchers to avoid a large amount of data collection and annotation work. Through model training on a large-scale image dataset, transfer learning provides a good detection scheme for a small-scale dataset. To acquire strong features from X-ray welding images, combined with transfer learning, a pre-trained CNN network is adopted to act as a feature extractor for X-ray welding images.

The AlexNet network is a typical CNN model that achieves good detection performance on ImageNet [37]. Figure 5 shows the special structure of the AlexNet network. For welding image defect recognition, a pre-trained AlexNet network on ImageNet is proposed for the feature expression of welding images.

Fig. 5
figure 5

The network structure of AlexNet

Fig. 6
figure 6

Feature maps generated by the AlexNet network for different samples

Specifically, as shown in Fig. 5, due to the 3 input channels of the pre-trained AlexNet network, the gray-scale images are converted to RGB images to serve the feature extraction of X-ray welding images.

5.2 Feature Selection

Generally, a single image feature has limited feature expression ability, and this results in certain limitations with respect to recognition tasks with complex samples. The pre-trained AlexNet network can generate many different feature maps for the given images. Faced with these feature maps, a suitable feature selection scheme is a core part of the proposed defect recognition algorithm.

For the pre-trained CNN network model, different network layers can acquire different feature maps with different spatial resolutions. To effectively demonstrate the feature expression abilities of different network layers, with typical samples, such as blow holes or solids, cracks and no defects, Fig. 6 shows the feature maps generated from different network layers.

As shown in Fig. 6, for the experimental samples with different categories, there are obvious differences between these feature maps, and these could with defect recognition for welding images. Furthermore, the shallow network layer can acquire lower-level feature maps with higher spatial resolution, and these are suitable for small-scale objects. As the number of network layers increases, the network could acquire more higher-level and abstract feature maps that are suitable for large-scale object detection or recognition.

However, with the increase of network layers, the features for details or micro defects will lost which will affect the detection precision on multi-scale samples. For welding defects (see Fig. 4), there are large gaps in the image scales of welding defects between different samples. And some micro defects also exist in the welding images. Therefore, the single feature map from the special network cannot meet the detection demands of defect recognition.

To ensure the detection precision of the recognition network with respect to multi-scale welding defects, multi-feature fusion is proposed to enhance its detection performance. Here, combined with the pre-trained AlexNet, the feature maps from different network layers are fused for high-precision defect detection.

6 Defect Recognition

On the basis of data augmentation, this section focuses on defect recognition algorithm in X-ray welding image patches. To effectively solve the recognition issue regarding small-scale welding defects, a novel welding defect recognition algorithm is proposed based on the shallow learning method. Therefore, detailed descriptions of feature fusion and defect classification are provided in this section.

6.1 Defect Classification

For accurate defect recognition in welding images, faced with small-scale samples, an effective classifier is also a key part of the whole recognition system. To date, different classifiers have been proposed for different recognition tasks related to small-scale samples, such as ANNs [38], AdaBoost [39], K-nearest neighbors (KNN) [40], and SVMs [41]. Based on its excellent classification performances on small-scale samples, nonlinear problems, and high-dimensional spaces, the SVM classifier is proposed for accurate defect recognition, as shown in Fig. 7.

Fig. 7
figure 7

A diagram of the SVM classifier

Fig. 8
figure 8

Diagram of multi-feature fusion

The image features are fed into the SVM classifier as the network input. To solve high-dimensional and linearly non-separable sample classification, the inner product function is applied to the SVM classifier for a nonlinear mapping transformation. Due to its good nonlinear mapping capability, the radial basis function (RBF) is proposed as the kernel function to act as the inner product function, as shown in Eq. 1.

$$\begin{aligned} K({x},{y}) = \exp ( - g{\left\| {{x} - {y}} \right\| ^2}), \end{aligned}$$
(1)

where g denotes the parameters of the RBF function. The optimal parameters of the SVM classifier are solved by a grid search strategy and a cross-validation method.

6.2 Feature Fusion

The image feature maps of the pre-trained AlexNet model have different feature lengths. To effectively fuse different features, DS evidence theory is proposed for multi-feature fusion. It is a typical fusion method that can realize the fusion of multiple subjects, such as multichannel sensor data and multiple classifiers. The special flow chart of multi-feature fusion is shown in Fig. 8.

The image features from different network layers obtained from pre-trained AlexNet model are fed into SVM classifiers to obtain the prediction probabilities regarding the X-ray welding images. The prediction probabilities are input into the DS evidence theory module for feature fusion as follows.

$$\begin{aligned} m(\mathrm{{W}}) = \frac{1}{K}\sum \limits _{{A_1} \cap {A_2} \cap {A_3}= W} {{m_1}({A_1})\cdot } {{m_2}({A_2})\cdot } {m_3}({A_3}) \end{aligned}$$
(2)
$$\begin{aligned} K = \sum \limits _{{A_1} \cap {A_2} \cap {A_3} \ne \emptyset } {{m_1}({A_1}) \cdot } {{m_2}({A_2})\cdot } {m_3}({A_3}) \end{aligned}$$
(3)

where m is the output probability and W are the statuses of the welding images. \(A_i(i=1,2,3)\) are the output probabilities of different SVM classifiers.

To better illustrate the flow chart of the proposed method, Algorithm 1 shows the pseudocode of the whole training process.

figure a

7 Experiments and Discussions

To verify the effectiveness and superiority of the proposed method, this section tests the model performance through a comprehensive experimental analysis and comparison.

First, the detailed experimental configuration is described. Second, the effectiveness of different feature expression methods is verified. Third, different multi-feature fusion experiments are carried out on the constructed welding image dataset. Finally, the superiority of the proposed method is tested through an experimental comparison with other advanced methods.

7.1 Experimental Configuration

The Welds subset of GDXray is divided into two parts with a ratio of 55:45 (training set and test set). For model validation purposes, these two sets are disjoint.

On this basis, due to the 4K length, to enlarge and construct the dataset, some image patches are acquired by random cropping. They are labeled with different values for algorithm verification. Furthermore, the numbers of samples belonging to different categories are almost the same to avoid the issue of imbalanced data. Detailed information about the dataset is shown in Table 1.

Table 1 Data augmentation based on random cropping

The proposed defect detection method includes multiple SVM classifiers for feature fusion. Five-fold cross-validation is utilized for these SVM classifiers to ensure the reliability of the experimental results.

7.2 Feature Extraction

Different convolution layers of the AlexNet model have different feature expression abilities, and this leads to different detection precisions. On the basis of the dataset, as shown in Fig. 5, the feature expression abilities of typical network layers are tested. The special experimental results on welding images are shown in Table 2.

Table 2 Identification results of transfer learning

Table 2 shows that the different network layers result in different identification precision rates due to their different feature expression abilities. For the shallow network layer, lower-level feature maps yield relatively lower identification precision and vice versa.

Furthermore, to demonstrate the feature expression performance of transfer learning, common handcrafted features, such as HOG and LBP, are also set as comparison methods. Based on the welding defect dataset, the classification results of different handcrafted features are shown in Table 3.

Table 3 Identification results of handcrafted features

As shown in Table 3, transfer learning can achieve excellent recognition performance on X-ray welding images compared with handcrafted features. It can be seen that the pre-trained deep network models have stronger feature expression abilities than the handcrafted features through network training on the large-scale image set, so they can acquire more effective image features.

7.3 Feature Fusion

To ensure the recognition performance of the proposed method with respect to welding defects, multi-feature fusion is used for accurate defect detection. Combined with a pre-trained AlexNet network, to solve the classification problem for multi-scale defect samples, low, middle and deep feature maps are fused together to improve the detection precision for welding defects.

For the AlexNet network, different network layers have different feature expression abilities, and these have certain effects on feature fusion. Here, different combinations of network layers are tested. Furthermore, to verify the fusion performance of the network, a common feature fusion method, score-level fusion [42], is set as a comparative method. Table 4 shows the special classification results of different fusion methods.

Table 4 Classification results of different features

As shown in Table 4, combined with the single features, feature fusion further improves the recognition performance for welding defects. Because the \(Conv_3\) layer can acquire more effective features than those of the \(Conv_2\) layer, the feature fusion of the Conv\(\_\)3, Pooling\(\_\)5 and FC\(\_\)2 layers results in a higher classification precision. The score-level fusion method also achieves the similar result as the proposed method. Additionally, compared with the score-level fusion method, the proposed method based on DS evidence theory can achieve a higher recognition precision.

To better show the recognition performance of the proposed method, the confusions matrix are given for better experiment analysis, as shown in Fig. 9.

Fig. 9
figure 9

The confusions matrix of proposed method

For the confusion matrix in Fig. 9, the proposed recognition method shows a relatively poor precision on crack defects. For the crack defects, some micro defects is a little similar to the normal samples. Meanwhile, part areas of large crack defects cause similar image features like blow holes or solids. These factors affects the recognition precision of crack defects. In the whole, the proposed recognition method acquires a better recognition performance on X-ray welding images compared with single features or other fusion methods.

7.4 Comparison with Other Detection Methods

To better show the superiority of the proposed method, some advanced pre-trained CNN network models, such as VGG16 [43], GoogleNet [44], MobilenetV2 [45], ResNet18 [46] and InceptionV3 [47], are set as feature extraction comparison methods.

Table 5 The classification results of different networks

As shown in Table 5, transfer learning can achieve higher classification accuracy for welding defects than handcrafted features due to its stronger feature expression ability.

Table 5 also indicates that the different pre-trained CNN network models have different feature expression abilities, leading to different classification accuracies for welding defects. Compared with other pre-trained CNN network models, the proposed fusion method results in a higher classification accuracy, indicating a better detection performance on the X-ray welding images.

7.5 Time Analysis

For different defect recognition methods, running time is also an important model evaluation indicator. Therefore, the running times of various methods are counted and discussed in this section. For a typical defect recognition system, the core links involve data loading time, model loading time, feature extraction time and recognition time. Here, the individual models and the proposed fusion method are tested separately, and the related experiments are carried out on an Intel i7-7700HQ CPU with 16 GB of memory. The special experimental results are shown in Fig. 10.

Fig. 10
figure 10

The average running times of different models

From the above experiments in Fig. 10, the proposed fusion method requires 2.35 s for defect recognition, so it cannot meet the needs of fast defect detection. For the defect recognition system, the model training and pre-loading processes are always offline, and the model loading time can be ignored in the model evaluation. Therefore, the running time only takes 0.42 s, which is faster than the times of other pre-trained network models, as shown in Table 6. Furthermore, this running time could be further improved with high-performance hardware, such as Nvidia Graphics Processing Unit (GPU) and Field Programmable Gate Array (FPGA).

Table 6 The online average running times of different models

Through the above experiments and analysis, the proposed fusion method does not only achieve higher detection precision on X-ray welding images than other methods but also has a faster running speed. Therefore, the proposed method provides a good detection scheme for detection issues related to small-scale samples.

8 Conclusion

Faced with weak-textured and weak-contrast X-ray welding images, inspired by multi-feature fusion, a novel defect recognition method is proposed based on transfer learning and DS evidence theory for accurate defect recognition to assist with the assessment of structural properties and system maintenance. Combined with transfer learning, to solve classification problems for multi-scale samples, with the pre-trained AlexNet network, multi-scale feature extraction is acquired for effective feature expression. The recognition model is established based on the SVM classifier and DS evidence theory to predict welding defects in X-ray welding images online. It is evaluated and verified on a public dataset (GDXray), and it can achieve a better recognition performance than those of existing methods, as seen through a comprehensive experimental analysis and comparison.

In the future, we will be devoting ourselves to this work and will perform more research to improve the recognition precision of our approach with respect to welding defects.