Keywords

1 Introduction

Gearboxes are widely used in machinery, electric power, aerospace and other fields, among which bearings are important components in gearboxes [1]. Due to the heavy task and short construction period, the gearbox needs to be overloaded for a long time, so it often breaks down, causing unnecessary downtime [2].

In recent years, experiments show that when dealing with fault signals collected in complex situations, it is difficult for a single signal processing method to achieve the desired effect [3]. With the advent of the era of big data, the computing power of equipment has been significantly improved. Therefore, deep learning methods have begun to be applied in the field of equipment fault diagnosis. Among them, deep belief network (DBN) [4], stacked denoising autoencoder (SDAE) [5], convolutional neural network (CNN) [6] are common network structures in deep learning. Compared with traditional methods, deep learning methods can generally better extract the characteristics of fault signals and improve the efficiency of mechanical fault diagnosis.

Some scholars use one-dimensional convolutional neural network (1-D CNN) to diagnose equipment faults for one-dimensional time domain signals. WU [7] et al. used 1-D CNN to correctly distinguish the failure of fixed-shaft and planetary gearboxes. Hu Yingqing [8] and others proposed a multi-channel 1-D CNN network model and applied it to gearbox fault diagnosis. The experimental results show that multi-channel 1-D CNN has better feature extraction ability than single-channel 1-D CNN. CNN has advantages over 1D data when dealing with 2D images. Ye Xing [9] et al. used EMD to process vibration signals, used kurtosis value to filter IMF components, combined with multi-channel convolutional neural network for training and completed the test, The experiment has achieved remarkable results. Aiming at the problem of insufficient generalization ability of conventional fault diagnosis methods, Gu Yuhai [5] and others proposed a fault diagnosis method based on the combination of EMD and convolutional neural network. After adding 6dB white noise to the original signal, the fault diagnosis is accurate. The rate still reached 96.19%.

Based on the above analysis, this paper proposes a new deep neural network model – multi-channel convolutional neural network (MC-CNN), which is applied to the fault diagnosis of gearbox vibration signal, greatly improving the fault diagnosis ability.

EMD is used to decompose vibration signals, and the kurtosis value and correlation coefficient of IMF components are weighted, and the IMF components with obvious characteristics are selected by using the weighted values. IMF component generates 2d time-domain image data set, and multi-channel convolutional neural network (MC-CNN) is used to complete training and testing. After determining the accuracy of the method, a low signal-to-noise ratio signal is constructed to test the network structure and make a horizontal comparison with other network structures.

2 Main Method

2.1 EMD

EMD is a time-frequency analysis method that can carry out adaptive decomposition of the collected time-varying non-stationary signals, and Eq. (1) defines decompose any nonlinear signal into a series of IMF signals and a residual term [7],

$$ x\left( t \right) = \mathop \sum \limits_{i = 1}^n c_i \left( t \right) + r_n \left( t \right) $$
(1)

where \(n\) is the number of IMF components obtained by decomposition; \(c_i \left( t \right)\) is the ith IMF component; \(r_n \left( t \right)\) is the final residual component.

Assume that the signal to be decomposed is \(x\left( t \right)\), and the specific algorithm flow is as follows:

  1. (1)

    Upper and lower envelope are obtained by spline interpolation method.

  2. (2)

    A component \(h_1 \left( t \right)\) is obtained by subtracting the mean value \(m_1 \left( t \right)\) of the upper and lower envelope by \(x\left( t \right)\), Eq. (2) defines original signal and residual components.

    $$ h_1 \left( t \right) = x\left( t \right) - m_1 \left( t \right) $$
    (2)
  3. (3)

    Judge whether \(h_1 \left( t \right)\) meets IMF conditions. If so, \(h_1 \left( t \right)\) is denoted as \(c_1\) as the first IMF, Eq. (3) defines component signal and residual component. Otherwise, repeat the above steps, assuming that the condition is satisfied for the \(k\) th time, then

    $$ c_1 = h_{1\left( {k - 1} \right)} \left( t \right) - m_{1k} \left( t \right) $$
    (3)
  4. (4)

    Subtract \(c_1 \left( t \right)\) from \(x\left( t \right)\) to get the residual component \(r_1 \left( t \right)\), , repeat steps (1) and (2) for \(r_1 \left( t \right)\), and judge whether the new residual component obtained needs to be decomposed. If so, repeat the above steps,therwise stop the decomposition.

2.2 Kurtosis and Correlation Coefficient

Kurtosis is a numerical statistic that reflects the distribution characteristics of random variables. Kurtosis is extremely sensitive to impact signals and is very suitable for fault diagnosis of mechanical systems [10].

Equation (4) defines the correlation coefficient \(\rho\) reflects the close degree of correlation between \(x\left( t \right)\) and IMF.

$$ \rho = \frac{{c\left( {1,2} \right)}}{{\sqrt {c\left( {1,1} \right)*c\left( {2,2} \right)} }} = \frac{{\sum_{k = 0}^N x\left( k \right)IMF\left( k \right)}}{{\sqrt {\sum_{k = 0}^N x^2 \left( k \right)\sum_{k = 0}^N IMF^2 \left( k \right)} }} $$
(4)

where, \(c\) is the covariance matrix of matrix \(X,IMF\); \(N\) is the sampling point of the signal.

By analyzing the \(\rho\) of each order of IMF and \(x\left( t \right)\), combined with the characteristics of EMD method and noise itself, the IMF component with large correlation coefficient is selected to achieve the purpose of denoising. Since each order of IMF is decomposed by \(x\left( t \right)\), it should be 0&lt in most cases. Rho & lt; 1. However, the experiment finds that when the SNR of the noisy signal \(x\left( t \right)\) is large, the correlation coefficient between the decomposed IMF and itself is less than 0. In order to reflect the relative variation trend of the rubbing AE signal energy and noise energy in each order of IMF through correlation coefficient, when the correlation coefficient is negative, the absolute value can be taken [11].

3 Multichannel Convolutional Neural Network

In this section, a CNN based MC-CNN is proposed. Its main advantages are as follows: the first convolution layer uses a 14*14 large convolution kernel, which improves the robustness of the model; the multi-channel module selects convolutional kernels of different sizes to obtain different feature information from vibration images, which enhances the feature extraction capability of MC-CNN.

Figure 1 shows the network model of MC-CNN, which contains two multi-channel convolution modules.

Fig. 1.
figure 1

MC-CNN network structure

3.1 Vibration Image

By calculating the kurtosis values of each IMF component, it can be found that some IMF components have large kurtosis values, but their characteristics are not obvious, which is not conducive to MC-CNN identification. In the process of original signals of CWRU bearing inner ring fault, it is found that the IMF components corresponding to the maximum kurtosis values of signals of different segments are different after EMD processing, as shown in Tables 12. Therefore, weighting calculation is carried out on kurtosis value and correlation coefficient, and the IMF component corresponding to the maximum weighting value T is selected to obtain the IMF component that can effectively express the impact and has obvious characteristics. Equation (5) defines relationship between individual kurtosis values and total kurtosis values. Equation (6) defines relationship between \(T\) and ω, ρ. In summary, the calculation process is as follows:

$$ \omega_{\left( i \right)} = ku_{\left( i \right)} /\mathop \sum \limits_{i = 1}^Q ku_{\left( i \right)} $$
(5)

where \(Q\) represents the number of IMF components, and \(ku_{\left( i \right)}\) represents the kurtosis value of the \(i\) th IMF component.

$$ T_{\left( i \right)} = 0.8*\omega_{\left( i \right)} + 0.2*\rho_{\left( i \right)} $$
(6)

where \(\rho_{\left( i \right)}\) represents the correlation coefficient of the \(i\) th IMF component, and \(T_{\left( i \right)}\) represents the weighted value of the \(i\) th IMF component.

Table 1. Segmented signal 36
Table 2. Segmented signal 38

Figure 2 shows the specific operation process. 2048 sampling points are taken as a single segment of signals, and the sampled signals of the same bearing segment can be decomposed into multiple segments of vibration signals. After the vibration signals of each segment are decomposed by EMD, the vibration frequencies are arranged in sequence from high to low.

Fig. 2.
figure 2

Vibration image signal construction process

3.2 Multichannel Convolution Layer

The first convolution layer uses single convolution kernel for convolution operation, and the subsequent network structure uses multi-channel convolution module, which can extract different fault features compared with single-channel convolution kernel multi-channel convolution module, thus significantly improving the performance of fault diagnosis (Fig. 3).

Fig. 3.
figure 3

Multi-channel convolution module

3.3 Network Structure and Parameters

MC-CNN comprehensively uses single-channel convolution and multi-channel convolution modules to effectively extract fault features and improve the robustness of the network model. The design of the network structural parameters is shown in Table 3.

Table 3. Multi-channel convolutional neural network structure parameters

In Table 3, there are two multi-channel convolution modules, among which the first module has three channels and the second module has four channels, respectively, for feature extraction from the two-dimensional images input from the previous layer. In the first module, the structural parameters of convolution layer A are (5,5,1)/(5,5,1)/(1,1), which means: 5*5 convolution kernel is used for feature extraction, step size is 1; Finally, use a 1*1 convolution kernel with step size 1.

3.4 Model Loss Function and Training Method

In the process of MS-CNN training, Adam adaptive optimizer was used to update network training parameters until the optimal solution was obtained. The training batch size was set as 128, and the number of iterations was 160. Finally, SoftMax classifier was used to achieve fault classification.

4 Fault Diagnosis Method Based on MC-CNN

After the vibration signals were collected, the collected vibration signals were segmented isometric with a signal length of 2048. Each vibration signal was effectively decomposed by EMD, and the kurtosis value and correlation coefficient of each IMF component were calculated. The WEIGHTED value \(T\) of kurtosis and correlation coefficient was used to select THE IMF component, and the IMF image data set was generated. IMF image data set was divided into training set and test set according to a certain proportion. The multi-channel convolutional neural network was initialized and the training set was used to start the training. The robustness test was conducted after the test set was used to complete the test (Fig. 4).

Fig. 4.
figure 4

Overall process idea diagram

5 Experimental Verification

5.1 CWRU Bearing Data Set

The data in this paper are from the Case Western Reserve University (CWRU) bearing data set, which is widely used in bearing fault diagnosis. Three kinds of grooves with different fault diameters of 0.1778 mm, 0.3556 mm and 0.5334 mm were manufactured in the rolling body, inner ring and outer ring of the bearing by using EDM technology in advance, and they were operated under different loads of the motor. In this paper, 10 kinds of vibration signals at the driving end of bearings were selected with a sampling frequency of 48 kHz and at different fault locations under different motor loads. The bearing fault types are shown in Table 4.

Table 4. Bearing failure type

5.2 Results

In order to verify the superiority of the signal processing method, the IMF image data set obtained by the weighted value method and the kurtosis method are used to train the same network model. Figure 5 shows the experimental results that the training accuracy of the weighted value method is higher than that of the kurtosis method. Figure 6 shows the confusion matrix obtained by training under the weighted image data set, and Fig. 7 shows the comparison results of different network models under the weighted data set.

Fig. 5.
figure 5

Training (validation) accuracy, loss rate

Fig. 6.
figure 6

Confusion matrices

Fig. 7.
figure 7

Different network models recognize the correct rate

5.3 Robustness

On the basis of the original signal, the corresponding noise was added to construct the signal to noise ratio of –6, –4, –2, 0. The weighted value method is used to obtain the image data set, which is used for training of different network models. Experimental results show that MC-CNN is superior to other network models at different SNR, which proves the superiority of network structure. Figure 8 shows the accuracy of MC-CNN is still 82.34% at –6 SNR, which proves the robustness of this method to a certain extent.

Fig. 8.
figure 8

The process of changing the accuracy of different signal-to-noise ratios

5.4 Data Visualization

In order to further explore the feature extraction process of MC-CNN. The 10 types of data were input into the CNN network structure, and the extracted features of each layer were visualized by T-SNE for dimensionality reduction. Figure 9 shows the first convolution layer and SoftMax layer respectively. It can be observed that the classification of original data gradually becomes clear after passing through the network model.

Fig. 9.
figure 9

T-SNE for dimensionality reduction visualization of raw data

6 Conclusions

An intelligent fault diagnosis method based on EMD and multi-channel convolutional neural network is proposed to transform vibration signals into two-dimensional image problems that are easier to be recognized by CNN. In the early stage of signal preprocessing, the kurtosis and correlation coefficient were used to obtain more characteristic IMF images. On the basis of CNN, the first convolution kernel selects a large-size convolution kernel to improve the robustness of the network model, and the use of multi-channel convolution layer is conducive to obtaining different features on the image. Experimental results show that compared with kurtosis method, the fault diagnosis accuracy of weighted value method can reach 99.30%. In the robustness test, MC-CNN has higher robustness and higher accuracy than single-channel CNN, LeNet and VGG-16, which proves that this method has good generalization ability.