A new model for bearing fault diagnosis based on optimized variational mode decomposition correlation coefficient weight threshold denoising and entropy feature fusion

Yang, Jing; Bai, Yanping; Cheng, Yunyun; Cheng, Rong; Zhang, Wendong; Zhang, Guojun

doi:10.1007/s11071-023-08728-9

A new model for bearing fault diagnosis based on optimized variational mode decomposition correlation coefficient weight threshold denoising and entropy feature fusion

Original Paper
Published: 21 July 2023

Volume 111, pages 17337–17367, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Nonlinear Dynamics Aims and scope Submit manuscript

A new model for bearing fault diagnosis based on optimized variational mode decomposition correlation coefficient weight threshold denoising and entropy feature fusion

Download PDF

Jing Yang¹,
Yanping Bai²,
Yunyun Cheng¹,
Rong Cheng²,
Wendong Zhang³ &
…
Guojun Zhang³

536 Accesses
9 Citations
Explore all metrics

Abstract

For the bearing fault diagnosis in small sample cases, a new model for signal denoising and entropy feature fusion (EFF) based on the wild horse optimizer (WHO) optimized variational mode decomposition (VMD) and correlation coefficient weight threshold (CCWT) is proposed (WHO–VMD–CCWT–EFF). For signal denoising, we first take the power spectrum entropy as the fitness function, and the WHO is used to optimize VMD parameters. Secondly, IMFs with correlation coefficient values less than 0.2 are removed and the correlation coefficient values as weights are applied to the corresponding IMF components, and then reconstruct them. Then, the refined composite multiscale dispersion entropy (RCMDE), refined composite multiscale fluctuation dispersion entropy (RCMFDE), refined composite multivariate generalized multiscale fuzzy entropy (RCmvMFE), refined composite multivariate generalized multiscale sample entropy (RCmvMSE), and multiscale permutation entropy (MPE) of the signal are calculated and fused. Finally, the Fisher discriminant classifier is used as the model for fault diagnosis. The proposed model achieves an accuracy of over 99% in 12 single working conditions and 30 multiple working conditions experiments using the case western reserve university (CWRU) dataset and the Paderborn dataset. Compared with existing feature fusion methods, the WHO–VMD–CCWT–EFF model only integrates five selected features, and can achieve accurate diagnosis of bearing faults in small sample experiments with 42 different artificial and real damages. This indicates that the model has good generalization ability between different datasets and working conditions.

Multi-Scale Fault Frequency Extraction Method Based on EEMD for Slewing Bearing Fault Diagnosis

Multi-feature optimized VMD and fusion index for bearing fault diagnosis method

Article 08 June 2023

Application of hierarchical symbolic fuzzy entropy and sparse Bayesian ELM to bearing fault diagnosis

Article 03 May 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In recent years, the issues related to health monitoring in the mechanical industry urgently need to be addressed. Some scholars have begun to spend time monitoring the health status of structures such as beams and trusses [1, 2]. Of course, due to the development of automated machinery and equipment, people are increasingly interested in fault diagnosis of their components, such as bearings [3,4,5]. At present, bearing fault diagnosis mainly includes three categories: signal processing and analysis, traditional fault diagnosis methods based on feature extraction, and self-extraction feature diagnosis methods based on deep learning [6]. In the actual research process, the three types of methods play their respective advantages and complement each other. From the perspective of signal processing, Li y et al. proposed a new time–frequency analysis (TFA) post-processing algorithm called local maximum high order time iterative synchrosqueezing (LHTIS) [7], and proved the effectiveness of this method by analyzing and processing fault signals. Haiyang Pan et al. [8] proposed multi-class fuzzy support matrix machine and successfully applied it to roller bearing fault diagnosis. In one study, VMD parameters and kernel fuzzy c-means (KFCM) were optimized respectively, and then the bearing fault types of small samples were identified [9]. Another study proposed a new method for bearing fault diagnosis based on wavelet packet transform and convolutional neural network optimized by simulated annealing algorithm [10]. Another study proposed ensemble self-taught learning convolutional auto-encoders (STL-CAEs) [11], which can effectively solve the problem of few labeled data. The method of combining dempster-shafer (DS) evidence theory with support vector machines (SVM) has also appeared in other bearing fault diagnosis research [12]. A new fault diagnosis method called RSG is proposed in the literature [13]. Yang S et al. [14] combined two-dimensional convolutional neural network (2DCNN) feature extractor and random forest (RF) classifier to establish a fault diagnosis model for the problem of high-speed bearings in offshore wind turbines. And the experimental results were 99.5% when 700 training samples and 300 test samples were input to the model. Literature [15] intercepted 30 training samples and 30 test samples respectively, and then calculated and mixed the time-domain and frequency-domain features of the samples. Finally, the deep neural network was used to identify the fault type with an accuracy of 99.1%. In [16], the weighted signal difference average (WSDA) as a new fitness function was proposed to optimize VMD, and a one-dimensional neural network was used for rolling bearing fault diagnosis. In the experiment with 5000 samples as the training set and 1000 samples as the test set, the accuracy of bearing fault diagnosis is 99.6%. In [17], the few-shot learning method was successfully applied to the fault diagnosis of rolling bearings, and the experimental verification was carried out under the mixed working conditions. The results show that when the number of training samples is 60, 200, 900, and 19,800, and the number of test samples is 75, the accuracy rates are 82.8%, 94.32%, 98.55%, and 99.77%, respectively.

It can be seen from previous research that deep learning is widely used in the field of bearing fault diagnosis. Of course, this is also due to its advantages in feature extraction and classification. However, the accuracy of deep learning is often at the cost of increasing the number of training samples. In addition, some studies are only carried out for a single working condition, and the effect of applying the proposed model to other working conditions remains to be verified. Based on the above analysis, the WHO-VMD-CCWT-EFF is proposed in this paper. In order to verify the usability and universality of the model, the CWRU dataset [18] and Paderborn dataset [19] are used for various single and multiple working condition experiments. The experimental results indicate that WHO-VMD-CCWT-EFF can achieve good results with only 10 training samples and 90 test samples. The main contributions of this paper are summarized as follows:

1.
A correlation coefficient weight threshold denoising method is proposed to denoise the fault signal decomposed by VMD.
2.
To extract better classification features, an entropy feature fusion method is proposed and the new bearing fault diagnosis method named WHO-VMD-CCWT-EFF is verified in the experiment.
3.
A new deviation metric is used to measure the stability of the model and validated in various experiments.
4.
On the basis of completing the experiment of single working condition, the experiment of mixing multiple working conditions is carried out, and good results are obtained. And the WHO-VMD-CCWT-EFF is still applicable and stable in the case of a small amount of data.

The rest of this paper is organized as follows: Section 2 introduces the theoretical basis related to the model. Section 3 introduces the framework of the model. The CWRU dataset and the Paderborn dataset are used for the experiments and analyses in Sect. 4. Finally, Sect. 5 gives the conclusion.

2 Theoretical backgrounds

2.1 WHO-VMD

VMD [20] is a non-recursive and adaptive signal processing algorithm proposed based on algorithms such as empirical mode decomposition (EMD). It aims to dissect the original signal in the frequency domain and decompose it into intrinsic mode function (IMF) with limited bandwidth and center frequency. WHO is a meta-heuristic optimization algorithm proposed by Iraj Naruei [21]. Similar optimization algorithms, such as the improved grey wolf optimization (IGWO), ant lion optimizer (ALO), and marine predator algorithm (MPA), have also been well applied to various structural detection. This article selects the WHO to optimize VMD parameters [22,23,24].

The algorithm is mainly inspired by the special behavior of wild horses that is different from other animals, that is, foals leave their parent groups before puberty to join other parent groups to avoid mating between relatives. In addition to mating behaviors, wild horses renew their position through social behaviors such as grazing behavior, group leadership, and exchange and selection of leaders.

In the process of VMD decomposition of bearing fault signal, it is found that the number of IMFs in VMD, that is, the number of decomposed layers $k$ and the value of multiplication factor α will directly affect the decomposition effect. To select the relatively optimal parameters, the parameter $ \left( {k,\alpha } \right)$ of VMD is optimized by WHO, as shown in Fig. 1.

Step1: Set the parameters of WHO. The total number of wild horses N = 30, the maximum number of iterations Max_iter = 30, the crossover ratio PC = 0.13, the percentage of stallions in the group population PS = 0.2, the number of stallions Nstallion = N*PS, the number of foals in each group Nfoal = (N-Nstallion)/Nstallion. The parameters to be optimized $\alpha \in \left[ {100,{ }2000} \right]$, $k \in \left[ {4,{ }8} \right]$, $\alpha ,k \in Z$.
Step2: Create populations, select leaders, and calculate fitness function values.
Step3: Search and update according to grazing behavior if Rand > PC, otherwise update by mating behavior. It should be noted that Rand is a random number with uniform distribution in the range [0, 1].
Step4: Group leaders as well as stallions are updated, respectively.
Step5: Determine whether the number of iterations is reached, if so, output $\left( {k,\alpha } \right)$, otherwise return to step 3.

2.2 Power spectrum entropy

The power spectral entropy [25] represents the uncertainty of signal energy under power spectral partitioning, which is a quantitative description of the complexity of signal energy distribution in the frequency domain. In actual industrial environments, bearing fault signals are collected in environments with different noise sources, resulting in complex frequency components of bearing fault signals. In order to effectively reflect the fault characteristics contained in each frequency component, the power spectrum entropy is used as the fitness function of the optimization algorithm. When the entropy value of the power spectrum is small, the frequency components in the signal are simple, and the power spectrum is concentrated on some frequency components, which can reflect the characteristics of the fault signal. In addition, the power spectral entropy values of each IMF component in the same fault state after VMD decomposition are relatively stable, and the power spectral entropy values vary in different fault states. This further proves that the power spectral entropy is suitable as the fitness function of the optimization algorithm.

Step1: Define the original fault signal sequence as $x\left( t \right) = \left\{ {x\left( 1 \right),x\left( 2 \right),x\left( 3 \right), \ldots x\left( L \right)} \right\}$

$$ P\left( i \right) = \frac{{\left| {x\left( w \right)} \right|^{2} }}{2\pi L} $$

(1)

where $L$ is the length of the signal, $P\left( i \right)$ is the power spectrum of the signal. $x\left( w \right)$ is the Fourier transform of the signal.

Step 2: Obtain the power spectral density distribution function by normalization:

$$ p\left( i \right) = \frac{P\left( i \right)}{{\mathop \sum \nolimits_{i = 1}^{N} P\left( i \right)}}\quad i = 0,1,2....N $$

(2)

where $N$ is the number of frequency components in the Fourier transform.

Step 3: Define the power spectrum entropy through the power spectral density distribution function as:

$$ H = - \mathop \sum \limits_{i = 1}^{N} p\left( i \right){\text{log}}p\left( i \right) $$

(3)

2.3 Correlation coefficient

The correlation coefficient is a description of the similarity between two random signals or deterministic signals. After the VMD decomposition of the bearing fault signal, the correlation between each IMF and the original bearing fault signal can be judged by calculating the correlation coefficient value. Then, it can be inferred from the correlation coefficient whether the IMF contains the main features of the original signal. Generally speaking, the closer the absolute value of the correlation coefficient is to 1, the higher the degree of correlation between the two, and the more obvious the features of the original signal contained in the IMF. The correlation coefficient $R_{k}$ between the k-th IMF and the original signal is defined as:

$$ R_{k} = \frac{{E\left( {u_{k} \left( t \right)f\left( t \right)} \right) - E\left( {u_{k} \left( t \right)} \right)E\left( {f\left( t \right)} \right)}}{{\sqrt {D\left( {u_{k} \left( t \right)} \right)} \sqrt {D\left( {f\left( t \right)} \right)} }} $$

(4)

where $f\left( t \right)$ is the original signal and $u_{k} \left( t \right)$ is the k-th IMF. E and D represent expected values and variance.

2.4 Entropy features

2.4.1 RCMDE

RCMDE [26, 27] was first proposed and applied to biomedical signals in 2017. RCMDE is improved by multi-scale and coarse-graining based on dispersion entropy. The specific calculation steps are as follows:

Step 1: For the sequence X of length $N$, divide it into segments of length τ. The average value of each segment is calculated and arranged to obtain a coarse-grained sequence.

$$ x_{k,j}^{\tau } = \frac{1}{\tau }\mathop \sum \limits_{{b = k + \tau \left( {j - 1} \right)}}^{k + \tau j - 1} X_{b} ,1 \le j \le \left\lfloor\frac{N}{\tau }\right\rfloor,1 \le k \le \tau $$

(5)

where $x_{k}^{\tau } = \left\{ {x_{k,1}^{\tau } ,x_{k,2}^{\tau } ,...} \right\}$ is the k-th coarse-grained sequence at the τ scale.

Step 2: Map the time series $x_{k,j}^{\tau }$ to $y_{k,j}^{\tau }$ by Eq. (6)

$$ y_{k}^{\tau } = \frac{1}{{\sigma \sqrt {2\pi } }}\mathop \int \limits_{ - \infty }^{{x_{k}^{\tau } }} e^{{\frac{{ - \left( {t - u} \right)^{2} }}{{2\sigma^{2} }}}} dt $$

(6)

where $u$ represents the mathematical expectation of sequence $y_{k}^{\tau }$, and $\sigma $ represents the variance of sequence $y_{k}^{\tau }$.

Step 3: Map the time series $y_{k}^{\tau }$ to $Z_{j}^{c}$ by Eq. (7)

$$ Z_{j}^{c} = {\text{Round}}\left( {c \cdot y_{k}^{\tau } + 0.5} \right) $$

(7)

where ${\text{Round}}()$ represents the rounding function, and $c$ represents the number of categories.

Step 4: Calculate the embedding vector by Eq. (8).

$$ z_{i}^{m,c} = \left\{ {z_{i}^{c} ,z_{i + d}^{c} , \cdots ,z_{{i + \left( {m - 1} \right)d}}^{c} } \right\} $$

$$ i = 1,2, \cdots ,N - \left( {m - 1} \right)d $$

(8)

where $m$ is the embedding dimension and $d$ is the time delay.

Step 5: Calculate the dispersion patterns and its corresponding probability. Assuming that $z_{i}^{c} = v_{0}$, $z_{i + d}^{c} = v_{1}$, and $z_{{i + \left( {m - 1} \right)d}}^{c} = v_{m - 1}$, the dispersion pattern corresponding to $z_{i}^{m,c}$ is $\pi_{{v_{0} v_{1} \cdots v_{m - 1} }}$. Calculate the probability corresponding to the dispersion pattern according to Formula (9).

$$ p\left( {\pi_{{v_{0} v_{1} \cdots v_{m - 1} }} } \right) = \frac{{{\text{Number}}\left( {\pi_{{v_{0} v_{1} \cdots v_{m - 1} }} } \right)}}{{N - \left( {m - 1} \right)d}} $$

(9)

Step 6: Calculate the average value $\overline{p}\left( {\pi_{{v_{0} v_{1} \cdots v_{m - 1} }} } \right)$ of the probability of the dispersion pattern, and obtain the RCMDE value through $\overline{p}\left( {\pi_{{v_{0} v_{1} \cdots v_{m - 1} }} } \right)$.

$$ {\text{RCMDE }}(x_{k}^{\tau } ,m,c,d,\tau ) = - \mathop \sum \limits_{\pi = 1}^{{c^{m} }} \overline{p}\left( {\pi_{{v_{0} v_{1} \cdots v_{m - 1} }} } \right)\ln \left( {\overline{p}\left( {\pi_{{v_{0} v_{1} \cdots v_{m - 1} }} } \right)} \right) $$

(10)

2.4.2 RCMFDE

RCMFDE is based on the study of Azami H et al. [28, 29] for dispersion entropy. The fluctuation dispersion entropy is superior to the dispersion entropy in that it takes into account the volatility of the time series while maintaining stable performance and less computation. Like RCMDE, RCMFDE obtains the dispersion pattern by formula (5–8). Calculate the probability corresponding to the dispersion pattern according to Formula (11).

$$ p\left( {\pi_{{v_{0} v_{1} \cdots v_{m - 1} }} } \right) = \frac{{{\text{count}}\left\{ {\begin{array}{*{20}c} {i|i \le N - \left( {m - 1} \right)d,} \\ {z_{i}^{m,c} {\text{pattern }}\pi_{{v_{0} v_{1} \cdots v_{m - 1} }} } \\ \end{array} } \right\}}}{{N - \left( {m - 1} \right)d}} $$

(11)

Among them, ${\text{count ()}}$ is the number of maps from $z_{i}^{m,c}$ to $\pi_{{v_{0} v_{1} \cdots v_{m - 1} }}$.

Calculate the average value of the dispersion pattern probabilities at scale τ, and the RCMFDE is obtained through $\overline{p}\left( {\pi_{{v_{0} v_{1} \cdots v_{m - 1} }} } \right)$.

$$ \overline{p}\left( {\pi_{{v_{0} v_{1} \cdots v_{m - 1} }} } \right) = \frac{1}{\tau }\mathop \sum \limits_{k = 1}^{\tau } p_{k} $$

(12)

$$ E_{{{\text{RCMFD}}}} \left( {x_{k}^{\tau } ,m,c,d,\tau } \right) = - \mathop \sum \limits_{\pi = 1}^{{\left( {2c - 1} \right)m - 1}} \overline{p}\left( {\pi_{{v_{0} v_{1} \cdots v_{m - 1} }} } \right) \cdot \ln \left[ {\overline{p}\left( {\pi_{{v_{0} v_{1} \cdots v_{m - 1} }} } \right)} \right] $$

(13)

2.4.3 RCmvMFE

RCmvMFE [30] is a tool proposed in 2017 to analyze the complexity of multi-channel signals. The detailed description of RCmvMFE is as follows:

Step 1: For a multivariate signal $Y = \{ y_{k,b} \}_{,b = 1}^{C}$ containing $p$ signals with a length of $C$, the coarse-grained operations are performed to obtain a time series, represented as $z_{\alpha }^{\left( \beta \right)} = \left\{ {x_{\alpha ,k,i}^{\left( \beta \right)} } \right\}$, where $ \beta$ is the time series scale.
$$ x_{\alpha ,k,i}^{\left( \beta \right)} = \frac{1}{\beta }\mathop \sum \limits_{{b = \left( {i - 1} \right)\beta }}^{i\beta + \alpha - 1} y_{k,b} \quad 1 \le i \le \left \lfloor \frac{C}{\beta } \right \rfloor = N,1 \le k \le p,1 \le \alpha \le \beta $$
(14)
Step 2: The multivariate embedded reconstruction is used.
$$ X_{m} \left( i \right) = \left[ {x_{1,i} ,x_{{1,i + \tau_{1} }} , \ldots ,x_{{1,i + \left( {m_{1} - 1} \right)\tau_{1} }} ,x_{2,i} ,x_{{2,i + \tau_{2} }} , \ldots ,x_{{2,i + \left( {m_{2} - 1} \right)\tau_{2} }} , \ldots ,x_{P,i} ,x_{{P,i + \tau_{P} }} , \ldots ,x_{{P,i + \left( {m_{P} - 1} \right)\tau_{P} }} } \right] $$
(15)
where $M = \left[ {m_{1} ,m_{2} ,. . .m_{p} } \right]$, $\tau = \left[ {\tau_{1} ,\tau_{2} ,. . .\tau_{P} } \right]$ are the embedding dimension and delay time, respectively, $n = {\text{max}}\left\{ M \right\} \times {\text{max}}\left\{ \tau \right\}$, $i = 1,2, . . .N - n$.
Step 3: Calculate the distance between $X_{m} \left( i \right)$ and $X_{m} \left( j \right)$, where $i \ne j$.
$$ d\left[ {X_{m} \left( i \right),X_{m} \left( j \right)} \right] = \mathop {{\text{max}}}\limits_{l = 1,2, \ldots ,m} \left\{ {\left| {x\left( {i + l - 1} \right) - x\left( {j + l - 1} \right)} \right|} \right\} $$
(16)
Step 4: According to the given threshold r and fuzzy membership function $\theta \left( {d,r} \right)$, $\phi^{m} \left( r \right)$ with the embedding dimension $m$ can be obtained:
$$ \theta \left( {d,r} \right) = \exp \left( {\frac{{ - (d)^{fp} }}{r}} \right) $$
$$ \phi^{m} \left( r \right) = \frac{1}{{\left( {N - n} \right)}}\mathop \sum \limits_{i = 1}^{N - n} \frac{{\mathop \sum \nolimits_{j = 1,i \ne j}^{N - n} \exp \left( {\frac{{ - (d\left[ {X_{m} \left( i \right),X_{m} \left( j \right)} \right])^{fp} }}{r}} \right)}}{N - n - 1} $$
(17)
Step 5: Let m = m + 1 and repeat steps 2–4. Calculate the average values $\overline{\phi }_{\beta ,\alpha }^{m}$ and $\overline{\phi }_{\beta ,\alpha }^{m + 1}$ of Eq. (17). Then RCmvMFE can be calculated by Eq. (18)
$$ {\text{RCmvMFE}}\left( {Y, \beta ,M,n,r} \right) = - {\text{ln}}\left( {\frac{{\overline{\phi }_{\beta ,\alpha }^{m + 1} }}{{\overline{\phi }_{\beta ,\alpha }^{m} }}} \right) $$
(18)

2.4.4 RCmvMSE

The probability calculation method for RCmvMSE [31] varies when the embedding dimension is $m$.

$$ B_{i}^{m} \left( r \right) = (N - n - 1)^{ - 1} P_{i} $$

(19)

$$ B^{m} \left( r \right) = (N - n)^{ - 1} \mathop \sum \limits_{i = 1}^{N - n} B_{i}^{m} \left( r \right) $$

(20)

Let m = m + 1, and repeat the above steps to get $B^{m + 1} \left( r \right)$. Calculate the mean values $\overline{B}_{\beta ,\alpha }^{m}$ and $\overline{B}_{\beta ,\alpha }^{m + 1}$ in $m$ and $m + 1$ dimensions. RCmvMSE can be calculated by Eq. (21)

$$ {\text{RCmvMSE}}\left( {Y, \beta ,M,n,r} \right) = - {\text{ln}}\left( {\frac{{\overline{B}_{\beta ,\alpha }^{m + 1} }}{{\overline{B}_{\beta ,\alpha }^{m} }}} \right) $$

(21)

2.4.5 MPE

To better study and analyze the dynamic characteristics of EEG, Ouyang G et al. [32] proposed a multiscale permutation entropy based on permutation entropy.

Step 1: A new time series is obtained by coarse-graining an original sequence Y of length N, where $\tau$ is the scale factor.

$$ y_{j}^{\left( \tau \right)} = \frac{1}{\tau }\mathop \sum \limits_{{i = \left( {j - 1} \right)\tau + 1}}^{j\tau } x_{i} ,1 \le j \le \left \lfloor\frac{N}{\tau } \right \rfloor$$

(22)

Step 2: The phase space reconstruction is applied with $ y^{\left( \tau \right)}$ to obtain the time series $X_{i}$.

$$ X_{i} = \left( {y_{i} ,y_{i + \lambda } , \ldots ,y_{{i + \left( {m - 1} \right)\lambda }} } \right) $$

(23)

where $m$ is the embedding dimension and $\lambda$ is the delay time.

Step 3: The $X_{i}$ is sorted in ascending order to generate a sequence of position indexes. For any kind of $X_{i}$, there are $m!$ permutations. The probability of each permutation is calculated according to Eq. (24).

$$ P\left( \omega \right) = \frac{T\left( \omega \right)}{{N - \left( {m - 1} \right)\lambda }} $$

(24)

where $T\left( \omega \right)$ is the number of occurrences of permutation $\omega$, $1 \le \omega \le m!$

Step 4: Define the multiscale permutation entropy by Eq. (25).

$$ H_{PE} = - \sum P\left( \omega \right)\ln P\left( \omega \right) $$

$$ H_{MPE} = [H_{P1} ,H_{P2} ...H_{P\tau } ] $$

(25)

2.5 Deviation

In order to measure the stability of the model, a new deviation indicator is defined. Suppose that for experiment A, the result of the i-th repeated experiment is $A_{i}$, i = 1,2,3……N.

$$ Deviation_{A} = {\text{max}}\left[ {A_{i} \left] { - {\text{min}}} \right[A_{i} } \right] $$

(26)

where ${\text{max}}\left[ {A_{i} } \right]$ is to find the maximum value of $ A_{i}$, ${\text{min}}\left[ {A_{i} } \right]$ is to find the minimum value of $A_{i}$.

3 The WHO–VMD–CCWT–EFF

The framework of the WHO–VMD–CCWT–EFF model is shown in Fig. 2. Three parts of fault signal denoising, feature extraction and fusion, and feature classification are included in this method.

Denoising: First, the VMD optimized by the WHO algorithm decomposes various bearing fault signals into IMFs; Secondly, the correlation coefficients between each IMF and the original bearing fault signal are calculated; Then, selecting IMFs with correlation function values greater than 0.2 with the original bearing signal; Finally, the correlation coefficients are used as the weight to multiply the corresponding IMFs to reconstruct the fault signal.

Feature extraction: RCMDE, RCMFDE, RCMvMFE, RCMvMSE, and MPE are extracted from the fault signal after denoising, and then the five entropy features extracted are fused.

Classification: The feature samples of the fault signals are divided into training and test sets according to the experimental requirements, and then the fault signals are classified by Fisher classifier.

4 Experimental analyses

In order to verify the effectiveness of the method proposed in this paper, two classical public datasets are used in the experiments.

4.1 Analysis of bearing fault signal

4.1.1 WHO–VMD

To address the issue of VMD decomposition being greatly affected by parameters $\left( {k,\alpha } \right)$, the WHO is used to optimize the parameters. Figure 3 shows the convergence curve of partial artificial damage and real damage under the Paderborn dataset. It can be seen that the number of iterations required to achieve convergence for each fault signal is inconsistent. Therefore, in order to consider as many fault signals as possible, 30 is chosen as the number of iterations. Similarly, the number of search agents is set to 30. From Fig. 4, it can be seen that when the number of search agents is 30, the convergence of the fault signal is better.

Taking the 0.3556 mm outer race bearing fault at 0HP as an example, the fitness curve of the WHO-VMD is shown in Fig. 5. As can be seen from the Fig. 5, the value of the fitness function for the first iteration is $7.2145 \times 10^{ - 4}$. The value of the fitness function is $7.0273 \times 10^{ - 4}$ after a small decrease, and tends to be smooth after two iterations. The fitness function value is reduced to $5.66 \times 10^{ - 4}$ after the fifth iteration, and to a minimum value of after the 14th iteration. The results show that WHO has a fast convergence rate in the process of VMD parameter optimization, which proves that WHO is suitable for optimizing the parameters of VMD. The parameters obtained by WHO-VMD in this experiment is $k = 5,\,\alpha = 1106$.

To verify the superiority of the WHO algorithm in optimizing the VMD parameters, the particle swarm optimization algorithm (PSO) [33], the whale optimization algorithm (WOA) [34], and the moth-flame optimization algorithm (MFO) [35] are used to optimize the VMD parameters. The number of optimization algorithm populations is set to 30, and the maximum number of iterations is 30, and then the fitness function convergence curve shown in Fig. 6 is obtained. As can be seen from Fig. 6, the convergence curve of PSO is unstable and the phenomenon of sudden high and low appears. The WOA and MFO converge after the 2nd and 6th iterations, respectively, corresponding to a power spectrum entropy function value of $7.0273 \times 10^{ - 4}$. It can be seen that WOA and MFO are faster in finding the optimum, and convergence can be achieved in fewer iterations. In contrast, WHO can reach the fitness termination value of WOA and MFO after the 2nd iteration, and still continue to iterate to $5.646 \times 10^{ - 4}$ thereafter. This proves the superiority of WHO in optimizing VMD parameters.

4.1.2 CCWT

After determining the VMD parameters, the fault signal is decomposed to obtain IMF and denoised through CCWT. The correlation coefficient about IR and OR in the Paderborn real damage D2 dataset as shown in Fig. 7. When 0.3 is chosen as the denoising threshold for CCWT, half of the IMFs are filtered, which may lead to excessive denoising and loss of otherwise useful information. When 0.1 is chosen as the denoising threshold of CCWT, no IMFs are filtered, and the expected denoising effect cannot be achieved. In the paper, 0.2 is chosen as the denoising threshold of CCWT. The IMFs with correlation coefficients less than 0.2 are filtered out first, and then the correlation coefficients of the remaining are used as the weighting coefficients of the IMFs to reconstruct the original signal.

Taking the OR (Label = 3) of the Paderborn real damage D4 dataset as an example. For easy observation, 500 sample points are selected to compare the differences before and after denoising, as shown in Fig. 8. When correlation coefficient threshold (CCT) is used, IMFs whose correlation coefficient value is less than 0.2 are removed and the remaining components are reconstructed. Based on the former, the CCWT applies the correlation coefficient values greater than 0.2 to the corresponding IMF components as weights. As can be seen from Fig. 8, compared with the signal of CCT denoising, the signal curve of CCWT denoising is smoother and has less burrs. Therefore, we think the denoising effect of CCWT is better.

In addition, although the threshold is set to 0.2 considering the characteristics of most fault signals. Due to the large number of fault signals in both datasets, there are still some signals with all correlation coefficient values greater than 0.2. Taking the IR (Label = 8) of the Paderborn real damage D1 dataset as an example, the denoising effects of CCT and CCWT on such situations are explored. As can be seen from the Fig. 9, the signal after CCT denoising is almost indistinguishable from the original signal because all the correlation coefficient values are greater than 0.2. CCWT not only effectively avoids this defect, but also achieves good denoising effect. This is because the weighting operation on IMF not only enhances the useful signal, but also weakens the noisy signal. The larger the correlation coefficient between IMF and the original signal during CCWT denoising, the more useful information the signal contains. On the contrary, the smaller the value of correlation coefficient, the more the IMF is considered to contain noise. The operation of using the correlation coefficient values as weights is equivalent to amplifying the IMF that is considered to contain useful information and shrinking the IMF that contains noise. Therefore, the CCWT method is considered not only enhance the useful signal but also weaken the noisy signal.

4.2 The CWRU dataset

4.2.1 The CWRU dataset description

The experimental data in this paper comes from the rolling bearing test stand shown in Fig. 10. The 6205-2RS JEM SKF deep groove ball bearing is used as the test bearing, and the data is collected under four loads of 0HP, 1HP, 2HP, and 3HP, and the sampling frequency is 12 kHz. Three damage faults made by electro-discharge machining (EDM), namely inner race fault, outer race fault, and ball fault, are included in the experiment. Each fault includes three different degrees of damage with diameters of 0.1778 mm, 0.3556 mm, and 0.5334 mm, as shown in Table 1. In the experiment, 100 samples are intercepted for each fault signal without overlap.

Table 1 The CWRU dataset description

A new model for bearing fault diagnosis based on optimized variational mode decomposition correlation coefficient weight threshold denoising and entropy feature fusion

Abstract

Similar content being viewed by others

Multi-Scale Fault Frequency Extraction Method Based on EEMD for Slewing Bearing Fault Diagnosis

Multi-feature optimized VMD and fusion index for bearing fault diagnosis method

Application of hierarchical symbolic fuzzy entropy and sparse Bayesian ELM to bearing fault diagnosis

Explore related subjects

1 Introduction

2 Theoretical backgrounds

2.1 WHO-VMD

2.2 Power spectrum entropy

2.3 Correlation coefficient

2.4 Entropy features

2.4.1 RCMDE

2.4.2 RCMFDE

2.4.3 RCmvMFE

2.4.4 RCmvMSE

2.4.5 MPE

2.5 Deviation

3 The WHO–VMD–CCWT–EFF

4 Experimental analyses

4.1 Analysis of bearing fault signal

4.1.1 WHO–VMD

4.1.2 CCWT

4.2 The CWRU dataset

4.2.1 The CWRU dataset description

4.2.2 Experimental analysis of single working condition bearing fault diagnosis

4.2.3 Experimental analysis of bearing fault diagnosis under multiple working conditions

4.3 Paderborn university dataset

4.3.1 Dataset description

4.3.2 Experimental analysis of bearing fault diagnosis in single working condition

4.3.3 Experimental analysis of bearing fault diagnosis under multiple working conditions

4.4 Model comparison

5 Conclusions

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation