1 Introduction

Rotating machinery is an indispensable part of industrial society. Most major accidents in rotating machinery occur due to bearing failure. Bearings, considered as one of the most stressed components in rotating machinery, have a significant pulse on the performance and productivity of rotating machinery [1, 2]. Repeated collisions and periodic pulses occur when the bearing fails due to the position deformation of the fault [3]. Early bearing defects first excite resonance frequencies in high-frequency domain [4]. Traditional fault diagnosis techniques mainly include demodulation and envelope-based methods, because they can identify the frequency caused by defects in bearings. However, the main challenge in applying this technology is to correctly select the center frequency and bandwidth to perform the envelope spectrum.

Spectral kurtosis (SK) is widely used as an important tool for effectively selecting the resonant frequency band in the fault diagnosis of bearing vibration signals. Antoni proposed Fast Kurtogram (FK) [5], using 1/3 and two binary forked tree structures, based on much systematic research on spectral kurtosis [6, 7]. Later, Randall and Sawalhi [8] combine the minimum convolution entropy with FK to amplify the pulse signal, making it easier to detect periodic components in the vibration signal. Combet and Gelman [9] improved FK through time-domain synchronous mean for early fault diagnosis of gears. Lei [10, 11] improved FK for early bearing fault diagnosis which has achieved good results. However, FK itself has certain limitations. For example, the segmentation of spectrum is not adaptive for finding the most suitable bandwidth and center frequency. It is worth mentioning that FK does not have an immune effect on accidental pulse. Hence, Protrugram was proposed by Barszcz [12] to find the optimal center frequency of resonance band. Wang [13] proposed adaptive SK and applied it to the mechanical fault diagnosis in [14, 15], which can adaptively estimate the bandwidth. Xu and Tian [16] proposed empirical scanning spectrum kurtosis for the selection of optimal frequency band. Sparsogram [17] is proposed which can avoid the effects of interference. In order to processing signal with low SNR, Autogram [18] based on kurtosis of square envelope unbiased autocorrelation was proposed. Tse [19] improved the Sparsogram by combining genetic algorithms to adaptively determine the optimal center frequency and bandwidth. Unfortunately, it sacrifices the calculation time due to the genetic algorithm. Time–frequency analysis has a very important position in signal decomposition and feature extraction [20]. Similarly, methods such as synchrosqueezed wavelet transform (SWT) and synchrosqueezed Fourier transform (SFT) are widely used for fault detection and classification in bearings [21, 22]. Empirical mode decomposition (EMD) proposed by Huang [23] can adaptively process the non-stationary signal. Inspired by EMD, Gilles developed a new method named empirical wavelet transform (EWT) [24]. The method divides the signal into a plurality of IMFs by a local maximum rule in the Fourier spectrum. The main idea of EWT is to divide the spectrum of the signal by designing multiple filters with indefinite bandwidth for mode decomposition. Kedadouche [25] compared the advantages and disadvantages between EMD and EWT and provided a more reliable basis for the follow-up research of EWT. Then, EWT has been improved by experts and scholars in the direction of fault diagnosis and applied in rotor fault diagnosis [26, 27]. The adaptive parameterless EWT method is proposed in [28] combining quadrature derivative-based normalized Hilbert transform for adaptive mode separation. Wang [29, 30] mathematically formulate EWT as a constrained optimization problem for bearing fault diagnosis. Chen [31] extracts modulated signals by EWT processing orthogonal basis decomposition components and uses wavelet spatial neighboring coefficient denoising with data-driven threshold to improve SNR. Jiang and Zhu [32] presented a decoupling diagnosis method for compound faults of rolling bearings based on empirical wavelet transform-duffing oscillator. The components with mixed modes obtained by EWT are directly observed as chaotic motions through the processing of duffing oscillator, so single fault components are identified from empirical modes one by one. Hu [33, 34] proposed an enhanced EWT (EEWT) to improve the spectrum segmentation by order statistical filter (OSF), which is able to select the optimal resonance frequency band of modulated signal. OSF was also used by Zhang [35] and Xu [36] to construct a new boundary distribution diagram similar to FK. Huang [37] proposed the frequency phase space empirical wavelet transform to divide the Fourier spectrum with the Teager energy distribution as a reference. Zhao [38] uses maximum–minimum length curve method to transform the original vibration signal spectrum to scale space in order to find meaningful clusters. The new method is named MSCEWT. Hsueh [39] et al. proposed a model based on deep CNN and EWT to detect and induce the motor operating state [40]. Ou [41] combines improved particle swarm algorithm, majorization-minimization-based total variation denoising algorithm and EWT to decompose bearing fault signals and extract fault features. Liu and Chen [42] present a general overview of the recent advancements made in research on the EWT algorithm and its state-of-the-art applications in a wide range of areas and providing some comprehensive references for reader concerning with EWT.

In this paper, a novel method named sparsity-guided multi-scale empirical wavelet transform method (SMSEWT) is proposed which can optimize the segmentation method and extract useful frequency band, reduce the number of invalid components, and suppress modal aliasing. In order to divide frequency bands containing similar information into final components, Fourier spectrum will be divided equally and used to calculate kurtosis. Frequency bands with similar kurtosis are considered to contain the same kind of information, which will be combined to achieve adaptive segmentation of the spectrum. Subsequently, empirical wavelet filters will be constructed and the time-domain waveforms of each frequency band can be obtained. Using sparsity to select envelope components containing abundant periodic pulses can diagnose bearing faults. The rest of this paper is structured as follows. Section 2 simply introduced the basic theory of EWT. Section 3 proposed multi-scale EWT that optimizes the spectral segmentation for further construction of EWT filters. In Sect. 4, SMSEWT is proposed for extraction of useful envelope components. Experimental validation and comparisons are performed in Sect. 5.

2 Empirical wavelet transform

2.1 Basic theory of EWT

In empirical wavelet transform, the frequency domain \([0,\uppi ]\) is adaptively divided into \(\mathrm{N}\) continuous parts and expressed by \({\Lambda }_{\mathrm{n}}\). According to the characteristics of Fourier spectrum, it can establish a filter bank to extract different components, as shown in Fig. 1. \({\upomega }_{\mathrm{n}}\) is defined as the boundary of continuous parts:

Fig. 1
figure 1

Empirical wavelet transform filtering method

$$\left\{\begin{array}{c}{\Lambda }_{\mathrm{n}}=\left[{\upomega }_{\mathrm{n}-1},{\upomega }_{\mathrm{n}}\right] \\ \bigcup_{\mathrm{n}=1}^{\mathrm{N}}{\Lambda }_{\mathrm{n}}=[0,\uppi ]\end{array}\right.$$
(1)

EWT consists of an empirical scaling function \({\widehat{{\varnothing }}}_{1}\left(\upomega \right)\) and several empirical wavelet functions \({\widehat{\Psi }}_{\mathrm{n}}\left(\upomega \right)\). The empirical scaling function and empirical wavelets function are expressed by:

$${\widehat{{\varnothing }}}_{\mathrm{n}}\left(\upomega \right)=\left(\begin{array}{ll}1;\quad \left|\upomega \right|\le \left(1-\upgamma \right){\upomega }_{\mathrm{n}}\\ \mathrm{cos}\left[\frac{\uppi }{2}\upbeta \left(\frac{1}{2\upgamma {\upomega }_{\mathrm{n}}}\left(\left|\upomega \right|-\left(1-\upgamma \right){\upomega }_{\mathrm{n}}\right)\right)\right];\\ (1-\gamma ){\upomega }_{\mathrm{n}}\le \left|\upomega \right|\le (1+\gamma ){\upomega }_{\mathrm{n}}\\ 0; \quad \text{others}\end{array}\right.$$
(2)
$${\widehat{\Psi }}_{\mathrm{n}}\left(\upomega \right)=\left(\begin{array}{l}1;\quad \left(1+\upgamma \right){\upomega }_{\mathrm{n}}\le \left|\upomega \right|\le \left(1-\upgamma \right){\upomega }_{\mathrm{n}+1}\\ \mathrm{cos}\left[\frac{\uppi }{2}\upbeta \left(\left(\left|\upomega \right|-\left(1-\upgamma \right){\upomega }_{\mathrm{n}+1}\right)/2\upgamma {\upomega }_{\mathrm{n}+1}\right)\right];\\ \left(1-\upgamma \right){\upomega }_{\mathrm{n}+1}\le \left|\upomega \right|\le \left(1+\upgamma \right){\upomega }_{\mathrm{n}+1}\\ \mathrm{sin}\left[\frac{\uppi }{2}\upbeta \left(\left(\left|\upomega \right|-\left(1-\upgamma \right){\upomega }_{\mathrm{n}+1}\right)/2\upgamma {\upomega }_{\mathrm{n }}\right)\right]; \\ \left(1-\upgamma \right){\upomega }_{\mathrm{n}}\le \left|\upomega \right|\le \left(1+\upgamma \right){\upomega }_{\mathrm{n}} \\ 0;\quad \text{others}\end{array}\right.$$
(3)

where the transition function \(\upbeta (\mathrm{x})\), the coefficient \(\upgamma\), and the transition phase \({\uptau }_{\mathrm{n}}\) are:

$$\left\{\begin{array}{l}\beta \left(\mathrm{x}\right)={\mathrm{x}}^{4}\left(35-84\mathrm{x}+70{\mathrm{x}}^{2}-20{\mathrm{x}}^{3}\right)\\ \gamma <\mathrm{min}\left(\frac{{\upomega }_{\mathrm{n}+1}-{\upomega }_{\mathrm{n}}}{{\upomega }_{\mathrm{n}+1}+{\upomega }_{\mathrm{n}}}\right)\\ {\uptau }_{\mathrm{n}}=\gamma {\upomega }_{\mathrm{n}} , 0<\gamma <1\end{array}\right.$$
(4)

Set the Fourier transform as \(\mathrm{F}(\bullet )\), the inverse Fourier transform is \({\mathrm{F}}^{-1}(\bullet )\). The detail coefficients \({\mathrm{W}}_{\mathrm{f}}^{\upvarepsilon }\) can be defined as:

$${\mathrm{W}}_{\mathrm{f}}^{\upvarepsilon }\left(\mathrm{n},\mathrm{t}\right)= <\mathrm{f},{\Psi }_{\mathrm{n}}>=\int \mathrm{f}(\uptau )\overline{{\Psi }_{\mathrm{n}}\left(\uptau -\mathrm{t}\right)}\mathrm{d}\tau$$
(5)

Calculate the approximation coefficients \({\mathrm{W}}_{\mathrm{f}}^{\upvarepsilon }(0,\mathrm{t})\):

$${\mathrm{W}}_{\mathrm{f}}^{\upvarepsilon }\left(0,\mathrm{t}\right)= <\mathrm{f},{{\varnothing }}_{1}>=\int \mathrm{f}(\uptau )\overline{{{\varnothing } }_{1}\left(\uptau -\mathrm{t}\right)}\mathrm{d}\tau$$
(6)

The empirical modes could be given by:

$${\mathrm{f}}_{0}\left(\mathrm{t}\right)={\mathrm{W}}_{\mathrm{f}}^{\upvarepsilon }\left(0,\mathrm{t}\right)*{{\varnothing }}_{1}\left(\mathrm{t}\right)$$
(7)
$${\mathrm{f}}_{\mathrm{n}}\left(\mathrm{t}\right)={\mathrm{W}}_{\mathrm{f}}^{\upvarepsilon }\left(\mathrm{n},\mathrm{t}\right){*\Psi }_{\mathrm{n}}\left(\mathrm{t}\right)$$
(8)

2.2 The shortcomings of EWT

Gilles proposed some rules to find \({\upomega }_{\mathrm{n}}\), dividing the whole Fourier spectrum into N parts. The most common rule is to use the local maxima of the spectrum to determine the segments. However, N is set by the first several local maxima, where \({\upomega }_{\mathrm{n}}\) is taken as the median of two adjacent maxima. This may separate the resonance sideband into different parts due to the relative concentration of several maxima, so the periodic pulses component is not the best performance. A periodic pulses signal with a center frequency of 1500 Hz and a width of the sideband of 100 Hz is simulated by Eq. 9 and shown in Fig. 2.

Fig. 2
figure 2

The waveform of the simulation signal

$${\mathrm{s}}_{0}\left(\mathrm{t}\right)={\sum }_{\mathrm{i}=1}^{\mathrm{M}}{\mathrm{Ae}}^{-\mathrm{g}2\uppi {\mathrm{f}}_{\mathrm{n}}^{\mathrm{i}}\mathrm{t}}\mathrm{sin}\left(2\uppi {\mathrm{f}}_{\mathrm{n}}^{\mathrm{i}}\mathrm{t}\sqrt{1-{\mathrm{g}}^{2}}\right)$$
(9)

where the amplitude \(A\)=5, the nature frequency \({f}_{n}\)=1600 Hz, the damping coefficient \(g=0.05\), \(\mathrm{M}=100\), and noise is \(\mathrm{SNR}=-5\mathrm{ dB}\).

In Fig. 3, a resonance band appears in the spectrum of the simulation signal. It is necessary to extract the sideband for analysis in order to achieve the purpose of fault diagnosis. The two points of local maxima in frequency need to be selected according to the rule of EWT, which is highlighted by the red point. According to the local maxima rule, the first boundary is located near 777 Hz and the second one is around 1434 Hz, as shown in Fig. 3. Unfortunately, the resonance frequency band deviates greatly from our ideal result, and the ideal resonance frequency band is divided into two parts, weakening the fault feature information.

Fig. 3
figure 3

Ideal boundaries and EWT boundaries

Figure 4 shows the filters constructed by original EWT which divide the whole frequency domain into three parts. Both \({\widehat{\Psi }}_{1}\) and \({\widehat{\Psi }}_{2}\) contain a part of the ideal fragment. As shown in Fig. 5, the periodic pulses are seen vaguely in the second empirical mode with too much noise. Meanwhile, the periodic shock in the third empirical mode has been submerged in the strong high frequency. Obviously, the local maximum rule cannot extract the appropriate resonance sideband in the spectrum due to the limitation of its fragment division, which affects the final extraction effect.

Fig. 4
figure 4

Filters of EWT

Fig. 5
figure 5

Results decomposed by EWT

3 Optimized empirical wavelet transform

The resonance band in the frequency domain can be seen as the generation of a series of pulses. Therefore, by dividing the spectrum into several scales and calculating the kurtosis of each scale, the trend of the spectrum can be depicted, which is a good prerequisite for effectively cutting the frequency domain. In this section, multi-scale empirical wavelet transform (MSEWT) was proposed to introduce the concept of multi-scale kurtosis into frequency domain. The proposed method is shown in Fig. 6, and the detail procession can be described as follows:

Fig. 6
figure 6

Flowcharts of multi-scale empirical wavelet transform

Step 1: Cutting the frequency domain using a given scale. In order to facilitate program editing, \(\upeta\) is introduced to represent scale (scale = \(\upeta {\mathrm{f}}_{\mathrm{q}}\)). When \(1\le\upeta \le 3\), the scale (\({\mathrm{f}}_{\mathrm{q}}\le \mathrm{scale}\le 3{\mathrm{f}}_{\mathrm{q}}\)) is seen as a suitable range.

Step 2: Calculate the kurtosis of each fragment. Since kurtosis is sensitive to accidental impulses, the corresponding kurtosis of the segment will increase when there is an obvious frequency peak in the scale segment. The local kurtosis transformation in the spectrum can reflect the strength of the frequency components in the spectrum. Extract the frequency concentrated part may be able to obtain the desired fault feature components.

Step 3: Fragments merging. The artificial averaging rule is defined by the following formula:

$$({K}_{i}-{\overline{K} }_{n})({K}_{i+1}-{\overline{K} }_{n})\le 0$$
(10)

where \({K}_{i}\) represents the kurtosis of the i-th scale fragment.

If the scale kurtosis of two adjacent satisfies Eq. 10, the middle boundary between the two adjacent scale segments will be retained. If the formula is not satisfied, the two dimensions will be fused together. The two task segments contain the same components and need not be segmented.

Step 4: Construct filter bank based on empirical scale function and empirical wavelet function. Each individual filter in the filter bank represents an independent component. All components make up the original signal. Reconstruct the information in each filter. The original signal will be decomposed into several components located in different frequency bands.

3.1 Case 1 of simulation experiment

When the rolling balls go through the fault position of the bearing outer ring, the pulse will occur. Periodic pulses are often used to simulate bearing failures.

For the signal mentioned in Eq. 9, the spectrum will be divided into 25 parts of which red are the initial boundaries. Kurtosis is used to describe the spectrum trend characteristics to emerge the resonance frequency band. The useless boundaries will be merged by the artificial averaging rule. The blue line in Fig. 7 is the boundaries after merging. After removing the redundant boundaries, only two boundaries are finally determined, which separates the resonance sideband from the whole spectrum. Figure 8 shows the filters constructed by MSEWT.

Fig. 7
figure 7

Boundaries and kurtosis of the spectrum

Fig. 8
figure 8

Filters of MSEWT

As shown in Fig. 8, the empirical scale function and empirical wavelet function are constructed according to the final boundaries. Three components are obtained by MSEWT and shown in Fig. 9, with the second empirical mode containing rich characteristic frequency. The optimized frequency division method can effectively separate the resonance sideband from the signals including noise.

Fig. 9
figure 9

The results decomposed by MSEWT

3.2 Case 2 of simulation experiment

In this section, a cosine signal is added in the periodic pulses, as shown in Fig. 10. The composition of signal 2 is as follows:

Fig. 10
figure 10

Simulated signal

$$\left\{\begin{array}{c}{s}_{\mathrm{c}1}\left(t\right)={\sum }_{i=1}^{M}A{e}^{-g2\pi {f}_{n}^{i}t}\mathrm{sin}\left(2\pi {f}_{n}^{i}t\sqrt{1-{g}^{2}}\right)\\ {s}_{\mathrm{c}2}\left(t\right)=0.5\mathrm{sin}\left(2\pi ft\right)\\ {s}_{2}\left(t\right)={s}_{\mathrm{c}1}\left(t\right)+{s}_{\mathrm{c}2}\left(t\right)+\zeta \end{array}\right.$$
(11)

where the amplitude \(A\)=5, the nature frequency \({f}_{n}\)=1600 Hz, the damping coefficient \(g=0.05\), \(\mathrm{M}=100\), frequency \(f\)=300 Hz, and noise is \(\zeta =(\mathrm{SNR}-3\mathrm{ dB})\).

The spectrum separated by scaled boundaries is shown in Fig. 11. It can be seen that the frequency of 300 Hz is quarantined into the second fragment, which should be extracted for further analysis. The kurtosis is high in the second fragment regions because of its sensitivity to protrusions, shown in Fig. 11b. The dotted blue lines divide the frequency domain into five parts, exactly separating the single frequency component and the periodic pulse component according to the artificial averaging rule.

Fig. 11
figure 11

Spectrum and initial boundaries of signal

Then, one scale function and four empirical wavelet functions are constructed, as shown in Fig. 12a. The components are displayed, and the single frequency component and the periodic pulse component are effectively extracted in the second and fourth part. The result shows that the proposed spectrum partition method can effectively divide the frequency band containing a single frequency component.

Fig. 12
figure 12

The results decomposed by MSEWT

3.3 Case 3 of simulation experiment

In this section, the simulation signal 3 (shown in Fig. 13) including periodic impulse signal, modulation signal, and accidental impulse signal is constructed to further verify the effectiveness of the method in segment segmentation. The composition of the simulation signal is as follows:

Fig. 13
figure 13

Simulated signal

$$\left\{\begin{array}{c}{s}_{\mathrm{c}1}\left(t\right)={\sum }_{i=1}^{M}5{e}^{-{g}_{1}2\pi {f}_{n1}^{i}t}\mathrm{sin}\left(2\pi {f}_{n1}^{i}t\sqrt{1-{{g}_{1}}^{2}}\right)\\ {s}_{\mathrm{c}2}\left(t\right)=0.8\mathrm{sin}\left(2\pi \bullet 3000t\right)\left(1+\mathrm{cos}\left(2\pi \bullet ft\right)\right)\\ {s}_{\mathrm{c}3}\left(t\right)=150{e}^{-{g}_{2}2\pi {f}_{n2}^{i}t}\mathrm{sin}\left(2\pi {f}_{n2}^{i}t\sqrt{1-{{g}_{2}}^{2}}\right)\\ {s}_{2}\left(t\right)={s}_{\mathrm{c}1}\left(t\right)+{s}_{\mathrm{c}2}\left(t\right)+{s}_{\mathrm{c}3}\left(t\right)+\zeta \end{array}\right.$$
(12)

where the nature frequency is \({f}_{n1}=1600\) Hz, \({f}_{n2}=4000\) Hz. The damping coefficient \({g}_{1}=0.05\),\({g}_{2}=0.1\), \(\mathrm{M}=100\), frequency \(f\)=150 Hz, and noise is \(\zeta =(\mathrm{SNR}-2\mathrm{ dB})\).

The spectrum of signal 3 is shown in Fig. 14a, which was separated into 25 parts. The results decomposed by the proposed method are shown in Fig. 14b. MSEWT obtained four blue dotted lines as the boundaries. Then, filters are obtained according to the boundaries which divide the spectrum into five parts. Figure 15 shows the components and their spectrum. The periodic pulse information is extracted. The periodic characteristics can be well restored in extracted component after eliminating noise interference. The modulation signal is separated by the third band-pass filter. The waveforms of the two signals are basically identical, and obvious modulation frequency spectrum lines appear in the spectrum. Signal 3 demonstrates that the MSEWT can effectively separate the periodic pulse component, the accidental pulse component, and the modulation component (Fig. 16).

Fig. 14
figure 14

Spectrum and initial boundaries of signal

Fig. 15
figure 15

The results decomposed by MSEWT

Fig. 16
figure 16

The SMSEWT method

4 Fault feature extraction method based on SMSEWT

The MSEWT method proposed in Sect. 3 obtains more reasonable boundaries and components, but it is necessary to extract required information from them. The analysis of envelope spectrum is normally used to diagnose bearing faults in engineering applications according to its characteristic frequency and harmonics. The bearing fault characteristic frequency and its harmonics shown in the envelope spectrum can be regarded as a few spikes with large amplitudes that reflect bearing fault signatures. Hence, [17] defines these spikes in the envelope spectrum as the sparsity representation of a bearing fault signal in a frequency domain. Sparsity measurement used in envelope spectrum shows good performance, which avoids the accidental interference in spectrum. The paper proposed sparsity-guided MSEWT (SMSEWT) to extract information for bearing fault diagnosis. More importantly, in order to obtain the envelope of the signal directly, we incorporate Hilbert transform into the filtering structure of SMSEWT.

4.1 The theoretical basis of sparsity

Sparsity has been applied in many fields. The basic expression can be described as:

$$S=\frac{\sqrt{{\sum }_{f=0}^{\frac{Fs}{2}-1}{\left|d(f)\right|}^{2}}}{{\sum }_{f=0}^{\frac{Fs}{2}-1}\left|d(f)\right|}=\frac{{\left|\left|d(f)\right|\right|}_{2}}{{\left|\left|d(f)\right|\right|}_{1}}$$
(13)

where \({\left|\left|\mathrm{d}(\mathrm{f})\right|\right|}_{2}\) and \({\left|\left|\mathrm{d}(\mathrm{f})\right|\right|}_{1}\) are L2 norm and L1 norm, and \(\mathrm{d}(\mathrm{f})\) is the corresponding power spectrum of the envelope. It is worth mentioning that the sparsity of bearing signal needs to be calculated in the envelope spectrum, which can be regarded as a sparsity signal. However, the proposed method uses Hilbert transform algorithm to obtain the envelope spectrum.

4.2 The theoretical basis of Hilbert transform

Much mechanical fault information exists in the form of modulation in vibration and noise. For the processing of modulated signal, there are two main purposes: obtaining the envelope and phase demodulation. In digital signal processing, this processing is particularly convenient with the help of Hilbert transform. The modulation signal is assumed to be in the form of:

$$x(t)=A(t)\mathrm{cos}(2\pi {f}_{0}t+\varphi (t))$$
(14)

Its Hilbert transformation can be expressed as:

$$\widehat{x}(t)=\int_{-\infty }^{\infty }\frac{x(\tau )}{t-\tau }d\tau$$
(15)

Thus the envelope estimation of \(x(t)\) can be obtained by:

$$\overline{A }\left(t\right)=\left|x\left(t\right)+\widehat{x}(t)\right|$$
(16)

Then, the envelope spectrum can be obtained by calculating the FFT of \(\overline{A }\left(t\right)\).

4.3 Connection of two methods

The sparsity cannot be calculated without the corresponding power spectrum of the envelope \(\mathrm{d}(\mathrm{f})\) obtained by the FFT of \(\overline{A }\left(t\right)\)’s autocorrelation. Interestingly, \(\mathrm{d}(\mathrm{f})\) can be worked out by EWT because of the coexist calculation of FFT. It will reduce a cycle time of FFT by combining Hilbert transform when the component is extracted by filter. Therefore, the empirical envelope signal is obtained from EWT.

$${f}_{e0}\left(t\right)={H[{W}_{f}^{\varepsilon }\left(0,t\right)*{\varnothing }_{1}\left(t\right)]}^{\vee }$$
(17)
$${f}_{en}\left(t\right)={H[{W}_{f}^{\varepsilon }\left(n,t\right)*{\Psi }_{n}\left(t\right)]}^{\vee }$$
(18)

where \({[\bullet ]}^{\vee }\), \(\mathrm{H}[\bullet ]\) denotes Fourier transform and Hilbert transform.

4.4 The proposed sparsity-guided multi-scale EWT

In this paper, a new method named sparsity-guided multi-scale EWT (SMSEWT) is proposed. Firstly, the vibration acceleration signals need to be collected by sensors and saved as digital signals. Then, set the scale and divide the entire spectrum equally. In the sample shown in Fig. 6, the initial boundaries divide the spectrum into 25 parts. After calculating the kurtosis for each frequency band, the average kurtosis can be obtained. It can be found that the kurtosis of most frequency bands is lower than average kurtosis, and a small number of continuous frequency bands have large values. These distinctive contiguous frequency bands will be merged according to the rules, and the final boundaries will be determined. After constructing the filter bank with empirical scale function and empirical wavelet function, a set of components will be obtained. Figure 6 shows the sparsity of the envelope spectrum of each component, and the component with the largest value will be extracted. There are periodic pulses in the time-domain waveform, and the envelope spectrum contains characteristic frequencies and harmonics. The fault diagnosis is realized by further analysis.

5 Experimental Verification

5.1 Experimental verification of Case 1

As shown in Fig. 17, it is a schematic diagram of a bearing failure test bench, which can simulate the bearing fault type of the inner race, outer race, and balling. In the first case, SMSEWT is applied to the diagnosis of bearing faults in the rotor test rig. The collected signal shown in Fig. 18 comes from 6397 deep groove ball bearing with outer ring fault accompanied by characteristic frequencies of 76.08 Hz.

Fig. 17
figure 17

Bearing fault test rig

Fig. 18
figure 18

Vibration signal of bearing failure of outer race

The bearing type and fault characteristic frequency is detailed in Table 1. The sampling frequency is 15360 Hz. An obvious resonance sideband near 2000 Hz needs to be extracted for bearing fault analysis.

Table 1 Characteristics frequency of 6307 bearing fault

As shown in Fig. 19, boundaries obtained by SMSEWT divide the whole frequency domain into seven parts according to the kurtosis of each part. The empirical wavelet filters are constructed based on the obtained boundaries. According to the proposed method, the sparsity of envelope signal is calculated to judge the fault information contained in this part. The maximum sparsity of envelope decomposition appears in the fourth part. Extracting this part, the envelope spectrum is obtained by FFT plotted in Fig. 19d. The extracted envelope signal embodies obvious periodic pulse in the time domain. At the same time, the fault characteristic frequency 76.88 Hz and its harmonics are obtained clearly in the envelope spectrum. The proposed method can effectively extract the bearing outer ring fault characteristic and successfully realize the bearing fault diagnosis.

Fig. 19
figure 19

The results decomposed by SMSEWT

In order to further verify the effectiveness of this method, Sparsogram, Fast Kurtogram (FK), and Protrugram are used for comparison. Figure 20 shows the maximum sparsity found for the Sparsogram at different decomposition levels (level = 6 and 7). When the number of decomposition levels is 6, the center frequency and bandwidth corresponding to the resonance frequency band obtained by the maximum sparsity are 1980 Hz and 120 Hz. Figure 20b shows the extracted fault feature envelope component. It is found that the envelope spectrum of the component has the pulse of the periodic interval but only the fault characteristic frequency and its second harmonics can be detected. The center frequency of the finally determined resonance frequency band is located at 7530 Hz, and the bandwidth is 60 Hz, when the level is 7. The result is completely different from the range determined when the level is 6. However, the fault signature frequency cannot be found in the envelope spectrum, and the extracted envelope signal has no regularity. The Sparsogram always goes to a narrower bandwidth in the frequency domain segmentation. As the level increases, the bandwidth will become smaller, and at the same time, the accuracy of the center frequency of the resonance band will decrease.

Fig. 20
figure 20

Results decomposed by Sparsogram

The filter-based FK is shown in Fig. 21 with a center frequency of 2080 Hz and a bandwidth of 320 Hz. Because of the deviation of the determined center frequency and the narrow bandwidth, the envelope component extracted contains only the fault characteristic frequency and its second harmonics, and the fault information is obviously less than that extracted by SMSEWT. The same result occurs in STFT-based FK. Considering comprehensively, the frequency segment extracted by Fast Kurtogram contains less fault feature information than SMSEWT because of the limitation of frequency band division.

Fig. 21
figure 21

Results decomposed by Fast Kurtogram

The results of Protrugram are plotted in Fig. 22a, and it can be seen that the maximum value appears at 1795 Hz with the bandwidth which is 250 Hz. The envelope component extracted in Fig. 22b also contains inadequate fault feature information. Although this method can obtain more accurate central frequency, it sacrifices a lot of computing time with the scanning process.

Fig. 22
figure 22

Results decomposed by Protrugram

5.2 Experimental verification of Case 2

In actual industrial production, some accidental effects caused by man-made or mechanical collisions are regularly mixed in vibration signals, which cause great difficulties for bearing fault diagnosis results. In this part, a fault vibration signal of bearing outer race containing accidental effects is obtained from HZXT-008, the small rotor rolling bearing test bench, shown in Fig. 23a. Figure 23b shows the location of the vibration sensor’s measuring points in the actual acquisition: horizontal and vertical directions on the side near the planetary gearbox. In order to obtain vibration signals with more obvious fault characteristic frequencies, the signals collected by vertical sensors are analyzed to verify the effectiveness of the proposed method. The fault characteristic frequency of the measured bearing with the type of NSK HPS 6200 is shown in Table 2. The fault vibration signal of the bearing outer ring is obtained with the characteristic frequency of 76.20 Hz. The speed of motor is set to 1500 rpm, the sampling frequency is 12000 Hz, and the length of sampling sequence is 24000 (2 s sampling time). Figure 24 shows the collected vibration signal and its spectrum from the HZXT-008 rotor test bench. In the time domain, it can be found that there is a large accidental interference at 1.9 s, which will have a great influence on the kurtosis value. Affected by large disturbances, the value of the frequency in the frequency domain fluctuates within the range of 2500 Hz to 4000 Hz. If the segment filtered signal is obtained for analysis, an unsatisfactory analysis result may be obtained to make the diagnosis result fail. Four methods, SMSEWT, Sparsogram, FK, and Protrugram, were applied to analyze the signal with interference, and the results of the analysis were compared as follows.

Fig. 23
figure 23

HZXT-008 small rotor test bench

Table 2 Characteristics frequency of NSK HPS 6200 bearing fault
Fig. 24
figure 24

Vibration signal of bearing failure of outer race with accidental pulse

Figure 25 shows the SMSEWT process. The sparsity of the two parts is higher, which correspond to 0–100 Hz filtered by \({{\varnothing }}_{1}\) and 900–1200 Hz filtered by \({\Psi }_{2}\). The sparsity of the envelope component obtained by \({\Psi }_{2}\) is the largest, indicating the most faulty feature information. In the frequency domain obtained by \({\Psi }_{2}\), the frequency 76.25 Hz and its second, third harmonics can be found which are close to the outer race fault characteristic frequency. These characteristic frequencies are sufficiently significant to diagnose the bearing outer ring failure. However, the filter \({{\varnothing }}_{1}\) extracts the envelope component of the low-frequency component containing more abundant power frequency information. The same as the former, although the envelope contains the interference component in the time domain, the envelope spectrum can show the periodicity. This fully proves that the proposed method can effectively avoid the accidental interference and successfully extract the bearing fault information to realize the bearing fault diagnosis.

Fig. 25
figure 25

Results decomposed by SMSEWT

As shown in Fig. 26, when level is equal to 4, the center frequency and bandwidth of the resonance band corresponding to the maximum value obtained by Sparsogram are 937.5 Hz and 375 Hz. The resonance band range acquired is slightly lower than that obtained by SMSEWT. However, the obtained fault characteristic frequencies are found without any harmonic frequencies in the envelope spectrum. It is worth mentioning that when level is equal to 7, the whole spectrum was divided into smaller segments. The position of resonance band is determined to a lower frequency band. No periodic impulse information in the envelope waveform is obtained without any obvious fault characteristic frequency found in the envelope spectrum. The result cannot provide reliable data basis for bearing fault diagnosis.

Fig. 26
figure 26

Results decomposed by Sparsogram

The results of the FK are plotted in Fig. 27. By dividing of filter-based FK, the maximum kurtosis corresponding to the band range is determined in the middle of the entire frequency domain (2000 Hz ~ 4000 Hz). For STFT-based FK, the obtained frequency domain is located in 3000 Hz ~ 4500 Hz. Figure 27b shows the envelope signal and its envelope spectrum obtained by the filter-based FK. There is a prominent accidental interference in the envelope signal of the time domain. Since the amplitude of the interference is large, the vibration information of bearing fault cannot be clearly expressed. The frequency of the envelope spectrum has a trend of fluctuate due to the influence of interference, though the power frequency and the characteristic frequency appear in envelope spectrum. Obviously, the extraction performance of this method is unsatisfactory. The same situation also appears in Fig. 27d, the envelope spectrum of components obtained by STFT-based FK. Therefore, FK loses its ability to determine the resonance sideband for the bearing vibration signal with accidentally disturbed. At the same time, it confirms that kurtosis has strong sensitivity to accidental pulse.

Fig. 27
figure 27

Results decomposed by FK

The analysis results of the Protrugram are described in Fig. 28, with the filter bandwidth of 230 Hz and the step size of 1. The center frequency found by this method is 1322 Hz. Protrugram estimates the center frequency by calculating the spectral kurtosis of the filtered signal, avoiding the effects of accidental interference. However, due to the fixed bandwidth, only fault characteristic frequency and the power frequency with its second harmonics appear in the obtained envelope spectrum, and the extraction effect is obviously inferior to SMSEWT.

Fig. 28
figure 28

The results decomposed by Protrugram: a kurtosis curve by Protrugram; b extracted component of maximum kurtosis in spectrum

In engineering applications, the collected vibration signal is often mixed with accidental interference. By analyzing the bearing vibration signal with accidental pulse, it is fully verified that the proposed SMSEWT can effectively avoid the influence of accidental interference and successfully extract the bearing fault characteristic frequency.

6 Conclusion

In this paper, a new method named sparsity-guided multi-scale empirical wavelet transform (SMSEWT) is proposed for rolling element bearing fault features extraction. Two major parts are included in the proposed algorithm. The first part is to improve the segmentation of the empirical wavelet method. By defining the concept of multi-scale kurtosis, the spectrum was divided into several sub-parts and the kurtosis of each sub-portion was obtained. Then, the boundary merging was carried out by the artificial averaging rule, and the boundary adaptive rational division was realized. Three simulated signals were used to verify the validity of the boundary partitioning method. The second part is the extraction of the special effective component of the fault. The sparsity was selected as the basis for judging the regularity component of the fault feature to determine whether the segment was extracted. The highlight of this part is that the Hilbert transform and EWT are combined properly to extract the envelope information of the fault feature component directly, which optimizes the calculation time. In this study, two experimental signals were used to verify the effectiveness of the proposed method.

This method optimizes the segmentation of EWT and makes it more suitable for analyzing bearing vibration signals. With adaptability, it avoids the influence of traditional methods. This method can be extended to fault diagnosis for gear, providing a novel and effective diagnostic method for fault diagnosis of rotor machinery.