1 Introduction

Research on fault detection of rotating machinery has drawn much attention in recent years. Gear and bearings are the major parts of rotating machinery and widely used in many industrial fields. These essential parts appear failure easily owing to the adverse working conditions of rotating machinery [1]. Hence, to avoid the expensive maintenance cost and ensure the safety and stability running of machines, timely diagnose faults appeared on gear and bearings are valuable. Furthermore, when a partial fault developed in gear or bearing, the measured vibration signal is usually characterized by nonlinear and nonstationary and its fault features are easily submerged by stochastic noise, so it is difficult to acquire efficiently fault signatures from vibration waveform solely using the time domain or frequency domain analysis [2]. Based on that, it is very necessary to put forward an effective method for detecting faults.

Currently, time–frequency analysis (TFA) has been proved to be a well-accepted technique since it can reveal time–frequency characteristics simultaneously [3]. Typical TFA algorithms have short-time Fourier transform (STFT) [4], Wigner–Ville distribution (WVD) [5] and Wavelet transform (WT) [6]. However, these techniques have their own limits for nonstationary signal processing. For instance, STFT suffers from low resolution and frequency trajectory blurs. For the superposed signal, WVD can obtain a favorable time–frequency pattern, but the inherent cross-term restricts its application field. WT has higher frequency resolution, but it is short of self-adapted ability because of the artificial selection of wavelet basis and levels. Thus, motivated by the above algorithms, some adaptive signal processing techniques are developed, such as empirical mode decomposition (EMD) [7], local mean decomposition (LMD) [8], intrinsic timescale decomposition (ITD) [9], empirical wavelet transform (EWT) [10], and variational mode decomposition (VMD) [11]. The literature survey shows that five ways (i.e., EMD, LMD, ITD, EWT, and VMD) have been smoothly applied in the field of fault detection [12,13,14,15]. Nevertheless, there are still some terrible problems in these developed self-adaptive methods. For instance, in the practical application of EMD, some drawbacks such as the end effect, mode mixing, envelope overshoot, and undershoot will emerge. As a perfection of EMD, LMD also encounters the end effect and mode mixing issue. Also, decomposition ability of LMD greatly rests with the step length choice of moving average, and the improper moving step length will produce the imprecise decomposition results. ITD is quite appropriate for the disposing of nonstationary signals, but it is liable to result in a waveform burr and curve distortion since the utilization of linear transformation. EWT is a useful tool for mechanical fault detection, but it has the segmentation problem of Fourier spectrum. VMD can be effectively used for the denoising of nonstationary signal, but performance of VMD largely relies on its preset mode number. Consequently, research on a novel and effective TFA technique is the highlight of this article.

Lately, Bonizzi et al. [16] presented a new adaptive signal processing algorithm, which is called singular spectrum decomposition (SSD) that is able to divide a hybrid signal into a sequence of singular spectrum components (SSCs) and reduce immensely the emergence of false content. At present, SSD has been successfully applied to data analysis of ocean field and ECG signal [17], but rarely used in fault detection. In view of this, this paper introduces SSD to analyze the real fault data. Unpleasantly, as with EMD, SSD may also possess the end effect in the practical application. Therefore, to avert this phenomenon, the improved singular spectrum decomposition (ISSD) is raised, which is able to suppress effectively the end effect and decompose a combined signal into several SSCs from high to low frequency. To further extract the feature signatures of SSCs, the appropriate spectrum analysis technique needs to be applied.

1.5-dimensional spectrum is the particular form in higher-order spectral analysis, which can enhance the frequency ingredient of transient pulse and has excellent anti-noise performance [18]. Currently, 1.5-dimensional spectrum has been studied and some achievements for fault feature extraction have been received. For example, Chen et al. [19] first used ensemble empirical mode decomposition (EEMD) to obtain several intrinsic mode functions (IMFs). Then, the obtained IMFs are subject to 1.5-dimensional spectrum analysis to detect the incipient gear crack faults. Jiang et al. [20] first employed adaptive lifting multiwavelet packet to decompose the vibration signal into a series of frequency bands, and then, 1.5-dimensional spectrum of the optimal frequency bands is computed to accomplish the fault detection of rolling bearing. Cai and Li [21] first applied EMD to achieve several IMFs, and then, each of the IMFs is devoted to 1.5-dimensional spectrum analysis. Lastly, all 1.5-dimensional spectrum results are reconstructed to detect gear faults. For another, Teager energy operator (TEO) is effective in extracting instantaneous energy signal, which can weigh accurately the change of signal’s total energy and is appropriate for retrieving the impact characteristics hidden in vibration signal [22, 23]. So far, some works about TEO have been conducted. For instance, Zeng et al. [24] proposed a normalized complex Teager energy operator (NCTEO) to obtain the time–frequency information of vibration signal, which can give a reliable diagnosis result for the rotor rubbing fault. Zhang et al. [25] first adopted resonance-based signal sparse decomposition (RSSD) to achieve the optimal resonance components, and then, TEO is devoted to extracting the fault characteristics and realize the compound fault diagnosis of rotating machinery. Bozchalooi and Liang [26] utilized splendidly the TEO to extract the modulated information of gear vibration signal and determine availably the type of gear faults. Based on that, integrating the merits of 1.5-dimensional spectrum and TEO, Tang and Wang [27] proposes a new spectrum analysis called 1.5-dimensional energy spectrum, which can be both enhance the characteristic frequency and inhibit the stochastic noise. Therefore, 1.5-dimensional energy spectrum is employed to extract the fault information of SSCs in this paper.

The remainder of this paper is organized as follows. Section 2 describes the SSD method and introduces the detail steps of ISSD. Moreover, decomposition capability of ISSD is verified by using the numerical signal and rotor oil whirl signal. In Sect. 3, a novel sensitive index is mentioned. Section 4 reviews briefly the theory of 1.5-dimensional energy spectrum and provides the realization process of the proposed detection framework. In Sect. 5, the proposed method is applied in the fault diagnosis of gear and rolling bearing, which validates that the provided method is efficient. The conclusions are given in Sect. 6, and some future works are provided.

2 ISSD method

2.1 SSD method

SSD is a new self-adaptive time–frequency analysis technique, which can decompose a combined signal into a sum of SSCs and a residual term. For a mixed signal \(x(n)\), the procedure of SSD is described as follows

  1. 1.

    Modify and rearrange the standard trajectory matrix in singular spectrum analysis (SSA). Assume that N is data length of given signal \(x(n)\) and M is the embedded dimension. The \(M \times N\) matrix can be constructed as \(X = [x_{1}^{\text{T}} ,x_{2}^{\text{T}} \ldots ,x_{M}^{\text{T}} ]^{\text{T}}\), where \(x_{i} = (x(i), \ldots ,x(N),x(1), \ldots ,x(i - 1))\) represents the ith row of the established matrix X and \(i = 1, \ldots ,M\). An example of how the elements of the trajectory matrix are rearranged before performing diagonal averaging is described as: For a given time series \(x(n) = \{ 1,\,2,\,3,\,4,\,5\}\) with embedded dimension \(M = 3\), the corresponding trajectory matrix X will be expressed as

    $$X = \left[ {\begin{array}{*{20}c} 1 &\quad 2 &\quad 3 \\ 2 &\quad 3 &\quad 4 \\ 3 &\quad 4 &\quad 5 \\ \end{array} \left| {\begin{array}{*{20}l} 4 \hfill &\quad 5 \hfill \\ 5 \hfill &\quad 1 \hfill \\ 1 \hfill &\quad 2 \hfill \\ \end{array} } \right.} \right]$$
    (1)

    where the left margin \(3 \times 3\) matrix amounts to the standard trajectory matrix X used in singular spectrum analysis (SSA) [28]. In order to enhance the oscillation content of the time series and guarantee the diminish of energy of the residual term, a novel trajectory matrix is rearranged as

    $$X = \left[ {\left. {\begin{array}{*{20}l} {} \hfill & {} \hfill &\quad 1 \hfill \\ {} \hfill &\quad 1 \hfill &\quad 2 \hfill \\ 1 \hfill &\quad 2 \hfill &\quad 3 \hfill \\ 2 \hfill &\quad 3 \hfill &\quad 4 \hfill \\ 3 \hfill &\quad 4 \hfill &\quad 5 \hfill \\ \end{array} } \right|\begin{array}{*{20}l} {} \hfill & {} \hfill \\ {} \hfill & {} \hfill \\ 4 \hfill &\quad 5 \hfill \\ 5 \hfill &\quad * \hfill \\ * \hfill &\quad * \hfill \\ \end{array} } \right]$$
    (2)

    where \(*\) represents the previous positions of the elements which moved to the right top of the left margin \(3 \times 3\) matrix. Each cross-diagonal has the same number of elements, so as to perform the average along the ith cross-diagonal of \(X\).

  2. 2.

    Select adaptively the size of embedded dimension \(M\) at iteration \(j\). The idea of choosing the embedding dimension is based on the dominant frequency of the residual at a given iteration j, because the dominant frequency is usually supposed to represent the main periodic component in a signal. To begin with, calculate the power spectral density (PSD) of the residual term at iteration \(j\), i.e.,\(\upsilon_{j} (n) = x(n) - \sum\nolimits_{k = 1}^{j - 1} {\upsilon_{k} } (n),(\upsilon_{0} (n) = x(n))\). Next, estimate the frequency \(f_{{\rm max} }\) corresponding to the most prominent peak of PSD. For the first iteration, if the normalized frequency \(f_{{\rm max} } /F_{\text{s}}\) is less than the given threshold \(10^{ - 3}\) (\(F_{\text{s}}\) is the sampling frequency), a sizable trend is considered as the residual term and \(M\) is set as \({\text{floor}}\,(N /3)\) [29]. Otherwise, for iterations \(j > 1\), the embedded dimension \(M\) is given as

    $$M = 1.2 \times \frac{{F_{\text{s}} }}{{f_{{\rm max} } }}$$
    (3)

    where \(f_{{\rm max} } /F_{\text{s}}\) denotes the main period in number of samples. Please find more details in [16] about the self-adaptive selection of the embedded dimension \(M\).

  3. 3.

    Obtain successively the jth SSCs from the high to low frequencies. For the first iteration, when a sizable trend is estimated, the first left and right eigenvectors are employed to get \(g^{(1)} (n)\), so that \(X_{1} = \sigma_{1} u_{1} \upsilon_{1}^{\text{T}}\), and \(g^{(1)} (n)\) is derived from diagonal averaging of X1. For iterations \(j > 1\), \(g^{(1)} (n)\) also needs to be obtained. As we all know, the frequency content of \(g^{(1)} (n)\) is mainly concentrated in the frequency band \([f_{{\rm max} } - \Delta f,f_{{\rm max} } + \Delta f]\), where \(\Delta f\) denotes the half-band width of the prominent peak in the PSD of the residual term. Therefore, based on all eigentriples whose left eigenvectors have the prominent peak in the frequency band \([f_{{\rm max} } - \Delta f,f_{{\rm max} } + \Delta f]\) and one eigentriple containing the most energy of the prominent peak, a subset \(I_{j} (I_{j} = \{ i_{1} , \ldots ,i_{p} \} )\) is built. Then, the corresponding component signal is retrieved by diagonal averaging of the matrix \(X_{Ij} = X_{i1} + \cdots + X_{ip}\) along the diagonal lines. In this step, to better estimate the width of the prominent peak, a spectral model is used to depict the PSD profile, which is constructed by three Gaussian functions. The spectral model is defined as follows

    $$\gamma (f,\theta ) = \sum\limits_{i = 1}^{3} {A_{i} } {\text{e}}^{{ - \frac{{(f - u_{i} )^{2} }}{{2\sigma_{i}^{2} }}}}$$
    (4)

    where \(A_{i}\) is the coefficient of the ith Gaussian function, and \(\sigma_{i}\) and \(u_{i}\) are the width and position, respectively. \(\theta = [A\sigma ]^{\text{T}}\) is the parameter vector, where \(A = [A_{1} ,A_{2} ,A_{3} ]\) and \(\sigma = [\sigma_{1} ,\sigma_{2} ,\sigma_{3} ]\). In Eq. (4), the first Gaussian function denotes the most prominent spectral peak, the second function records the second highest spectral peak, while the third functions records all the peaks in between the first and second prominent spectral peaks. That is,

    $$u_{1} = f_{{\rm max} } ,\,u_{2} = f_{2} ,\,u_{3} = \frac{{f_{{\rm max} } + f_{ 2} }}{ 2}$$
    (5)

    The model parameters can be acquired by weighted least squares fitting of the model. The initial parameter values of the model fitting are given as follows

    $$\left\{ {\begin{array}{*{20}l} {A_{1}^{(0)} = \frac{1}{2}{\text{PSD}}(f_{{\rm max} } ),\,\sigma_{1}^{(0)} = f:{\text{PSD}}(f) = \frac{2}{3}{\text{PSD}}(f_{{\rm max} } )} \hfill \\ {A_{2}^{(0)} = \frac{1}{2}{\text{PSD}}(f_{2} ),\,\sigma_{2}^{(0)} = f:{\text{PSD}}(f) = \frac{2}{3}{\text{PSD}}(f_{2} )} \hfill \\ {A_{3}^{(0)} = \frac{1}{4}{\text{PSD}}(f_{3} ),\,\sigma_{3}^{(0)} = 4\left| {f_{{\rm max} } - f_{2} } \right|} \hfill \\ \end{array} } \right.$$
    (6)

    In the optimization process, Levenberg–Marquardt algorithm is used to obtain the optimal values. If the value of \(\sigma_{1}\) is given, the value of \(\Delta f\) is then stated as \(\Delta f\) = 2.5 \(\sigma_{1}\). Moreover, to retrieve the jth components, the second iteration is performed. Among this process, a scaling factor \(\hat{a}\) is applied to adjust the difference value between \(g^{(j)} (n)\) and the residual item \(\upsilon^{(j)} (n)\), and its expression is as follows

    $$\hat{a} = \hbox{min} \left\| {\upsilon^{(j)} (n) - a\tilde{g}^{(j)} (n)} \right\|_{2}^{2}$$
    (7)

    where \(\hat{a} = (g^{\text{T}} \upsilon ) /(g^{\text{T}} g)\) and \(\tilde{g}^{(j)} (n) = \hat{a}g^{(j)} (n)\).

  4. 4.

    Set the stopping criterion of the decomposition process. Separate \(\tilde{g}^{(j)} (n)\) from the estimated signal \(\upsilon^{(j)} (n)\) to acquire the resulting signal \(\upsilon^{(j + 1)} (n) = \upsilon^{(j)} (n) - \tilde{g}^{(j)} (n)\), which denotes the input to the next iteration \(j + 1\). Next, the normalized mean square error (NMSE) between the resulting signal \(\upsilon^{(j + 1)} (n)\) and the given signal \(x(n)\) is computed that is

    $${\text{NMSE}}^{(j)} = \frac{{\sum\nolimits_{i = 1}^{N} {\left( {\upsilon^{(j + 1)} (i)^{2} } \right)} }}{{\sum\nolimits_{i = 1}^{N} {\left( {x(i)} \right)^{2} } }}$$
    (8)

    SSD is terminated when NMSE is lesser than the given threshold \({\text{th}} = 1\%\). Finally, the given signal \(x(n)\) is decomposed into the sum of SSCs and the residual \(\upsilon^{(m + 1)} (n)\)

    $$x(n) = \sum\limits_{k = 1}^{m} {\tilde{g}^{(k)} (n) + \upsilon^{(m + 1)} (n)}$$
    (9)

    where m is the amount of SSCs and \(\tilde{g}^{(k)} (n)\) is the kth SSC. The detailed steps about the original SSD algorithm can be found in Ref. [16].

2.2 ISSD method

SSD is a new nonparametric time–frequency analysis method, which was successfully applied to the processing of low field potential data. SSD method can adaptively divide a multi-component signal into several SSCs independent of each other, whose instantaneous features have physical meaning. Sad to say, the end effect may appear in SSD, which is concerned to the stand or fall of decomposition results. Therefore, for the purpose of avoiding this phenomenon, the appropriate extension technique should be taken to handle the signal’s boundary. Currently, some extension patterns include extreme continuation, data prediction, and waveform matching [30]. However, extreme continuation algorithms utilize merely the extreme value point of both ends of the initial signal to accomplish the extension procedure, so it cannot reflect exactly the natural trend of raw data. Data prediction approach, such as neural network (BP) [31], support vector machine (SVM) [32], and auto-regressive (AR) model [33], can suppress efficiently the end effect. Nevertheless, these methods are time-consuming and greatly affected by prediction accuracy and data length. Waveform matching method is utilized by Hu et al. [30] and Li et al. [34] in end effect suppression. Among this method, all similar waveforms are firstly searched and then determine the similar waveform which is the best match with signal boundary; namely, the optimal matching waveform is found. For the last step, data before (after) the optimal matching waveform are connected to the left and right side of the original signal, so that the extension waveform conform to the changing trend of raw data as much as possible. As illustrated in Fig. 1, \(M_{i} (i = 1,2,3, \ldots )\) is the maximum values of the given signal x(t), corresponding to time \(tm_{i}\), and \(N_{i}\) is the minimum values of the given signal x(t), corresponding to time \(tn_{i}\). S1 denotes the left boundary point of a given signal x(t). The triangular waveform \(S_{1} - M_{1} - N_{1}\) is deemed as the characteristic waveform to look for the best matching waveform in given signal x(t), and data before (after) the best matching waveform are regarded as the left (right) extension of signal x(t). Therefore, based on the merit of waveform matching extension method, a revised SSD version, called ISSD, is proposed to decompose the multi-component signal and alleviate the border effect. For a mixed signal x(t), the specific steps of ISSD method are elaborated as follows:

Fig. 1
figure 1

Schematic diagram of waveform matching extension

  1. 1.

    Take the left extension as an example, hunt for all extreme points and endpoint of mixed signal x(t) via extremum seeking algorithm, and determine the characteristic waveform \(S_{1} - M_{1} - N_{1}\) shown in Fig. 1a.

  2. 2.

    Search all matching waveforms similar to the characteristic waveform \(S_{1} - M_{1} - N_{1}\) according to the corresponding time \(ts_{i}\) of the starting value \(S_{i}\) of matching waveforms, and the corresponding time \(ts_{i}\) is obtained using the linear interpolation [29], which is shown in Eq. (10).

    $$ts_{i} = \frac{{tm_{1} \times tn_{i} - tn_{1} \times tm_{i} }}{{tm_{1} - tn_{1} }}$$
    (10)
  3. 3.

    After that, calculate the matching error of all similar waveforms, as shown in Eq. (11).

    $$E_{i} = \left| {S_{i} - S_{1} } \right| + \left| {N_{i} - N_{1} } \right| + \left| {M_{i} - M_{1} } \right| + \left| {M_{i + 1} - M_{2} } \right|$$
    (11)

    where \(\left| {M_{i + 1} - M_{2} } \right|\) is the trend term of similar waveforms, which is used to reflect the position of the relative extremum point of all similar waveforms in the mixed signal x(t).

  4. 4.

    Find out the minimum matching error in all matching errors, and take the similar waveforms with minimum matching error as the optimal matching waveform.

  5. 5.

    Data before the optimal matching waveform \(S_{i} - M_{i} - N_{i}\) are used to extend the left boundary point of the mixed signal.

  6. 6.

    Use the same principle to process the right boundary point of the mixed signal, and obtain the final extension waveform. Namely, both ends of the mixed signal x(t) are extended completely through this process. It is worth pointing out that the characteristic waveform \(S_{1} - M_{1} - N_{1}\) shown in Fig. 1b is on the right, when we intend to estimate the right boundaries. Specifically, the optimal matching waveform \(S_{i} - M_{i} - N_{i}\) similar to characteristic waveform is firstly determined, and then, data after the optimal matching waveform \(S_{i} - M_{i} - N_{i}\) are used to extend the right boundary point of the mixed signal.

  7. 7.

    Perform SSD for the extended waveform to obtain several SSCs which its data length contains the extended part.

  8. 8.

    Apply Hilbert transform for all SSCs to obtain the corresponding instantaneous frequency, amplitude, and phase. Finally, remove the extended part of the mixed signal x(t) and plot the instantaneous frequency of all SSCs together to get the ultimate time–frequency graphs (TFGs). A clearer visual example for signal with sharp transition is given to elaborate the performance of waveform matching extension. The extended points of the left and right boundary are set as 100 points in this paper [30]. Figure 2a, b displays, respectively, the extended results obtained by waveform matching method for a periodic impact impulse series and a multi-component modulated signal. One can clearly see that the left and right extended part accords with the natural tendency of the original signal, which means waveform matching method is appropriate for the processing of multi-component modulated signal.

    Fig. 2
    figure 2

    A visual example of waveform matching extension

2.3 The evaluation indicator of decomposition performance

In order to quantitatively assess the capability of signal decomposition, some evaluation indicator should be imported. As we know, the total energy of the raw signal before decomposition is expected to be equal to the total energy of the obtained contents after decomposition. In other words, the total energy of the signals before and after decomposition is expected to be consistent. Hence, performance of signal decomposition can be measured by the energy change of the given signal before and after decomposition. Given this, an end effect evaluation index named energy error \(\theta\) in Ref. [13] is introduced to assess the signal decomposition performance, which can be described as

$$\theta = \frac{{\left| {\sqrt {\sum\limits_{i = 1}^{n + 1} {{\text{RMS}}_{i}^{2} } } - {\text{RMS}}_{x} } \right|}}{{{\text{RMS}}_{x} }},\,{\text{RMS}}_{x} = \sqrt {\frac{{\sum\limits_{i = 1}^{N} {x^{2} (i)} }}{N}}$$
(12)

where \({\text{RMS}}_{x}\) is the root mean square of the given signal x(t), \({\text{RMS}}_{i}\) is the root mean square of the ith decomposition components, and n + 1 is the total amount of decomposition components, including the residual term. According to Ref. [13], the bigger the computing results of θ is, the greater the energy error of the signals before and after decomposition is. That is, the larger θ value indicates the lower decomposition precision and the greater end effect.

Furthermore, in theory, each component obtained by signal decomposition is expected to be mutually orthogonal. Therefore, orthogonal index (OI) in Ref. [35] is regarded as a measure indicator to evaluate the orthogonality of signal decomposition results. For a given signal x(t), its orthogonal index can be modeled as

$${\text{OI}} = \frac{{\sum\nolimits_{i = 1}^{{N_{C} }} {\sum\nolimits_{j = 1}^{j < i} {\left| {\sum\nolimits_{k = 1}^{N} {C_{ik} \times C_{jk} } } \right|} } }}{{\sum\nolimits_{k = 1}^{N} {(x_{k} - r_{k} )^{2} } }}$$
(13)

where \(N_{\text{C}}\) denotes the total amount of decomposition components, \(N\) describes the data length of decomposition components, \(C_{ik} (t)\) and \(C_{jk} (t)\) depict the ith and jth decomposition components at sifting step k, respectively. \(x_{k}\) is the raw signal and \(r_{k}\) is residual term. Because of orthogonal index is expected to be close to zero, the smaller orthogonal index value shows the better decomposition results.

2.4 Performance validation of the ISSD method

2.4.1 Numerical signal analysis without noise

To validate the effectiveness of the ISSD method, here a numerical signal \(x(t)\) is considered as follows

$$X = 0.2\sin (80\pi t) + 0.3\sin (40\pi t) + 2\sin (20\pi t) + 0.5\sin (10\pi t) + t^{2}$$
(14)

The numerical signal is made up of four sine waves and a trend term. The sampling frequency and sampling number are set as 1000 Hz and 1000, respectively. Numerical signal analysis is conducted on an Intel Pentium G3420 3.20 GHz CPU with 4.00 GB RAM, and MATLAB (2010a) platform is used to implement the simulation. Figure 3 shows the numerical signal \(x(t)\) and its five composed ingredients. Six methods, namely, original SSD, mirror-symmetric extension-based SSD (MS-SSD), support vector machine extension-based SSD (SVM-SSD), waveform matching extension-based SSD (ISSD), EMD, and LMD, are used to decompose the numerical signal, respectively. TFGs obtained using the above methods (i.e., SSD, MS-SSD, SVM-SSD, ISSD, EMD, and LMD) are shown in Fig. 4a–d, respectively. From Fig. 4a, b, it can be seen that the numerical signal \(x(t)\) is divided into four SSCs and the first two SSCs suffer from the end effect phenomenon to some extent. Besides, in Fig. 4a, b, the fourth SSC acquired by SSD and MS-SSD appears scale-mixing problem. That is, decomposition components derived from SSD and MS-SSD cannot coincide well with the real components of numerical signal \(x(t)\). From Fig. 4c, it is clearly illustrated that the results obtained by SVM-SSD have small end effect, but the fourth SSC also emerges scale mixing. In Fig. 4d, ISSD method can give better decomposition results, which are closer to the actual value. However, as shown in Fig. 4e, f, EMD and LMD cannot extract accurately the true component of the original signal. This comparison result indicates that ISSD method is suitable for the analysis of multi-component signal.

Fig. 3
figure 3

Numerical signal x(t) and its composed components

Fig. 4
figure 4

Time–frequency graphs obtained by a SSD; b MS-SSD; c SVM-SSD; d ISSD; e EMD; and f LMD

To further illustrate the validity of the ISSD method, the evaluation indicator of six algorithms is calculated, respectively. The comparative results are given in Table 1. From Table 1, it is clear that energy error θ and orthogonal index OI of ISSD method are smaller than that those of other methods. This means that ISSD method has smaller end effect and preferable decomposition results than other extension pattern-based SSD (i.e., MS-SSD and SVM-SSD). Namely, the results of ISSD are closer to the real values. However, running time of ISSD is greater than that in other methods, except for SVM-SSD. The additional consuming time is caused by the optimization process of matching waveform.

Table 1 Evaluation index comparison among different methods

2.4.2 Numerical signal analysis with low noise

To show the anti-noise ability of the ISSD method, the numerical signal shown in Eq. (14) is added a stochastic noise with SNR of 6 dB. Six methods (i.e., SSD, MS-SSD, SVM-SSD, ISSD, EMD, and LMD) are, respectively, used to analyze the noisy simulation signal. TFGs obtained using the above six methods are shown in Fig. 5, respectively. From Fig. 5, we can see that four frequency components of numerical signal can be extracted by four methods (i.e., SSD, MS-SSD, SVM-SSD, and ISSD). Moreover, there are several distinct frequency features in TFGs based on ISSD. In other words, ISSD has the ability of anti-noise to some extent. But actually, compared with numerical signal analysis without noise (see Fig. 4d), the analysis results of Fig. 5d are poor relatively, which indicates that the noise has some influence on the analysis result of the ISSD method. Overall, when the numerical signal is mixed with stochastic noise, the ISSD method still can extract the intrinsic characteristic frequency components of numerical signal. In Fig. 5e, f, frequency components are mainly distributed in the low frequency area, which means that EMD and LMD cannot effectually extract frequency components of the numerical signal.

Fig. 5
figure 5

Time–frequency graphs obtained by a SSD; b MS-SSD; c SVM-SSD; d ISSD; e EMD; and f LMD

Table 2 lists the evaluation indicator of six algorithms. As can be seen, energy error and orthogonal index of ISSD method are the smallest, which indicates that the ISSD method has a good decomposition performance. However, ISSD method has more computing time than other methods, except for SVM-SSD. Additional time of the ISSD method is probably caused by waveform matching process.

Table 2 Evaluation index comparison among different methods

2.4.3 Numerical signal analysis with high noise

To further demonstrate the anti-noise ability of the ISSD method, the numerical signal shown in Eq. (14) is added a stochastic noise with SNR of 2 dB. The numerical signal with noise is processed by six methods (i.e., SSD, MS-SSD, SVM-SSD, ISSD, EMD, and LMD), respectively. TFGs obtained by different methods are plotted in Fig. 6a–f, respectively. It is obvious in Fig. 6 that four methods (i.e., SSD, MS-SSD, SVM-SSD, and ISSD) can extract the main frequency component of the original signal, but severe fluctuations occur at the frequency of 40 Hz. Overall, TFGs obtained by ISSD have a good frequency resolution, which means that ISSD also can extract the corresponding characteristic frequency when the signal is polluted by noise. As shown in Fig. 6e, f, there is serious scale-mixing problem in TFRs based on EMD and LMD, which indicate that frequency components of numerical signal cannot be identified by EMD and LMD. Therefore, the effectiveness of the ISSD method is further verified by the comparative results.

Fig. 6
figure 6

Time–frequency graphs obtained by a SSD; b MS-SSD; c SVM-SSD; d ISSD; e EMD; and f LMD

Table 3 gives the evaluation indicator of six algorithms. From Table 3, we can find that energy error and orthogonal index of ISSD are less than those of other methods, which imply that ISSD is superior to other methods in signal decomposition. However, due to the application of waveform matching, computing time of ISSD method is higher than that of other methods, except for SVM-SSD.

Table 3 Evaluation index comparison among different methods

2.4.4 The experimental signal analysis

To further verify the effectiveness of the ISSD method, we apply the ISSD method to analyze the rotor oil whirl signal collected from rotor test rig located in North China Electric Power University (NCEPU). The adopted device during the experiment is Bently RK-4 test bench, and the type of data collection equipment is ZonicBook/618 E, as exhibited in Fig. 7. The sampling frequency and data length are set as 1280 Hz and 1024, respectively.

Fig. 7
figure 7

Schematic diagram of rotor test bench

Figure 8a, b shows the waveform and FFT spectrum of rotor oil whirl signal, respectively. Figure 8a shows that the rotor oil whirl signal is represented by a sum of sine waves, and in the FFT spectrum, there is prominent amplitude at the rotating frequency of 43.75 Hz, which is the feature of rotor oil whirl fault. Besides, the amplitude of half-frequency (i.e., 21.25 Hz) is greater than that of rotating frequency, which means rotor appears an oil whirl fault. Figure 8c–h shows the TFGs resulting from six methods (i.e., SSD, MS-SSD, SVM-SSD, ISSD, EMD, and LMD), respectively. In Fig. 8c–e, the instantaneous frequency (i.e., 21.25 Hz, 43.75 Hz, and 65 Hz) can be found, but their end effect is serious. In addition, the time–frequency trajectory of Fig. 8c, e is obscure and some false contents appear in Fig. 8c, e. As shown in Fig. 8f, the fault features of rotor oil whirl can be captured clearly and the end effect is relatively smaller. Moreover, time–frequency trajectory of TFG obtained by ISSD method is gem-pure. Figure 8g, h shows that EMD and LMD suffer from mode mixing phenomenon and cannot extract accurately the defect feature of rotor oil whirl signal.

Fig. 8
figure 8

Analyzed results of rotor oil whirl signal: a waveform; b FFT spectrum, time–frequency graphs obtained by c SSD; d MS-SSD; e SVM-SSD; f ISSD; g EMD; and h LMD

Likewise, for quantitative comparison, the evaluation parameters are calculated in terms of Eqs. (12) and (13), and the results are described in Table 4. Table 4 shows that ISSD has the smallest energy error and orthogonal index. Namely, ISSD has a better inhibition ability of end effect and decomposition results than other extension pattern-based SSD (i.e., MS-SSD and SVM-SSD). Meanwhile, as shown in Table 4, EMD has the least calculation time, and SVM-SSD is time-consuming, whereas the computing time of ISSD is close to that of SSD and MS-SSD. Therefore, it can be concluded that ISSD is suitable for decomposing the multi-component signal and is effective in suppressing the end effect.

Table 4 Evaluation index comparison among different methods

3 Adaptive selection of sensitive components

The selection of sensitive components after signal decomposition is a critical step in the fault diagnosis, so an efficient method should be employed to determine the sensitive components with greatest contribution for fault feature extraction. As known, kurtosis is considered as a promising tool, which can fully disclose the periodic characteristics of cyclic impulses [36]. The bigger kurtosis denotes the greater impact features and the higher signal energy. Sparseness is a statistical parameter, which can effectively reflect the sparse characteristics of vibration signal [37]. The larger sparseness indicates the stronger data sparsity and the more periodic impulses. Therefore, considering the superiority of kurtosis and sparseness, a sensitive index (SI) based on the product of kurtosis and sparseness is presented to choose adaptively the sensitive components. For a given signal \(x(t)\), the SI is defined as

$${\text{SI}}_{j} = {\text{Kurtosis}}(e_{\text{SSC}}^{j} ) \times {\text{Sparseness}}(e_{\text{SSC}}^{j} )$$
(15)

where \(e_{\text{SSC}}^{j}\) denotes the Hilbert envelope signal of the jth component, and the expression of kurtosis \(K\) and sparseness \(S\) can be, respectively, given as

$$K = \frac{{E(x - u)^{4} }}{{\sigma^{4} }},\,S = \frac{{\sqrt {\frac{1}{N}\sum\nolimits_{n = 1}^{N} {x(n)^{2} } } }}{{\frac{1}{N}\sum\nolimits_{n = 1}^{N} {\left| {x(n)} \right|} }}$$
(16)

where \(u\) and \(\sigma\) are the mean value and standard deviation of signal \(x(t)\), respectively. \(x(n)\) is the vibration signal, and \(N\) is the data length of signal \(x(n)\). The bigger SI indicates the better signal sparseness and the greater impact characteristics. Hence, the largest SI can be used to select the sensitive components containing the most abundant fault characteristic signatures.

4 The proposed detection scheme

4.1 1.5-dimensional energy spectrum

1.5-dimensional energy spectrum \({\text{ES}}(\omega )\) is implemented through the following formula, which is equivalent to the FFT spectrum of third-order cumulant diagonal slice of instantaneous energy in essence [27].

$${\text{ES}}(\omega ) = \int\limits_{ - \infty }^{ + \infty } {R_{3\phi } (\tau ,\tau )} \times e^{ - j\omega \tau } d_{\tau }$$
(17)
$$R{}_{3\phi }(\tau ,\tau ) = E\left\{ {e(t)e(t + \tau )e(t + \tau )} \right\}$$
(18)

where \(e(t)\) indicates the instantaneous energy of the given signal \(x(t)\) and satisfies the equation \(e(t) = [\dot{x}(t)]^{2} - x(t)\ddot{x}(t)\), and \(\dot{x}(t)\) and \(\ddot{x}(t)\) are, respectively, the first- and second-order derivative function of the given signal \(x(t)\) [26]. In Eq. (18), \(R{}_{3\phi }(\tau ,\tau )\) denotes the third-order cumulant diagonal slice of instantaneous energy \(e(t)\), and \(E\left\{ \cdot \right\}\) denotes the expected value.

4.2 The proposed detection scheme

To improve the diagnostic precision, a new fault detection approach in which ISSD and 1.5-dimensional energy spectrum are combined is proposed. Figure 9 shows the flowchart of the proposed detection scheme. Detailed procedure of the proposed method is summarized as three steps:

Fig. 9
figure 9

Flowchart of the proposed detection scheme

  1. 1.

    Signal decomposition Employ the ISSD to deal with the collected vibration signal, and obtain several SSCs whose instantaneous frequency has physical significance.

  2. 2.

    Sensitive SSCs selection Calculate the sensitive index of each SSC, and select adaptively the sensitive SSCs based on maximum criterion of sensitive index.

  3. 3.

    1.5-dimension energy spectrum analysis Perform the 1.5-dimensional energy spectrum analysis on the selected sensitive SSCs, extract the fault frequency of vibration signal, and identify the fault category of rotating machinery.

5 Application to rotating machinery fault diagnosis

Gear and rolling bearing are the essential parts of rotating machinery and play important roles in engineering application. When gear or rolling bearing appears localized fault, the generated vibration signal is usually a multi-component modulated signal since ISSD method can decompose a multi-component signal into a sequence of SSCs. Therefore, ISSD method is especially suitable for the processing of gear and rolling bearing vibration signals. In this part, application examples for the fault diagnosis of gear and bearing are taken to illustrate the effectiveness of the proposed method. Moreover, some available algorithms (e.g., EMD-based Hilbert transform [38], LMD-based Hilbert transform [39], spectral kurtosis [40], and SSD-based Hilbert transform) are also utilized to compare their detection ability.

5.1 Case 1: application to gear fault diagnosis

In this subsection, the proposed technique is adopted to process gear fault data collected from mechanical fault simulator located in NCEPU. Figure 10a shows the sketch of the QPZZ test rig. Figure 10b shows the photograph of faulty gear. In this experiment, four accelerometers in the status of upright were installed on the gearbox housing for data acquisition. The motor speed during testing was 834 rpm (about 13.9 Hz). The analyzed gearbox belongs to single reduction. Small gear on the import axis is regular operation and its tooth number is 55, whereas large gear on the export axis has a pitting defect and its tooth number is 75. Gear mesh frequency fm = 764.5 Hz and gear defect frequency fg = 10.2 Hz. Gear fault data during testing are gathered via a sampling frequency of 5120 Hz and a sampling time of 1 s.

Fig. 10
figure 10

Schematic sketch of gearbox

Figure 11a–c shows the waveform of normal gear vibration signal and its corresponding FFT spectrum and envelope spectrum, respectively. As can be seen, fault characteristic information of gear cannot be found in spectrogram when the gear vibration signal is collected under the healthy state. Moreover, FFT spectrum of normal gear vibration signal has no obvious resonance band. Figure 12a shows waveform of gear fault signal, of which transient impulses are submerged in noise. Figure 12b, c shows FFT spectrum and envelope spectrum of Fig. 12a, respectively. In Fig. 12b, the frequency components are situated in a range of broadband and the fault information about defects cannot be found. Also, we can know from Fig. 12c that the defect frequency fg is absent.

Fig. 11
figure 11

a Temporal waveform; b fast Fourier transform spectrum; and c envelope spectrum for normal gear vibration signal

Fig. 12
figure 12

a Temporal waveform; b fast Fourier transform spectrum; and c envelope spectrum for gear pitting vibration signal

The proposed method is exploited to detect gear fault. Firstly, gear fault data are processed by ISSD and the obtained first four SSCs are given in Fig. 12a. Next, SI of the received SSCs is plotted in Fig. 13b. According to Fig. 13b, SSC3 with the biggest SI is selected as the sensitive components. Eventually, 1.5-dimensional energy spectrum of SSC3 is performed to obtain the results of Fig. 13c. Figure 13c indicates that there are apparent spectral lines at defect frequency fg and its harmonics, which mean that a pitting fault occurs in gear. In other words, the proposed method is effective in recognizing the fault type of gear.

Fig. 13
figure 13

Analyzed result of gear fault signal using the proposed method: a the first four SSC components; b sensitive index; and c 1.5-dimensional energy spectrum of SSC3

As a contrast, EMD and LMD are used to analyze the same gear fault data, respectively. Following Eq. (15), SI of the former six components (i.e., intrinsic mode function (IMF) and product function (PF)) acquired by EMD and LMD is drawn in Fig. 14a. The SIs of IMF2 and PF2 are greater than those of other IMF and PF in Fig. 14a, so IMF2 and PF2 are selected as the sensitive components for further processing. Figure 14b shows the waveform and envelope spectrum of IMF2, whereas the waveform and envelope spectrum of PF2 are plotted in Fig. 14c. In the envelope spectrum of Fig. 14b, c, the gear fault frequency fg and its harmonics can be detected, but the amplitudes at fault frequency fg are weak. Likewise, the original SSD method and spectrum kurtosis (SK) are also employed to analyze the waveform of Fig. 12a. Figure 15a, b shows the waveform of the sensitive components obtained by the SSD method and its corresponding envelope spectrum, respectively. Figure 15b shows that gear fault frequency fg and its harmonics can be identified, but its spectrogram is not as clear as those in Fig. 13c. It is worth mentioning that 1.5-dimensional energy spectrum of Fig. 15a can also obtain the clear feature extraction results. Kurtogram is exhibited in Fig. 16a. Figure 16b, c shows the filtered signal and its corresponding envelope spectrum, respectively. As shown in Fig. 16c, although the fault feature information can be observed, the amplitudes at characteristic frequency fg are not obvious.

Fig. 14
figure 14

Analyzed result of gear fault signal using EMD and LMD: a Sensitive index; b waveform and envelope spectrum of IMF2; c waveform; and envelope spectrum of PF2

Fig. 15
figure 15

Analyzed result of bearing fault signal using the original SSD method

Fig. 16
figure 16

Analyzed result of gear fault signal using SK method: a Kurtogram; b the filtered signal; and c envelope spectrum

5.2 Case 2: application to bearing fault diagnosis

Further, the proposed method is applied to detect the generator front bearing fault occurring in a wind turbine, whose nominal output power is 750 KW. Structure sketch of wind turbine is shown in Fig. 17, which is mainly composed of impeller, gearbox, and generator. Two accelerometers during vibration testing were glued on the front–back bearing housings of generator to gather the fault data, as shown in Fig. 17a. Bearings type of generator is SKF6324C3, and its geometric parameters are listed in Table 5. In the process of data collection, the average spindle speed and generator shaft speed were n1 = 21.63 r/min and n2 = 1519 r/min, respectively. The sampling frequency was 16,384 Hz, and the total data length of the measured signal was 163,840. Nevertheless, only the vibration signal derived from generator front bearing is analyzed in this paper. Generator bearing outer race defect frequency was calculated as fo = 79.21 Hz.

Fig. 17
figure 17

a Sensor placement and b the schematic diagram of wind turbine transmission system

Table 5 Geometric parameters of front–back bearing of generator

Figure 18a shows waveform of the collected vibration data with a length of 8192. From Fig. 18a, it is known that there is a lot of noise in the vibration signal. Figure 18b, c describes the FFT spectrum and envelope spectrum of Fig. 18a, respectively. As noted in Fig. 18b, the defect frequency fo is difficult to be discovered. In Fig. 18c, there are high amplitudes at bearing defect frequency (fo = 79.21 Hz), but several interference ingredients can also be found, which impedes the judgment of bearing injury types.

Fig. 18
figure 18

a Temporal waveform; b fast Fourier transform spectrum; and c envelope spectrum for bearing vibration signal

The proposed method is used to diagnose the generator bearing fault. Figure 19a shows the results derived from ISSD. SI of each SSC is given in Fig. 19b. As shown in Fig. 19b, the largest SI is corresponding to SSC1. Hence, SSC1 is deemed as the sensitive components. Figure 19c shows the 1.5-dimensional energy spectrum of SSC1. As depicted in Fig. 19c, bearing defect frequency fo and its double frequency 2fo are extracted clearly, which indicates that a localized defect appeared on the outer race of generator front bearing. Namely, the proposed method can perform reasonably well for bearing fault detection.

Fig. 19
figure 19

Analyzed result of bearing fault signal using the proposed method: a the first four SSC components; b sensitive index; and c 1.5-dimensional energy spectrum of SSC1

For comparison, EMD and LMD are conducted to tackle the same data. Figure 20a shows that IMF1 and PF1 have the largest SI. Therefore, IMF1 and PF1 are analyzed, and the results are plotted in Fig. 20b, c, respectively. We can see that the defect frequency fo and its second harmonic 2fo can be found in the envelope spectrum of IMF1 and PF1. However, the amplitudes at defect frequency fo and its harmonics are lesser than those of Fig. 19c. Meanwhile, the original SSD method and spectrum kurtosis (SK) are also used to process the signal of Fig. 18a. The analyzed results obtained by the original SSD method are shown in Fig. 21. As can be seen from the envelope spectrum of Fig. 21b, defect frequency fo and its second harmonic 2fo can be extracted, but its spectral lines are not as good as those in the ISSD method. Kurtogram is plotted in Fig. 22a. Figure 22b, c shows, respectively, the filtered waveform and its corresponding envelope spectrum. In Fig. 22c, there are distinct peak values at characteristic frequency fo, but its harmonics is not clear.

Fig. 20
figure 20

Analyzed result of bearing fault signal using EMD and LMD: a sensitive index; b waveform and envelope spectrum of IMF1; c waveform and envelope spectrum of PF1

Fig. 21
figure 21

Analyzed result of bearing fault signal using the original SSD method

Fig. 22
figure 22

Analyzed result of bearing fault signal using SK method: a Kurtogram; b the filtered signal and c envelope spectrum

5.3 Further discussions

From the above analysis, it can be inferred that the performance of the proposed method is verified by diagnosis examples. Meanwhile, for the fault detection of gear and bearing, the proposed method can perform better than other comparative methods (i.e., SSD, EMD, LMD, and SK). Nevertheless, all the above-mentioned results tend to the qualitative comparison. Therefore, evaluation indicators should be also calculated to compare the above methods quantitatively.

Firstly, the decomposition capability of four methods (i.e., ISSD, SSD, EMD, and LMD) is discussed for case 1 and case 2. Two indicators (i.e., energy error and orthogonal index) given in Sect. 2.3 and CPU time are used to test the capability of the four methods. Table 6 exhibits the contrast analysis results among the four methods. As stated in Table 6, energy error and orthogonal index of ISSD are lesser than those of other three methods, which means that ISSD has better decomposition results. It is noteworthy that ISSD needs more running time compared with other three methods. That is, the decomposition capability of ISSD is enhanced at the expense of time-consuming. Therefore, for future work, we intend to improve the running speed of ISSD and apply ISSD to process other types of vibration signal.

Table 6 Quantitative comparisons among different methods

Moreover, we move our focus to the spectrum analysis comparison of sensitive component after signal decomposition. A fault feature ratio (FFR) reported in Ref. [41] is used to evaluate quantitatively the detection capability of five methods (i.e., the proposed method, SSD-based Hilbert transform, EMD-based Hilbert transform, LMD-based Hilbert transform, and spectrum kurtosis), which is defined as follows:

$$R_{f} = \frac{S(f) + S(2f) + S(3f)}{S}$$
(19)

where \(f\) represents the defective frequency, \(S\) represents the overall amplitudes of Hilbert envelope spectrum of the signal, \(S(f)\), \(S(2f)\), and \(S(3f)\) denote the envelope spectrum amplitudes corresponding to \(f\), \(2f\) and \(3f\), respectively. The greater FFR value indicates a better fault detection performance. FFR value obtained by the above five methods is listed in Table 7. As shown in Table 7, the proposed method has the FFR value of 0.1626 and 0.2893, which is much higher than the FFR value derived from other methods. As a whole, the proposed method can improve the detection capability for gear or bearing fault compared with other methods.

Table 7 FFR value of the analysis results derived from different methods

6 Conclusions

In this paper, a method of adaptive signal analysis named ISSD is proposed, which can improve signal’s decomposition results and restrain the phenomenon of end effect effectively. Simulation and experiment results verified the feasibility of ISSD algorithm. Subsequently, a new synergistic idea of ISSD and 1.5-dimensional energy spectrum is further presented for diagnosing the local faults emerged in rotating machinery. Through application to fault diagnosis of a gearbox and rolling bearing, the proposed method is proved effective to extract the fault characteristics of rotating machinery. Besides, the analysis results adequately pointed that the detection capability of the provided method is better than that of other available methods (i.e., EMD, LMD, and SK). What is important is that the accuracy of fault diagnosis can be improved by joint application of the ISSD and 1.5-dimensional energy spectrum. The contributions of this paper are summarized as follows:

  1. 1.

    A modified algorithm called ISSD is presented, which can improve signal decomposition ability.

  2. 2.

    A novel integration scheme (ISSD and 1.5-dimensional energy spectrum) is developed for fault detection.

  3. 3.

    The validity of the presented algorithm is proved using simulated and experimental signals.

The preliminary results show that the proposed algorithm is effective in detecting the local fault under constant speed. It is unknown to apply the proposed algorithm to solve the problem of fault diagnosis under variable speed. For our future work, we intend to extend the proposed algorithm to diagnose mechanical faults under variable speed.