1 Introduction

Diagnostics, fault detection and isolation (FDI) approaches of rotating machines are crucial factors to ensure cost-effective and reliable industrial processes. Advanced approaches featuring monitoring and control of availability aspects of rotating machines like condition-monitoring systems (CMS) or structural health monitoring (SHM) systems require a reliable diagnostics and prognostics system. In industrial practice nowadays, these systems include signal-based approaches in combination with machine learning methods. Model-based approaches using suitable system description are also discussed mainly in academic research as a possible alternative to the more application-oriented former signal-based approaches. Modern approaches are based on feature extraction and recognition algorithms, along with mathematical modeling and simulations, in order to detect and/or avoid faults that are able to breakdown machines and systems by affecting the functionality. In the consequence, the task of related diagnostics and prognostics approaches is to establish relevant statements as early as possible.

Cracks in rotating machinery are the most critical and fundamental damages in the related industry, often caused by fatigue stress. Dynamically, they often lead to vibration effects similar to those of asymmetry and misalignment, accompanied with changes in the vibration properties. These changing dynamical properties are used as indicators for changed mechanical parameters of the rotor, including deterioration effects. The knowledge about the related causal chain initialized by shaft cracks in effecting the system dynamics of the rotating system can be used by model-based as well as signal-based approaches.

It should be noted that both approaches (model-based, signal-based) use knowledge, but in a different way.

Signal-based approaches are using output signals (denoted as \(S_o\)). Fault detection modules relate the raw or filtered signals (denoted as \(S_f\)) to reference or threshold values stating changes with respect to the normal (regular, healthy) conditions, called fault detection. Additional knowledge and modules are necessary to relate these signals (\(S_o\), but mainly \(S_f\)) to those related to distinguished one compared on a third level (feature space). Other classification approaches are used (possibly in combination with suitable chosen filters) to distinguish states/faults as classes. Relating to this, also model information/knowledge is used implicitly, but also these approaches are denoted as signal-based (or data-driven) approaches.

Model-based approaches use, beside output signals \(S_o\), input signals \(S_i\) and a model to be built (parameter identification) or to be assumed (observer). The comparison results directly from this level on the base of comparison [of parameters: identification or the outline establishment of residuum (observer)]. Filtering of residuum allows the distinction of errors location within a so-called parity space. These models are assumed as given/known directly, so these approaches are denoted as model-based.

In this contribution, two different related approaches are explained in detail in order to evaluate their potential and applicability. The differences and the possibilities are illustrated with respect to the development of a reliable crack detection approach to be applied for online monitoring in turbomachinery.

The paper is partly based on a previously published model-based approach of the first author [30], firstly published in 1993. Here, the approach is briefly introduced and applied to a new example using realistic model data. The signal-based alternative used here has been previously published in complete different context [25]. Both approaches are well developed within the last years [25], or decades [30].

In a previous publication of the authors, this combination of SVM and wavelet was introduced [25]. Nevertheless, in this publication this approach is improved and applied to a different system class. This specific kind of system and also application task (crack detection of rotor) is used to compare the introduced approach with a model-based approach also previously introduced [30]. In comparison with this very early and novel (model-based) investigation, here now a realistic model is used (no lumped-mass approach) and a professional simulation program is used, and the results are compared. The formerly novel approach [30] (often cited) is compared with the new approach introduced showing strongly improved results.

1.1 Cracks in rotors

Rotating shafts are considered among the most important and most critical machine elements in the industrial processes and machines such as turbines, compressors, and pumps. The rotors of such machines are usually subjected to extreme working conditions of loading and temperature variations. Accordingly the consequent failures can lead to enormous damages, economical losses, and injuries. A profound and actual overview is given in [14]; here, Ishida introduced case histories of accidents and cracks found in industrial machines.

Different kinds of faults and flaws have been recorded in the rotating machinery such as unbalance, misalignment, rub, and looseness [35]. The cracks in rotors have long been considered as factors limiting the safe and reliable operation of the rotating machinery. A crack may be developed from some surface or internal imperfections and propagate without much apparent warning. In ductile steels used for rotors, cracks are influenced by many factors such as the rapid fluctuation of the bending stresses, the presence of stress raisers and possible design or manufacturing flaws, and the variations in temperature and environment [26].

Methods of crack formation and propagation vary from high and low cycle fatigue to temperature and environment effects. A typical event of cracking in ductile steels can be divided into three stages: crack initiation, in which tiny discontinuities are initiated in the uncracked parent material; crack propagation, in which the discontinuity grows in size as a result of the cyclic stresses induced in the material; and failure, which occurs when the material that has not been affected by the crack cannot withstand the applied loads [3, 26].

The transverse breathing crack is the most critical type of cracks in rotating shafts. Here the cross section is reduced weakening the rotor under certain conditions. In case of breathing, the opening and closing the rotating crack are due to the rotation of the shaft. The crack moves from the upper position in which the static bending moment forces the crack to be closed, to the opposite position in which the crack is forced to be open [3].

The influence of the crack existence in the rotor structure is related to the change of the local stiffness of the crack. A crack introduces local flexibilities and time-dependent changes in the stiffness of the structure due to strain energy concentration, hence reducing the natural frequencies of the original uncracked rotor [3, 10].

1.2 Crack detection of elastic rotors: a brief review

Several crack models are developed and used in the eighties and seventies, mostly used to be integrated within simulation studies.

Several techniques have been used to monitor cracks in rotors such as vibration sensors, ultrasonic measurements, tribological analysis, and recently the acoustic emission techniques [3, 21]. However, the vibration-based techniques have been used widely as tools for fault diagnosis in the rotating machinery [11, 34]. Vibration-based systems directly measure the rotor response forced by rotor flaws. A general review of the vibration-based condition monitoring for structures is given by Carden and Fanning [4]. They presented several approaches adopted in the literature for time, frequency, and modal domains, such as the natural frequency-based methods which are modal methods dependent on frequency shifts and the relation between stiffness changes and natural frequency changes. According to Carden and Fanning, the reliability of such methods is limited to single or few damage locations and/or small laboratory structures. They presented and discussed other approaches based on mode shape, modal strain energy, dynamically measured flexibility matrix, residual force vector, wavelet transform, neural networks, genetic algorithms, and statistical pattern recognition. They stressed that there is a lack in research to deal with synchronous faults and the remaining service life. The vibrational behavior of cracked rotors is also studied by Silani et al. [29]. They used a finite element approach and short-time Fourier transform (STFT) to investigate the detection of small cracks. They presented that though the transient response does not change sensibly in the presence of very small cracks, the STFT of the response behavior can clearly identify cracks. In the work of Sawicki et al. [27], the method of multiresolution wavelet analysis has been applied on the vibration signal of a rotating system with and without external force excitation, in order to distinguish the existence of a transverse crack. They found that the differences are more pronounced in the presence of external force excitation. They presented the RMS amplitude values of the vibration signal in different frequency bands as a simple quantification method for fault severity. Several other vibration-based techniques are introduced in the literature [12, 40].

The objectives of any crack monitoring system comprise crack detection, localization, severity quantification, and remaining service life prediction. The crack detection and localization have been much more emphasized in the literature than the other two objectives.

An early concept of model-based crack detection based on the theory of disturbance observer is introduced by Söffker et al. [30]. Here, based on the nominal behavior of the system, the fault effects caused by the crack are interpreted as unknown external effects acting to the nominal behavior of the rotor. Measurements of displacements and/or velocities of the vibrating dynamic system are necessary, together with further information such as the mechanical model of the rotor and the characteristics of the typical behavior of the crack. Another model-based method is proposed by Sekhar [28] for the online identification of cracks in a rotor while it is passing through its flexural critical speed. The fault-induced change of the rotor system is taken into account by equivalent loads in the mathematical model. For crack modeling, the flexibility matrices of the cracked section are utilized according to Papadopoulos and Dimarogonas [23]. The equivalent loads are virtual forces and moments acting on the linear undamaged system to generate a dynamic behavior identical to the measured one of the damaged system. The rotor has been modeled using FEM, and the crack has been identified for its depth and location on the shaft for different rotor accelerations. The CWT has been used to extract the subharmonic features of crack from the time response. The work results show accurate estimation of crack location; however, the error of crack depth estimation increases with the decrease in measured vibration data (DOF). On the contrary, results of Xiang et al. [37] show better accuracy of crack depth estimation than that of location estimation. The model-based detection system proposed by Xiang et al. is based on the combination of wavelet-based elements and genetic algorithm. Genetic algorithm is applied to eliminate the errors of frequencies between numerical simulation and experimental measurement. The wavelet transform is used also in the work of Nagaraju et al. [22]. They studied the transient analysis of rotor system with transverse breathing crack for flexural vibrations. To extract the hidden features of the crack, the time signal obtained from the transient analysis is transformed to 3D-CWT plots in which the time–frequency components are clearly represented. A new wavelet plot called cross-wavelet transform (XWT) has been applied to the time signals to obtain the phase angles. The XWT gives the phase angles of different frequency components along with the subcritical peaks in a single plot. The inverse problem of crack detection has been carried out using artificial neural network (ANN).

Some other reference papers for model-based systems are recommended [6, 32].

The signal-based systems for monitoring rotating machines have been used for long time. The techniques used are gradually updated according to data acquisition and signal processing techniques used. The advances in machine learning and feature extraction techniques have induced new methods and techniques to be adopted in the field of rotating machinery fault detection.

The work introduced by Tao et al. [33] comprises a detection method based on Fisher discriminant analysis (FDA) as linear dimensionality reduction technique and Mahalanobis distance (MD) for performance assessment. Data samples are projected into a new low-dimensional space in which MD between the new measurement data and normal population is calculated for performance assessment. As a conclusion, the transformation of MD into the feature space and the determination of an adaptive threshold for fault detection is still a challenge. An energy-based approach to defect diagnosis in rotary machines is introduced by Yan and Gao [38]. The method uses continuous wavelet transform CWT and is based on the analysis of the energy content associated with the signal to determine the best suited base wavelet and decomposition scale for analyzing the signal.

Some other reference papers for signal-based systems are recommended [36, 39].

For deep understanding about the cracked rotors techniques and recent advances in general and for the two main categories model- and feature-based, some recommended general review references are helpful [3, 17, 26, 35].

2 Rotor system

Modeling of elastic rotors has been developed over decades to a high degree of sophistication for simulation, fault detection, and isolation purposes. The developed models are validated by comparing numerical results with the natural frequencies, mode shapes and critical speeds acquired from vibration measurements of the rotors. However, sufficient performance of a rotor model should usually consider the dynamics of complex framework foundation which is directly connected to the rotor dynamics. This adds more difficulties and complexity to the process of rotor modeling and restricts the use of models.

The incorporation of breathing behavior into the dynamics of the rotor represents a significant increase in complexity as a result of the nonlinear characteristics of the real transverse fatigue cracks.

During the rotation of the rotor, and mainly as a result of gravity, a portion of the rotor cross section remains under compression and a portion under tension. In case of cracked rotor, the crack section facing tensile stress opens, while the crack section facing compression closes. Therefore, the crack section opens and closes according to the angular position of rotation. In the completely closed position, the rotor behaves as it was uncracked.

Crack breathing is accompanied by periodical changes in the stiffness of the rotor. At certain angular position, when the crack is completely closed, the rotor has almost the stiffness of faultless rotor. Similarly, at certain angular position, when the crack is completely open, a significant decrease in local stiffness exists; however, the reduction in global stiffness of the rotor depends on the depth and location of the crack.

Many approaches have been developed for modeling cracks in rotors, and the subsequent reduction in stiffness. The developed models should accurately represent the rotor system; however, the required accuracy depends on the intended use of the model. On the other hand, the complexity of the breathing crack mechanism leads to necessary approximations and assumptions which are required in order to simplify the process of modeling and use of model.

The earliest, most simple, but often used model, is the hinge model published in 1976 by Gasch [8]. The steering function in the model is a step function approximation of the crack in which the crack is represented as either entirely open or closed. It is assumed in the model that vibrational amplitudes are small compared with the static weight deflection, and the axial and torsional vibrations are ignored [9]. In Fig. 1 the ”breathing” of the crack under the weight influence when the shaft is slowly turned, and the model steering function f(t) are shown.

Fig. 1
figure 1

Crack breathing, static deflection, and corresponding steering function (ref. to [9])

A smooth transition between the open and closed positions was introduced by the sinusoidally varying model presented by Mayes [20]. The model uses a steering function in the form \(f(t)=(1+{\text {cos}}\, \varOmega t)/2\) as a better approximation of the breathing behavior. The use of Mayes model is more significant for deeper cracks, although the rotor stiffness is not directly connected to the depth of the crack.

Considering the crack-related vibrations, more complicated breathing model was introduced by Jun et al. [15]. The presented model expressed the equation of motion with the response-dependent stiffness in a simple rotor. The model used the fracture mechanics to estimate the cross-coupling stiffness, as well as the direct stiffness. The crack openness was determined through the response solved by the governing equation, and the response-dependent stiffness was calculated by numerically integrating over only the open crack area [16].

In order to implement more realistic rotor models, the finite element method has been used in connection with the previously mentioned models by many researchers.

The previously mentioned models were applied and compared by Penny and Friswell [24] in order to investigate the influence of the crack model on the response of a general rotor model. According to their results, the three crack models examined had relatively small effect on the predicted steady-state 1X response, but they did have some influence on the predicted whirl orbit and the steady-state 2X response. However, in any crack identification scheme, these differences are not likely to have a significant effect, and simple models are more readily used [18].

The example system used in this contribution uses an application-oriented modeled rotor based on a finite element model of a length of 4.2 m and a radius of 0.14 m. The rotor is supported by two bearings, which are modeled with a simple bearing supporting damping (total damping coefficient d \(=\) 9e3 Ns/m and stiffness k \(=\) 3e7 N/m). While the four sensors at the bearing positions allow measurement of the rotor’s displacement, the displacements and velocities of the beam nodes in the x and y plane are measured directly during operation. This measurement scenario is the typical one in practice and different to those used in the previous publications [30]. The discretized model of the rotor (Fig. 2), has eight nodes, seven elements and 32 elastic degrees of freedom (each node has four degrees of freedom, translation, and rotation in the x and y planes). The vector equation of motion arises to

$$\begin{aligned} \begin{aligned} M\ddot{q}(t)+(D+\varOmega G)\ddot{q}(t)+Kq(t) = Fw, y = C_rq, \end{aligned} \end{aligned}$$
(1)

where M represents the mass matrix, D the damping matrix (including bearing damping), G the gyroscopic matrix, K the stiffness matrix (including bearing stiffness), f the input matrix, q the displacement vector, y the measured nodes, \(C_r\) the output matrix (corresponding to sensor nodes), \(\varOmega \) denotes the rotational speed, and w the input vector (unbalance forces and crack forces).

Fig. 2
figure 2

Discretized rotor model

The modeling of the shaft cross-crack is realized by integrating the effects of curvature-dependent changes applied to the location of the crack. Additionally, crack-specific parameters [20, 30] have to be adapted. This procedure is used here similar to those introduced in [30]. The integration of the crack effects results in an addition to the rotor dynamics description to additional strongly nonlinear effects. The crack-related flexibilities are primarily influenced by the curvature-induced opening of the crack at the crack position (here assumed as known). The integration into a FEM-based description is necessary for applications because for signal-based approaches output data has to be generated and for model-based approaches input–output information beside the model has to be used. Here, the unknown external effects are interpreted as effects on the right side of the dynamical equation. In the case of simulations, these external effects have to be taken into account for the simulation as well-known effects.

The state-space model is used for both simulations and the observer design. It is known that the eigenvalues of a rotor system are speed-dependent. The eigenvalues are calculated at the rotational speed of 9000 rpm. The first four forward modes are illustrated in Fig. 3. The amplitudes for each mode are normalized, so that the maximal amplitudes for each mode are taken as unity. The two rigid body modes and the first two bending modes can be clearly identified.

Fig. 3
figure 3

The first four rotor eigenmodes (speed of 9000 rpm)

To show the effect of the crack in the shaft dynamics, simulations are performed with and without crack. The system is excited by an unbalance U = 0.12 Kg m acting to node 6 of the rotor; the weight force is taken into account in the form of static forces in the y-direction. Using the location of the crack at the node 4, the rotor crack is modeled using a relative compliance of \(hr = 0.001\). In Fig. 4, an example of the system to be considered is shown, and it can be stated that the crack affects only a small noticeable change in the oscillation amplitude, but not in phase.

Fig. 4
figure 4

Vibration amplitude (blue) and crack force (gray) at Lx. (Color figure online)

The technical challenge for monitoring is similar for all procedures. The change of the rotor dynamics due to the occurrence of the shaft cross-crack through the available (indirect) measurements of the bearing shaft movements has to be identified with respect to fault detection. The signal characteristics or in the sense of a diagnostic task, the identification of the mapping between the detected change and the related causal cause, should be used in the way that the fault should be assigned to the geometrical position of the irregularity.

3 PI-observer (model-based) approach

As mentioned in the section of brief review (Sect. 1.2), crack detection based on model-based approaches has been studied in the last years by many groups.

An early concept of crack detection based on the theory of disturbance observer, later denoted as PI-observer [31], is introduced by Söffker et al. [30], later optimized for practical use, and successfully used for several years [19, 31]. Here, based on the nominal behavior of the system, the fault effects caused by the crack are interpreted as unknown external effects acting to the nominal behavior of the rotor. For this task, measurements of displacements and/or velocities of the vibrating dynamic system are necessary, together with further information such as the mechanical model of the rotor and the characteristics of the typical behavior of the crack.

Theoretical considerations are realized, showing the ability of the method, especially for reconstructing related forces understood as disturbance forces caused by the crack. The introduced method shows that it is possible to detect a crack inside the rotor only using bearing measurements, also if only small stiffness changes occur.

The basic idea of the methods of analytical redundancy applied to fault detection involves the use of analytical relationships for example in the form of deferential or deference equation systems. The accepted relationships are understood as mathematical equivalent to the real physical system in which the input–output relationships are mapped into an appropriate form. This may include the modeled (assumed) fault behavior (or related descriptions).

The system equation of motion (Eq. 1) mentioned in Sect. 2 can be transformed into more general description of state-space model as

$$\begin{aligned} \begin{aligned} \dot{x}&=Ax+Bu+Nf(x,u,t)\\ y&=Cx. \end{aligned} \end{aligned}$$
(2)

Here x denotes the 2n-dimentional state vector (consisting of displacement and velocity variables), A is the 2n \(\times \) 2n system matrix, B represents the input matrix, C the output matrix, y is the vector of measurements, and u is the 2n-dimentional vector of known control inputs and/or excitation functions. The vector function f(xt) describes the nonlinearities, unknown inputs, and unmodeled dynamics of the system. The matrix N is the corresponding input matrix of the nonlinearities into the linear dynamical system. It is assumed that the system parameters A, B, C, and N, as well as the input and output time signals u and y, are known. The task is to reconstruct the unknown nonlinearities f(xt), (here the external disturbance forces of the crack) by applying proportional integral observer (PIO) [31]. The basic idea of the PIO is to enhance the well-known Luenberger observer for specific additional degrees of freedom. These degrees of freedom are used to estimate the unknown inputs and are realized as the integral of the estimation error realize the observer dynamics.

To describe approximately the time behavior of the nonlinearities, the state-space vector is extended by a fictitious disturbance vector v(t) including dynamical behavior

$$\begin{aligned} \begin{aligned} f\approx {Hv},\\ \dot{v}\approx {Fv},\\ \end{aligned} \end{aligned}$$
(3)

which leads to the extended system

$$\begin{aligned} \begin{aligned} \begin{bmatrix} \dot{x}\\ \dot{v} \end{bmatrix} =&\begin{bmatrix} A&NH\\ 0&F \end{bmatrix} \begin{bmatrix} x\\ v \end{bmatrix} + \begin{bmatrix} B\\ 0 \end{bmatrix} u\\ y =&\begin{bmatrix} C&0 \end{bmatrix} \begin{bmatrix} x\\ v \end{bmatrix}.\\ \end{aligned} \end{aligned}$$
(4)

Here the matrix N couples the fictitious approximation Hv of the unknown inputs n to the states where they appear. The signal characteristics of these inputs are approximated by a linear dynamical system with the system matrix F.

An extended observer could be constructed for the system mentioned in Eq. 4 in order to estimate the approximation of the disturbances v as \(\hat{v}\). The task of the observer design, with suitably chosen large observer gains and assuming the observability of A by C, is to realize the estimation of at least the displacement variables at the crack location, and typically of all modeled nodal displacements, and of the additional effects known as acting to the system.

From the structure of the PI-observer illustrated in Fig. 5, the dynamics of the PI-observed system are described by

$$\begin{aligned} \begin{aligned} \dot{\hat{x}}&=A \hat{x} + L_3 \hat{v} + Bu + L_1(y- \hat{y}),\\ \hat{v}&=L_2(y- \hat{y}), \end{aligned} \end{aligned}$$
(5)

and transformed in a matrix form as

$$\begin{aligned} \begin{bmatrix} \dot{\hat{x}}\\ \dot{\hat{v}} \end{bmatrix} =\underbrace{\begin{bmatrix} A-L_1C&L_3\\ -L_2C&0 \end{bmatrix}}_{A_e} \begin{bmatrix} \hat{x}\\ \hat{v} \end{bmatrix} + \begin{bmatrix} B\\ 0 \end{bmatrix} u + \begin{bmatrix} L_1\\ L_2 \end{bmatrix} y. \end{aligned}$$
(6)
Fig. 5
figure 5

Structure of PI-observer [31]

For the estimation of crack disturbances, it is necessary to emphasize the estimation residual as

$$\begin{aligned} \begin{bmatrix} \dot{e}\\ \dot{\hat{f}} \end{bmatrix} =A_e \begin{bmatrix} e\\ \hat{f} \end{bmatrix} - \begin{bmatrix} N\\ 0 \end{bmatrix}f \end{aligned}$$
(7)

where the estimation error e(t) is introduced as

$$\begin{aligned} e(t) = \hat{x}(t) - x(t). \end{aligned}$$
(8)

The analysis of the estimation residual can be used for the detection of faults and related localization. It is shown in [19, 31] that suitable observer design in combination with large gains lead to acceptable estimation also in case of wrong approximation used in Eq. 3.

3.1 Implementation, simulation, results, and discussion

As an example the observer-based calculation of the behavior of the additional flexibility as additional inverse stiffness based on measurements taken from the rotor displacement at bearing position is considered. This type of fault acts as multiplicative error. The simultaneously realized estimation of the fictitious crack force as well as the related displacement at the location of the crack, the diagnostic indicator ’relative stiffness loss,’ can be determined as a causal indicator [30] showing the dynamical behavior effect of the crack at the location of the crack. Clearly, the rotation-induced ’breathing’ of the crack can be shown (Fig. 8). The observer-based results, based on the PI-observer method, are shown in Figs. 6, 7, 8, and 9. The variables can sometimes be estimated very accurately, as shown in Fig. 6. In Figs. 6 and 7, the time behavior of the estimation of the node displacement has been used to reconstruct the crack-induced effect from the vibrational behavior. Applying simulated noise (to simulate a real application example), a partly strong influence on the reconstructed curves is observed (Fig. 9).

Fig. 6
figure 6

Comparison of simulated and estimated vibration (left) as well as crack force (right)

Fig. 7
figure 7

Comparison of simulated and estimated vibration (left) as well as crack force (right) with sensor noise

Fig. 8
figure 8

Reconstructed relative compliance

Fig. 9
figure 9

Reconstructed relative compliance considering additive sensor noise

4 Data-driven (signal-based) approach

Easy-to-apply (no model is needed), signal-based detection concepts represent an usual technique for vibration monitoring systems in practice. The main advantage of this widely used field of techniques is the easy applicability. The main disadvantage is that the conclusion from measurable changes to the physical reason (diagnostic statements) assumes detailed and specific knowledge or assumptions about the physical behavior of the fault. On the other hand, modern machine learning methods are used to generate features representing different states of the rotor related to the existence of faults / changes like due to the dynamical effects resulting from ’breathing’ cracks.

The existence and growth of cracks and faults in vibrating systems like turbomachines can be implicitly observed by monitoring of features generated from measurements of the system. In spite of the easy use of the measurements, the detection of specific physical effects behind the change of signal properties is usually difficult to detect or classify directly, especially in the early stages of damages. This results mainly from the weak effects and also from existing disturbances or other effects affecting the measurements as well as the vibrations. However, reliable measurements supported by appropriate information extraction techniques can also in case of the above-mentioned effects produce recognizable features and patterns which enable reliable allocation of the physical causes, indicating the existence and size of cracks even in the early stages.

In particular, the diagnosis of failures appears as a complex task. Recent developments, however, permit the use of filtering techniques in combination with methods that do not have the limitations of classical threshold-based methods. Suitable filtering techniques for fault detection are used such as FFT, Cepstrum, STFT, or wavelets, which produce sufficiently complex features to define complex characteristics of the vibrational state [7]. With the help of suitable pattern recognition and classification methods, the generated complex features can be learned to classify patterns in the application, i.e., assign the learned patterns. As classification methods, the known methods of neural networks (NN), support vector machine (SVM), and the fuzzy-based methods can be used. These methods as supervised learning methods use a problem-specific data sets and form method-specific patterns that can differentiate specified faults and machine health states of interest. These techniques are in general easy to apply without the need for complex modeling task necessary for model-based approaches. The main disadvantage is that the conclusion from measurable changes to the physical reason (diagnostic statements) assumes knowledge or assumptions about the physical behavior of the fault.

Suitable machine learning methods are used to generate features representing related different states of the rotor connected to the existence of faults / changes etc. As introduced, one important dynamical effect results from ’breathing’ cracks, which is an important but unusual fault. The feature extraction stage is realized to extract suitable features and to exclude the redundant ones. Additionally, the suitable features are transformed into a representative form to help to make the recognition process easier. The transformed features are undergone the classification to detect specified machine health states of interest.

In wavelet analysis [7], signals are compared with a set of template functions obtained from scaling and shifting of a mother wavelet function. Wavelet-based approaches are widely used in classification and recognition tasks as feature extraction tools. The performance of the wavelets is proved to be more flexible than other usual approaches as they lead to time–frequency analysis with adaptive and suitable time and frequency resolution concurrently, and therefore with perfect reconstruction characteristics.

Vibration signals of the rotating machinery are usually a mixture of periodic and transient components buried in broadband background noise. For applications implying noise to be removed from a signal, a reliable alternative is the discrete wavelet transform (DWT) which is obtained by a process of a dyadic parameter discretization of the continuous wavelet transform (CWT) leading to more efficient computational effort as well as to a suitable size of generated parameters. These advantages make the DWT more appropriate for real-time applications in comparison with other approaches.

In general, the system including crack to be monitored should be monitored, so it becomes necessary to observe and isolate growing cracks. The used routines should work robustly independent from changing operating conditions. Operating conditions may change, also damping effects. In many applications, parameters collected from the starting up of the rotor are used. It is also plausible that the stationary signals would provide reliable source of system state information by excluding the disturbing transient events.

The task of the diagnostic system includes the generation and related processing of a suitable feature set which is representative to the different machine states, and the reliable classification of the classes within the feature set. The combination of reliable feature extraction and classification procedures adds enhancement to the individual capabilities of the two modules (Fig. 10).

Fig. 10
figure 10

Classification modules

The application of SVM classification requires the selection of suitable kernel function and the parameters which adjust the function of the classifier. The radial basis function (RBF) kernel is a usual first choice for diversity of applications for many reasons [13]. Using only one parameter (\(\gamma \)) to be selected, the kernel can handle nonlinear cases more effectively, with fewer numerical difficulties. Additionally, the penalty parameter (C) of SVM should be assigned. A perfect separability of the training data is not necessary as it could be reason for over-fitting which is an indication to deteriorated generalization of the model. In order to avoid over-fitting, the method of cross-validation (CV) is used. The training set is divided into k equal parts. One part is chosen for testing and the rest for training the classifier. This is done for all the parts and gives an average indication of the classification accuracy for different values of C and \(\gamma \). A multistage grid search (Fig. 11) is then helpful in order to find the required best accuracy.

Fig. 11
figure 11

Cross-validation and parameter combination for a coarse grid, and b fine grid

An illustration of the classification module is presented in Fig. 12. The parameters \(C_\mathrm{opt}\) and \(\gamma _\mathrm{opt}\) represent the optimal configuration of the classifier parameters and selected for the classification of the test data. In the case of multi-class SVM, the method used must be considered together with parameter selection strategy. As an example, there are two options to implement the “one-against-one” method considering parameter selection: First, for any two classes of data, the parameter selection is conducted to have the best (C, \(\gamma \)). The second option is that for each (C, \(\gamma \)), cross-validation in combination with the “one-against-one” method is used for estimating the performance of the model. A sequence of preselected (C, \(\gamma \)) is tried to select the best model [5]. Considering the overall accuracy, one parameter set for each individual decision function may lead to over-fitting.

Fig. 12
figure 12

Overview of the classification module

The generation of features using wavelets introduces diversity of choices according to the wavelet method and mother wavelet selection, and the level of decomposition, in addition to the way in which the generated subcomponents are handled. An illustration of the different approaches for features generation using wavelets is presented in Fig. 13.

Fig. 13
figure 13

Different alternative feature generation procedures

In order to improve the classification process, the feature set is manipulated for the following purposes:

  • Generation of further features

  • Improvement in the existing features

  • Combination of features

The information content of the detail and approximation levels of the DWT represents a possible indicator of the system states and accordingly a source of new features. In order to improve the existing features, some features are subjected to elimination because of the low content of classification indicators. The elimination of the coefficients with low classification abilities enables more focus on the data segments with higher CV accuracy. The proposed procedure for this concept is:

  1. 1.

    The best CV for all coefficients is determined.

  2. 2.

    One coefficient is eliminated and the CV is determined again.

  3. 3.

    In case that the CV accuracy after elimination is at least the same as before, the CV is assigned as the new best CV and the coefficient is permanently eliminated. Otherwise, the coefficient remains in the feature set.

  4. 4.

    Steps 2 and 3 are implemented for all the coefficients.

In order to combine features to generate new ones, some statistical measures are used. The root-mean-square (RMS), as an example, is used to quantify the wavelet coefficients generated within the analyzing window, as a measure of the magnitude of varying quantities.

4.1 Implementation and discussion

The sensor data provided from the considered rotor consist of four time series of vibration acceleration signals (Lx, Ly, Rx, and Ry) taken at the two bearing ends (L and R) of the rotor in two independent coordinates, horizontal (x) and vertical (y), taken from a stationary region. A time window of 1 s with a sample rate of 10 kHz is applied. A state set of measurements consists of 51 measurement signals of the size of 5000 points. The rotational velocity of the rotor remains constant on 9000 rpm.

To apply the DWT, an analyzing window of a suitable length is shifted across the data stream generating the wavelet parameters at a suitable level of the wavelet analysis. In this contribution, the SVM method is used, therefore after a training phase a model for the classification of test data is developed. The corresponding transformation defines a dependency (mapping) between the indicating features and the system state using a separating hyperplane with a maximum separation. The main advantage of the SVM is its generalization ability. Here the maximum margin criterion in the process of selecting the separating hyperplane can be realized. Another advantage of SVM is its robustness against signal-related outliers using the so-called penalty parameter, which allows controlling the misclassification error.

The applied dataset is divided into two data subsets. The first one is a training dataset of four states; no-crack and three different small sizes of crack (hr = 0.0001, 0.0005, and 0.001). The second data subset includes one state of comparatively bigger size of crack leading to chaotic behavior of the rotor. An enlarged example of the four signals is illustrated in Fig. 14 for stationary crack-free behavior and \(hr=0.022\) cracked rotor. The signals are understood as measurements taken from the system from those nodes which can be measured. In general, all signals shown result from simulations of the cracked and uncracked rotor.

Fig. 14
figure 14

Signal sample with a no-crack stationary behavior, and b crack chaotic behavior (\(hr = 0.022\))

The two data subsets are tested in order to investigate the influence of different artificial disturbances on the classification ability. Two kinds of disturbances are used. The signal \(S(t_i)\) is disturbed by a random noise as

$$\begin{aligned} S_\mathrm{Rnd}(t_i) = S(t_i)+\mathrm{Random}\,\{-10^{-5}...10^{-5}\} \end{aligned}$$
(9)

and by an amplitude-dependent disturbance as

$$\begin{aligned} S_\mathrm{Amp}(t_i) = S(t_i)(1+0.5\mathrm{Random}\,\{0...1\}). \end{aligned}$$
(10)

The influence of the disturbances on the signal Lx is enlarged as shown in Fig. 15.

Fig. 15
figure 15

Signal sample Lx with different types of noise

The cross-validation (CV) is used as a measure for the quality of classification, if no test dataset is available. The application of scanning window is adjusted to generate 51 measurements for each crack state in order to have sufficient CV estimation for the classification ability.

The four crack states are considered as four classes for training. The classification results are summarized in Table 1. The best classification ability (100 %) is achieved by the vertical measurements (y) in case of random noise (Rnd.). The CV of the measurements in the right side of the rotor (Rx and Ry) is generally better, in which are the levels 5 and 6 give the best scale level of the DWT. These two levels are generally better also in case of amplitude-dependent disturbance (Amp.), in which the horizontal measurements (x) perform better.

Table 1 CV results for classification of four crack classes with different noises (\(\%\))

Furthermore, the three crack states are combined in one class as cracked rotor. By not considering the differences between the crack sizes, the focus is put on size-independent classification. The classification results are summarized in Table 2. The same previous number of measurements is considered, thus 51 measurements for the class “no-crack” and 153 measurements for the class “crack,” accordingly, the worst CV accuracy is 75 %. The results are coinciding with those presented in Table 1. In addition to the approximation level of the vertical measurements (y), the detail levels 4–6 are best candidates for classification. As a conclusion, the classification between the states no-crack and small-sized crack is considered reliable.

Table 2 CV results for classification of two crack classes with different noises (\(\%\))

Additionally, a larger size of crack (\(hr=0.022\)) leading to chaotic behavior of the rotor is generated by 51 measurements in order to investigate classification ability against a state of no-crack. As a result, and considering all the features together, a CV accuracy of 100 % is achieved independent of the state of the disturbance. In order to investigate the robustness of the classification, the classifier is trained by the four classes mentioned in Table 1 and tested by the bigger sized crack mentioned above. Individual features and minimum CV accuracy of 70% are considered. The allocation of the 51 test measurements into the four classes is introduced in Table 3. With the exception of Rx and one Ry measurements with random noise, all the other individual features classify the test measurements as cracked. This is an indication of reliable parameters for classification. Inconsistently, the test measurements in the table are most frequently classified as 0.0005 class crack.

Table 3 Classification of the chaotic crack state in four crack classes

A signal sample for a period of 4 s of rotor run is considered, including the stage of starting up, in order to further investigate the separability of the size of crack states and the remaining service life. To apply the DWT, an analyzing window of a suitable length is required. Here the most appropriate wavelet mother function is used. Different wavelet mother functions are tested on the system in different levels and parameters in order to find the most crack state indicators separating the states. The root-mean-square (RMS), as a measure of the magnitude of varying quantities, is used to quantify the wavelet parameters generated within the analyzing window. The tested wavelet mother functions include haar, dmey, sym, and db. The best results are obtained in the decomposition level 6 using the discrete meyer (dmey) wavelet mother function (Fig. 16), characterized by the highest separability of the crack indicators and a homogenous applicability independent of the place of application within the data. It should be noted that the choice of the mother wavelet as well as the observed result that using the coefficients from decomposition level 6 will give the best results, cannot be generalized. This result is (as usual using these approaches) obtained by practical comparisons.

Fig. 16
figure 16

Meyer wavelet mother function

The resulting RMS measures and a related moving average smoothing of the noisy signal Ry are shown in Fig. 17. The results are presented for four different crack size levels of the considered rotor. It can be seen that the separability of the crack levels is not affected by the non-stationary startup of the rotor. It can also be seen, that based on fewer measurements, good results can be achieved.

Fig. 17
figure 17

RMS and smoothing; Ry, dmey, level 6

For comparison, alternative four different results are shown in Fig. 18. The presented figures result from the same dataset used in Fig. 17 but using less appropriate DWT parameters and mother wavelets. In the results presented in Fig. 18a, b, the high level of fluctuation and the non-homogenous solution prevent the applicability of the solution. The other two wavelet functions presented in Fig. 18c, d result in RMS range of cracks 1.82e\(-\)6 and 1.83e\(-\)6, respectively, whereas the selected wavelet (Fig. 17) results in higher RMS range (2.02e\(-\)6) which means better separability of the different crack sizes and accordingly more accurate results of the crack detection and evaluation system. In a window of 0.5 s at the end of the data, the effect of changing the rotor system damping is presented as minimum in Fig. 19 for the selected wavelet (Fig. 17).

Fig. 18
figure 18

RMS and smoothing of different combinations for comparison. a RMS and smoothing; Ly, dmey, 6; b RMS and smoothing; Ry, haar, 6; c RMS and smoothing; Ry, db4, 6; d RMS and smoothing; Ry, sym, 6

Fig. 19
figure 19

Effect of damping; Ry, dmey, 6

The task of the SVM classification module is to evaluate the extracted features which contain the indicators of the system state in order to achieve a statement about the existence and size of the cracks and faults to be detected. The fusion function of the SVM classifier helps to obtain a more reliable complementary sensor array of the four filtered sensor data provided by the rotor system.

The required evaluation of the extracted features can be provided in two ways: binary classification and multiclass classification. In the binary classification the classifier is trained using binary training data to classify two classes; here: cracked and non-cracked rotor. In this case the required quantification of the crack size is calculated using the position and distance of the specified state from the separating hyperplane in the feature space. In the case of using the multiclass classification, the training data used to train the classifier are grouped according to the size of the crack into segments coinciding with the required scale and accuracy of the size of the cracks. Furthermore, fine-tuning of the results can be done using the distance of the state from the corresponding separating hyperplane in the feature space.

The training result of the multiclass classification of the previously mentioned rotor implementation is given in a two-dimensional view in Fig. 20. The class boarders of the crack size classes can be clearly detected. It can also be seen that the data resembles one-directional trajectory in the direction of the increasing crack size.

Fig. 20
figure 20

The SVM feature space

The decision function of the binary classification of the rotor implementation is presented in Fig. 21. The decision boarder in the figure represents the separating hyperplane between the two classes of the training data; cracked and non-cracked rotor. The four training datasets of the four different crack sizes which are mentioned in Fig. 20 are recognized in Fig. 21 as different distances from the decision boarder. In case of more data samples available representing all the crack sizes in between, the decision function would resemble a one-directional trajectory in the direction of the increasing crack size. In this case, the size of the crack can be defined implicitly by using the distance from the decision boarder.

Fig. 21
figure 21

The SVM decision function

For remaining life prognostic purposes, the decision function can be undergone an extrapolation process to reach a pretrained crack safety limit, as shown in [25].

The previous discussion indicates that the binary classification may be more suitable for the crack size quantization. This is concluded from the one-directional trajectory of the decision function in the direction of the increasing crack size. This conclusion is supported by the fact that the SVM classification is originally and more efficiently a binary classification [1].

5 General discussion of the approaches to be compared

Many publications have been published presenting the successful applications of many methods representing the two mentioned approaches of monitoring system design. Indeed, it is quite difficult to estimate how successful an approach is, compared to the other for many reasons. In the literature, the successful implemented algorithms have generally been limited to faults which are basic in comparison. Very few publications have dealt with realistic behaviors of faults and cracks which might appear even simultaneously. Additionally, most of the published systems have been implemented and validated on laboratory structures which could essentially differ from the real world industrial systems. The lack of benchmark measurements and assessment criteria for the systems make it even harder to compare. However, some general inferences can be concluded from the study of both approaches.

The main advantage of model-based approaches is that the usually available, very detailed, and physical-oriented understanding of the fault and crack effects acting to the system is preserved and can be used by the approach to compensate measurements. The typical and known interpretation problem of signal-based approaches is avoided using problem-related indicators like the introduced stiffness change in rotating coordinates, showing a measurement-based reconstruction of the physical breathing of the cracked rotor. The introduced approaches allow also the implicit definition of the crack depth, if the measurements are noise free. One of the drawbacks of this kind of approaches is the necessity of fault models, as well as the assumed hypotheses about the location of the fault or the crack. In general, the success of all methods of analytical redundancy is essentially defined by the quality of the model. Modeling faults inevitably lead to errors which could lead with the possibly existing real effects from the rotor dynamics to the fact that corresponding faults cannot be distinguished in principal. Indeed, a successful rotor modeling should consider the dynamics of the related foundation which is much more complicated.

Modern signal-based approaches, on the other hand, are based on robust machine learning and feature extraction techniques, making the detection more reliable and robust even for small cracks. The used methods are more robust against disturbances and noises. This advantage makes them more suitable for realistic conditions in the industry. Additionally, the simplicity of the signal-based techniques makes them more appropriate for real-time systems.

Considering the targets of the detection systems, more reliable fault localization could be achieved by model-based systems, although multiple fault localization requires the establishment of multiple model system with locations hypothesis. On the other hand, fault severity estimation could be more effective using modern feature extraction and classification techniques adopted by signal-based systems.

The field of sensor and decision fusion is one possible method to combine the two main approaches [2]. Another alternative is to use a model-based system to recognize the changing service conditions and operation setpoint, in order to adjust the adaptable feature extraction and classification in the concurrent signal-based system.

6 Short summary and conclusion

The two main approaches used for crack detection and prediction in rotating machinery: model-based and signal-based approaches, are investigated by one typical example. Several strength and weak points are discussed and compared for the two approaches using two representative applicable methods, in order to achieve a comparative overview of the available techniques.

Beside an observer-based solution predicting crack depth related information, a new signal-based/data-driven approach is introduced to improve the detection problem with respect to noise.

The PI-observer-based method is considered as modern model-based technique, to give indication about possibilities and limitations of such kind of methods. A novel signal-based approach is introduced, based on SVM and wavelets as modern machine learning techniques.

Modern machine learning techniques are found more robust against disturbances and noises, whereas the model-based techniques are more adaptable with load changes of the system and more able to be connected with system physics and modeling parameters.