1 Introduction

Industry 4.0 is bolstered by the benefits of self-configuration, self-optimization, early awareness, decision making, and predictive maintenance capabilities (Qin et al. 2016). In order to accelerate the transition from the conventional manufacturing systems to one that caters to the contemporary industrial revolution, the cyber-physical system (CPS) (Mosterman and Zander 2016) has been integrated into Industry 4.0. In CPS, the physical machine working environment is virtually modeled by incorporating sensors in the machine components. These technical advancements significantly contributed to the emergence of the discipline known as prognostics and health management (PHM) as an indispensable arm of Industry 4.0 (Lee et al. 2018). Failure detection and predictive maintenance (FDPM) is a vital component of PHM in that it ensures benefits such as optimum cost, safety, availability, and reliability by preventing the chances of catastrophic failures, serious accidents, and unexpected shutdowns of the whole system. Rotating machinery constitutes approximately 40% of all machinery and is more prone to both deterioration and failures in the mechanical and electromechanical systems (Chen 2009). At the component level, the rotating machinery is mainly divided into three parts: bearings, gears, and rotors (Chen et al. 2018). While there is a plethora of literature related to bearing and gear fault identification, comparatively fewer works address the faults affecting the rotor system (Devendiran and Manivannan 2016; Wang et al. 2017; Zhang et al. 2019b).

In the early days, the rotor fault diagnosis (RFD) research was categorized into two main streams: (i) model-based diagnosis and (ii) signal-processing-based diagnosis. In terms of the former, the physical quantities and features of the monitored system, such as stiffness, mass, and damping-matrices, are modeled using a mathematical model coupled with the facts of physics (Pennacchi et al. 2006). The modeling could be performed either at the system-level or at the component-level. However, researchers have encountered certain shortcomings with this method. For one, it has proven to be unfeasible as the system becomes more complicated with more sophisticated machinery and advanced technologies. Moreover, this approach is insufficient for updating real-time processes with newly read data. More research studies on the model-based identification of rotor faults can be found in the literature review from Bachschmid et al. (2002). Meanwhile, techniques based on signal processing (Ricci and Pennacchi 2011) rely on extracting fault-specific characteristic features from the acquired signal and coupling them with the statistical model of the suspected fault. Fault diagnosis using pure signal-processing-based methods also has certain limitations, much like the model-based methods. Nonetheless, the characteristic features identified by the signal processing methods remained the bedrock of fault diagnosis and the developed decision-making algorithms for some time (Li et al. 2015a; Rai and Mohanty 2007; Cui et al. 2017; Chen et al. 2018).

More recently, the motto of ‘right information and data at the right time for decision making’ became the driving force of AI-based data-driven FDPM (Oztemel and Gursev 2020; Angelopoulos et al. 2020) within the context of Industry 4.0. Under the umbrella of AI, there are two main ideologies, namely, shallow learning (SL)—which is popularly referred to as machine learning (ML)—and deep learning (DL). The availability of low-cost sensors and big data has enhanced the data-driven decision-making philosophy. The widespread use of ML and the rapid emergence of DL algorithms have ensured their popularity in the field of rotating machinery fault diagnosis (Liu et al. 2018; Zhao et al. 2019). Moreover, the robust and adaptable nature of these algorithms (Liu et al. 2018) has ensured that they are highly regarded among the research community. Unfortunately, the majority of the research related to AI-based fault diagnosis of rotating machinery only focuses on the bearing or gear faults (Wei et al. 2019; Chen et al. 2018; Liu et al. 2018), with little attention paid to RFD. As such, the literature appears somewhat fragmentary, falling short of providing opportunities for exploiting the fault-specific characteristics of RFD and utilizing them in the feature engineering phase of AI to produce significant research improvements within the domain. Furthermore, the analysis of rotor faults is paramount since they not only have an unmediated, often disastrous impact on the performance and structural attributes of the affected equipment but may also cause secondary faults in the surrounding equipment such as the bearings and gears (Xue et al. 2013).

Within this context, this paper endeavors to study and review the recent developments in ML and DL approaches in terms of RFD, as well as to gain a thorough understanding of the characteristics and features of rotor faults.

The significance of this review on RFD can be summarized as follows:

  1. 1.

    Most importantly, this review presents an attempt to instill within the relevant researchers, a research practice aimed at utilizing the fault characteristics and the prior knowledge on the failure in automated diagnosis, rather than merely adopting data-driven AI approaches.

  2. 2.

    The majority of the studies on rotating machinery fault diagnosis focus on component-wise analysis rather than fault-wise analysis, thus leaving loopholes in utilizing the fault characteristics and the associated features (Xia et al. 2012). In view of this, this study pursues an appropriate fault-wise categorization throughout the review process to characterize the error as per the suspected fault.

  3. 3.

    The existing studies on rotor faults make little reference to the fault characteristics, the correlation between the faults, and the fault simulation process. This work thus attempts to provide a comprehensive reference point for RFD researchers, elaborating on rotor faults, including in terms of the practical fault simulation in testbeds.

  4. 4.

    While structural rotor faults (SRF) are a common and straightforward fault in rotating machinery, compared to bearing or gear faults, they receive little attention from the research community. This paper thus ventures to draw more AI research attention toward SRF and the attendant properties.

  5. 5.

    Rotor faults, specifically SRFs, are characterized by spectral changes in the rotating frequency and the harmonic frequencies in the vibration spectrum (refer Table 1), and the research that utilizes these fault-specific distinctive frequency components (DFC) with AI is thus emphasized in this literature review.

    Table 1 Description of rotor faults
  6. 6.

    The studies on RFD promote the development of AI-based tools and techniques to ensure the ‘integrity of the rotating machinery equipment,’ which is an inevitable element of industrial safety and reliability.

  7. 7.

    The review traces the trajectory of fault diagnosis from traditional ML to advanced DL data-driven approaches along with the attendant advantages and disadvantages as well as a comparative analysis.

1.1 Rotating machinery fault categorization

Fig. 1
figure 1

Categorization of rotating machinery faults

As noted above, the appropriate categorization of rotating machinery faults is crucial to going beyond component-wise analysis and enhancing the fault-characteristics-based analysis for RFD. An example of fault categorization is shown in Fig. 1. Here, the rotor faults are separated from the bearing and gear faults in the first step before the rotor faults are categorized from a ‘cause of vibration’ perspective. According to this, SRF or 1x fault is the primary cause of vibration, followed by the shaft-related faults—the secondary cause of vibration—with these two factors constituting the first two categories. Meanwhile, the broken rotor bar (BRB) fault has also been included in the fault category list. The most common and critical rotor faults, such as unbalance (UB), misalignment (MA), and looseness (LS), can be categorized under SRF (Chen 2009). A bent shaft (BS), a shaft crack (SC), a rub impact fault (RIF), and corrosion and wear (Cr&Wr) are the faults that affect the shafts. They are frequently associated with SRF and are considered to fall within the shaft fault category. In a practical scenario, it cannot be assumed that a single fault occurs in the rotor at any given time. Hence, we incorporated compound fault (CF) in the fault grouping process to represent the combination of multiple faults. Meanwhile, with the exception of rotor faults, any faults highlighted in the existing literature are considered to fall under the ‘other faults’(OF) category in the remainder of this paper. The rotor unbalance induced by the uneven distribution of mass in the rotor. It causes the inertia axis of the rotor to become misaligned with the geometric axis, which induces vibration in the rotor (MacCamhaoil 2016). The improper alignment of the couplings, shafts, and bearings, the thermal distortion of the bearing-housing supports, and the asymmetry in the applied load are the causes of the misalignment. This results in the bearings carrying a load higher than they are specifically designed for (Bognatz 1995). Looseness can be caused by improper assembly or the long-term running of machinery (Wu et al. 2010), and the existing literature is replete with research on bearing pedestal looseness. Bearing-related looseness can create an effect similar to that of an unbalance, while component-based looseness may result in secondary damage and detachments. It is noteworthy that around 40% of rotor-related problems can be attributed to unbalance, 30% to misalignment, 20% to resonance, and the remaining 10% to other reasons (Fahy and Thompson 2016). The contact between the rotor and the stationary parts of the machine under tighter clearances results in the rub fault. Shaft cracks are developed by severe thermal and mechanical stresses, which means the shaft cannot withstand the forces generated during normal operation (Patel and Darpe 2009b). A broken rotor bar fault (MA Cruz 2000) commonly occurs in induction motor (IM) rotors, triggering an uneven current flow through the rotor and creating both thermal and bending issues with the rotor. The electrochemical reaction due to environmental factors exacerbates the corrosion-based faults on the shaft’s surface (Bonnett 2000). Section 3 provides a detailed description of every fault considered in this review.

The remainder of the paper is organized as follows. Section 2 provides an overview of the state-of-the-art techniques, methods, and algorithms used in different phases of an ideal data-driven AI-based RFD framework. Section 3 then examines the significant rotor faults along with their vibrational frequency and phase characteristics, the causes and effects, and the associated faults, while an explanation of the practical aspects and a rotor testbed-based implementation are presented at the end of the section. The research frontiers of ML- and DL-based RFD are analyzed thoroughly in Sects. 4 and 5, respectively. Here, the aim is to highlight the works that demonstrate the fault-specific feature processing, the governing data-related issues, and other RFD challenges. Section 6 then summarises the trends and challenges of each phase and provides recommendations for an insightful selection of the techniques and methods for RFD problems. Future research directions are provided in Sect. 7 with the main aim of bridging the gap between laboratory solutions and industry-acceptable solutions. Finally, Sect. 8 presents the concluding remarks.

2 AI-based RFD framework

An ideal AI-based framework for RFD comprises three main phases: data acquisition, feature processing, and classification, as illustrated in Fig. 2. The ML part follows the above-stated phases sequentially, while DL excludes the feature processing phase due to its inbuilt feature learning capability. Nonetheless, there exist certain DL models such as belief generative models that prefer signal processed data to raw data. The methods and techniques adopted in the different phases of the RFD process under this framework are demonstrated in this section.

2.1 Data acquisition

The data acquisition phase of RFD heavily depends on the data source, the type of acquired signal, and the way the data is saved or presented. Hence, we divided this phase into three divisions: data source, sensing method, and input representation. The commonly available datasets for rotating machinery fault diagnosis -such as the case western reserve university (CWRU) dataset, the Paderborn university dataset, the PRONOSTIA dataset, the intelligent maintenance systems (IMS) dataset -have been devised for bearing-fault diagnosis and are thus not particularly relevant to RFD, especially to SRF. The SCADA (supervisory control and data acquisition) (Zhang and Wang 2014) for wind turbines (WT) was established to be a useful data source that addresses certain fault conditions of the turbine rotor. Induction motor is another acceptable data source from which voltage and current data can be acquired and monitored for broken rotor bar fault diagnosis, while it is mostly unsuitable for detecting other rotor faults. The data collected from the real-world industrial scenario is highly imbalanced, making it challenging to produce generalized solutions for RFD. The sensible and ubiquitous method adopted by the research community for collecting rotor fault data is the rotor testbeds (RTB) method. This requires employing dedicated processing software/hardware along with a computer to ensure the data collection process is smooth. A typical testbed setup with simulated faulty rotor conditions is described in the rotor fault overview section (Sect. 3). Another, but the not-so-popular option is the ‘rotating machinery library’ (RML) method demonstrated by Ishibashi et al. (2017), in which faulty rotor data is artificially created by applying rotor dynamics theory using efficient programming languages.

Fig. 2
figure 2

AI-based RFD framework

Rotor-related faults affect the nature and behavior of the vibration (Hariharan and Srinivasan 2009), and vibration has thus become the most widely used signal sensing method for RFD. The subharmonic and superharmonic frequency components of non-linear and complicated vibration motion characterize different rotor faults in distinct terms, which makes it an option of preference for RFD researchers. A rotor testbed used for acquiring the vibration data with the provisions for rotor fault simulation is shown in Fig. 3. Furthermore, the vibration-related studies have proved that SRF is the primary cause of vibration, while other shaft faults, such as rub and crack, are secondary phenomena causing the rotor to exhibit excessive vibration (Patel and Darpe 2009c). While vibration signals carry the system’s dynamic information, the signals are sensitive to noise as well as the sensor mount positions. In such a scenario, alternative sensing methods such as voltage and current (V&C) (Fang and Ma 2006) have been used for data acquisition, especially in terms of induction motor rotor fault analysis. Meanwhile, a number of researchers have focused on dealing with a variety of other sensing methods, including acoustic emission (AE) (Li et al. 2004a), sound (Saimurugan and Ramachandran 2014), and temperature (Langarica et al. 2019b). Here, certain works used combined signals, which are formed either by simply combining signals or by applying a fusion of two or more types of signal (Li et al. 2004a; Timusk et al. 2008).

Fig. 3
figure 3

A vibration data acquisition testbed (Courtesy : Meggitt India Pvt. Ltd.)

As signals from various sensors are applied to a wide variety of processing models undergoing enhancement on an almost daily basis, it is crucial to know how the input is being represented. For example, vibration signals are usually represented in a univariate or multivariate time-series. However, when applying them to a convolutional neural network (CNN) model, the preferable representation is a 2D image format. The methods used to accomplish such transformations can affect the performance of the whole framework, often to a great extent.

2.2 Feature processing

The next phase announces feature processing in which the information buried in a raw signal is extracted by suppressing the noise, identifying the fault specific features, and presenting only the necessary features to the succeeding phase. Hence, feature processing accounts for a variety of signal processing techniques, and it is performed in two steps, known as feature extraction and feature selection. The methods followed for feature extraction are primarily categorized into the time-domain (TD), frequency-domain (FD), and time-frequency-domain (TFD) techniques. Here, the statistical features (Vyas and Satishkumar 2001), such as mean, variance, root mean square, skewness, or kurtosis, are commonly operated in the TD feature extraction process. In the FD, fast Fourier transform (FFT) (Tajik et al. 2015a), discrete Fourier transform (DFT), power spectrum analysis, autoregressive (AR) model (Yuan and Chu 2006), eigenvector, envelope analysis, and Welch’s method are the frequently applied approaches for the feature extraction. The most popular methods—which include short-time Fourier transform (STFT) (Walker et al. 2014), empirical mode decomposition (EMD), and wavelet packet decomposition (WPD) (Bin et al. 2012)—fall under the TFD analysis, which also includes methods such as Hilbert-Huang transform(HHT), Hilbert transform (HT) (Konar et al. 2015), Wigner-Ville distribution (WVD) (Li et al. 2011), and wavelet transform (WT) (Younus and Yang 2012). Due to their exceptional performance, EMD and HHT have been the popular choices among these methods, particularly with non-linear and non-stationary signal processing. Here, it should be noted that rather than depend on these methods directly to extract feature parameters, it is better to utilize them in fault-specific symptom parameter extraction. Xue et al. (2013) made one such attempt in utilizing DFCs (1x to 5x of rotation frequency) to identify the SRFs.

The next operation is the feature selection, in which the prominent features are selected either by generating new features or by eliminating non-relevant features from the existing feature-set. The popular dimensionality reduction techniques, which include principal component analysis (PCA) (Uddin et al. 2014), linear discriminant analysis (LDA) (Tajik et al. 2015b), and independent component analysis (ICA) (Saimurugan and Ramachandran 2014), fall within the first category, while rough set theory (RST) (Konar et al. 2015), genetic algorithm (GA) (Konar et al. 2015) and sequential selection (SS) (Guyon and Elisseeff 2003) are used in the second category methods. There are no significant observations in terms of feature selection within the context of RFD, as feature extraction plays a key role in utilizing the fault-specific characteristics.

2.3 Classification

In the final phase of the AI-based framework, the classification/prediction is performed by employing ML- or DL-based models. Each of these models is further divided into subcategories based on their underlying principle, architecture, type of input, or design philosophy, etc. It is evident from the framework that ML-based models utilize the features supplied from the previous phase, while DL-based models accept the raw data as input. The classical shallow learning ML algorithms necessitate the feature processing phase, which demands extensive domain expertise and time. The most widely applied ML classifiers for RFD are the support vector machine (SVM) model (Fengqi and Meng 2006), the artificial neural network (ANN) model (Roemer et al. 1995), and a number of their variants. Researchers have experimented with different kernels in SVM, while various learning algorithms and activation functions have been manipulated for ANN in conjunction with all the popular feature extraction methods. Under the instance-based category, the k-Nearest Neighbor (k-NN) (Chen et al. 2011) algorithm is popularly used in RFD. Probability-based Bayesian methods - including naïve–Bayes (NB) (Yusuf et al. 2013) and the Bayesian belief network (BBN) (Xu 2012)—are also demonstrated in relation to RFD, while the non-parametric type -which includes decision tree (DT) (Nguyen et al. 2008), random forest (RF) algorithms (Quiroz et al. 2018)—as well as simple classifiers such as logistic regression (LR) (Quiroz et al. 2018) and linear discriminant analysis (LDA), (Glowacz 2018) also feature within the literature. A number of researchers have also introduced the AdaBoost (AB) (Martin-Diaz et al. 2018) algorithm and other ensemble classification algorithms (Niu et al. 2008) by combining the hypotheses of already proven models. There are individual attempts (Niu et al. 2007) made to perform the comparative analysis of the above algorithms using the same dataset and signal pre-processing methods.

The conventional SL algorithms are constrained in their ability to learn the non-linear relation of features. The need for extensive computation, the required time (especially for the feature processing), and the demand for specialized expertise in the domain, are other limiting factors. By using multiple-layer deep architectures, DL methods could imbibe advanced levels of representation of input data as they go deeper, which enables them to identify more complex features on their own. CNNs (Janssens et al. 2016) have attracted more attention than any other DL model in RFD since they have proven to be as effective with temporal data as with images, leaving the user with the responsibility of providing appropriate input representation. The autoencoder (AE)-based models, such as stacked AE (SAE) (Lei et al. 2016a) and stacked denoise AE (SDAE) (Zhao et al. 2018), also find their place in RFD. Deep belief networks (DBN) (Oh et al. 2016), which is a generative hybrid graphical model composed of multiple restricted Boltzmann machines (RBM) or AEs, are also widely used in this context. The sequential DL models, well known for their ability to deal with the temporal data, such as a recurrent neural network (RNN) and the variants, known as long short-term memory (LSTM) (Lei et al. 2019) and gated recurrent unit (GRU) (Liao et al. 2019), have also been explored in relation to RFD. Meanwhile, the deep generative model known as the generative adversarial network (GAN) (Lee et al. 2017) has performed its role as a data upsampler in certain RFD works. The details of commonly used ML and DL algorithms are provided in Tables 2 and 4, respectively, while a detailed review of the literature related to these models is provided in Sects. 4 and 5.

3 Theoretical background of rotor faults

As noted above, an array of generic fault categories affecting the rotor, including SRFs, shaft faults, and broken rotor bar faults, are grouped together under the category of rotor faults in this paper. As outlined in the introduction, these three categories are divided into subcategories based on the nature and cause of the faults (Bate 1987; Mais 2002; Alsalaet 2012).

3.1 Structural rotor faults

These faults are also known as 1x faults and are considered as the primary cause of the abnormal vibration in the rotor. The three faults that fall within this category are misalignment, unbalance, and looseness.

3.1.1 Misalignment

The scenario where bearings, shafts, and couplings are not properly aligned along their centrelines is termed misalignment (Alsalaet 2012). As machines operate, the heating and cooling results in component expansion and cold alignment, respectively, the net result of which is the misalignment of the machinery components. Continuous operation with an uneven foundation, a shift in the foundation, or the improper alignment introduced by imparted forces from other components can also lead to misalignment (Mais 2002). In addition, there is a high chance of installation misalignment to the system if the couplings are not properly set up. Meanwhile, faults such as a bent shaft and improper bearing seats also cause a certain type of misalignment with effects and symptoms similar to those of normal misalignment (Alsalaet 2012). Both misalignment and unbalance result in the bearings having to bear a higher dynamic load than they are specifically designed for and ultimately leads to failure due to early fatigue.

In terms of the couplings, misalignment creates excessive heat and friction, and thereby damages the component, while in terms of the horizontal shafts, any misalignment causes excessive vibration on both the vertical and axial planes. Finally, any misalignment in the overhung horizontal shaft is characterized by excessive vibration in the horizontal and axial planes. Axial vibration is a salient component of both these scenarios. In the case of the vertically aligned shaft, excessive horizontal and axial vibration indicates misalignment (Mais 2002).

There are three types of misalignments called parallel, angular, and parallel and angular misalignments (Nakhaeinejad and Ganeriwala 2009).

  • (i) Parallel misalignment: In parallel misalignment, the centerlines of shafts joined by the coupling are parallel, but they will be at an offset. It is characterized by strong radial vibration.

  • (ii) Angular misalignment: When a bending force is induced on the shaft by the joint at the coupling, such misalignment is called angular misalignment. In such cases, joining shafts’ centerlines are crossed at an angle between them, which causes strong vibration in the axial direction.

  • (iii) Parallel and Angular misalignment: It is the combination of both parallel and angular misalignments, and hence produces vibration in both axial and radial directions.

Fault characteristics analysis: Misalignment can be analyzed by comparing the ratio between 1x (unbalance indicator) and 2x (misalignment indicator) components. In normal misalignment scenarios, 2x and 1x components are present in the radial vibration spectrum, with 2x being the predominant component with a range up to 150% of 1x (Mais 2002). The severe misalignment conditions are characterized by the harmonics 3x to 8x or even a full high-frequency harmonics series, and its vibration amplitude at 2x running speed will be greater than 150% of that at 1x. Patel and Darpe (2009c) presented certain observations to identify the type of misalignment as well as to distinguish between misalignment and crack faults in the rotor. In order to uniquely identify misalignments, they suggested investigating the presence of strong negative higher harmonic frequency components compared to that of rotor crack. They also observed that stronger 1x axial as well as torsional responses and weak higher harmonics represent parallel misalignment. They added that a strong 3x harmonic component in its axial and torsional vibration response at 1/3rd critical speed indicates angular misalignments. Phase information is also used as a defining feature of misalignments. Across the coupling or machine, a phase shift of 180\(^\circ\) in the axial position shows angular misalignment, and the same in radial position indicates parallel misalignment, whereas that in both axial and radial position represents combined misalignment (Alsalaet 2012). The vibration waveforms follow a periodic pattern having one or two cycles in each revolution of the shaft.

3.1.2 Unbalance

The faulty state of a machine that occurs when the centreline of the mass of the rotor (inertia axis) and the center of rotation (geometric axis) are non-coinciding is known as unbalance (MacCamhaoil 2016). Rotor mass eccentricity created by the uneven build-up of debris on the rotor, assembly errors (unidentical blades of the wind turbine, windings of the generator rotor, etc.), and the addition of new fittings to the rotor without appropriate counterbalancing, are the major causes of an unbalanced rotor (Mais 2002).

In addition to these aspects, a number of other rotor faults can also lead to unbalance. For example, a bent shaft or loose parts can change the balance, while factors such as corrosion, abrasion, and the falling of damaged rotor parts also lead to unbalance. The critical components of the machine, which include the gears, bearings, and couplings, may be destroyed by an unbalance fault. In fact, unbalance particularly affects the bearings in that, as noted above, they are forced to carry a higher dynamic load than they are specifically designed for, which ultimately leads to failure due to early fatigue. During operation, rotating structures will experience a wobbling movement that is characteristic of the vibration resulting from the unbalance.

Unbalance produces a radial vibration; that is, it is part vertical and part horizontal. Since the machine is generally more flexible in the horizontal plane, excessive vibration is a good indicator of unbalance. Under ideal conditions, axial measurements should indicate weak vibration as most forces are generated perpendicular to the shaft (Mais 2002). In terms of vertically placed shafts, the unbalance is due to the mass effects of radial plane vibration, and the dominant frequency component will be 1x. A phase shift of 90\(^\circ\) occurs as the sensor moves from the horizontal to the vertical position, and no radial phase shift across the machine or coupling occurs (MacCamhaoil 2016).

In general, there are three types of unbalance:

  • (i) Static unbalance: The unbalance observed when at rest is known as static unbalance or force balance, wherein only one force is involved in affecting the balance. In this case, the inertial axis of a rotor is displaced and lies parallel to the axis of rotation. This issue is widespread in disk-shaped rotors since there is a high chance of uneven mass distribution that creates a parallel shift between the axes. In simple systems, the chance of static unbalance is greater than that of couple unbalance.

  • (ii) Couple unbalance: Two equal forces or weights placed 180\(^\circ\) apart, which make the rotor appear balanced statically, are known as couple unbalance. This issue is not observable when the rotor is at rest and is a typical phenomenon in elongated cylindrical type rotors. A couple unbalance causes the so-called ‘wobble effect,’ which produces a 180\(^\circ\) out-of-phase reading from opposite ends of the shaft. Complicated systems with multiple locations in the rotor with unbalanced weights or systems with more than one coupling are prone to this type of fault.

  • (iii) Dynamic unbalance: The unbalance condition that occurs in real systems is known as dynamic unbalance. This issue presents a combination of static unbalance and couple unbalance. It has been observed that dynamic unbalance is present in virtually every rotor and that it can be addressed only by applying weights on at least two planes.

Fault characteristics analysis: The vibration wave is sinusoidal and occurs at a frequency of ‘one per revolution’ (1x), i.e., a single frequency vibration with the same amplitude in all radial directions. Other than the severe faulty states, vibration generally contains 1x only, without any harmonics of it. The 1x with high amplitude and its harmonics with less than 15% of the 1x is an indication of an unbalanced state (Al-Bedoor 2001). In this case, up to the first critical speed of the machine, the amplitude increases with speed, and the phase from the vertical and horizontal measurements differ by 90\(^\circ\). In dynamic unbalance, there will be 180\(^\circ\) phase shift in the radial direction, while static unbalance shows a phase shift of 0\(^\circ\). For unbalance due to the bent shaft, the phase shift of 180\(^\circ\) happens in the axial direction with no phase shift in the radial direction (Mais 2002).

3.1.3 Looseness

Looseness is categorized into two types depending on whether the part affected by looseness is a mechanical or a structural component. When the mechanical components are fitted incorrectly, this results in component looseness, while the relative movement between the surfaces of the fundamental structures results in structural looseness. This looseness causes excessive horizontal, vertical, and structural vibrations in the horizontal and overhung horizontal shafts, where excessive horizontal and structural vibrations result in the issue extending to the vertical shafts. The looseness issue accounts for more vertical vibrations than horizontal vibrations (Ma et al. 2011).

(i) Component looseness: Rotating components and/or non-rotating connections that constrain the shaft to its rotating axis (such as the bearing base [pedestal], bearing mounts, and bearing caps) can experience looseness due to improper fittings, wear and tear, and thermal expansion. This is known as component looseness, which results in the components becoming damaged or detached from the assembly, thus inducing secondary damage. For example, component looseness can lead to relatively small residual misalignment, causing increased vibrations that affect both the radial and axial planes. If the loosened components are rotor mounted, this may result in unbalance.

(ii) Structural looseness: While the fundamental structures are not supposed to move freely, loosened bolts or bedplates, a deteriorated concrete foundation, and loose or distorted machine mountings can create a slight movement between the surfaces of these structures. This is known as structural looseness and is often created between one vibrating component, generally the foot of the machine and one stationary component, that is, the foundation. Both structural looseness and soft foot looseness can lead to vibration in the radial plane.

Fault characteristics analysis: In the case of component looseness, the initial stage vibration signature contains mostly 1x and 2x components, but with escalated deterioration, the fractional harmonics with increased amplitude starts to appear. Generally, looseness is characterized by several running speed frequency harmonics (1x–10x) with subharmonics (x/2,x/3, etc.) and their integer multiples (2x/3, 4x/3, etc.) of magnitudes greater than 20% of the 1x amplitude. Structural looseness creates 1x and/or 2x radial components with predominant vertical amplitude, subject to the type of issue. For rigidly connected machines with no belts or couplings, the radial 2x signifies looseness. The waveform generated is periodic with one or two cycles per revolution. A phase difference of 180\(^\circ\) exists between the foundation and vibrating components in case of structural looseness (Mais 2002).

3.2 Shaft faults

In addition to SRF, faults such as bending, cracking, rub impacts, and corrosion and wear, which affect the shaft directly, can occur and are known as shaft faults. These occur either as a consequence of SRF or due to other external reasons. The majority of shaft faults are known to be the secondary cause of abnormal vibration. The most important shaft faults are outlined below.

3.2.1 Bent shaft

A bent shaft is the most common rotating machinery fault and often develops as a result of thermal distortion, creep, or a large unbalance force (Darpe et al. 2006), while rotor rub is another cause of thermal rotor bowing. As a result of gravity, a rotor can experience cold bow while in a resting position, especially in shafts with a high length to width ratio. The rotor bow on a rotating machine will result in the re-emergence of the static unbalance faulty condition. High torque pressure and improper handling can also be the reason behind a bent shaft. Bent shafts induce effects similar to those of misalignment, causing the shaft to bear a higher dynamic load than they are specifically designed for, ultimately resulting in failure due to early fatigue (Mais 2002).

Fault characteristics analysis: If the bent is close to the middle of the shaft length, then 1x will be dominating, and 2x will be dominant for the bents near to the couplings. The most affected plane is axial, though vertical and horizontal planes will also give out 1x and 2x peaks (Lees et al. 2009). The 2x amplitude can vary from 30% of the 1x amplitude to 100–200% of the 1x amplitude. The spectrum of bent shaft is almost similar to that of misalignment; hence, the phase can be used as a good differentiating indicator. In bent shafts, the radial phase measurements will be in phase, while axial measurements will be 180\(^\circ\) out of phase at opposite ends of the component (Mais 2002).

3.2.2 Shaft crack

Severe thermal and mechanical stress or manufacturing defects result in weak spots in the shaft, thus limiting its ability to withstand the forces generated during normal operation. This is known as shaft crack. The causative stress may be cyclical, whereby the initial crack is converted into a fatigue fracture resulting in a sudden breakage of the shaft. The first and foremost effect of shaft crack is the reduction in bending stiffness in the direction of the crack, which results in inducing excessive 2x vibration in the shaft (Darpe et al. 2006). The second effect is the rotor bow, where the bending results in the natural axis shift corresponding to the direction of the crack. This effect generates 1x components, which will progressively add to the already existing residual unbalance frequency component (Lees et al. 2009).

Fault characteristics analysis: The cracked shaft vibration mostly affects the radial plane, and it produces increased 1x vibration along with the 2x and 3x harmonics. In certain scenarios, the presence of subharmonics 1/2x,1/x, etc. are also observed with this fault. Studies revealed that the phase shift is directly proportional to the depth of the crack (Rahman et al. 2013; Darpe et al. 2006).

3.2.3 Rub impact

The contact developed between the rotating components and the stationary components creates a rub impact, the effect of which will remain passive in terms of overall vibration. Excessive unbalance, misalignment, self-excited instability, and resonance can cause stator-rotor rub faults in the system, which results in highly non-linear vibrations (Wang et al. 2016). The static forces working on the rotor in the radial direction and the thermal distortions in the casing have been identified as the cause of stator-rotor rub. The heat induced by the asymmetric friction produced by the rub impact in one-per-revolution fashion can result in a thermal bow in the rotor, which induces both an unbalance effect and a phase difference. As the rub impact becomes stronger, the degree of non-linearity increases, thereby creating higher amplitude harmonics of rotating frequency.

Fault characteristics analysis: The vibration changes in characteristic ways whenever the rub happens between stationary and non-stationary components. So other than the times there is no contact, the waveform seems absolutely normal. The spectrum will be containing higher amplitude subharmonics and superharmonics of the synchronous frequency with a strong rub impact. Chu and Lu (2005) observed the presence of 1/2 fractional harmonic components and 1/3 fractional harmonic components along with the 2x, 3x harmonic components. Meanwhile, the existence of pseudo-resonance and backward whirling components showed by Patel and Darpe (2009a) as the effect of rub. The phase information shows that it can not maintain a consistent phase in rub related vibration motion.

3.2.4 Shaft corrosion and wear

Corrosion and wear fall under the category of nonfracture-type shaft failure. The electrochemical reaction due to environmental factors causes corrosion, and the consequential wear results in the metal being worn away, further increasing the stress and ultimately culminating in fatigue cracks. For the most part, these faults will not result in shaft failure; however, in conjunction with the other faults, they may cause fatigue failures that leave clear evidence. Here, cracks expand in a direction perpendicular to the applied stress on the area, from where the metal part is removed due to the debris created by oxidation. Meanwhile, pitting presents a corrosion issue that leads to short-term failure, which results in a small amount of material loss from the shaft periphery (Bonnett 2000).

Fig. 4
figure 4

Rotor fault implementation in a testbed

3.3 Broken rotor bar

Broken rotor bar faults are an omnipresent fault in induction motors and are often caused by thermal, magnetic, dynamic, mechanical, and environmental stresses (MA Cruz 2000). The resulting uneven current flow generates localized heating and results in a thermal bow on the rotor, which, in turn, results in an unbalance rotor fault. Though, a higher 1x amplitude characterizes the vibration in the radial direction, stator or rotor current analysis is commonly adopted in broken rotor bar fault diagnoses (Bate 1987).

Fault characteristics analysis: Uneven current flow to induction motor rotor owing to crack or break results in two kinds of vibration changes. In the first case, 1x and harmonics generally up to 4x will accompany with pole-pass sidebands with quite low 2x line frequency Mais (2002). In the second case, the amplitude of rotor bar passing frequency (RBF, which is the number of rotor bars times the running speed) will be higher, with 2x line frequency sidebands (100Hz or 120Hz) (Bate 1987). Due to the motion of vibration, the phase will be consistent in broken rotor bar faults.


Experimental setup for rotor faults


To analyze rotor faults other than broken rotor bar fault, more than 80% of the works depend on rotor testbeds for faulty data generation. Figure 4 presents a diagrammatic representation of a typical rotor testbed-based data acquisition setup along with the fault implementation and the associated frequency responses.

The data acquisition setup contains the RTB, the signal-conditioning unit (SCU), and the monitoring and analyzing unit (MAU) (Xue et al. 2013). The main component of the testbed assembly is an electric motor controlled by a variable frequency drive (VFD, which varies the frequency and voltage supply) connected to a rotating shaft via a flexible coupling. The shaft is supported by bearing housing that is fixed by pedestal bolts to the testbed base. The bearing housing has provisions for mounting sensors in horizontal, vertical, and axial directions. The shaft will be mounted with discs for various purposes. Generally, a disc with one or more notches is closely placed with a tachometer that senses the rotational speed of the shaft. This tachometer reference, along with a sensor waveform, is used to calculate the phase information. Meanwhile, weighing discs, which will likely incorporate grooves, holes, or another eccentric weight connecting provision, will also be mounted on the shafts. The majority of the testbeds are equipped with a metal bush connecting facility to create the rub impact (Lu et al. 2019a). Contact type general-purpose accelerometers and non-contact type proximity sensors are commonly used for the vibration data acquisition. The acquired data is then passed through an SCU, which often performs signal amplification and analog-to-digital conversion (ADC). It is then applied to the MAU, which is generally a computer that carries out the monitoring and analysis of the input signal and any associated tasks with the help of specifically developed software.

The unbalance fault is created by connecting weights (bolts, nuts, washers, etc.) on the weighing disc. Two equal weights are placed 180\(^\circ\) apart to create couple unbalance, while a one-sided weight is positioned to generate static unbalance. Much like in couple unbalance, if unequal weights are set in place, then dynamic unbalance can be simulated (Walker et al. 2014). The coupling is loosened to induce a parallel shift and/or angular shift to create parallel and/or angular misalignment (Lu et al. 2019a). In terms of loosening faults, the majority of the works deal with pedestal loosening, which is simulated by loosening the pedestal bolt of the bearing housing to create clearance from the testbed base (Xue et al. 2013). Meanwhile, structural looseness can be demonstrated by loosening the bolt connecting the testbed base to the foundation. The stationary bush that is in contact with the shaft or disc periphery will induce rub impact (Walker et al. 2014). The shaft bend and shaft crack provisions are also shown in Fig. 4. Here, the experiment is conducted by setting different fault conditions as described before, and then the testbed is set to different rotational frequencies and load conditions to collect data via repeated trials.

4 Machine learning-based approaches to RFD

RFD research has been successful in employing a variety of SL algorithms before and after the advent of DL methods in the research arena. Irrespective of the fact that the ML methods require a complex feature engineering process that demands sufficient domain expertise and time, it has certain advantages that make it an option of preference for many machine health monitoring applications. The provision for applying domain knowledge and the comparatively fewer data requirements make the ML method significantly more favorable in this regard. One of the earliest methods found in the existing literature involves using ANN for motor fault diagnosis. Thereafter, the research in this domain flourished with several algorithms adopted across the supervised and unsupervised categories. The commonly used ML methods, along with their features, pros and cons, relevance, and compatibility with RFD, are presented in Table 2.

Table 2 Summary on ML classifiers

4.1 Artificial neural network

In AI-based machine health monitoring, the history of the application of ANNs stretches over three decades (Chow et al. 1991), and it has been an indispensable part of the literature on SL of RFD right from its inception. The approaches using ANN in RFD are custom-made according to the variations of ANN architecture and signal processing techniques used in feature processing.

The method of integrating finite element (FE) analysis with the neural networks is adopted in specific works of RFD. In an early attempt, Roemer et al. (1995) proposed FE-based neural networks for classification of unbalance and misalignment rotor faults, which enable real-time machinery diagnostics and component stress prediction. They observed that shaft is affected by low cycle fatigue with the start/stop cycle of the rotor under unbalance condition, while bearings are more affected by high cycle fatigue correlated with rotor speed. Later, Yu and Han (2010) also developed an FE model of the rotor with transverse crack(s), and an ANN is proposed to identify the location and depth of a crack in the rotor. They introduced fracture mechanics theory and the energy principle of Paris for modeling of FE for single and multiple cracks.

Several researchers have tried to find out the best suited ANN architecture and feature extraction method for specific tasks in RFD. The work of Hassan et al. (2003) was one among those, in which they compared different NN models such as perceptrons, linear filters, feed-forward, self-organizing networks, and learning vector quantization (LVQ). They concluded that perceptron and LVQ architectures were superior to others, and could achieve 100% accuracy in unbalance and looseness conditions. Applying the findings, El-Shafei et al. (2007) used the LVQ neural network, in series and in parallel connections separately to a fuzzy inference engine for addressing the SRFs. To overcome the limitation of LVQ in handling compound faults, a feed-forward NN with resilient propagation (RP) algorithm was introduced. They proved the success of NN as a high-resolution configuration diagnostics tool and appropriateness of fuzzy logic as a low-resolution configuration diagnostics tool. They realized the neuro-fuzzy system as an option that can provide a confidence index for fault. Another work from the perspective of performance comparison was done by Tajik et al. (2015b), who applied six different methods for feature extraction and three different NNs (BPNN, RBFNN, PNN) for classification with PCA and LDA dimensionality reduction techniques, for different rotor unbalance conditions. Reddy and Sekhar (2013) compared the performance of ANN by applying statistical features and amplitude in the FD. Applying on unbalance and pedestal looseness conditions, they concluded that statistical features give good results over FD magnitudes.

A few works have registered to incorporate the TD features in ANN, whereas FD and TFD features can be seen in a fair amount of works lately. Vyas and Satishkumar (2001) used statistical features integrated with ANN, in which moments of vibration signals acquired in the TD were utilized as features, thereby reducing the number of inputs produced after high resolution preprocessing techniques. They applied a backpropagation ANN to detect five rotor faults, including looseness, unbalance, and misalignment. Singh and Kumar (2015) used six statistical features extracted from the TD vibration signal and compared the performance of ANN and SVM in terms of single and multiple faults of shaft bow and misalignment, and concluded that SVMs require comparatively less running time. Similarly, in statistical feature extraction, there were some novel attempts from some researchers like Li et al. (2015b). They proposed the feature extraction method by taking the sampled average of some conventional features to improve the performance of ANN and SVM. Apart from that, an evaluation method is also developed using the decoupling vectors and the thresholds based on simple algebraic computation.

Certain significant contributions have been observed in the preprocessing of input data in FD and TFD for the smooth learning of neural networks in RFD. Bin et al. (2012) conducted FD preprocessing with EMD and WPD to extract five spectral bandwidths energy of vibration signal and successfully identified ten rotor faults, including rotating stall and breathing vibration, loose rotating parts, and fluid-related defects. The ability of WPD in generating a time-frequency spectrum containing the wavelet packet coefficients for each of the depths and nodes, was utilized by Sadeghian et al. (2009) for feature extraction from stator current of induction motor. The extracted coefficients with different frequency resolutions of the same frequency component, together with the slip speed, were applied to the ANN. This work has its significance in the online detection of faults. Chen and Chen (2011) addressed six common faults of the rotor system like unbalance, bow, misalignment, rub, whirl, and whip using a set of individual neural networks based on structured genetic algorithm (sGAINNs). The highlighting feature of this work is that the authors have made a significant contribution in FD preprocessing and in exploiting the effect of frequency harmonics in rotor faults. Unbalance localization was performed by Walker et al. (2014) through the simulation of different types of unbalances, including multiple planes unbalance, and they derived the subsynchronous non-linear features in the FD as an input to ANN. This unbalance localization is particularly suggestive in situations where sensor placement options are limited. The results of unbalance, misalignment, rub, or combinations of these faults were also presented in the same work. Shaft bow, unbalance, and their combined effect was studied by Srinivas et al. (2010a). Daubechies wavelet was used for FD transformation, and they utilized amplitudes of 1x, 2x, 3x, and 4x vibration harmonics in horizontal, vertical, and axial directions resulting in an accuracy of 99.9% using ANN. The same authors continued their study on unbalance and cracked rotors in (Srinivas et al. 2010b). Frequency spectrum analysis on combined faults of unbalance and bearing clearance was another contribution of Srinivas and Holla (2014). As in their previous work (Srinivas et al. 2010a), the same Daubechies wavelet transform and amplitudes of vibration harmonics were utilized in this work too. In one recent work, Pang et al. (2020) used various methods such as ensemble empirical mode decomposition (EEMD), Hu invariant moment feature vector, and morphological image processing with BP neural network. Unbalance, misalignment, oil whirl, and oil whip faults are simulated in both single-span rotor and double-span rotor test rigs achieving an accuracy rate up to 95%.

Moreover, ANN is viewed as a replacement for conventional feature extraction methods. One such approach was proposed with a two-level learning stage by Lei et al. (2016b). In the first stage, an unsupervised two-layer neural network known as sparse filtering was applied to learn features from mechanical vibration signals directly, and then softmax regression was employed to classify the health conditions in the second stage. Without any specific feature extraction technique, an improved accuracy along with the increase in the number of unlabeled data was obtained by this method. The works which consider the fused features to represent the multiple faulty conditions are also available in the ANN literature of RFD. A feature fusion model based on information entropy and the probabilistic neural network was proposed by Jiang et al. (2018). They proved that the accuracy of their fusion method was 10% higher compared to the processes using each of the information entropies separately.

In addition to testbeds, induction motors and wind turbines are the machines where ANNs are employed mainly to identify broken rotor bar faults and unbalances, respectively, often using voltage and current input. Broken rotor bar fault, along with misalignment and unbalance, was determined by Cabal-Yepez et al. (2015) using statistical features of voltage and current signal. They collected the data from variable speed drives (VSD) rather than the direct connection. A field-programmable gate array-based implementation was developed to offer an online, system-on-chip solution for real-time condition monitoring with ANN to perform classification. The simulation of five imbalance fault conditions of a wind turbine was performed through the TurbSim, FAST, and Simulink by Malik and Mishra (2017) considering the parameters from three-phase stator current and voltage. By using EMD and PCA in preprocessing and ANN in classification, the method achieved a higher degree of accuracy. Cacciola et al. (2016) utilized NN to decide whether the unbalance is present or not on a wind turbine rotor using 1x harmonics amplitude. By using load harmonics analysis, they could determine the severity of unbalance and its location. They additionally classified the root cause of the unbalance by rotor response. Acoustic signal, a supposedly robust input, was applied to ANN by Li et al. (2004b). The signal was converted to normalized power spectra together with the rectified statistic moments to identify unbalance in rotating machinery. They introduced the d-normalization technique to nullify the distance effects of sensors and components.

The ensuing discussion can be summarised as follows. As part of one of the top-rated ML methods, ANNs feature a great deal in RFD, extending to around 22 of the 71 papers falling within the ML category. Here, it is clear that the majority of the researchers have tried different feature processing methods to improve the output, while a fair number of works follow TD approaches, despite the prevailing trend of adopting FD approaches. Similarly, the proportion of works that have modified the ANN structure (23%) and the number of works that consider DFC feature extraction (27%) very high with ANN. Moreover, the percentage of works (73%) that deal with RFD exclusively (without considering bearing or gear faults) demonstrates the fact that ANNs are highly suitable for RFD analysis.

4.2 Support vector machine

As one of the frontrunners in both classification and regression tasks, SVM has been proven to be a decent performer in machine health monitoring over the last two decades. The method is well known for handling data overfitting and for its exceptional performance in terms of accuracy in RFD research.

There are numerous ways in which researchers modify SVMs for adapting to their research problems in RFD. The following are a few approaches that made their way in changing the conventional method of using SVM by upgrading the kernel or incorporating other techniques and data structures with SVM. Widodo and Yang (2007) developed a two-phase kernel algorithm with kernel PCA and ICA. The performance of the classification process using various feature extraction methods and the kernel function were presented in their work. The same authors (Widodo and Yang 2008) later introduced an SVM with a kernel function using the wavelet transform known as wavelet support vector machine (W-SVM) with strong generalization capability. Transient current signal, preprocessed by discrete wavelet transform and PCA, was used as input for the classifier. They proved that introducing non-linear kernel using wavelets improves the SVM performance significantly. They presented another work (Widodo et al. 2007) in which the sequential minimal optimization algorithm carried out training of the SVMs, and various scenarios concerning this were examined using datasets of vibration and stator current signals. They concluded that SVMs achieved high performance in classification using multiclass strategies such as one-against-all (OAA) and one-against-one (OAO). A broken rotor bar fault detection system based on stationary wavelet packet transform (SWPT) and multiclass wavelet kernel support vector machines (MWSVM), was presented by Keskes et al. (2013). They discussed the classification performance of different multiclass SVMs with various kernel functions. They found that OAA is about five times faster than OAO, and it required less number of singular values. Kang and Kim (2013) proposed a multi-layer SVM (MLSVMs) with RBF kernel, which extracted features from both the TD and the FD. In the TD, it used short-time energy (STE) with the singular value decomposition (SVD) technique to represent various faults. In the FD, it adopted the discrete cosine transform (DCT) with the SVD technique and addressed eight different rotor faults.

By fusing the advantages of the information entropy method and SVM, Fei et al. (2014) proposed an information fusion based method known as process power spectrum entropy and SVM (PPSE-SVM). PPSE feature vector has robust learning and generalization ability, fault tolerance ability, and strong anti-noise interference ability so that the proposed method demonstrated state-of-the-art performance. Zhang et al. (2015) proposed significant improvements for SVM as well as the feature processing methods to bring better performance even in fewer faulty data situations. They applied the fuzzy support vector machine (FSVM) optimized by a multi-population GA. The EMD and WPD were utilized in the preprocessing of the vibration signal. This work succeeded in making full use of rotor dynamics and computational intelligence. Unbalance conditions in wind turbines were identified by Malik and Mishra (2016) using intrinsic mode functions (IMFs) with the EMD method for decomposing stator current input and PCA for dimensionality reduction. The significance of the work was the use of proximal SVM (PSVM), which had been invented by Mangasarian and Wild (2001). By the rendered results, they showed that the method was appropriate for online fault diagnosis applications. Saimurugan and Ramachandran (2014) made a comparison between vibration and sound input signals for an SVM classifier, with and without fast-ICA based preprocessing. They simulated 12 health conditions with bent and/or unbalanced shaft, with different bearing fault conditions. They concluded that in certain circumstances, vibration-based diagnosis performs much better than sound signals in RFD, but both are effective in fast ICA-SVM based classification. Tang et al. (2010) showed that the multiclass SVM trained with chaos particle swarm optimization (CPSO) outperforms ANN in identifying rotor faults like unbalance, misalignment, bending, and bolt looseness. Another work of PSO with SVM done by Duan et al. (2016a) proposed a support vector data description with a binary tree structure for multi-classification problems with unbalanced datasets. The parameters of support vector data description are optimized by PSO, resulting in the application of BT-PSO-SVDD technique for imbalance classification, which was able to outperform ANN, SVM, and fuzzy c-means (FCM).

In data and feature representation, SVMs show their variegation; and specific such works are portrayed in literature. One such far-reaching RFD data representation method that operated with SVM is infrared or thermal images. They have the advantage of being non-contact and non-intrusive, and thereby able to avoid many hurdles commonly experienced with the other sensing methods. Younus and Yang (2012) used the thermal image as an input with a two-stage process for RFD. The decomposed data by a 2-D DWT operation on the input was applied to a Mahalanobis distance and relief algorithm for feature selection. By using SVM and LDA classifiers, they proved the applicability of the method. Uddin et al. (2014) put forward a different approach in which a 2-D gray-scale texture created from TD vibration signals, was applied as input to the feature extractor. The dominant neighborhood structure (DNS) map generated as a result of feature extraction was applied to OAA multiclass support vector machines with RBF kernel. Among the eight rotor health conditions laid down, this approach gained 100% accuracy. Similarly, Janssens et al. (2015) gave preference specifically to extract features like histogram features, measurement of concentration related to the spatial temperature distribution, Gini coefficient, etc. They identified unbalance and various bearing faults in two different pipeline stages. SVM and random decision forest (RDF) classifier performed the classification task. Another work with a similar kind of data was done by Duan et al. (2016b), in which image segmentation and histogram processing with PCA dimensionality reduction was applied with SVM and NB classifiers. It was also reported that the SVMs work with features extracted by CNN to enhance the fault diagnosis performance (Sun et al. 2017).

Moreover, it is important to highlight some works with SVM, which have given far-reaching attention to the feature extraction process. Li et al. (2015b) developed a method for evaluating feature extraction and made use of SVM to validate the performance in the RFD. Based on the central limit theory, the statistical features were extracted so that it followed a normal distribution and were evaluated with a decoupling technique and compared with thresholds to decide on fault category. An improved version of the distance evaluation technique known as the compensation distance evaluation technique (CDET) was introduced by Fatima et al. (2014) to select the most sensitive features for each rotor fault from the set of statistical features in the TD. A grid search approach was operated to optimize the hyper-parameters of SVM and applied for identifying unbalance as well as bearing faults. It was shown that irrespective of the number of transducers used, a certain level of accuracy could be attained through this method; therefore, it is applicable for online condition monitoring. To extract the fault features around the sub-harmonics and sup-harmonics of the rotational speed, Pang et al. (2018) introduced a novel index known as characteristic frequency band energy entropy (CFBEE). It was applied to the time-frequency spectrum (TFS) derived from raw vibration signals with improved singular spectrum decomposition (ISSD) and HT. This method attained higher accuracy comparing to the EMD-based characteristic frequency band energy entropy method. Martínez-Morales et al. (2018) performed feature data fusion by fusing electric data (stator current) and mechanical data (axial vibration) to identify the misalignment, unbalance, and the bearing faults of three-phase induction motors. OAO multi-classification SVM with RBF kernel performed the fault classification task. Similarly, in a recent work, Gangsar and Tiwari (2019) used vibration and current data for diagnosing mechanical and electrical faults of an induction motor, adopting grid-search methodology with cross-validation to select optimal SVM features for building the best model. Aydin et al. (2007) proposed a method of tuning kernel and penalized parameters of SVM using an artificial immune system and Park’s vector approach for extracting features from three-phase motor current. Nguyen et al. (2008) applied GA for optimal feature selection from a set of TD features. SVM and DT were employed for analyzing the classification performance of unbalance, looseness, and bearing faults, concluding that SVM with the selected features works better. Fault specific symptom parameter extraction is the key element of SRF diagnosis. The nature and behavior of DFCs in rotor faults like misalignment, unbalance, and looseness was utilized by Xue et al. (2011) with SVM for fault diagnosis learning. This work was one of the initial attempts to make use of symptom parameters of SRF, addressing varying load conditions. They continued to use DFCs in their next work (Xue et al. 2013) in which they identified the SRF, performing normalization and extraction of DFCs using a multi-band pass filter. The filter range accounted for the change in frequency band under varying load conditions. Lately, Lobato et al. (2019) proved that data augmentation combined with GA for feature optimization could provide better accuracy of 95.19% even with significantly less amount of input data. It was observed that EEMD features with GA could improve the results by 17.31%, and, together with the augmented data, the improvement increased to 20.19%. For the detection of broken rotor bar fault in induction motor, Armaki and Roshanfekr (2010) introduced three new features like harmonic curve area, harmonic crest angle, and harmonic amplitude. These were extracted from power spectral density (PSD) of stator current using FFT, after which the performance of linear, polynomial, and RBF kernels was compared. They concluded that harmonic amplitude is not a useful feature because of its motor load dependency; meanwhile, among the kernels, the RBF kernel provides decent performance. To identify rotor crack under strong noise condition, Li et al. (2011) proposed feature extraction from amplitude and frequency of AE signal based on the pseudo Wigner-Ville distribution (PWVD) with SVM. This method was able to classify cracks of three different depths.

Some researchers demonstrated the dominance of SVM over state-of-the-art ML techniques. Singh and Kumar (2015) compared SVM with ANN in terms of classification accuracy with six statistical features derived from TD and concluded that the time taken to run the model by SVM was remarkably less when compared to ANN technique. In another similar work, Moosavian et al. (2014) compared the performance of the SVM and K-NN on unbalance rotor fault with 29 FD features and found that though KNN is faster than SVM, the latter shows substantially better performance in terms of accuracy. Ruiming and Hongzhong (2006) presented motor current signal analysis (MCSA) with SVM for induction motor rotor faults. FFT was used for FD feature extraction, and the performance comparison was made with majority voting, binary tree, neural network and hybrid matrix. There are some unique contributions to RFD from SVM, out of which more fault specific information or conclusions has been derived. In one such work, Fengqi and Meng (2006) showed that full-spectrum cascade analysis of acceleration signal remains ideal for some particular types of fault like rub impact. The relation with the amplitude of harmonic spectrum rub malfunction intensity was well explored in this work, and significant results were produced. Baccarini et al. (2011) demonstrated the OAA technique for SVM, in which they utilized only one vibration sensor and four SVMs to deliver improved results of classification. They observed that, for the best signal acquisition and analysis, the vertical sensor position is preferred. Another significant contribution in motor RFD was illustrated by Yuan and Chu (2006) in which 14 rotor faults were identified by multiclass SVM. FFT and AR models were used to design a feature space with nine frequency bands, and the PCA assisted in the dimensionality reduction process. In the context of broken rotor bar fault, Kurek and Osowski (2010) proposed an SVM based method that performed online detection of the presence of rotor fault and identification of fault affected rotor bars. Spectral information of the motor current, voltage, and shaft field of one phase was employed as input to SVM. They discovered that sufficient diagnostic information is present in the phase current for SVM based fault diagnosis of induction motor.

The research on RFD using SVM can be summarised as follows. First, it is worth noting that the researchers mainly attempt to change the kernel and incorporate new data structures to SVM to achieve a decent RFD accuracy. SVMs account for the maximum percentage (44%) of the share in ML methods for RFD analysis. One pertinent fact is that in terms of broken rotor bar faults, SVM is the traditional ML model with proven compatibility with voltage and current features. Moreover, as the literature states, the improved performance of SVM heavily depends on FD processing for feature extraction. SVMs succeed in utilizing information entropy fusion and spectrum analysis in certain applications, demonstrating its ability to deal with a host of input representations, while it is even used to evaluate the performance of feature extraction. Due to the excellent potential for handling outliers, SVMs can be used with any form of sensing method. Similarly, a number of the significant contributions that made use of DFC in RFD involved using SVM for classification. In short, SVMs are the most versatile of the ML models from an RFD perspective.

4.3 k-nearest neighbor

k-NN is an instance-based, non-parametric algorithm renowned for its interpretability and ease of implementation. It is the most popular algorithm used in RFD after SVMs and ANNs.

Chen et al. (2011) proposed a k-NN based fault classification method for unbalance and bearing faults with raw TD vibration signature as input. The significant contribution of this work was the use of the ’maximum cross-correlation sum operator’ as a similarity measure, which is shift-invariant and noise-tolerant. The proposed method achieved an error rate of 0.74% over varying operating speeds. Another approach using k-NN was proposed by Biet (2013), which dealt with faults associated with nuclear plant generators using rotor flux measurements and classical electrical measurements. Fisher criterion and the sequential backward selection algorithm were operated for feature selection from the scalar parameters. The k-NN with Euclidean and Mahalanobis distances was applied for classification, and it was able to achieve 85.1% accuracy in rotor faults. This work was a follow up of the previous work (Biet and Bijeire 2011), where they tried a large number of rotor fault combinations in which feature selection was made only from the radial flux density. Recently, Gohari and Eydi (2020) studied the identification of unbalance parameters of a rotating shaft having multi-discs with k-NN and DT. It is concluded that in terms of unbalance locating, KNN presents more accuracy in estimating of unbalance parameters compared to DT.

A more suitable fault specific method for finding broken rotor bar fault using the nearest neighbor (NN) algorithm was demonstrated by Karvelis et al. (2015). Start-up current was taken as input, and its transient analysis applied to the model using wavelet approximation to isolate the characteristic component of the specific fault. The input discretization was performed by maximum entropy partitioning (MEP) and represented with relative information created by an intelligent icon-like approach. Both simulation data, as well as experimental data, were used for testing, and the method produced high-classification accuracy. In another fault specific frequency selection based approach, Glowacz (2018) proposed a method for selecting essential frequency components as features and applied to k-NN along with BPNN as well as words coding based classifiers. It was based on computing the absolute values of differences in the frequency spectrum of acoustic signals, and a threshold was used to determine one or two groups of essential frequency components.

Nguyen and Lee (2008) utilized GA based on a distance criterion for feature extraction. By evaluating the features with k-NN and DT, they proved that selected features improve the accuracy and reduce the running time. The authors continued their work (Nguyen and Lee 2010) with an enhancement of applying weight factors in decision making. They implemented the weight factor for the selection of best features and evaluation of k neighbors. The performance was compared with SVM to prove the dominance of the proposed method. In the works of Son et al. (2009), Timusk et al. (2008), Moosavian et al. (2014), and Yang et al. (2015), k-NN was employed for comparison purpose and proven as one of the best-performing algorithms specifically appreciated for its faster and simpler operation.

The attendant discussion can be summarised as follows. First, while a k-NN is simple and convenient, it is not widely adopted in RFD. In fact, barely 10% of the works utilize k-NN despite its ability to deal with decision boundaries of any form. A number of researchers developed different similarity measures derived from well-known distance functions, while others operated well-established algorithms such as GA to boost the performance of k-NN. However, the inability to recognize important attributes, the overheads involved in deciding the parameter ‘k,’ and the interpretability issue due to its non-parametric nature have been identified as the main limitations of this algorithm, which make researchers reluctant to adopt it for RFD analysis.

4.4 Naïve–Bayes

Naïve–Bayes is the most popularly used Bayesian model in RFD, which works on the conditional probability basis. Wang et al. (2012) used envelope features of the motor current extracted using HT and compared it with the conventional motor current features using Naïve–Bayes as well as two other classifiers. Current envelope features, AR coefficients, statistical features, and FD features were applied to mRMR (min redundancy max relevancy rule) for proper feature selection. The results of three different classifiers validated the effectiveness of current envelope features in this work.

Bayesian modeling has a drawback that it is unable to model and learn from the time-series level change of data. As an attempt to mitigate this disadvantage, Yusuf et al. (2013) presented the NB classifier on the fault groups created by the hidden Markov model (HMM). The results showed that the false positives and negatives were reduced with an identification accuracy of 84.55% by the probabilistic belief-based identification capability of Bayes theorem. Duan et al. (2016b) applied image segmentation and histogram processing with PCA dimensionality reduction to the NB classifier. They formulated a region selection criterion named ’dispersion degree’ to discriminate fault representative regions. The performance of the algorithm was evaluated using the original image, segmented image analysis, and segmented image analysis with feature reduction methods.

Bayesian Belief Network: In the field of rotating machinery fault diagnosis, incorporating human expertise is essential, especially when there is little one-to-one correspondence between symptoms and faults. Bayesian belief network (BBN) is a Bayesian model that is useful in such situations typically when dealing with compound faults. Xu (2012) developed a BBN with three layers, namely machine running conditions layer, machine faults layer and fault symptoms layer with two topological configurations of causality, and fault symptom. Compared with the traditional Naive Bayesian network, the proposed BBN contained two topological configurations of causality, which concerns the information not only about the fault symptoms but also the machine conditions. The designed system could give reasonable uncertainty inference in rotor faults to cop up with practical knowledge and experience of experts.

Here, the attendant analysis can be summarised as follows. First, as a probability-based classifier, an independence assumption is the core aspect of the NB algorithm. Meanwhile, the higher chance of correlated features in the RFD dataset results in certain restrictions to its application. The inability to model the time-series data and to learn from it, as well as the issues created in the classifier due to the absence of data, are just some of the limiting factors. However, attempts such as HMM to deal with time-series and mRMR for feature extraction have been made to return some ground-breaking results. In the Bayesian family, the importance of BBN cannot be denied as it is capable of dealing with most of the aforementioned problems. In short, the attribute correlation and the probability of missing the RFD sensor data can be given as justifiable reasons for the nominal participation of NB in RFD.

4.5 Fuzzy logic and other methods

The ability of fuzzy logic to mimic human reasoning has been utilized in RFD by certain researchers. El-Shafei et al. (2007) used the LVQ neural network in parallel to a fuzzy inference engine for addressing the SRFs. The whole harmonics and half harmonics up to 6x were utilized in fuzzy decision logic and proved the successes of fuzzy as a low-resolution configuration diagnostics tool for spectral data. They found the neuro-fuzzy system as a good option that provides a confidence index for faults. The fuzzy Sugeno model adaptive neuro-fuzzy inference system (ANFIS) was used by Lei et al. (2008) for rotating machinery fault diagnosis. They employed TD, FD, and TFD features along with an improved distance evaluation technique for feature selection. The results showed the ability of the system by providing high accuracy for finding bearing faults, fault severity and category. Another work by them (Lei et al. 2007), that combined multiple ANFIS with GAs for a more reliable and intelligent RFD system was also reported in the literature. Zhang et al. (2015) introduced a fuzzy support vector machine (FSVM) by integrating fuzzy logic with SVM. Qiu and Rao (2005) summarized a significant analysis of unbalanced non-linear rotor systems involving uncertain parameters and concluded that the fuzzy approach could be considered as more versatile and robust in system design and analysis. Lately, fuzzy neural network technology application in equipment fault diagnosis was proposed by Xu et al. (2020). In this work, network learning weights are transformed into case-based reasoning-based diagnostic guidance operators.

Among the remaining ML algorithms, the RF was used by Yang et al. (2015) with multiple class feature selection (MCFS), and Quiroz et al. (2018) used the same algorithm with TD statistical features. The LDA algorithm was explored for RFD with acoustic signal input by Glowacz (2018), and DT was used by Nguyen and Lee (2008). A vibration image-based diagnosis was proposed by Yan et al. (2019b) in which three features such as histogram of vibration image (HVI), histogram of oriented vibration image (HOVI), and 2-D FFT generated from vibration images were used. The AdaBoost proposed in this work overcome the overfitting problem, and it fuses multiple features. Martin-Diaz et al. (2018) selected AdaBoost for comparative analysis to prove its significance in RFD analysis.

Here, we can summarise the discussion as follows. The decision making based on fuzzy logic has advantages over other methods since the rules derived from the fault state play the key role. As the fuzzy classification system does not have the capacity for pattern recognition as other ML models do, and since certain concerns exist in terms of its implementation, it is not widely accepted in RFD scenarios. However, individual attempts to utilize the harmonics of the rotating frequency as the input to the fuzzy system and the use of ANFIS have returned some significant result improvements. Another way of utilizing fuzzy logic to overcome its limitations is to incorporate it with other ML methods, which has, in fact, been adopted in a number of works related to RFD.

4.6 Classifier fusion

The conventional fault diagnosis method of using a single information source together with a unique decision method exhibits certain shortcomings. Niu et al. (2007) proposed a decision fusion by finding the optimal sequence of classifiers’ fusion. It is based on selected decision vectors using the correlation measure of classifiers and sensor fusion method using relativity theory. SVM, LDA, k-NN, improved iterative scaling (IIS), Gaussian mixture model (GMM), and LVQ classifiers were used, and multi-agent classifiers fusion algorithm was engaged for fusing them. The comparison results with majority voting and Bayesian belief classifiers proved the superiority of multi-agent fusion. The same authors continued the fusion method in (Niu et al. 2008) using multi-level wavelet decomposition with transient current as the input. After the feature extraction using wavelet-based decomposition, the signals were applied to the two decision-level fusion strategies, the multi-agent fusion, and Bayesian belief fusion. The results demonstrate that the proposed framework fusing SVM, ART-KNN, LDA, and k-NN classifiers could achieve an accuracy of 98.5%. In contrast, the best individual classifier could achieve only 90% accuracy.

Santos et al. (2012) combined bagging, AB, general boosting projection (GBPC), and RF classifiers to obtain an ensemble classifier for performing unbalance and misalignment classification of wind turbines under various speed and load conditions using current and vibration signals as inputs. The results of the proposed method compared with C4.5 DTs, k-NN and NB, and found that the AB using J48 DTs as base classifiers achieved the highest accuracy of 96.24%. Tao et al. (2016) proposed a novel classifier ensemble technique known as weighted majority voting with a different confidence level to ensemble NB, RF, and SVM classifiers. Based on the vote through the confidence diversity, they assembled multiple classifiers, and the results were compared with the conventional normal weighted majority voting method. They observed that the performance of the NB classifier was low as compared to the progress made by RF and SVM in terms of classification accuracy. Moreover, the proposed ensemble method gave significantly improved accuracy compared to the conventional weight-based voting method, especially in broken rotor bar fault. The AB ensemble classifier presented by Martin-Diaz et al. (2017a) addressed the issue of imbalance data in rotating machinery fault diagnosis. TD and FD features extracted from stator current were applied to variable filtering through correlation analysis for feature selection. SMOTE sampling technique was followed by a cross-validation technique so that it ensured a rebalancing of data and generalization of the results. AB classifiers with DT (CART) and the NB classifiers were utilized for performance evaluation in the presence of varying sized datasets and imbalance ratios. The results demonstrated the superiority of the AB ensemble classifier in RFD of induction motors.

The fusion of classifiers in RFD literature can be summarized as follows: The works that performed classifier fusion constitute only less than 5% of overall ML literature of RFD. SVM, k-NN, and NB are often found among the fusing classifiers. The different fusing methods like multi-agent fusion, Bayesian belief fusion, decision level fusion, majority voting, and its variants, have been used in most of the works, and their performance comparison has been carried out. The literature proves that classifier fusion in RFD can address multiple drawbacks of individual classifiers, but still, the research in this direction has a long way to go.

Table 3 Summary on ML comparison studies

4.7 Studies on classifier comparison

There have been certain efforts from researchers to study and compare several ML methods that will help in the selection of the most appropriate classifier according to the requirement. The summary of such classifier comparison studies is given in Table 3. In one such work, Gang et al. (2007) compared the performance of five classifiers with ten TD, three FD, and eight AR features on acceleration, voltage and current, and flux signals of an induction motor. They concluded the order of classifier performance as RF, SVM, k-NN, LDA, and ART-kNN (Adaptive resonance theory-Kohonen neural network), and identified flux2 as the excellent performing signal for classification. Timusk et al. (2008) experimented with transient but repeatable operating data using vibration information alone, taking into consideration time-series statistics and AR models. Eight different classifiers were tested on motor and gear datasets with and without re-sampling along with the Gabor feature statistics. The study revealed the failure of Gabor feature statistics as a feature in rotor faults. A smart sensor system architecture that acquired vibration, current and flux signals from induction motors was developed by Son et al. (2009). The SVM, LDA, k-NN, and RF were used for dealing with the fault classification section. The results showed that LDA and k-NN achieved high accuracy and performance stability, where RF performed the worst. Wang et al. (2012) worked with the envelope of the motor current and the order of classifier performance in terms of accuracy was k-NN (96.15%), NB (94.5%), and SVM (91.43%).

Cunha Palácios et al. (2015) conducted a meticulous review on machine fault analysis, including stator and rotor faults with NB, k-NN, SVM, ANN, repeated incremental pruning (RIP), and DT classifiers. Considering the rotor faults alone and in combination with other faults under the varied power supply and mechanical loading conditions, they observed that ANN and k-NN could perform with the highest accuracies 96.5% and 96.1%, respectively, where NB could only get an accuracy nearing 70%. Godoy et al. (2016) employed fuzzy ARTMAP network, SVM, k-NN, and MLP to classify broken rotor bar defects in inverter-fed induction motors analyzing the TD stator current signal, and the rate of acquisition was based on machine frequency supply. Over a wide range of frequencies and load conditions, k-NN performed the best followed by MLP, and the fuzzy-based classifier was the weakest performer.

Martin-Diaz et al. (2017b) worked on the current signals from an inverter feed induction motor with one broken rotor bar under changing operational conditions. Various TD and FD features were extracted, and feature selection task was performed with the help of the Pearson correlation coefficient. NB was able to classify the faulty data with only two instances misclassified showing the same behavior in both classes. But SVM and k-NN categorized all instances to a particular class only. The same authors (Martin-Diaz et al. 2018) continued the work by additionally including the classifiers AB, ANN, and bagging. Two load conditions and two faulty conditions like half-broken rotor bar and broken rotor bar were considered on an overall eight cases, resulting in a classifier ranking as follows- AB, bagging, MLP, NB, SVM, and k-NN. Li et al. (2018) has worked with WPD based feature extraction for classifiers BPNN, SVM, and k-NN on looseness, friction, and load unbalance faulty conditions. In the classification process, BPNN achieved 98.7% accuracy with WPD, and in degradation assessment, k-NN showed the best performance.

The start-up transient current signal of line start-permanent magnet synchronous motor (LS-PMSM) was introduced by Quiroz et al. (2018) for identifying the broken rotor bar fault using RF classifier. They compared the performance with the other five algorithms using 13 statistical TD features. The experiments were conducted with different number of trees within RF and three reduced features. RF performed well with 99.6% accuracy, which proved to be independent of the number of trees. Konar et al. (2015) applied signal processing techniques such as CWT, HT for feature extraction, and PCA, GA and RST for dimensionality reduction. They proved that powerful signal processing tools, in conjunction with the data mining technique, could reduce the burden on SVM. The classification performance turned out to be SVM, the best performer with an accuracy of 99.15%, followed by k-NN, giving 98.86%. MLP algorithm showed 97.23%, and finally, simple logistic showed the least performance with 96.96% when worked with six types of induction motor faults.

5 Deep learning-based approaches to RFD

The abundance of data and the evolution of new algorithms supported by the GPU-like hardware expansions have accelerated the development of DL, which is essentially a nested hierarchical and more abstract representational framework. The advantage of the automatic feature extraction of DL models made them widely accepted within the field of machine fault diagnosis since this allows for avoiding the requirement of high domain expertise in feature engineering. In fact, RFD makes use of a variety of DL models, including the most popular CNNs, autoencoder based models such as SAE and SDAE, sequential models such as LSTM and GRU, the generative model like GAN, generative and probabilistic models such as DBN, and other deep neural networks (DNN). Indeed, the majority of the DL algorithms have gained traction in a short space of time. The commonly used DL algorithms, along with their features, pros and cons, relevance, and compatibility with RFD are presented in Table 4.

Table 4 Summary on DL classifiers

5.1 Convolutional neural network

The CNN is established as a convolution-based hierarchical image pattern recognizer and is the most widely used DL model in RFD. The discrepancy caused by using 1-D time-series signals in a 2-D convolution-based network is solved by different methods such as 1-D to 2-D conversion or by introducing 1-D CNNs.

In one of the early RFD attempts, Janssens et al. (2016) addressed rotor misalignment fault using various levels of lubricant degradation and other common bearing faults. They presented their work to manifest the improvements of feature-learning based approach over feature-engineering based approach by quantifying CNN’s performance over linear regression based on handcrafted features. The proposed CNN contained 32 convolution filters and 200 hidden units in the fully connected layer with scaled DFT of raw data as input, and it achieved an accuracy of 93.61% over 87.25% of the conventional model.

CNN is specifically designed for handling 2-D data; therefore, several image representations of input data have been widely used with CNNs. Jeong et al. (2016) proposed orbital image-based diagnosis using CNN for identifying unbalance and misalignment. Orbital images necessitate specific preprocessing such as denoising, offset shifting, re-orienting, and rescaling before applying it to CNN. The proposed CNN was formed by a stack of two convolution layers, two max-pooling (sub-sampling) layers, followed by fully connected layers. This framework attained significantly good results in classifying orbital images. One of the successful methods of inputting data for structural fault analysis of the rotor is the IR images. CNN proved to be effective in such a scenario, by Liu et al. (2017b) through comparing CNN’s performance with that of SVM on different sized input images. SVM with features extracted from the histogram of thermal images was unable to demonstrate the performance at par with CNN in terms of accuracy. A data fusion method for converting data into RGB channel false-color image and the position-based polar image was proposed by Yao et al. (2018). First, they had created the simulation of fault in the MATLAB/Simulink environment in which harmonics with frequencies 1x, 2x, 4x, and higher values were injected to simulate characteristic components of certain faults. The raw data collected in this way was then converted to color and polar images in order to connect with the practical working conditions. The method is validated with a ball screw testbed with misalignment fault.

There are several ways by which vibration data has been applied to CNNs in 2-D form. Symmetrized dot pattern (SDP) image was proved as one of the convincing ways by Zhu et al. (2019a). Normalized time waveform was mapped into an SDP on a polar graph to produce SDP images, which was demonstrated as an effective method of representing time-series data from multiple sensors. The effectiveness of using the SDP image was manifested by applying it to a CNN with two convolution and pooling layers and one fully-connected layer with faulty data on unbalance, misalignment, rub, and oil whirl. The same authors developed another method using SDP images and CNN utilizing characteristic information fusion (Zhu et al. 2019b). They operated the Hilbert vibration decomposition (HVD) technique to decompose the original signal to get modal components and then fused them based on the information fusion of SDP. A CNN with two convolution and pooling layers and one fully connected layer achieved 97.7% precision, indicating that the proposed fault diagnosis method was highly reliable. Another approach of converting raw data (that are sampled into a phonetic form) to 2-D image input for the learning of CNN, based on the spectrogram, was proposed by Yu et al. (2018). The power spectral function of the signal was converted to a 2-D image form and was applied to a batch regularized CNN with VGG16 architecture. The effectiveness of the proposed method was demonstrated by comparing its performance with BPNN, SVM, and ELM. An interesting approach for providing a 2-D representation for vibration data is the continuous wavelet transform scalogram (CWTS), which was proposed by Guo et al. (2018). CWTS was designed by discomposing vibration signals in different scales using wavelet transform with certain preprocessing, and then it was applied to parallel CNN for acquiring the scores to select the appropriate fault model. The model was applied to different testbed data and demonstrated that the proposed approach was a universal and generic solution. Dislocated time-series CNN (DTS-CNN) proposed by Liu et al. (2017a) to handle disparity of time-series industrial data (most often in 1-D form) with 2-D images, thereby helped CNNs to excel well as it did in other domains. DTS-CNN adds a dislocate layer that extracts the relationship between signals with different intervals in periodic mechanical signals. The dislocated data matrix was applied to a CNN architecture with four convolution layers, two pooling layers, and two fully connected layers resulting in superior performance over the other methods, especially in non-stationary data. Recently, Yongbo et al. (2020) developed a method using CNN that extracted fault features from the IR thermal (IRT) images and fed it into the softmax regression classifier for fault pattern identification. The proposed method could achieve an accuracy of 98.59%, outperforming DBN, DNN, and SAE. In another similar work, Liu et al. (2020), used shaft orbit images as input to the CNN. This generated the first set of features, while DWT transform of denoised swing signal and SVD created the second set of features. Finally, the CNN and DWT-SVD combined features applied to the SVM classifier to attain a better recognition effect than other methods. Presenting the vibration signal in the form of a 2-D Kurtogram to CNN was also observed in some bearing fault analysis works using CNNs (Udmale et al. 2019a, b).

Table 5 Summary on CNN architecture

Most of the CNN architectures reported in the industrial fault diagnosis system either deal with 2-D plots or convert raw data to a 2-D matrix for operational convenience. But Ince et al. (2016) introduced adaptive 1-D CNN, which used 1-D arrays instead of 2-D matrices for both kernels and feature maps. In this method, the 2-D convolution and rotation operations were replaced with corresponding 1-D operations. Moreover, instead of applying raw motor current signals to CNN, a notch filtered current spectrum was applied as input to suppress power system fundamental frequencies. Wang et al. (2018) demonstrated another notable work utilizing 1-D CNN. They proposed an efficient way of fusing multi-sensor data for multiple motor faults with a multi-model DL containing 1-D CNN and MLP. They applied the frequency spectrum of current and vibration signals with TD values as input to the learning model. The current envelope analysis, performed by HT helped the model to identify sideband harmonics of faulty conditions. The experiments proved that the method was efficient and robust not only in rotor faults but also in bias sensor faults in varying load conditions. Vector CNN (VCNN) developed by Xiaoxun et al. (2019) was another 1-D CNN based approach. It took the original 1-D vibration signal vector into the model, avoiding the information loss observed in the signal processing operations. The model was tested with rotor rubbing, unbalance, misalignment, and whirl faults on a testbed. Liao et al. (2019) proposed a 1-D CNN-GRU model for hydroelectric generating unit (HGU) fault diagnosis in order to combine the speed and lightness of convnets with the order-sensitivity of GRU. The framework followed a three-layer structure consisting of the input layer, the hidden layer, and the output layer, where the hidden layer contained the convolution layer, pooling layer, GRU layer, and the fully-connected layer. GRU, combined with CNN, was able to achieve significant improvement in the results. A recent contribution in 1-D CNN was presented by Zhao et al. (2020a) to identify shaft misalignment and crack in the rotor system. The proposed multi-input convolutional neural network (MI-CNN) took the raw TD input signal and applied to 1-D CNNs, in which kernels of different sizes were adopted to extract the signal features of diverse dimensions. The method could achieve an accuracy of 99.42% when the fused features were applied to softmax function for classification.

Modifying CNN structure is another way by which the researchers utilize CNN in order to adapt to specific requirements of the RFD problem. Sadoughi and Hu (2019) proposed a physics-based CNN (PCNN) which modified the CNN structure by incorporating physical knowledge about the faulty component and fault characteristics. They achieved it by designing a convolutional layer that functioned based on the similarity between the input signals (data) and the fault characteristics signals (physics). The overall structure of PCNN contained five convolutional layers, three max-pooling layers, and one global max-pooling layer, as well as spectral kurtosis (SK), envelope, and FFT function layers around the first convolution layer. The CNN proposed by Zhang et al. (2017) with wide first-layer convolutional kernel and deep network structure with small convolutional layers is known as WDCNN (Wide first-layer kernels deep CNN). The most important property of WDCNN is its ability to suppress high-frequency noise while taking raw input data directly. This work utilized data augmentation to increase the number of inputs in order to improve generalization ability. It was able to achieve 100% accuracy in the public CWRU bearing dataset. Different CNN models and architectures used in RFD are shown in Table 5.

Along with structure alteration, there were several attempts to make CNN suitable for multi-sensor data through data fusion or feature fusion methods. For integrating automatic feature learning, adaptive feature fusion, and accurate fault classification of data from multiple sources, a multi-stream CNN was proposed by Yuan et al. (2019). In the feature learning layer, 1-D convolution was employed for the raw vibration signal. Then 2-D convolution was used for the infrared image, followed by a feature fusion layer and a softmax layer for classification. Misalignment, rub impact and looseness fault data from a testbed was used for demonstrating the model, and the results were compared with BPNN. Sonkul et al. (2019) developed multi-channel CNN (McCNN), which accepted multi-channel input from bearing, shaft, and gearbox with sensors arranged in horizontal and vertical directions. It took raw TD data and needed little preprocessing. Yuan et al. (2018) proposed a novel multi-mode convolutional neural network (M-CNN) for the fusion of both structured and unstructured, heterogeneous data. Adjustable filter banks were introduced to decide appropriate convolution mode, and t-SNE was applied to fuse the deep features. On raw vibration as well as infrared image input data, CNN achieved a classification rate of 98.97% on almost all important SRFs. The convolutional pooling architecture on CNN operated as a feature learning tool for specific other ML models. Sun et al. (2017) developed such a framework in which the local filters convolutional pooling architecture was learned by backpropagation neural network to give it discriminative power so that it derived invariant and robust features. Finally, SVM classified six faults of an induction motor with the best performance among state-of-the-art methods.

Moreover, improving the generalization ability, even with scarce labeled samples, is one of the major issues in applying DL to SRFs. Zhang et al. (2018) proposed the usage of handcrafted features generated by STFT and the labeling of unlabelled data with SVM. A CNN with six convolution layers, two pooling layers, and two fully connected layers designed to discriminate different unbalance conditions and looseness with better accuracy and generalization ability. The same work was modified (Zhang et al. 2019a) by applying the features from raw data to several candidate SVMs for selecting the best performing SVM, and it assisted in creating an augmented training set (ATS). The CNN is designed with two sets of convolution, pooling, and fully connected layers to improve the fault diagnosis results, and tested with two different datasets. Another approach towards solving the issues of insufficient and extremely imbalanced, distributed data with DL models, was presented by Xu et al. (2019) with a small data-driven convolution neural network (SDD-CNN). They adopted the label dilation method to solve the imbalance in data distribution and a semi-supervised data augmentation method for solving scarce data issues. Among the four variants of architectures demonstrated by them, SDD-inception v3, using a deep transfer learning strategy, proved to be superior with 99.56% accuracy. Transfer learning is effectively used for the same purpose in certain works. Xiao et al. (2019b) proposed transfer learning by CNN embedded with maximum mean discrepancy (MMD) to solve the DL issue of handling the unlabeled data and the unlabeled data under constantly varying working conditions (which are commonly available from real applications). The regularization term of MMD ensured CNN parameters to learn to reduce the distribution mismatch between the features in the source and target domains. The CNN was designed with four convolution layers, one pooling layer, and four fully connected layers, and was tested on the European driving cycle (NEDC) data. The results revealed that the proposed method achieved higher accuracy than the other methods.

Hardware associated approaches with CNN that directly deal with the industrial internet of things (IIoT) based hardware platforms, as well as with other embedded systems, have also been observed in RFD literature. An IIoT based application development for rotor fault diagnosis was demonstrated by Langarica et al. (2019a) with dynamic incremental PCA (DIPCA) and reconstruction-based contribution (RBC). In this approach, a statistical data-driven algorithm (for preliminary detection and identification of faults) was combined with a signal-based method that utilized a CNN for refined fault diagnosis of vibration signals. The CNN, with two convolutional layers of 20 filters and 40 filters, respectively, two pooling layers and softmax output function, achieved an identification accuracy of over 90%. The IIoT components and design principles were well defined and followed in this framework. In another hardware related approach, Lu et al. (2019b) implemented an embedded system with Raspberry Pi and signal acquisition and processing circuit for evaluating the performance of the proposed CNN in real-time data. The CNN input image was created from a vibration signal by base two logarithmic spectrum (B2LS), and the proposed CNN structure included two convolution layers, two max-pooling layers, one fully connected layer, and a softmax layer. This work proved to be a heterogeneous computing framework for motor fault diagnosis on a small-size, flexible, and convenient handheld device.

Adding the discriminative ability to the existing classifier improves the performance to a great extent, as seen in (Sun et al. 2017) (where CNN gave discriminative power and SVM was the classifier). An almost same approach, with an adversarial network as discriminator and CNN (DACNN) as the classifier, was proposed by Han et al. (2019b). DACNN merged the idea of adversarial training into the feature learning of deep model, i.e., the parameters of trained CNN with small datasets were transferred for training with the adversarial strategy. By experimenting with a wind turbine test rig and a generic gearbox fault dataset, they proved that their method was effective and superior even with smaller datasets. The same authors (Han et al. 2019a) proposed a way to extract spatial (between measurements) and temporal (for each measurement) features using a spatio-temporal pattern network (STPN). It provided an efficient way to apply multivariate time-series to 1-D CNN in single-channel or multi-channel mode with a complexity less than that of a 1-D CNN. The authors experimentally proved that the spatiotemporal CNN (ST-CNN) outperformed SL methods like SVM and RF. Besides, dealing with compound faults is one of the research challenges still kept least addressed in RFD. Liang et al. (2019) made such an attempt with an end to end diagnosis model integrating CNN with wavelet transform and multi-label classification, namely WT-MLCNN, which directly used raw vibration signals. The main attraction of the work is that, rather than considering a compound fault as a new single fault, the model considered it as a multi-label image classification problem by converting a 1-D vibration signal to 2-D time-frequency images by wavelet transform.

The discussion on CNN can be summarised as follows. First, around 51% of the RFD-related works that use DL adopted CNN as the classifier, which indicates how CNN has been a gamechanger, even with time-series data inputs. However, the main concern with using CNN for RFD relates to the 2D representation of time-series data produced by machine health monitoring sensors. Several time-tried, efficacious methods have been involved in this conversion, including SDP, CWTS, and spectrogram. Meanwhile, rather than use time-series data as input, there are a number of methods directly acquire 2D images such as orbital images or IR images. A fair number of the related works (68%) altered the structure of CNN to make it compactable with RFD issues, while the review substantiates the fact that utilizing DFC with CNN is close to impossible. It is also clear that 1D CNNs began to gain in popularity, and adding the discriminative ability to CNN ensure the wider acceptance.

5.2 Deep belief networks

DBN is an unsupervised, probabilistic DL network with a stacked structure of RBMs or AEs, pre-trained using a greedy learning algorithm. The efficacy to provide impressive results with fault specific features rather than with raw data has offered this model much acceptance among the RFD research community.

Certain approaches became significant in the way they retreat the input so that the DBN can make more out of it. In this direction, Oh et al. (2016) developed a method for converting raw vibration signals to the image by the omnidirectional regeneration method (ODR). Histogram of oriented gradients (HOG) descriptor was presented as input to the DBN. They observed that the classification accuracy varied from 95% to 100%, depending on the number of nodes in the hidden layers of the DBN. Guan et al. (2019) made a significant contribution to structural fault identification by combining EMD and DBN. The IMF signals generated by EMD were selected based on fault information content using sample entropy. The FFT transformation was adopted after the reconstruction of selected signals into new vibration signals and then presented it to DBN. The effectiveness of the proposed method was tested with the FD and TD structural fault features. Comparing with BPNN and CNN, this method achieved the highest diagnostic accuracy and was superior in precision. Compressed sensing has been proved as an efficient method to reduce the data capacity and improve analysis efficiency. This was effectively applied by Shao et al. (2018) along with convolutional DBN (CDBN). To enhance the feature learning ability for the compressed data, the CDBN model was constructed with Gaussian visible units. The exponential moving average (EMA) technique was employed to improve generalization performance.

The handling of heterogeneous data sources is another criterion in RFD. Yan et al. (2019a) proposed a multi-DBN model to solve multi-source heterogeneous information fusion problems. Multiple DBNs approach was adopted in such a way that one DBN was used per each sensor stream, and finally, the fused results were presented as output. The authors demonstrated their method with shaft orbit information and multi-source heterogeneous information, which were collected from a testbed on different rotor misalignment conditions. DBN was observed as a fault feature extractor by Shao et al. (2016). The features from each layer of the multi-layer representations were collected for the performance evaluation of layer-wise feature extraction. The same was utilized for comparison with the traditional features. BPNN was demonstrated as a decent classifier that is capable of classifying six different rotor faults far better than the other methods with the DBN extracted features. A DBN approach for rotor fault classification and mechanical degradation assessment was presented in the work of Li et al. (2019), who applied DBN by stacking five RBM layers with Bernoulli functions. The input vibration data of unbalance, rub, and looseness faulty conditions decomposed by WPD, was applied to the model, and its performance was compared with DNN, k-NN, SVM, and BPNN models.

We can summarise the following aspects of DBN based on the attendant discussion. First, DBN is among the preferred options for RFD researchers as it allows them to remain unrestrained in terms of applying the feature processing and can thus take advantage of the fault-specific features. Even though we were able to identify a few works related to DBN, around 44% of these exclusively focused on rotor faults. Most of them demonstrated the scope and possibilities of using DBN in terms of different aspects, including feature extraction, the fusing of DBNs for multi-sensor data, using DBN with regenerated input, and convolutional DBNs. Meanwhile, layer-wise feature extraction is another interesting aspect of utilizing DBNs. In short, the possibility of using extracted features rather than raw data is the distinguishing feature of this method.

5.3 Autoencoders and DNN models

As an unsupervised method capable of learning features, AE is widely used in many applications, including RFD. The basic AE models are enhanced by stacking multiple AEs to form a stacked autoencoder (SAE), while, through the addition of denoising capability, stacked denoising autoencoders (SDAE) can be developed.

Chen and Li (2017) proposed a simple and straightforward implementation of AE by developing a deep neural network with a stack of multiple DAEs as hidden layer. The model employed dropout regularization to avoid overfitting, and it was used for the identification of SRF with rub and oil film whirl. Even without any fault specific signal preprocessing, their method was able to incorporate the characteristics of input data effectively and extract characteristic information of different faults. Additionally, the feature learning ability of AE has been widely utilized in RFD, and several attempts were made to enhance this feature learning capability. A hybrid AE model combining one denoising autoencoder (DAE) and multiple contractive auto-encoders (CAE), was proposed by Shao et al. (2017) that enhanced the feature learning ability of standard AE. They utilized locality preserving projection (LPP) for the fusion of features to extract the most representative information and a softmax layer for the classification of SRF. They considered different inputs like raw vibration data, features extracted from certain frequency-band signals, features extracted from the most sensitive frequency band signal, etc. The performance of the model was compared with standard DAE, standard CAE, BPNN, and SVM. For good measure, learning fault specific features from TD and FD transformed data with SDAE was also studied in the RFD literature. Zhao et al. (2018) illustrated SDAE to detect the rotor faults with FFT transformed FD signal as input. The performance of SDAE was compared with the different number of hidden nodes and deep architectures, as well as with other models like BPNN and SVM, supplying TD and FD input signals. The SDAE deep architecture adaptively extracted the fault features and was able to achieve an optimum accuracy of 99.89% with softmax output function. Meng et al. (2019) identified several issues associated with deep learning architecture in SRF analysis, especially when using AE. They addressed irregularity of raw vibration, generalization issue, increased number of parameters in the DL model, etc. in their work. They proposed segmentation of raw input, data augmentation, and scaled conjugate gradient (SCG) algorithm in SAE respectively to address these issues. In induction rotating machinery fault diagnosis, the proposed method proved to be superior over the other methods, and it achieved an accuracy of 99.6%.

Furthermore, rotor misalignment through ball removal from bearing on a self-aligning bearing rotor system along with the other bearing faults, was studied by Narendiranath Babu et al. (2018). They presented two autoencoders and a softmax layer for the realization of DNN with an SCG algorithm and the cross-entropy technique. The performance was compared with ANN, and EMD implemented along with HHT achieved 100% accuracy for DNN. The same authors came up with a similar work (Babu et al. 2018) using a journal bearing dataset, and they enhanced the study with TD, FD, and TFD features on the models. The AE with an increased number of hidden layers and a final softmax layer indicated a classification accuracy of 100% again. The results highlighted that as the number of faults increases, the overlap of fault types begins, and that a deep AE performs well in such situations. An SDAE was employed by Kong et al. (2019) for identifying rotor related faults of aero-engine. The vibration input from multi-sensor data was undergone multi-modal heterologous feature extraction by SDAE, and then it was applied to support vector classifiers to fuse SDAEs in different modalities to get the acceptable classification results. Recently, Tang et al. (2020) introduced an SAE to identify contact rubbing, oil whirl, dynamic unbalance, and static unbalance faults. SAE extracted significant fault features after the statistical filtering process, which reduced the interference information and increased the signal-to-noise ratio. The results projected the dominance of the proposed method over DAE, SVM, and BPNN.

Improving the robustness of feature representation of sparse AE is very much important, and one of such works was manifested by Sun et al. (2016). A partial corruption was added into the input with the help of the denoising coding, and the learned sparse feature representation assisted the training of a neural network with drop out regularization, for classification. During forward propagation, the average activation for sparsity was computed to calculate sparse cost function. The framework was compared with NN, LR, and SVM on rotor fault data of induction motor and was superior over the other methods. Lei et al. (2016a) proposed a two-stage learning process utilizing sparse filtering in SAE. The local discriminative features were extracted from raw vibration signals using sparse filtering. Then by averaging these local features, the learned features were obtained and applied to the softmax regression. Though the method does not deal with SRF data, we included this work since this method can handle the scarcity of labeled data issues, which is very common in RFD. Moreover, the authors give an observation that weight vectors of sparse filtering attribute similar properties to those of the Gabor filters.

Certain methods reduce the overhead of tuning the parameters of the DL models, and one such method was proposed by Yang et al. (2016). An ELM based autoencoder generated a weight matrix of the hidden layer, that transformed the input dataset into a new feature representation, and this representative learning avoided information loss compared to the traditional feature extraction methods. The last single layer of ELM performed the fault classification task. The framework was compared with different types of feature extraction methods of wind turbine data and achieved the best result. Another approach with ELM was proposed by Sharma et al. (2015), in which RMS value of 3-phase voltage and current signature was considered as input to the model. They identified six external faults of three-phase induction motor and compared ELM performance with SVM and MLP to highlight the performance of ELM. From the ELM design perspective, the work of Yang et al. (2016) was differentiated from other ELM designs by the hierarchical structure of ELM. The ELM in this work was designed by concatenating the forwarding list of ELM and were processed independently. The two crucial ELM based components in the framework were ELM based autoencoder and ELM based classifier so that the ’number of hidden nodes’ was the only adjustable parameter. Another architecture modification in ELM was observed in ensemble ELM (EELM), proposed by Wang et al. (2020) for compound-fault diagnosis of rotating machinery. It consisted of an ELM based clustering sub-network that generated the Euclidean distance representations from each point to every centroid followed by another subnet that identified potential output tags through multiple-output-node multi-label learning. TD and FD features were extracted by PSO and variational mode decomposition (VMD), and the method proved its dominance on a real-world database. Recently Zhao et al. (2020b) proposed a multi-manifold deep extreme learning machine (MDELM) algorithm to classify UB, MA, LS, and rub faults. The feature extraction was performed by an ELM based modified sparse filtering, and then the mining of the intra-class and inter-class discriminant feature information was accomplished by MDELM with multi-manifold constraints. State-of-the-art algorithms were selected as the benchmark for comparative studies of the MDELM algorithms. In essence, ELM proved to be an efficient method of fault classification for any kind of rotating machinery.

Neural networks with the deep structure of multiple feed-forward layers (neither with RBMs nor with AEs) are also available in the literature of RFD. Bo et al. (2017) developed a simple DNN model that dealt with the whirl machinery rotor vibration data to identify rotor unbalance, misalignment, and the rubbing fault. Eight spectrum values, including the 1x, 2x, 3x harmonics, which have a high effect on SRFs, are presented as the input eigenvalues of the network. Li et al. (2019) applied DNN with five hidden hyperbolic tangent layers with 50, 50, 32, 32, and 32 nodes, respectively. The model was trained on coefficient and energy-based features extracted by DB4 WPD, using a parallelizing stochastic gradient descent algorithm. The unbalance, rub, and looseness faults were implemented and were compared with DNN, k-NN, SVM, and BPNN to indicate its superiority.

The AE-based RFD research can be summarised as follows. First, autoencoder-based models are the most widely used DL models after CNNs in the RFD-related literature. In fact, since RFD uses the features learned by AE, only nominal attempts have been made to utilize DFC in the literature, and there is an ongoing trend within the research fraternity to enhance the feature-learning capability of AE. Deep AE performs well in terms of multiple-fault or overlapped-fault situations. Generally, SDAE performance has been studied with varying numbers of hidden nodes and different deep architectures. Meanwhile, ELM-based AE is becoming increasingly popular since it shifts the emphasis from the overheads involved in parameter tuning.

5.4 LSTM and GRU

An RNN is capable of learning temporal information of sequential data by holding the past information and sharing weights using its recursive structure. LSTM and GRU are two variants of RNN that are equipped with gates to avoid vanishing gradient problem. There were only a few attempts made to capture the long-term dependencies of input time-series signal using the sequential analysis techniques of AI. One such work was manifested by Lei et al. (2019), who introduced LSTM for categorizing SRFs, bearing faults, and other compound faults in a wind turbine test rig. They dealt with multivariate time-series raw data and used simple data fusing techniques, opening an opportunity for applying advanced data fusion methods. The proposed method was compared with SVM, MLP, RNN, WDCNN, and DCNN with a single sensor as well as multi-sensor data, and projected the dominance of their method. Xiao et al. (2019a) proposed a more advanced approach of hybrid feature learning that combined statistical parameters, recurrence quantification analysis (RQA), and three-layer stacked LSTM. Twenty-nine statistical features, including 16 TD features and 13 FD features and 10 RQA features were extracted and applied to the batch normalization layer. The method proved its preeminence by comparing state-of-the-art AI-based fault diagnosis methods, together with different feature extraction techniques. Similar to LSTM, GRU can also utilize long term dependency of time-series data. Liao et al. (2019) proposed a 1-D CNN-GRU architecture that adaptively learns fault factors. In this three-layer structure, the hidden layer contained a GRU layer with the convolution layer, the pooling layer, and the fully-connected layer. GRU, combined with CNN, helped to exploit the speed and lightness of convnets with the order-sensitivity of GRU. The authors proved the effectiveness of the system by implementing the model in a working hydropower station and illustrated good results in four small hydro operation monitoring systems.

We can conclude the following things about sequential learning DL approaches in RFD. Given the majority of sensing methods used in RFD produce time-series data, only a few attempts were made to pinpoint the sequential analysis for capturing the time-series behavior of RFD data, resulting in a nominal share for RNN in RFD literature. To deal with multivariate time-series, the data fusion technique has shown its effectiveness in certain works. LSTM and GRU models are being recognized in recent times, which provides a roadmap for RFD analysis to explore the long-term dependencies. These models are easily incorporated with the other models to exploit the effectiveness of both.

5.5 Generative adversarial networks

Being an input reconstruction based semi-supervised learning method, GAN turned out to be a critical solution to data-related issues in RFD. In one of such attempts, the data imbalance issue was addressed by Lee et al. (2017) by applying GANs to oversample the minority class. Their method contained two GANs for two different types of faults. They compared GAN based oversampling with standard oversampling methods and their combined effects on the performance of DNN. Based on the results, they concluded that when the imbalance ratio is minor, GAN is adequate, and if the imbalance ratio is significant, then standard methods are desirable. Their results proved that in intelligent fault detection and diagnosis using DNN with imbalanced data, GANs could be effectively utilized. Data imbalance issue was solved using an adversarial network as discriminator and CNN (DACNN) as the classifier, by Han et al. (2019b). In this model the small datasets are transferred to DACNN, and then trained with the adversarial strategy. By experimenting with a wind turbine test rig and a generic gearbox fault dataset, they proved their method effective and superior even when the dataset size is very small.

We can infer the following observations about GANs from the literature. GANs are mainly used for oversampling data in RFD research. While dealing with small-sized datasets, GANs proved to be a solution to many issues, including data imbalance. The philosophy of adversarial training is applicable in other DL models as well.

6 Summary and challenges

A systematic overview of various state-of-the-art AI-based techniques for RFD is presented in this study. Moreover, special attention has been given to introduce rotor faults with their characteristics (refer Table 1), that will be instrumental in deciding the feature selection process, preprocessing methods, and even in selecting the classifier. In this section, a summary of research progress that happened so far in RFD will be described, in line with the various phases of data-driven AI-based RFD framework. The overall statistics of the state-of-the-art research is demonstrated in the heatmap shown in Fig. 5.

Fig. 5
figure 5

Summary heatmap of RFD (values in percentage)

Data acquisition phase: Methods in both ML and DL pursue the same process in the data acquisition phase of RFD, though they demand data at different quantity and precision. For an acceptable level of generalization, DL cannot compromise with the amount of data customarily needed by ML, but can deal with noisy data environments. Therefore the data source selection is always critical in the case of RFD. From the literature summarized in the heatmap, it is identified that around 42% of works depended on RTB method for data collection and about 20% bank on the other sources, including test rigs for bearing, gear, or wind turbine whereas a few works utilized open-source datasets as well. The 30% of works collected induction motor current and voltage as input. It is not an exciting fact from the RFD perspective as this input mainly deals with BRB fault only, though very few exceptions are notified. It is quite interesting to notice that ML had to depend hardly 13% on other open sources, while DL heavily draws around 34% for other sources. This indicates that testbed kind of data collection methods are unable to mitigate the data requirement issues of DL, which we have to analyze in conjunction with the number of works explicitly performed for SRF. A significant finding is that 55% of ML related works deal exclusively with RFD, but DL’s contribution in this regard is just 36%, which indicates that the remaining works in both ML and DL consider rotor faults along with the bearing or gear faults. This limits the scope of RFD specific research improvements, which in turn causes a proportional falling off in the percentage of AI works that utilized rotor fault specific parameters (DFC, frequency harmonics, etc.) in ML with 14% and DL with 8%. This statistical analysis is shown in Fig. 6.

The facts about the data source of RFD summarized so far draws in two important conclusions. Firstly, testbed, which is recognized as the primary data source of RFD, often fails to provide sufficient data for DL so that DL methods choose some open-source datasets. Secondly, these datasets lack rotor specific data though they have a large amount of bearing or gear fault data.

The other challenges identified from the literature in acquiring faulty data are:

  • In real situations, the machine runs in faulty conditions are very rare compared to its normal running conditions. So it is difficult to get faulty data keeping the balanced sampling.

  • Though we can simulate faulty conditions in testbeds and acquire data, it will be a challenging and difficult task to run the testbeds for a long time in a faulty environment.

  • Commonly available testbeds find it difficult to simulate the frequently changing RPMs, load, and other environmental conditions like noise, which makes the data unrealistic most of the time.

Fig. 6
figure 6

RFD specific research (percentage of works)

This scenario opens a new research direction of data generation or augmentation. The augmenting dataset is familiar to the image and speech processing research community but not significantly explored in time-series data processing. The techniques proposed for augmentation frequently uses random sampling with a replacement proven to be readily implementable but has certain issues. Firstly, there is no guarantee that the generated data follows the same distribution as that of the original data. Secondly, in no way, it ensures the time-series properties (trend, seasonality, etc.) to be present in the newly synthesized data in most of the cases. In augmentation, it is also challenging to confirm the label of newly created samples, opening a new direction to the research community. Another method of generating data is using GANs as the data generator. As we mentioned earlier, the approaches using GAN is the least attended one in time-series data-based machine health monitoring. It has been proven by many researchers that the popular minority class oversampling methods, like ADASYN (He et al. 2008) and SMOTE (Chawla et al. 2002) are unable to provide good generalization in the scenario where data is sparse, and presence of outliers is more. But GANs often demonstrated much better performance in such scenarios. So proving the effectiveness of GANs in time-series data generation is also a new area of research. The methods to address unlabelled datasets or scarce labeled datasets always demand more research in small datasets situations. It invites more exploration of unsupervised learning paradigms as well. The option of ’rotating machinery libraries’ that artificially creates faulty rotor data, mentioned in Sect. 2 is not in vogue, but it leaves something to count on in the present scenario.

Among the signal sensing methods in the data acquisition phase, the most commonly adopted approach is vibration analysis. It is evident that SRF is the root cause that influences the characteristics and behavior of vibration. Hence, except for broken rotor bar faults, more than 80% of works depends on vibration data. Compared to other methods, vibration sensors have to be mounted on the device, and as a result, the signals are sensitive to noise as well as the sensor mount positions. By applying deeper layers, DL overcomes these issues, but ML is not robust to noise, which justifies the comparatively higher percentage of other sensing methods with ML literature. In signal sensing, the position and number of sensors have a high implication on SRF since different faults affect different planes (refer Table 1), and placing the sensors on that basis will help in better diagnosis results.

Meanwhile, DL prefers to have higher dimensional input data demanding more number and types of sensors, which leads to certain research problems related to the fusion of multi-sensor data. It is clear from the statistics that only around 10% of works both in ML and DL combined tries to deal with multi-sensor data, which indicates that SRF diagnosis has to take a big leap in this direction. Another important observation is that the selection of sensing methods heavily depends on the fault type. For example, broken rotor bar fault is one of the prevalent rotor faults in induction motors, and it is notified from the literature that no sensing methods apart from motor current and voltage, have been tried for this fault both in ML and DL models. Only a small fraction of research (less than 5%) accounts for acoustic emission and temperature-related RFD studies. Still, they are gaining popularity because of their low cost, ease of installation, and their non-contact and non-intrusive nature, which avoids many sensor-related hurdles.

Feature processing Phase: DL exercises the automated feature extraction and selection by learning discriminative features in an end-to-end manner, and therefore, this phase is skipped in DL. But certain signal processing techniques have been utilized in DL models for data preprocessing and proper input representation. As opposed to DL end-to-end learning, ML goes for manual feature extraction, which is summarized from RFD perspective, as follows:

  • The symptom parameters of SRF are frequency harmonics. Hence, the TD methods are unable to capture and utilize these parameters, resulting in poor performance for RFD.

  • The non-linearity and non-stationary nature of the rotor vibration signals complicates the accurate fault diagnosis for the FD methods. Because the Fourier transform is suitable only for stationary signal processing, and it fails to reveal the inherent information of non-stationary signals. In the case of the wavelet transform, the selection of a wavelet basis and its lack of ability to adapt to changes are the key factors adversely affecting the performance.

  • While the TFD processing techniques, particularly STFT, WVD, etc. gained immense popularity in RFD by its exceptional ability to handle non-stationary signals, these are exhibiting certain deficiencies. For instance, STFT fails to produce an ideal resolution to frequency and time simultaneously. Furthermore, to implement efficient and fast STFT algorithm, SFT lacks orthogonal bases. Similarly, WVD produces inference terms in its decomposition process.

The literature review statistics indicate that 18%, 40%, 31% of works still depend on conventional TD, FD, and TFD features, respectively. Irrespective of the fact that TFD processing is best-suited for extracting SRF specific features from the non-linear and non-stationary vibration signals, a more prominent share is enjoyed by FD operations.

In such a scenario, it is evident that the fault specific DFC extraction is the key to the feature extraction of RFD. But using the aforementioned methods from a general perspective is ineffective in fault specific feature extraction as the fault categories are complex and large. The 14% of ML literature in RFD accounts for the fault specific component extraction (refer Fig. 6), which either depends on FD operations to extract the harmonics of rotation frequency or the frequency decomposition (like EMD and its variants, WPD, etc.) in the TFD. These facts signify that further research can be conducted in fault specific symptom parameter extraction by wisely utilizing conventional FD and TFD processing methods. Table 6 lists the contribution in utilizing the rotor fault specific symptom parameters, and that proves their significance in RFD.

Table 6 Fault specific component usage

From the literature, it is evident that DL methods for RFD also demand certain signal processing with 26% and 18% in the FD, and TFD, respectively. Certain observations from the literature are summarized as follows:

  • Raw vibration signal is 1-D time-series necessitates conversion into 2-D form for CNN to operate convenient and fast, though 1-D CNNs are also present in the literature.

  • As input, deep neural networks prefer frequency decomposed components to raw vibration signals.

  • The performance of DBNs with raw data or TD features are worse compared to the other domain features.

  • The signal acquisition mechanism also is substantive; for example, infrared images or sound inputs need proper preprocessing before applying to the DL models.

In respect of fault specific DFC utilization in DL, DBNs, AE-based models, and 1-D CNNs have opened up some provisions in the literature, but constituted only 8% of the overall works (refer Fig. 6), and requires more attention in research. It is observed that, once the adequate quality features are extracted, the feature selection process followed in the RFD is the same as that followed in any other rotating machinery fault diagnosis process.

Classification Phase: Around 80% of ML models operated in the RFD literature is constituted by SVM (41%), ANN (29%), and k-NN. The NB and all the other classifiers provide less than 5% each in the literature. Similarly, CNN seized a predominant position among the DL methods with a 56% share in the literature followed by AE based models, while DBN also registered some significant contributions. The general features of these classifiers, their pros and cons, along with their compatibility with RFD, are already summarized in Table 2 and 4.

In terms of accuracy and classification speed, SVM and ANN are top-notch, which is evident from their wide acceptance among the ML models in RFD. But the analysis of the works of comparison study of ML classifiers had shown that even when ANN meets the expectation, the accuracy of SVM was not commendable (refer Table 3). The reason may be the architecture versatility of ANN compared to SVM, which is evident from the fact that a higher fraction of ANNs in the literature altered their basic structure in accordance with the RFD problem. NB and k-NN are worthy of their ability to deal with overfitting and have good interpretability while SVM and ANN still have to advance a lot in this regard. Since the raw signals acquired from sensors are affected by noises, it is observed that the researchers depended on preprocessing steps to deal with noisy raw data, rather than going for NB like ML models which project staunch robustness to noise.

Regarding RFD, one of the primary focuses of our discussion is about incorporating domain-specific knowledge in decision making, which we have tried to explain through the fault specific symptom parameter extraction capability of the model. It is observed that ML is doing it a much better way (though just 14% of works tried it) by using frequency decomposition and utilizing harmonics of rotation frequency as features. But this approach possesses certain challenges:

  • Ideal single fault assumption: Most of the works assume that only one fault occurs in the rotor at a time, which is quite impractical in real industrial scenarios. The monitored characteristic frequency will be the result of frequency interplay between different faults.

  • Unobservable faults: The manifestation of rotor faults like corrosion and wear, and some oil-based faults with these frequency components are not much sensible.

  • Par with industrial data: The real data with continuously changing rotational frequency and load conditions cannot be simulated in most of the testbeds, which limits the applicability of this strategy.

  • Multiple movements: The trajectory of rotor motion need not always be maintained the same, which affects the vibrational frequency components. Environmental vibration is also an influencing factor.

The application of such domain knowledge into DL models by means of fault specific features is not always straightforward as in ML, because of the automated feature extraction and selection of end-to-end learning strategy. CNN is one example for this phenomenon, from the fact that just 7.14% of works tried to incorporate fault specific features by some data fusion and 1-D CNN model. But as we mentioned earlier, DBN, AE-based models, and DNNs give some scope in this direction since they prefer processed data to raw data. One of the remarkable features in the existing literature is that the fault specific discriminative feature extraction reduces the size of the DL model and eventually leads to better performance. Some other works indicated significant effort to change the ’black box’ nature of DL by using proper visualization tools like t-SNE. Almost all DL algorithms are robust to noise, but CNNs present a bit more denoising capability compared to others, especially better than AEs. SDAE like models overcome these limitations. The performance generalization issues of DL related to the size and diversity of data, class imbalance, etc. have already been discussed at the beginning of the summary section. A few more observations we like to add here.

  • Regarding the accuracy of DL model, it is around 99%, and some are achieving 100% from testbed datasets, which need not guarantee the same with industry data of varying RPM and load conditions. The unavailability of generalized datasets limits the opportunity to compare different works in terms of accuracy.

  • Unbalanced datasets demand the usage of metrics such as precision, recall, F1-score, etc. along with the most commonly utilized accuracy metric.

  • While the acquired signal is sequential in nature, very few works have tried to address RFD problems with sequential deep learning models like RNN.

7 Future research directions

In order to encourage research on RFD, with special emphasis on SRFs to provide a more generalized, industry conforming, and realistic solution, the current research can be modified in the following ways.

  1. 1.

    Synthetic data generation: Utilize synthesized data by engaging data augmentation or GAN based data generation for addressing data scarcity issues. Research can be done in time-series data augmentation with a focus on ensuring time-series properties of the data, maintaining the correlation between different columns of multivariate time-series, and ultimately confirming the labels of synthesized data. This poses a challenging research problem in time-series domain and augmentation literature. GAN based data generation has been tried in certain bearing fault diagnosis works, leaving scope for research in RFD data generation.

  2. 2.

    Challenging datasets: Another research lies in filling the gap between testbed data and real-world industrial data. As we have mentioned earlier, creating a complicated dataset by applying varying RPMs and load conditions in testbed simulations can create new research issues and makes the solution industry-ready.

  3. 3.

    Challenging Problems: Most of the RFD research deals with single faults, and only very few have tried to study compound faults, which is not on par with the real-world scenario. The research challenge in dealing with compound faults can be made more complicated by considering the compound faults as a multi labeling problem.

  4. 4.

    SRF specific symptom parameters: Extraction and usage of fault specific symptom parameters poses new research challenges in both ’feature engineering’ as well as refining classifier architecture.

  5. 5.

    More learning strategies: Classifier fusion and transfer learning kind of learning strategies are least applied in RFD. Fusing new classifiers at different levels (data-level, feature-level, decision level, etc.) opens a lot of opportunities to improve accuracy. Similarly, the training with artificial faults and the transfer of the trained model for the classification of natural faults in a real-world scenario (by facilitating classifier fusion, transfer learning, and domain adaption) is a challenging problem in research. Few-shot learning is another option which is already proved to make reasonable accuracy with fewer data.

  6. 6.

    Sequential deep learning: The literature of RFD indicates that no significant attempts have been made yet to exploit the sequential nature of the sensed signals utilizing RNN based deep sequential models like LSTM and GRU.

8 Conclusion

This paper focused on conducting a comprehensive review of the literature related to RFD involving ML and DL strategies. We first introduced SRF along with various other rotor faults, emphasizing the attendant relevance and characteristics in order to prioritize a fault-wise analysis rather than the currently followed practice of component-wise analysis. RFD was further discussed in relation to an AI-based framework to present the methodologies adopted in each phase, demonstrating the pros, cons, and challenges to provide an overview of the current RFD scenario. The review on the ML and DL classifier models was presented with the aim of emphasizing the rotor fault specific factors, such as the ability to incorporate fault-specific symptom parameters in feature engineering and dealing with imbalanced, scarce, or noisy data. The works that conducted a comparison of the ML models within an SRF context were summarised separately, while similar attention was also paid to the various classifier fusion methods. The summary section concluded the current trends, challenges, and recommendations in terms of all the aforementioned aspects, whereas the future direction section, mainly focused on facilitating the application of testbed solutions in real-world industrial scenarios. Ultimately, this work is presented to provide a single point of reference for RFD researchers in relation to AI with practical insights.