Keywords

1 Introduction

Condition monitoring and fault diagnosis of crucial machines is gaining importance in the industry in order to detect and diagnose the state of the machines so that catastrophic failure can be avoided. Efficient monitoring and diagnosis of machines can improve the reliability, and reduce the maintenance and downtime cost of the machines. Electric motors have become the workhorse and one of the most critical devices in all type of industries. By far the Induction Motor (IM) is most widely used electric motor because of their potential advantages such as ruggedness, low cost, low maintenance, and capability to run under harsh operating environments [1]. In spite of their reliability, IM when subjected to undesirable mechanical, electrical and environmental stresses may develop various modes of faults. These faults result in a complete failure of the IM and finally unexpected shutdown of the production process and sometimes personal injuries. In the last two to three decades, numerous fault monitoring and diagnostics have been developed to avoid unexpected IM failure.

The IM failure occurs due to the progression of any mechanical faults for example bearing faults and rotor faults and/or electrical faults for example the broken rotor bar, stator winding and phase unbalance fault [2, 3]. Many condition monitoring techniques are used for early detection these faults based on the current, vibration, acoustic emission, magnetic flux, and thermal [3]. However, the vibration and current signals are the most preferred signal in industries owing to their ease of measurement and reliability [4]. Also these signals can detect most of the electrical and mechanical related faults in the motor. Faults can be detected by investigating appropriate features of measured signals such as the fault characteristic frequency. Various conventional monitoring techniques are available to detect IM faults based on the fast Fourier transform (FFT), Hilbert transform (HT) and high resolution spectrum analysis, short time Fourier transform (STFT), park vector approach, etc. [5, 6]. However, these techniques are not always reliable. In recent years, significant efforts have been made to use the artificial intelligence (AI) technique for the reliable fault diagnosis of IMs [7].

An appreciable improvement in condition monitoring and fault diagnosis methods have been achieved using the AI techniques such as artificial neural network, fuzzy logic, neuro-fuzzy, etc., [7,8,9]. Nowadays, a competitive AI technique called the SVM is gaining recognition in the field of fault diagnostics. The SVM claimed to have best generalization capabilities and high success rate even with small number of samples [10]. AI techniques are preferred as they are data based techniques; therefore, they do not involve any detail knowledge of IM parameters or its modeling. The suitable statistical features from time domain, frequency domain or time-frequency domains extracted using available data are used as input to the AI classifier for the fault diagnosis. Nowadays, wavelet transforms (WT) in time-frequency analysis is being used in order to extract the suitable features because they can represent the signal with a limited number of coefficients. In addition, they have good energy concentration properties because of compact support of the basis function used in WT. Three versions of WT namely continuous wavelet transform, discrete wavelet transform and wavelet packet transform have been developed. Moreover, the WPT is being preferred as a signal processing tool in the AI based diagnostics of faults. Many mother wavelets (like, Daubechies, Symlet, discrete Meyer, etc.) have been developed in the IM fault diagnosis [11].

In order to perform diagnostics of IM fault using AI and wavelet, researchers generally used experimental data, which is acquired in the laboratory. However, it is very important to check if the developed fault diagnosis methodology is robust against the various operating condition of IM, which is hardly attempt by the researchers. Therefore, this paper aims to investigate the performance of AI and wavelet based diagnosis of IM fault for various operational conditions. In this work, in order to develop effective AI-wavelet based diagnostics, the WPT is used as a signal processing tool and the SVM is used as classifier. In total nine mechanical and electrical fault conditions are considered for the diagnosis with a healthy IM. The current as well as vibration signals are used to take care of individual effectiveness in the detection of mechanical and electrical faults. For performing the proposed diagnosis, firstly, the experimental data was acquired from all faulty IMs and healthy IM for various operational conditions. The useful statistical features based on WPT are then extracted from the experimental data. For effective fault diagnosis the useful features are chosen using wrapper model of feature selection. In this study, a multiclass SVM called one-versus-one method is employed. In order to select optimal parameters of SVM, a cross-validation technique is used. For checking the robustness of the methodology, the diagnostics is executed for several operational conditions (i.e., load and speed) of IMs.

2 SVM Theory

SVM is first developed by Boser, Guyon and Vapnik in 1992 [12, 13]. The SVM is first developed as a binary classifier and its principles can be explained in two-dimensional domain as presented in Fig. 1. It present the classification of two class, i.e., class A (square) and class B (circle). In order to classify these classes, SVM create a hyperplane or a set of hyperplanes between two classes and finally select an optimal separating hyperplane by maximizing the margin, namely, the gap between the nearest data point of two classes.

Fig. 1.
figure 1

Illustration of basic principle of SVM

The optimal separating hyperplane can be obtained using following optimization problem:

$$ {\text{Minimum }}[ \tau ({\mathbf{w}})] = \frac{1}{2}\left\| {\mathbf{w}} \right\|^{2} + C\sum\limits_{j = 1}^{m} {\xi_{j} } $$
(1)
$$ {\text{Subject}}\,{\text{to}},\,y_{i} f(x_{i} ) = y_{i} (wx_{i} + b) \ge 1 - \xi_{i} ,\quad i = 1, \ldots ,m $$
(2)

where, training datasets \( \left\{ {({\mathbf{x}}_{j} ,y_{j} )} \right\}_{j = 1}^{m} ; \, {\mathbf{x}}_{j} \in R^{l} , \, y_{j} \in \left\{ { - 1,1} \right\}, \) xj is input vector, yj is label of the xj, m is the number of input vector, l is the input dimension, w is normal direction of a hyperplane, b is a scalar, \( \xi_{j} \) denotes positive slack variables and C is the generalization parameter. The function that corresponds to the hyperplane is linear for a linearly separable data. Though, the data is not always linearly separable. Therefore, the SVM is also developed for the non-linear separable data with the help of a kernel trick. Furthermore, the SVM have been developed for handling the multi-class classification cases [14]. A multiclass case are handled by decomposing it into a number of two class case. Various multiclass techniques such as one-versus-all, one-versus-one and direct acyclic graph have been introduced. From literature it is found that the one-versus-one is most effective technique because of its good classification ability and less computational or training time.

3 WPT Theory

The WPT was first introduced by Coifman and Wickerhauser by generalizing the DWT [15]. A richer signal analysis could be possible due to enhanced resolution obtained in the higher frequency region. WPT has a framework of the multi-resolution analysis (MRA) similar to DWT. The WPT is advantageous because it provides similar frequency bandwidth in each resolution by diving not only low but also high frequency sub band. In other words, DWT split only the approximation version while WPT simultaneously disintegrate approximation as well as detail versions. Therefore, in WPT, the informations carried by the parent signal are not altered (lost or increased) caused by signal decomposition. Thus, WPT is most preferable signal processing techniques and especially for non-stationary signals [11]. The wavelet packet function can be defined by a time-frequency function set with the help of following equations:-

$$ W_{j,k}^{m} = 2^{{\frac{1}{2}}} W^{m} \left( {2^{j} t - k} \right), $$
(3)

Where, variable m (m = 0, 1, 2 …) denotes the modulation or oscillation parameter. The integer j and k denote the scale and translation variable, respectively. The initial two wavelet packet functions are defined as scaling and mother wavelet functions, respectively.

$$ W_{0,0}^{0} (t) = \phi (t), $$
(4)

and

$$ W_{0,0}^{1} (t) = \psi (t), $$
(5)

Another form of wavelet packet function can be written as follows:-

$$ W_{0,0}^{2m} (t) = \sqrt 2 \sum\limits_{k} {h(k)} W_{1,k}^{m} (2t - k), $$
(6)

and

$$ W_{0,0}^{2m + 1} (t) = \sqrt 2 \sum\limits_{k} {g(k)} W_{1,k}^{m} (2t - k), $$
(7)

where, h(k) and g(k) denote the low and high pass filter coefficients, respectively. They are directly related to predefined scaling and mother wavelet function. For a function \( f(t), \) the wavelet packet coefficient \( W_{j,k}^{m} \) computed via the inner product \( \left\langle {f(t),W_{j,k}^{m} } \right\rangle \) is written as,

$$ W_{j,k}^{m} = \left\langle {f(t),W_{j,k}^{m} } \right\rangle = \int {f(t)} W_{j,k}^{m} (t)dt. $$
(8)

When a signal S (sampling frequency, fs) is considered for the decomposition by WPT up to third resolution level, i.e. j = 3, the signal is segregated into eight subspaces (or packets), i.e. \( 2^{j} \). The frequency interval of each packet for a frequency interval of \( (0,2^{ - 1} f_{s} ) \) of the whole scaling space is [16]:

$$ ((m - 1)2^{ - j - 1} f_{s} ,m2^{ - j - 1} f_{s} ),\quad m = 1,2, \ldots ,8 $$
(9)

4 Experimentation and Feature Extraction

In this section, first experiments were performed in order to generate the sufficient data for fault diagnosis, and then suitable features based on the WPT were extracted from raw datasets.

4.1 Experimental Set-Up and Procedure

Experimental test carried out on a test rig as shown in Fig. 2. The rig comprises of a Machine Fault Simulator (MFS), a data acquisition system with a signal monitor, three AC current probes, and a tri-axial accelerometer. The MFS comprises of a test IM (three-phase, 0.5 HP and 50 Hz), shaft, bearings, pulley, belt, a variable frequency drive for changing the speed and a magnetic clutch attached with a gear box for changing the load. Six IMs were used to generate six different faulty conditions, namely healthy motor (HM), bearing fault (BF), broken rotor bar (BRB), unbalanced rotor (UR), misaligned rotor (MR) and bowed rotor (BR), and two IMs were used to generate two faults with two severity levels, i.e. phase unbalances (PUF1 and PUF2) and stator winding faults (SWF1 and SWF2). Altogether ten IM fault conditions are considered in this study. Three current probes were attached to three phases of IM and an accelerometer was attached to the upper surface of the IM adjacent to shaft end to measure current and vibration, respectively.

Fig. 2.
figure 2

(a) Experimental set up (b) IM with different faults

The current probes and accelerometer were finally attached to the DAQ for the collection of the data. The data was acquired with 1 kHz sampling rate and 10000 samples. In total, 25 datasets are acquired in 250 s. The data was generated from ten faulty IMs for a wide speed range (i.e., 10 Hz to 40 Hz in interval of 5 Hz) and three load conditions {i.e., no load (0% of rated torque), light load (0.11% of rated torque), and high load (0.55% of rated torque)}.

4.2 Feature Extraction

In an AI based fault diagnosis, the feature extraction is an important step for extracting specific characteristics of the parent signal. In spite of using a potential classifier inappropriate feature may decrease the classification performance. Thus, the WPT which is one of the preferable technique for feature extraction is adopted in this study. The Haar wavelet is applied to time domain data upto 3rd levels. In order to extract suitable statistical features, firstly wavelet packet coefficients are found at all the available nodes of the 3-level wavelet tree. Though, it is not feasible to directly use these coefficients as fault features because it can reduce the classification performance. This is because most of nodes do not contain valuable information for the extraction of features. Thus, an appropriate node is selected and features are then extracted corresponding to the selected node. For the selection of appropriate node, either single-level basis selection (SLBS) or multi-level basis selection (MLBS) approach are used. In order to select any particular approach, criterion like best basis selection (BBS) and local discriminant basis (LDB) extension of BBS are usually employed [17]. In this study, SLBS approach is used which allows the search space to select only the lowest level of tree. For the selection of most appropriate node, a relative wavelet energy (RWE) criterion is employed. From this criterion, most suitable node is the one which shows maximum RWE. The RWE depends on the energy concentration of a certain signal and frequency bands of the particular node. The RWE represents the probability distribution of energy and is calculated as:

$$ {\text{RWE}}_{\text{m}} = p_{m} = \frac{E(m)}{{E_{overall} }}\quad {\text{with}}\quad \sum\limits_{m} {p_{m} } = 1 $$
(10)

The \( E(m) \) denotes overall energy of disintegrated signal of a particular node and is calculated as:

$$ E(m) = \sum\limits_{j = 1}^{n} {\left| {C_{m,j} } \right|}^{2} $$
(11)

where \( C_{m,j} \) is the jth wavelet coefficient of the mth node, j indicates the number of wavelet coefficients, \( E_{overall} \) denotes overall energy of the signal corresponding to all the nodes.

$$ E_{overall} = \sum\limits_{m = 1}^{N} {E(m)} $$
(12)

where i = 1, 2, …, N indicates each node. The RWE is computed from Eq. (10). For the current and vibration signals, suitable nodes are chosen corresponding to the maximum RWE value. After choosing the most suitable node, fault features are extracted using the coefficients of the chosen node. Initially fourteen features {e.g., first eight higher moments (µ1 to µ8), standard deviation (σ), skewness (χ), kurtosis (к Crest factor (CF), peak-to-peak (RPP), mean to standard deviation (Rmsd),)} are extracted for selecting the most appropriate fault features.

5 Diagnosis Results and Discussions

The OVO-SVM is employed for IM fault diagnosis. In this work, LIBSVM tool has been used to apply SVM [18]. Firstly, total dataset is distributed for training and testing of the classifier in the ratio of 80% and 20%, respectively. Now the different label is allotted to all the faults including Healthy Motor (HM) such as HM-1, BRB-2, PUF1-3, PUF2-4, SWF1-5, SWF2-6, BF-7, UR-8, BR-9 and MR-10. The training of SVM classifier is then performed at each IM operational conditions using the training data. It is noted that the training of SVM is done here using the RBF kernel. The RBF kernel comprises two basic hyper-parameters, the kernel parameter γ and the Lagrange multiplier, C. These parameters must be optimized for an effective fault diagnosis. These two parameters are optimized by the cross-validation method along with grid search methods. Several pairs of (C, γ) are evaluate and the one with maximum training or cross-validation (CV) accuracy is selected. This is done with each wavelet feature one by one. The optimization of (C, γ) for µ1 and µ2 features of the wavelet for a case when cross-validation is applied at 40 Hz and high load is shown in Fig. 3. It shows that the maximum training or CV accuracy is obtained as 99.5% and 98.5% for µ1 and µ2, respectively. The optimal pair of (C, γ) is then selected corresponding to the highest CV accuracy. After selecting the best pair of (C, γ) the final training is performed using this pair. The trained SVM model is now used to classify or diagnose total ten faulty motor conditions. The classification or prediction performance of the classifier can be described by percentage prediction rate, i.e. the number of successfully classified test data out of total number of test data.

Fig. 3.
figure 3

CV or training accuracy at 40 Hz and high load

For effective fault diagnosis, it is crucial to select the best wavelet features, which can represent each fault condition, effectively. Therefore, in order to choose effective wavelet feature(s), initially fourteen features are considered. The wrapper model is employed for the selection of effective features for the present diagnosis. According to the wrapper model, the feature(s) with highest classifier’s prediction accuracy is/are the effective feature(s). Now the diagnosis is performed by inputting the selected features one by one when the testing of classifier is done at the same operating conditions as the training. The diagnostics is executed at seven different speeds (from 10 Hz, 15 Hz, 20 Hz, 25 Hz, 30 Hz, 35 Hz, and40 Hz) for a high load condition and average results in terms of prediction performances are shown in Fig. 4. It is observed that the most of features, like initial six higher moments (µ1µ6), Rpp and σ are able to classify the considered IM faults with more than 90% prediction accuracy. Other features like Rmsd, χ and к are also able classify the faults to some extent. However, for other considered features like µ7, µ8 and CF, the prediction accuracy has gone below 75%, which are unacceptable in the fault diagnosis. From the results, it can be concluded the following features (µ1µ6), Rpp and σ features of wavelets are better characterized most of IM faults than other considered wavelet features. However, among these the µ1 feature of wavelet shows the highest prediction accuracy, i.e. 98.6%. Therefore, it can be concluded that µ1 is most effective wavelet (Haar) feature for the present fault diagnosis.

Fig. 4.
figure 4

Wavelet feature selection based on the wrapper model

After selecting the most effective feature, i.e., µ1, the fault diagnosis is now considered and checked for a range of IM operational conditions. First, diagnostics is done for the high load condition as presented in Table 1. The average of overall prediction accuracy is 98.6% for the considered motor speeds. All faults, except BRB at 20 Hz (66.7%), SWF2 at 30 Hz (66.7%) and BR at 15 Hz (83.3%) and 25 Hz (83.3%) are 100% classified or predicted perfectly at all considered speeds for the high load condition. Now the diagnostics is done for no load condition applied to the motor and results are tabulated in Table 1. The average of overall prediction accuracy is 98.8% for considered motor speeds. All faults, except SWF1 at 20 Hz (66.7%), SWF2 at 40 Hz (66.7%) and BF at 10 Hz (83.3%) are perfectly classified at all considered speeds for no load condition. From results, it can be concluded that fault diagnostics with the Haar wavelet and the µ1 has yielded nearly perfect prediction, not only for different IM faults, but also for their severity at all speeds and different levels of load torques. The prediction performance of the present fault diagnostics is nearly same for all operating conditions of IMs. The diagnosis is independent of operating condition of IM. That means the diagnosis of IM fault based on the SVM classifier and the WPT (Haar wavelet) can be accomplished at all the considered load and speed.

Table 1. Fault diagnosis for various operating conditions

6 Conclusions

This paper presents the performance analysis of the SVM and the WPT based IM fault diagnostics at various speeds and loads. The vibration and current signals are used to diagnose various mechanical and electrical fault conditions. In this work, Haar wavelet function is used to perform the diagnosis. Initially, fourteen wavelet features are considered for the study and then selected the most effective feature using wrapper model feature selection procedure. It is found that the µ1 feature of Haar wavelet can characterize all IM fault conditions better than other considered features. Finally, in order to check fault diagnosis performances over different operating conditions of IMs, the diagnosis is performed for various speeds and mechanical loadings. The diagnosis is then considered based on one of the effective wavelet feature, i.e., µ1. Average prediction accuracies are found to be 98.6% and 98.8% corresponding to the high load and no load conditions, respectively. Results show that the combination of the SVM and µ1 feature of the Haar wavelet is capable of diagnosing IM faults as well as their severities effectively at all considered operational conditions. It is also observed that the performance for the present diagnosis does not depend over angular speed as well as external load of IMs. In this study, 80% and 20% of the total data sets are considered for training and testing, respectively. However, the metrics used could be rearranged to test a low percent of training data to test different scenarios. In addition, other wavelet functions, like Gaussian, Shannon, Morlet, etc. can also be used for the comparative study.