Keywords

1 Introduction and Literature Review

Induction motor (IM) is an essential part in many industries, which drives the moving and lifting arrangements. There is an urgent need to give some special attention to smooth running of the IM in order to have a stable and high performance. Due to various stress due to severe operating conditions, the wear and tears may happen on the different parts and leads to mechanical and electrical faults. By early fault detection of IMs and proper preventive maintenance improve the machine life or loss of valuable production time and avoiding more serious accidents. Numerous condition monitoring techniques are developed in last four to five decades based on the acoustic emission, stator current, vibration, etc. [1, 2].

Among several conditioning methods, the monitoring of the current and vibration are so common due to their low cost and non-intrusiveness. The accuracy of these techniques depends on the loading of the machine, and also the signal-to-noise ratio of measuring instruments [3]. Signal-based methods commonly use the stator current as a measurement since it is sensitive to the rotor and stator faults (i.e. the stator winding fault, broken rotor bar fault, and phase unbalance and single phasing), and it is a suitable method to obtain a diagnostic index and a threshold stating the edge between faulty and healthy conditions. Detecting and identifying mechanical faults (i.e. bearing faults, unbalance rotor, bowed and misaligned rotor) and separating them from each other are major challenges in electrical drive systems. Generally, vibration is commonly used for detecting the healthy and faulty condition [4]. The IM may fail due to electrical or mechanical fault or combination of both. In this condition, it will be beneficial to study the both electrical and mechanical faults or corresponding signatures together.

In recent years, many intelligent based methods have been offered such as artificial neural network, fuzzy expert system, condition-based reasoning, random forest, etc. Among those, the SVM is uncommon in the field of condition monitoring and fault diagnosis of machinery. The SVM performance is excellent in respect of accuracy [5]. Also the SVM is suitable to online identification of IM faults: broken rotor bar, unbalanced voltage, air-gap eccentricity fault and outer raceway bearing defect [6]. Nguyen and Lee [7] and Nguyen et al. [8] investigated mechanical faults diagnosis of IMs based on time domain vibration signal using the SVM, decision tree and GA. They used C-SVM with RBF kernel. Widodo et al. [9] presented the fault diagnosis of IM using combination of independent component analysis (ICA) and SVM based on the vibration and current signatures. The combination of ICA and SVM can serve as an encouraging alternative. From literature survey, it is evident that faults prediction of IMs using multi-class SVM (MSVM) algorithms is still uncommon and has lot of potential, especially of the mechanical and electrical faults prediction together. Also the optimal selection of MSVM parameters for C-SVM and nu-SVM is not found in the literature. Hence, it can be explored further for the perfect multiclass fault prediction in IMs. In this paper, a comprehensive study on the prediction of faults (mechanical and electrical) in IMs has been attempted using the MSVM classifier with optimal selection of classifier parameters for best result.

2 Introduction to SVM Classifier

The SVM is a supervised learning method by examining data and identifying patterns, which is used for classification and regression analysis. Vapnik [10] was the first inventor and the recent version was proposed by Cortes and Vapnik [11]. Generally, the SVM version is used for classification between two data by placing the data in a hyper plane along with couple of support vectors. The following paragraph is written about the C-support vector machine and nu-support vector machine, which are used for classification.

C-Support Vector Machine (C-SVM):

Given training vector \( x_{i} \in R^{n} ,i = 1, \ldots ,l, \) in two classes, and an indicator vector \( y \in R^{l} \) such that \( y_{i} \in \left\{ {1, - 1} \right\} \), C-SVC (Boser et al. [12]; Cortes and Vapnik [11]) formulated the following primal optimization problem

$$ \begin{array}{*{20}l} {\mathop {\hbox{min} }\limits_{w,b,\xi } \quad \quad \tfrac{1}{2}w^{T} w + C\sum\limits_{i = 1}^{l} {\xi_{i} } } \hfill \\ {\begin{array}{*{20}l} {{\text{Subject}}\,{\text{to}}} \hfill & {y_{i} \left\{ {w^{T} \phi \left( {x_{i} } \right) + b} \right\} \ge 1 - \xi_{i} } \hfill \\ {\text{With}} \hfill & {\xi_{i} \ge 0,i = 1, \ldots ,l,} \hfill \\ \end{array} } \hfill \\ \end{array} $$
(1)

where \( \phi \left( {x_{i} } \right) \) function maps \( x_{i} \) into a higher-dimensional space, w is the weight vector, b is the bias, \( \xi \) is the slack variable allowed for the misclassification of difficult or noisy point, and \( C > 0 \) is the regularization parameter.

nu-Support Vector Machine (nu-SVM):

The nu-support vector classification (Scholkopf et al. [13]) introduces a new parameter \( nu \in \left( {0,1} \right) \). It has been proved that nu an upper bound on the fraction of support vectors. Giving training vectors \( x_{i} \in R^{n} ,i = 1, \ldots ,l \), in two classes, and a vector \( y \in R^{l} \) such that \( y_{i} \in \left\{ {1, - 1} \right\} \), the primal optimization problem could be written as

$$ \begin{array}{*{20}l} {\mathop {\hbox{min} }\limits_{w,b,\xi ,\rho } \quad \tfrac{1}{2}w^{T} w - nu\;\rho + \tfrac{1}{l}\sum\limits_{i = 1}^{l} {\xi_{i} } } \hfill \\ {\begin{array}{*{20}l} {{\text{Subject}}\,{\text{to}}} \hfill & {y_{i} \left\{ {w^{T} \phi \left( {x_{i} } \right) + b} \right\} \ge \rho - \xi_{i} } \hfill \\ {\text{with}} \hfill & {\xi_{i} \ge 0,i = 1, \ldots ,l,\rho \ge 0} \hfill \\ \end{array} } \hfill \\ \end{array} $$
(2)

Performance Index:

The classification of the testing data could be found with the SVM algorithm. Suppose that a set of testing data are analyzed by the SVM and classified, among them some are classified correctly and remaining is not. The term classification accuracy can be written as

$$ {\text{Accuracy}} = \left( {{\text{Number}}\,{\text{of}}\,{\text{correctly}}\,{\text{classified}}\,{\text{data}}/{\text{Total}}\,{\text{number}}\,{\text{of}}\,{\text{testing}}\,{\text{data}}} \right) \times 100\% $$
(3)

Multi-class Classification:

In reality the classification demands more than two classes of faults to be classified. In the rotor machinery fault diagnosis, classification of same machine element faults for example in motor faults: bearing fault, rotor misalignment fault, bowed rotor fault, unbalanced rotor fault, and for different machine element faults: gear faults, motor faults, bearing faults. This type of multi-classification was addressed by several methods (one-against-all, one-against-one, direct acyclic graph, etc.). The method ‘one-against-one’ presented by the Knerr et al. [14] and Kressel [15] is applied here.

If k is the number of classes, then k(k − 1)/2 classifiers are constructed and each one trains data from two classes. For the training data from the ith and jth classes, it could be solved in the two classification problem.

$$ \begin{array}{*{20}l} {\mathop {\hbox{min} }\limits_{{w^{ij} ,b^{ij} ,\xi^{ij} }} \quad \tfrac{1}{2}\left( {w^{ij} } \right)^{T} w^{ij} + C\sum\limits_{t}^{{}} {\left( {\xi^{ij} } \right)_{t} } } \hfill \\ {\text{Subject to}} \hfill \\ {\left( {w^{ij} } \right)^{T} \phi \left( {x_{t} } \right) + b^{ij} \ge 1 - \xi_{t}^{ij} ,{\text{if}}\,\,x_{t} \,\,{\text{in the}}\,\,i^{th} {\text{class}}} \hfill \\ {\left( {w^{ij} } \right)^{T} \phi \left( {x_{t} } \right) + b^{ij} \ge - 1 + \xi_{t}^{ij} ,{\text{if}}\,\,x_{t} \,\,{\text{in the}}\,\,j^{th} {\text{class}}} \hfill \\ {\xi_{ij} \ge 0} \hfill \\ \end{array} $$
(4)

In the classification, we use a voting strategy in which each binary classification is considered to be a voting, where votes could be casted for all data points, x, at the end a point is designated to be in a class with the maximum number of votes. In case those two classes have identical votes, though it may not be a good strategy, it chooses the class appearing first in the array of storing class names. Many methods are available for the multi-class SVM classification (MSVM); and Hsu and Lin [16] gave a detailed comparison and concluded that ‘one-against-one’ is a competitive approach. The LIBSVM [17, 18] freely available software package is used for the multi-class classification.

3 Parameter Selections

Two MSVM parameters C and nu, named as regularization and support vector fraction, respectively; and another related to the kernel (gama or \( \gamma \)) to be fixed before classification. The accuracy depends upon the choice of these parameters. The best one can be picked by the help of tools, like the grid-search technique (GSM), the genetic algorithm (GA), etc. The following paragraph gives briefs of these two methods.

3.1 Grid Search Method (GSM)

In this method, cross validation (CV) accuracies are calculated for different set of parameters. These sets are generated by a mesh grid. The best CV is selected from among CVs and corresponding parameter is picked. Final classification is done with that parameter. LIBSVM [17, 18] is used for this technique.

3.2 Genetic Algorithms (GAs)

The detail explanation of GA is avoided; reader may refer the book of Deb [19, 20]. The freely available GA software developed by Kay [21, 22] is used. Figure 1 shows the flow chart for selection of parameters. Initially, total data set (features) is divided into two components. The first component is again divided into two subdivisions one for training with the genetically generated parameters and other for testing to find the accuracy (which is based on Eq. (3)). The best accuracy is finally picked and corresponding parameters are chosen. Those chosen parameters are used for the final testing to find the accuracy. Table 1 indicates the components used for the GA.

Fig. 1.
figure 1

Flow chart for selection of SVM parameters by GA

Table 1. Optimization fitness function and design parameters

4 Experimental Setup and Feature Extraction

In this section the experimental setup and measurements of vibration and current signal are explained. The procedure for time domain data collection and features generation from the healthy and faulty IMs are presented.

4.1 Experimental Setup

The Machine Fault Simulator (MFS) is the laboratory experimental test rig (shown in Fig. 2). It consists of an IM (0.37 kW, 50 Hz, 4-pole, and rated RPM-3450) connected with one end of a shaft using a flexible coupling. The shaft is mounted in a bed by means of ball bearing. One pulley-belt drive is connected on the other end the shaft. This pulley-belt drive is connected with a gear box. A magnetic brake clutch is attached with this gear box to load the IM. Eight IMs are replaced one by one to generate the data for ten different faulty/healthy conditions.

Fig. 2.
figure 2

Experimental set-up of induction motor with loading and measurement arrangement

One tri-axial accelerometer (sensitivity: 10.23 mV/m/s2, 10.27 mV/m/s2, 10.34 mV/m/s2) is mounted on the top of the motor to capture the data. The position of the accelerometer is found near to the bearing of the armature of the motor because the bearing is the only load carrying component. Three AC current probes are attached with input power line of the IM to measure its variations. All the sensors are connected to the DAQ (NI make) to collect the variation of current and vibration. NI LabView software is used for recording the current and vibration time domain data. A constant DC power source is used to power one tachometer for measurement of speed of the shaft. The time domain data are acquired at the sampling rate of 2000 Hz for nine faulty and one healthy IM (or no defect motor, i.e. ND). Total 300 raw data-sets (\( 300\; \times \,2000 \) sample points) were collected for each IM faulty conditions.

Among the eight IMs, fours IMs have mechanical faults (i.e. bearing fault (BF), rotor misalignment fault (RMF), bowed rotor fault (BRF), unbalanced rotor fault (URF)) and another three IMs have electrical faults (i.e. the broken rotor bar fault (BRBF), stator winding fault with maximum and minimum resistance (MSWF and SWF, respectively), and phase unbalance and single phasing fault with maximum and minimum resistance (MPUSPF and PUSPF), respectively. Here, two different severity levels of stator winding fault and phase unbalance-single phasing was introduced by varying the resistance of winding. An external control box was connected to one phase of the winding to vary the resistance (0–1 O) of the same. Measurements were taken in a range of angular speeds (10 Hz to 40 Hz in 5 Hz interval), and also for three different external loads (or torques) on the motor, no load named as T1 (0 N m i.e., 0% of rated torque), light load named as T2 (0.113 N m i.e., 11% of rated torque) and high-load named as T3 (0.565 N m i.e., 55% of rated torque). Raw data sets were stored in the DAQ at individual speeds and loads for various IM faults for further processing. The MPUSPF data below 15 Hz rotational speed is not able to take for all the three loading condition.

4.2 Feature Extraction

In order to predict faults, the feature selection is critical, which comprises all vital information of fault conditions. Features are needed to feed as an input to the MSVM classifier for the training and the testing. Standard deviation, skewness and kurtosis are calculated from the time domain data (vibration and current) and these are used for features [23, 24]. Altogether, \( 6\; \times \;300 \) (for \( 3\; \times \;300 \) data sets of three orthogonal direction vibration signals) and \( 6\; \times \;300 \) (for \( 3\; \times \;300 \) data sets of three phase current signals) features are calculated for further classification. That means \( 12\; \times \;300 \) sets of data are available for a single faults, hence for 10 numbers of mechanical and electrical faults \( 12\; \times \;10\; \times \;300 \) sets of data are available for a particular speed.

5 Fault Prediction Using MSVM

5.1 Feature Optimization of MSVM Parameters

During the optimization of MSVM parameters by GA, \( 12\; \times \;10\; \times \;180 \) data points were used for the training of fault classification, and \( 12\; \times \;10\; \times \;90 \) data points were used for the testing. The variation of initial and final fitness values (i.e., the percentage accuracy) with the population for C-SVM and nu-SVM for 40 Hz rotational speeds are shown in Fig. 3(a)–(f) with three loads, respectively. Populations (including inside and outside the limit of constrains) are shown in these figures. Initial populations indicate the divergence of domain and final populations indicate the most of its chosen population reaches the converged level.

Fig. 3.
figure 3

Variation of initial and final fitness with population in GA optimization for two MSVM with three loads

In the case of GSM, \( 12\; \times \;10\; \times \;270 \) data points are used for the parameter estimation. The cross-validation accuracy in the GSM for 40 Hz rotational speed for the C-SVM and the nu-SVM is shown in Fig. 4(a)–(f) with three loads, respectively. In these the contour line for percentage accuracy is plotted and the best CV accuracy is marked. Correspondingly, the best SVM parameter is found from the best CV accuracy and the testing accuracy is tabulated. The optimized percentage accuracy of different MSVM formulation is shown in Tables 2 and 3 (for C-SVM and nu-SVM with T1 load), Tables 4 and 5 (for C-SVM and nu-SVM with T2 load), Tables 6 and 7 (for C-SVM and nu-SVM with T3 load). The accuracy with bold mark refers to the best ones in that particular speed.

Fig. 4.
figure 4

Cross validation accuracy at 40 Hz speed for two MSVM with three loads

Table 2. Fault classifications with C-SVM for T1 load
Table 3. Fault classifications with nu-SVM for T1 load
Table 4. Fault classifications with C-SVM for T2 load
Table 5. Fault classifications with nu-SVM for T2 load
Table 6. Fault classifications with C-SVM for T3 load
Table 7. Fault classifications with nu-SVM for T2 load

5.2 Prediction Ability

After optimization of MSVM parameters \( 12 \times 10 \times 30 \) data points are used for the final testing of the fault classification for GA and GSM. Many occasions the accuracy is more in GA as compared with the GSM, which reflect the soundness of the GA.

At T1 Load.

It observes the testing accuracy, the lowest one is equal to 80.84% and this occurs at 15 Hz rotational speeds for nu-SVM case. Tables 2 and 3 illustrate the percentage prediction in various rotational speeds against the best prediction. The prediction 58.62% is the individual lowest against RMF case at 10 Hz rotational speed.

At T2 Load:

It observes the testing accuracy, the lowest one is equal to 83.91% and this occurs at 10 Hz rotational speeds for C-SVM case. Tables 4 and 5 illustrate the percentage prediction in various rotational speeds against the best prediction. The prediction 55.17% is the individual lowest against ND case at 15 Hz rotational speed.

At T3 Load:

It observes the testing accuracy, the lowest one is equal to 84.25% and this occurs at 10 Hz rotational speeds for C-SVM case. Tables 6 and 7 illustrate the percentage prediction in various rotational speeds against the best prediction. The prediction 55.17% is the individual lowest against ND case at 15 Hz rotational speed.

Initially for each of the ten classification cases (i.e., BF, RMF, BRF, UR, BRBF, MSWF, SWF, MPUSPF, PUSPF and ND), the training data was provided at the running speeds from 10 Hz to 40 Hz in the intervals of 5 Hz and then the multiclass classification capability of two classes of MSVM was noted for these running speeds. It is concluded that MSVM has the ability to make perfect classifications if the training data is available for that particular running speed. It is also observed that the prediction accuracy gradually increases with the increase of the rotational speed and load. This is due to the high signal-to-noise level at the high rotation speed due to better manifestation of faults in vibration signals at these speeds. It is observed that nu-SVM showed good predictions. Overall at 15 Hz rotational speed the prediction is lowest.

6 Conclusions

In this work, the induction motor fault classification capabilities of the C-SVM and the nu-SVM in MSVM with the use of the best parameter chosen by the GA is demonstrated and results are compared with the parameter chosen by the conventional GSM technique. The raw data in time domain were measured and stored from an experimental setup with the interchanging of nine defective IMs (BF, RMF, BRF, UR, BRBF, MSWF, SWF, MPUSPF, PUSPF) along with the healthy (ND) by a tri-axial accelerometer and three current probes in a range of motor speeds and three load levels. Three statistical features were calculated from the raw data. The classification accuracy was calculated with RBF kernel in C-SVM and nu-SVM by the using of the GA and GSM techniques. The convergence of population was also demonstrated. The GA based technique shows its ability to improved accuracy with respect to the GSM based technique. The prediction ability of MSVM progressively increases at higher speeds due to better manifestation of fault dynamics in signals. Among the two MSVM, using of nu-SVM shows better results. This same technique can also be applied for the other kernels as discussed in LIBSVM tool. Another important factor is to train the MSVM by a range of rotational speeds and to test it for the prediction at out of the range rotational speed by the proposed method. The frequency domain and time-frequency domain data analysis can also be done with the proposed method.