Keywords

1 Introduction

The role of centrifugal pumps is of great importance in many industries. Hence, condition monitoring of them is absolutely necessary to prevent early failure, production line breakdown and to improve plant safety, efficiency and reliability. Furthermore, pumps, compressors and piping are causes of the major equipment failure in oil and gas plants. Centrifugal pumps are sensitive to: (1) variation in liquid condition (i.e. viscosity, specific gravity, and temperature), (2) Suction variation, such as pressure and availability of a continuous volume of fluid, and (3) variation in demand. Some of failure reasons are induced by captivation, hydraulic instability, or other system related problems. Others are the direct result of improper maintenance, maintenance-related problems, improper lubrication, misalignment, unbalance, seal leakage, and a variety of others in which machine reliability is periodically affected.

In this research, we use the data of a real centrifugal pump used in a petroleum industry located in the south of Iran. The data consists of 7 columns, the first six are features, i.e. flow, temperature, suction pressure, discharge pressure, velocity and vibration. The last column i s the fault class related to those features ranged from 1 to 5. Table 1 shows an example of the given data and our problem of fault classification.

Table 1. A row of the given data sheet containing values of the six features and the related fault type and under that a row of features without the fault type which should be diagnosed by us.

Considering the above explanations, our problem is to devise precise intelligent approaches which receive a number of data sheet’s rows, learn the pattern behind the features and given fault types, and finally, are themselves able to detect faults by giving them only the features’ values afterwards. Obviously, approaches with less errors or misclassification are more favourable. Due to the fact that failure diagnosis by human is time consuming and human errors may happen, using artificial intelligence and machine learning classification methods has gained popularity to develop a diagnostic scheme. Artificial Neural Networks (ANNs), which are inspired from the biological nervous systems, have been widely used by researchers in the field of classification. Support Vector Machine (SVM) presented by Vapnik 1995 [15] is a strong classification method based on the Structural Risk Minimisation (RSM). The application of SVM in classification is called Support Vector Classification (SVC). Hence, SVM and SVC mean exactly the same in this paper.

The reminder of this paper is organised as follows: In Sect. 2 a review of the related literature and different methods used for fault classification of pumps and similar devices are presented. Our ANN and SVC approaches are described in Sect. 3. Section 4 contains the results and comparisons of the all methods applied. Finally, conclusions and a recommendation for future research are covered in Sect. 5.

2 Related Work

In this section we aim at presenting an overview of the methods applied to classifying and clustering faults in centrifugal pumps and the likes. Researchers of this field have widely used Artificial Intelligence (AI) due to its applicability and capability in learning complicated patterns and accurate classification. Sun et al. 2012 [13] review Computational Intelligence (CI) approaches for oil-immersed power transformer maintenance by discussing historical developments and by presenting state-of-the-art fault diagnosis methods.

ANNs, which are of prominent approaches in AI, have been chosen as classifier in many papers. As some examples: Unal et al. 2014 [14] propose an ANN based fault estimation algorithm verified with experimental tests and promising results. Their ANN model was modified using a genetic algorithm providing an optimal skilful fast-reacting network architecture with improved classification results. In Azadeh et al. 2013 [2] a unique flexible algorithm is proposed for condition monitoring of a centrifugal pump into two different states based on ANN and SVM with hyper-parameters optimisation.

SVM has gained a considerable popularity among the surveys done in recent years. In Bacha et al. 2012 [3] an intelligent fault classification with a SVM approach is applied to power transformer Dissolved Gas Analysis (DGA). An application of the SVM in multiclass gear-fault diagnosis is studied by Bansal et al. 2013 [4]. Bordoloi & Tiwari 2014 [5] attempt the multi-fault classification of gears by SVM learning technique using frequency domain data. Fai & Zhang 2014 [6] applied support vector machine with genetic algorithm to fault diagnosis of a power transformer in which genetic algorithm is used to select appropriate free parameters of SVM.

An improved Ant Colony Optimisation (IACO) algorithm is proposed in Li et al. 2013 [9] to determine the parameters of SVM and then it is applied to the rolling element bearing fault detection. Gryllias and Antoniadis [7] propose a hybrid two stage one-against-all SVM approach for the automated diagnosis of defective rolling element bearings. In Muralidharan et al. 2014 [12] the application of SVM algorithm in the field of fault diagnosis and condition monitoring are discussed. Wang et al. 2014 [16] develop a noise-based intelligent method for Engine Fault Diagnosis (EFD), so-called HHT–SVM model based on the techniques of Hilbert-Huang Transform (HHT) and Support Vector Machine (SVM). Zhu et al. 2014 [19] train a multi-class SVM to achieve a prediction model by using Particle Swarm Optimisation (PSO) to seek the optimal parameters.

Other methods are used in this area as well. For instance: The survey of Lei et al. 2013 [8] summarises the recent research and development of Empirical Mode Decomposition (EMD) in fault diagnosis of rotating machinery. Azadeh et al. 2010 [1] provide a correct and timely diagnosis mechanism of pump failures by knowledge acquisition through a fuzzy rule-based inference system which could approximate human reasoning. The study of Muralidharan & Sugumaran 2013 [11] uses vibration signals for fault diagnosis of centrifugal pumps using wavelet analysis. Zhang and Nadi 2007 [18] propose three Genetic Programming based approaches for solving multi-class classification problems in roller bearing fault detection.

At last it is worth mentioning that although rarely but clustering approaches are used in the field of fault detection. Zogg et al. 2006 [20] is an example that simplifies known clustering techniques and introduces new vector clustering techniques for faults of heat pumps.

3 Description of the Applied Methods

In this section detailed explanations on the structure of our employed ANN and SVC (SVM) methods are given. Afterwards, we describe how GA is combined with these two classification approaches and illustrate the overall procedure of the devised integrated ANN-GA and SVC-GA algorithms.

3.1 The ANN-GA Framework

In machine learning, Artificial Neural Networks are a family of statistical learning algorithms inspired by biological neural networks. They are mainly used for function approximation, pattern recognition and classification. ANNs are presented as systems of interconnected “neurons” which can compute values and the combination of them leads to a network that can learn a complicated pattern between inputs and outputs. An ANN consists of nodes as neurons in different layers. Each node transmits a final value to nodes of the next layer. This value can be obtained by a function in the node called the activation function. The first layer has neurons equal to the number of inputs, whereas the last layer has neurons equal to the outputs. Between these two layers some hidden layers may exist to boost the ability of the ANN. For our classification problem we need an ANN that receives the values of six features as inputs and diagnoses the fault type based on them. Figure 1 depicts the structure of the applied ANN with nodes and the activation functions to convert a series of features to a fault type. The Network is fed by the six inputs and send them to all of the 3 neurons considered for the next layer. In each neuron i of the middle layer the weighted sum of inputs plus a constant number \(b_i\) is computed which is called \(T_i\) of the neuron, \(T_i=w_{i1}x_1+w_{i2}x_2+...+w_{i6}x_6+b_i\). The three values resulted from this layer are sent to the last layer consisted of only one neuron. This last neuron acts exactly like the neurons of the previous layer and returns the weighted sum of the received values added by a constant \(b_4\), \(E=\displaystyle \sum _{i=1} ^{3}l_i T_i+b_4\). Finally E goes through a step function, which determines the final output of the network or the fault class based on the amounts of \(c_i\).

Fig. 1.
figure 1

The structure of the applied ANN for fault classification

We summarise the main parameters of this network, which have a crucial effect on its performance in matrices:

$$ W = \left| \begin{array}{ccc} w_{1,1} &{} w_{1,2} &{} w_{1,3} \\ w_{2,1} &{} w_{2,2} &{} w_{2,3} \\ w_{3,1} &{} w_{3,2} &{} w_{3,3}\ \end{array} \right| $$

\(B=[b_1,b_2,b_3,b_4]\), \(L=[l_1,l_2,l_3,]\), \(C=[c_1,c_2,c_3,c_4]\).

Choosing the best amounts for the above parameters can improve the classification performance of the ANN but it is a very difficult task to adjust them on the best or optimal values. Hence, besides the conventional training methods, we apply Genetic Algorithm, which is a powerful evolutionary optimisation algorithm and is able to obtain solution of good qualities in real time. For detailed explanation of GA, readers are referred to M.D.Vose 1999 [10]. Due the fact that these parameters are continuous, we use the continuous version of GA in which the values of the parameters are considered as genes and they constitute a chromosome together. The initial population is generated by producing 200 chromosomes. The fitness of each chromosome is evaluated based on the below function:

$$\begin{aligned} Fitness\ function=1-percentage\ of\ correct\ predicted \ classes=1-\frac{N_c}{N_T} \end{aligned}$$

Where \( N_c \) is the number of correct predicted faults and \( N_T \) is the total number of predictions. The algorithm seeks to minimise the above fitness function iteration by iteration to reach a near optimal solution in the end. The main characteristics of the applied GA are as follows: \(Population\ size=200, Crossover \ percentage=0.7, Mutation \ percentage=0.3, Maximum\ of\ Iterations=100\)

Figure 2 illustrates the procedure of our combined ANN-GA algorithm. According to the figure, firstly, the initial amounts of (WBLC) are set. The data sheet is divided to training data set and testing data set. GA begins to optimise the amount of parameters by its selection, crossover and mutation operators. The algorithm terminates by reaching the maximum of iterations and determines the best found values for ANN parameters. Then these values are used for fault detections afterwards.

Fig. 2.
figure 2

The procedure of the applied ANN-GA algorithm

3.2 The SVM-GA Framework

In SVM (SVC), we have a set of training input \(D=\{(x_1,x_2),...,(x_i,y_i)\}\), where \(x\in R^d\) and \(y\in \{-1,1\}\) is the class label, \( i=1,...,l \). The method seeks to find a separating hyper plane that maximises the distance to the nearest data points of each class. This goal is met by minimising the following objective function:

$$\begin{aligned} Max \ \frac{1}{2} \Vert w \Vert ^2+C \sum _{i=1}^{l} \varepsilon _i \end{aligned}$$
(1)
$$\begin{aligned} Subject\ to \ \ y_i[W^T. \varPhi (x_i)]\ge 1-\varepsilon _i \end{aligned}$$
(2)
$$ \varepsilon _i\ge 0, i=1,...,l $$

This model is called soft margin SVM and \(\varepsilon _i \) handles misclassification, w is a weight vector, b is bias and C is the misclassification penalty to trade-off between the model complicity and training error. In equation (2), \( \varPhi (x_i) \) is a non-linear function and maps the input data to a high dimensional feature space where data can be separated linearly. Considering necessary condition for optimality, one can turn the above minimization problem into the following dual form:

$$\begin{aligned} Max \sum _{i=1}^{l} \alpha _i -\frac{1}{2} \sum _{i=1}^{l} \sum _{j=1}^{l} \alpha _i \alpha _j K(x_i,x_j) \end{aligned}$$
(3)
$$\begin{aligned} Subject\ to \ \sum _{i=1}^{l} \alpha _i y_i =0 \end{aligned}$$
(4)
$$0\le \alpha _i\le C, i=1,...,l,$$

where \( K(x_i,x_j) \) is a kernel function representing the inner product of \(\langle \varPhi (x_i),\varPhi (x_j)\rangle \) and \(\alpha _i\) is a Lagrangian multiplier. Solving the dual problem leads to the optimal separating hyper plane as following:

$$\begin{aligned} \sum _{SV} \alpha _i y_i K(x_i,x_j)+b=0. \end{aligned}$$
(5)

The optimal classifying rule is:

$$\begin{aligned} f=sgn(b+\alpha _i [y_i K(x_i,x_j)]), \end{aligned}$$
(6)

where SV is the set of support vectors that the corresponding Lagrangian multipliers are positive for them. Figure 3 shows how a soft margin SVM with linear separating hyper plane divides the data into two classes.

Fig. 3.
figure 3

Linear seperating hyper planes in soft margin SVM

We used the following kernel functions in our SVC for the fault diagnosis of the centrifugal pump:

$$\begin{aligned} Polynomial: K(x_i,x_j)=(\gamma . <x_i,x_j>+s)^d \end{aligned}$$
(7)
$$\begin{aligned} Gaussian\ basis\ function:K(x_i,x_j)=-\gamma . \Vert x_i-x_j\Vert ^2 \end{aligned}$$
(8)
$$\begin{aligned} Linear: K(x_i,x_j)=<x_i,x_j> \end{aligned}$$
(9)
$$\begin{aligned} Quadratic:K(x_i,x_j)=(<x_i,x_j>+1)^2 \end{aligned}$$
(10)

As the parameters, i.e. C,\( \gamma \), s, of SVM like those of ANN can strongly affect its performance, properly adjusting them can considerably improve it. Hence, other than conventional methods, these parameters are determined with the GA used for ANN. Considering the fact that SVM can only divide the data into two groups and there are 5 fault classes in our problem to be classified, we should implement SVM four times after each other. Each run of SVM classifies one fault from the remaining ones and has its own training process. Figures 4 and 5 illustrate the framework of our SVM-GA algorithm. As it is shown by Fig. 5, the initial parameters of SVM are set at first. Then the GA searches in the space of parameter amounts for each of the 4 runs separately and as it terminates for each, it begins with the parameter setting of the next one. Finally, when the parameters have been tuned for all the runs, the procedures ends with the best values found.

Fig. 4.
figure 4

SVM approach to obtain 5 classes

Fig. 5.
figure 5

The procedure of the applied SVM-GA

Fig. 6.
figure 6

Fitness function values of SVC-Gaussian-GA

Figure 6 illustrates the fitness function values of SVM-GA with Gaussian kernel function from the first up to the last iteration of GA.

4 Results and Comparisons

In this section we present a brief overview of the achieved results. We have altogether 100 rows of data. For feeding the algorithms, 70 % of data are randomly considered for training and 30 % as testing data. To make the data noisy for testing the robustness of the approaches, 0.1 is added to 30 % of columns 1, 3, and 6 of the data sheet. Table 2 shows percentage of correct fault diagnosis of pure GA and SVM methods without GA improvements and Fig. 7 depicts this amounts visually. SVM-Gaussian has the best performance and a good robustness. SVC-Linear is in the second position but it has the highest robustness among all. SVC-Quadratic, ANN and SVC-Polynomial are in the next ranks. It is worth mentioning that ANN has the worst robustness.

Table 2. Correct diagnosis proportion of the pure ANN and SVM
Fig. 7.
figure 7

Correct diagnosis of the pure ANN and SVM with Normal and Noisy Data

To show the superiority of our ANN-GA and SVM-GA, the diagnosis experiments are also done with K-Nearest Neighbours (KNN) and Decision Tree, which are of high rated classification methods. Table 3 and Fig. 8 show these performance comparisons. According to the results, SVM-Gaussian has again the best performance among all and GA enhancement has enabled it to detect faults in all cases correctly both in normal and noisy environment. Therefore, it is the most robust as well, together with SVM-GA-linear. ANN-GA performs worse than SVM-GA with all the kernels in terms of correctness, and considering robustness, SVM-GA is superior except for the case of polynomial function which results almost the same as ANN-GA. Finally, the worst diagnosis performances belong to KNN and Decision Tree.

Table 3. Correct diagnosis proportion of ANN-GA, SVM-GA, KNN and Decision Tree with Normal and Noisy Data
Fig. 8.
figure 8

Correct diagnosis proportion of ANN-GA, SVM-GA, KNN and Decision Tree with Normal and Noisy Data

To show the GA effects on ANN and SVM, the performance improvements are depicted by Fig. 9. The largest improvement is for SVM-Gaussian in noisy condition and lowest for SVM-Polynomial in noisy condition.

Fig. 9.
figure 9

The improvement of methods by GA

For Comparisons of the methods in detail, McNemar’s tests are executed and the results are tabulated as Table 4 to examine which model outperforms the others significantly. McNemar’s is a nonparametric statistical test for two related nominal samples with a null hypothesis of marginal homogeneity in \( 2\times 2 \) contingency tables. A detailed explanation on McNemar’s test is provided in Webb et al. 2011 [17]. If the significance level is set to 10 %, then a p-value less than 0.1 shows that models vary significantly. The tests for SVC and SVC-GA are done with their best kernel function which is Gaussian according to the accuracy results.

Table 4. MCNemar’s test results (p-values)

At the end of this section, we perform 10-fold cross-validation to evaluate the validity of models. For this sake, the data sheet is divided into 10 even subsets, then each of them is once used as the test dataset and the other 9 as training dataset. Finally, the averages of models’ accuracies (proportion of correct predicted fault types) are considered for models’ validity evaluation. The average of models’ accuracies are presented in Table 5.

Table 5. Average of accuracies in 10 fold

5 Conclusion

In this paper we presented fault classification algorithms by combination of two intelligent machine learning methods, namely ANN and SVM, with Genetic Algorithm. The results showed that GA can significantly improve the performance of the classifiers. The performances of all employed algorithms, i.e. ANN, ANN-GA, SVC, SVC-GA, KNN and Decision Tree, were compared by different tests in normal and noisy condition. The comparisons showed that SVM with Gaussian kernel function had the best accuracy in correct fault diagnosis and an excellent robustness against noise. It was also observed that SVM is superior to ANN in most of the cases. For future research in this direction, testing the ability of other optimisation algorithms to improve ANN, SVM and other classification methods is recommended.