1 Introduction

Currently, the industrial plants must maintain high levels of quality and efficiency in their productions besides accomplishing demanding regulations related to the environment protection, and industrial reliability and safety [1,2,3]. It is well-known that faults affect the availability of the industrial systems and adversely influences the safety of the operators. For these reasons, the modern industries have increasingly incorporated condition monitoring systems for detecting and locating faults [4,5,6].

In the scientific literature, the condition monitoring schemes are sorted out in two main groups: model-based [7,8,9], and data-driven methods [10, 11]. In the model-based approach, the condition monitoring is supported on the generation of residuals produced by the difference between the measured variables from the industrial plant and the variables estimated by the model that simulates the operation of the process. The effectiveness of these methods depends on the quality of the model, which implies a high degree of knowledge of the process by experts. However, the high complexity of current industrial plants manifested by strongly nonlinear behaviors of the variables and their relations makes that very difficult to achieve [12]. The proposed solutions based on data driven do not need such precise knowledge of the system parameters, and the relationships among variables which is an advantage in complex processes [13,14,15,16].

A review of several strategies for condition monitoring tasks in the last two decades, shows an important increase in the use of fuzzy techniques [17,18,19,20,21]. The membership grades are essential elements in the theory of fuzzy sets. Intuitionistic [22, 23] and interval-valued [24] fuzzy sets were introduced to solve situations where the classical theory of fuzzy sets cannot be applied [25]. More recently, another non-standard fuzzy subset called Pythagorean fuzzy sets was introduced by Prof. Ronald Yager [26, 27]. As shown in [27], the space of membership grades in the Pythagorean fuzzy sets exceeds that generated by the membership grades of the intuitionistic type which represents an excellent advantage in several types of applications [28,29,30,31].

The operation of a condition monitoring system in an industrial plant is seriously affected by external disturbances and noise present in the measurements of the process variables because they introduce imprecisions and uncertainties in the observations. In order to overcome such difficulties, a scheme to condition monitoring based on Pythagorean membership grades (PyMGs) is presented as the main contribution of this paper. In the proposal, a set of n classification fuzzy algorithms are trained in an off-line stage. The Pythagorean membership grade (PyMG) and the complement obtained by these n algorithms are used in the rule-based decisions to obtain an enhanced partition matrix. This allows to improve the positioning of the center of the classes and the data clustering. As a result, the performance of the condition monitoring system gets increases. The use of a set of n algorithms in the training stage allows to improve the performance of the classification process. On the other hand, Pythagorean fuzzy sets allow to reduce false alarms by improving the robustness of the fault diagnosis against noise and external disturbances.

The remainder of the paper is presented as follows. Section 2 presents a background about the Pythagorean membership grade theory, the methodology with the use of the PyMDs for improving the performance of condition monitoring systems, and an illustrative example. In Sect. 3, the Gustafson-Kessel (GK) and the kernel fuzzy C-means (KFCM) algorithms, the UCI machine learning data sets, and the benchmark Tennessee Eastman (TE), which are used to test the proposed methodology, are presented. In Sect. 4 the results obtained are evaluated. A performance comparison with successful algorithms is developed in Sect. 5. At last, the conclusions are displayed.

2 Materials and Methods

The main concepts about the Pythagorean membership grades and the description of the main proposal of this paper are presented in this section.

2.1 Pythagorean Memberships Grades

[27] presented the Pythagorean fuzzy sets (PyFS), and the membership grades related with them that will be identified as Pythagorean membership grades (PyMGs) in the paper. Next, the fundamental elements that characterize the PyMGs are presented by using a similar mathematical formulation to that used in the original article [27].

Consider a space S and a fuzzy subset D of this space. The PyMG of each element \(s \in S\) is represented by the values \(f(s)\,,\, h(s) \in [0,1]\). These values are termed the strength and the direction of commitment at s, respectively. Both are related with the support for membership (\(D_{Y}(s)\)) and the support against membership (\(D_{N}(s)\)) of s in D. The element that establishes the difference between the PyFS with respect to other nonstandard membership grades is based on the relation between \(D_{Y}(s)\) and \(D_{N}(s)\) which is established by using the Pythagorean complement with respect to f(s). In this case, \(D_{Y}(s)\) and \(D_{N}(s)\) are defined from f(s) and h(s) as

$$\begin{aligned}{} & {} D_{Y}(s) = f(s)\textrm{cos}(\phi (s)) \end{aligned}$$
(1)
$$\begin{aligned}{} & {} D_{N}(s) = f(s)\textrm{sin}(\phi (s)) \end{aligned}$$
(2)

where

$$\begin{aligned} \phi (s) = (1-h(s))\frac{\pi }{2} \, \textrm{radians},\,\,\phi (s)\in [0, \frac{\pi }{2}] \end{aligned}$$
(3)

From these relations, in [27] it is demonstrated that

$$\begin{aligned} D^{2}_{Y}(s) = f^{2}(s) - D^{2}_{N}(s) \end{aligned}$$
(4)

From Eq. (3), if \(h(s)= 1\), this implies \(\phi (s) = 0\), and, therefore, \(cos(\phi (s)) = 1\) and \(sin(\phi (s)) = 0\). This result indicates that \(D_{Y}(s) = f(s)\) and \(D_{N}(s)\) = 0. On the other hand, if \(h(s) = 0\), this implies \(\phi (s) = \pi /2\), \(D_{Y}(s) = 0 \) and \(D_{N} (s) = f(s)\). Therefore, h(s) indicates how f(s) is pointing to membership. In the case that \(h(s) = 1\), the direction of f(s) is totally pointing to membership, and if \(h(s) = 0\), the direction of the f(s) totally points to nonmembership. Partial support to membership and nonmembership are represented by values of h(s) between 0 and 1.

Note that in general form, a PyMG is represented by two values \(m,n \in [0,1]\) that satisfy the relation \(m^{2}\) + \(n^{2}\) \(\le \) 1. In this case, \(m = D_{Y}(s)\), represents the membership degree of s in D and n = \(D_{N}(s)\) is the degree of against membership of s in D. As \(m^{2}\) + \(n^{2}\) = \(f^{2}\), therefore, a PyMG represents a point of a circle of radius f. Since it is required that \(a,\,b \in [0, 1]\), then, \(\phi \in [0, \pi /2]\), and this locates the point that represents the PyMG in the upper right quadrant.

Fig. 1
figure 1

Spaces of Pythagorean and intuitionistic membership grades [27]

Fig. 2
figure 2

Classification scheme using Pythagorean memberships grades

An intuitionistic membership grade (IMG) is also represented by the values \(m,n \in \,[0, 1]\), but, in this case, it is required that m + n \(\le \) 1. [27] demonstrated that every point (mn) which represents an IMG is also a PyMG. For example, if the point \((\frac{\sqrt{5}}{4},\, \frac{\sqrt{11}}{4})\) is considered, then \((\frac{\sqrt{5}}{4})^{2} + (\frac{\sqrt{11}}{4})^{2} = \frac{5}{16}\, +\, \frac{11}{16} = 1 \) which indicates that this is a PyMG. However, since \(\frac{\sqrt{5}}{4} = \frac{2.236}{4} = 0.559\) and \(\frac{\sqrt{11}}{4} = \frac{3.317}{4} = 0.829\), and then \(0.559 + 0.829 \,>\, 1\), which indicates that this is not an IMG. From this, it can be concluded that the set of PyMGs exceeds the set of IMGs. Figure 1 shows this result. The figure shows that IMGs are the points under the line m + n \(\le \) 1, and the PyMGs are the points under the function \(m^{2}\) + \(n^{2}\) \(\le \) 1.

One important consequence of the above analysis is the possibility to use the PyFSs in situations in which it is not possible to use intuitionistic fuzzy sets (IFSs). This advantage will be very useful to enhance the performance in the schemes of condition monitoring in industrial plants.

2.2 Description of the Proposal

In Fig. 2, the scheme of the condition monitoring based on PyMGs proposed in this paper is displayed. The proposal shows two stages: the training stage (TS) developed off-line and the recognition stage (RS) performed online. In the TS, the fuzzy classifiers are trained by using a historical database of the process, which contains an adequate amount of data representative of the l operating states or classes of the system (normal operation state and fault states). In the RS, the trained fuzzy classifiers evaluate the membership degree of each observation \(x_q\) to different classes. The highest value of membership decides to which class the observation is allocated, i.e.,

$$\begin{aligned} C_{l} = \left\{ l: \max \left\{ \mu _{lx_q}\right\} , \forall l,x_q\right\} \end{aligned}$$
(5)

2.3 Training Stage

In this phase, n classifiers are trained and the center of the different classes are determined. The outputs of these algorithms (membership grade and the complement) are used in the rule-based decisions to obtain an enhanced U partition matrix, which allows to enhance the position of the centers of each class for improving the classification process. In the construction of the rule base for decision making, the Pythagorean fuzzy sets are used for obtaining a greater classification space. This is a very powerful advantage in industrial systems, where process variables are affected by noise and external disturbances, with the aim of obtaining a robust condition monitoring system.

Let \(Al_i,\,\,i=1,2,....,n\) be a set of n fuzzy classification algorithms, and \(Al_i^+\), \(Al_i^-\) be their respective fuzzy partition matrices and their complements obtained in the training stage. The rule base will have \(2^n\) rules, and they are built as follows [26, 27]:

(6)

where \(U_1,U_2,....,U_{2^n}\) are the fuzzy partition matrices as outputs of the rules \(R_1, R_2,...,R_{2^n}\) respectively.

This base of rules allows a better clustering (a better fuzzy partition matrix Ubetter) of the data due to the use of the information of the classification obtained for the n classification algorithms. In addition, the use of PyMDs enhances the robustness of the system with respect to noise. Both aspects contribute to a better location of the class centers in the training stage, which allows to achieve a robust and better classification process during the online recognition stage.

2.4 Recognition Stage

The observations coming from the industrial plant are classified in the different classes by using Algorithm 1. For performing the classification, the distance between each observation \(x_q\) and the center of the different classes obtained in the previous stage are computed, and the membership degree of the observations to the l classes is determined. For allocating an observation \(x_q\) in a class, the highest membership degree is determined:

$$\begin{aligned} C_{l} = \left\{ l: \max \left\{ \mu ^{Al_1}_{lx_q},......., \mu ^{Al_n}_{lx_q}\right\} , \forall l,x_q\right\} \end{aligned}$$
(7)
figure a

3 Study Cases and Experimental Design

For validating the proposal of condition monitoring scheme, the Gustafson-Kessel (GK) and the kernel fuzzy C-means (KFCM) classifiers will be used.

3.1 Gustafson-Kessel (GK) Algorithm

In [32], the standard fuzzy c-means (FCM) classifier is modified by using an adaptive distance norm with the aim of recognizing clusters with different forms in a data set. The norm-inducing matrix \(A_{i}\) of each cluster is used to yield the inner-product norm:

$$\begin{aligned} d^{2}_{iq} = \left( x_{q} - z_{i}\right) ^{T}A_{i}\left( x_{q} - z_{i}\right) \end{aligned}$$
(8)

In order to achieve the clusters fit, in the c-means function, the matrices \(A_{i}\) are used as optimization variables. The objective function of the GK classifier is defined by:

$$\begin{aligned} J\left( X;U,z,\left\{ A_{i}\right\} \right) = \sum _{i=1}^{c}\sum _{q=1}^{N}\left( \mu _{iq}\right) ^{m}\left( d_{iq,Ai}\right) ^{2} \end{aligned}$$
(9)

The following constraints are established [32]:

$$\begin{aligned} |A_{i}| = \rho _{i}, \,\, \rho _{i}>0, \forall _i \end{aligned}$$
(10)

where \(\rho _{i}\) is established for each cluster, the expression for \(A_{i}\) is obtained by using the Lagrange multiplier method as follows:

$$\begin{aligned} A_{i} = [\rho _{i}\textrm{det}(F_{i})]^{1/\textit{n}}F_{i}^{-1} \end{aligned}$$
(11)

\(F_{i}\) represents the fuzzy covariance matrix of the \(i{-\textrm{th}}\) cluster, and it is computed as:

$$\begin{aligned} F_{i} = \frac{\sum _{q=1}^{N}\left( \mu _{iq}\right) ^{m}\left( x_{q} - z_{i}\right) \left( x_{q} - z_{i}\right) ^{T}}{\sum _{q=1}^{N}\left( \mu _{iq}\right) ^{m}} \end{aligned}$$
(12)

By using Lagrangian multipliers, the conditions for local extreme in the objective function given by Eq. (9) are derived:

$$\begin{aligned}{} & {} \mu _{iq} = \frac{1}{\sum _{j=1}^{c}\left( d_{iq,{\textbf{A}}}/d_{jq,{\textbf{A}}}\right) ^{2/\left( m-1\right) }} \end{aligned}$$
(13)
$$\begin{aligned}{} & {} {\textbf{z}}_{i} = \frac{\sum _{q=1}^{N}\left( \mu _{iq}^{m}{\textbf{x}}_{q}\right) }{\sum _{q=1}^{N}\mu _{iq}^{m}} \end{aligned}$$
(14)

Algorithm 2 shows the GK algorithm.

figure b

3.2 Kernel Fuzzy C-Means Algorithm (KFCM)

The kernel function in the KFCM algorithm carry out the mapping of the data from the input space to a space of higher dimension. A mapping \(\varvec{\Omega }\) allows to modify the objective function of FCM as follows:

$$\begin{aligned} J_{{KFCM}} = \sum \limits _{{i = 1}}^{c} {\sum \limits _{{q = 1}}^{N} {\left( {\mu _{{iq}} } \right) ^{m} } } \left\| {{\varvec{\Omega }}({\textbf{x}}_{{\textbf{k}}} ) - {\varvec{\Omega }} ({\textbf{z}}_{{\textbf{i}}} )} \right\| ^{2} \end{aligned}$$
(15)

\(\left\| \varvec{\Omega (x_{q})}-\varvec{\Omega (z_{i})}\right\| ^{2}\) represents the square of the distance between \(\varvec{\Omega (x_{q})}\) and \(\varvec{\Omega (z_{i})}\). By using the kernel function, the distance in the feature space is computed in the following form:

$$\begin{aligned} \left\| \varvec{\Omega (x_{q})}-\varvec{\Omega (z_{i})}\right\| ^{2} = \mathbf {K(x_{q},x_{q})}- \mathbf {2K(x_{q},z_{i})} + \mathbf {K(z_{i},z_{i})}\nonumber \\ \end{aligned}$$
(16)

There exist several kernel functions that can be used. The selection of which one to use is related to the application type. However, one of the most popular kernel functions due to its satisfactory results is the Gaussian kernel which is selected to be used in this paper [10, 33].

As the Gaussian kernel will be used, \(\mathbf {K(x_q,x_q)=}\mathbf {K(z_{i},z_{i})= 1}\) and \(\left\| \varvec{\Omega (x_{q})}-\varvec{\Omega (z_{i})}\right\| ^{2}= \mathbf {2\left( 1-K(x_{q},z_{i})\right) }\). Therefore, Eq. (15) is expressed as:

$$\begin{aligned} J_\textrm{KFCM} = 2\sum _{i=1}^{c}\sum _{q=1}^{N}\left( \mu _{iq}\right) ^{m}\left\| 1-\mathbf {K(x_{q},z_{i})}\right\| ^{2} \end{aligned}$$
(17)

where

$$\begin{aligned} \mathbf {K(x_{q},z_{i})} = e^{-\left\| {\textbf{x}}_{q}-{\textbf{z}}_{i}\right\| ^{2}/\sigma ^{2}} \end{aligned}$$
(18)

The conditions for local extreme in the objective function given by Eq. (17) are derived using Lagrangian multipliers:

$$\begin{aligned}{} & {} {\textbf{z}}_{i} = \frac{\sum _{q=1}^{N}\left( \mu _{iq}^{m}\mathbf {K(x_{q},z_{i})x_{k}}\right) }{\sum _{q=1}^{N}\mu _{iq}^{m}\mathbf {K(x_{q},z_{i})}} \end{aligned}$$
(19)
$$\begin{aligned}{} & {} \mu _{iq} = \frac{1}{\sum _{j=1}^{c}\left( \frac{1-\mathbf {K(x_{q},z_{i})}}{1-\mathbf {K(x_{q},z_{j})}}\right) ^{1/\left( m-1\right) }} \end{aligned}$$
(20)

The KFCM algorithm is displayed in Algorithm 3.

figure c
Fig. 3
figure 3

Flowchart of the condition monitoring process

The general condition monitoring scheme using the GK and KFMC algorithms is shown in Fig. 3. In this scheme, it can be seen that the online recognition algorithm determines the existence of a fault if j samples representative of it are received in a window of time. Later, an abnormal situation alarm of the process is executed. This is done to reduce false alarms in the presence of noise or outliers. The parameter j and the dimension of the window of time are selected by using expert knowledge in correspondence to the process.

Fig. 4
figure 4

Data sets for experiments in the first study case

3.3 Study Cases

Case Study 1: Synthetical Datasets The datasets shown in Fig. 4 are presented to validate the performance of the proposed condition monitoring. These datasets were created synthetically to represent complex situations for classification, despite having only two classes of two variables. They were obtained from UCI Machine Learning Repository: Data Sets (https://archive.ics.uci.edu/ml/datasets.php). The sets Data A, Data B and Data D have 1,000 observations each one, and the set Data C has 700 observations to evaluate the performance on small amounts of data.

In the training stage, 750 observations from the sets Data A, Data B and Data D were used, and, the remaining 250 observations were used in the recognition stage. In the case of the set Data C, 525 observations were used in the training stage and 175 observations were used in the recognition stage (See Fig. 5).

Fig. 5
figure 5

Case Study 1: dataset preparation

Table 1 shows the parameter values used in the GK and KFCM classifiers for the experiments. In the case of the parameter \(\sigma \), several experiments (\(\sigma =10,20,30,40,\ldots ,100\)) were performed, and the value of \(\sigma \) that gave the best results in the classification was selected.

Fig. 6
figure 6

Piping diagram of the Tennessee Eastman process

Case study 2: Tennessee Eastman Process The second case study is the Tennessee Eastman (TE) benchmark process which has been extensively used to assess the effectiveness of new proposals of control strategies and monitoring schemes [34]. The plant has five subprocesses interconnected as shown in Fig. 6.

This benchmark contains one data set corresponding to the normal operation condition and 21 preprogrammed faults. The data sets with the presence of noise of the TE are generated for 48 hours and the faults are included after 8 hours of simulation. A detailed description of the control objectives, the main process features and its simulation are presented in [35]. All used data sets were obtained from http://web.mit.edu/braatzgroup/TEprocess.zip. Table 2 displays the set of faults used in the performance evaluation of the proposed condition monitoring strategy. For the training, 480 observations for each one of the four faults were used, and 960 observations were considered in the recognition stage (See Fig. 7).

Table 1 Parameters used in the GK and KFCM algorithms in the case study 1

Table 3 shows the value of the parameters used in the GK and KFCM algorithms in the case study 2. The parameters were selected according to the experience in the previous work [36].

3.4 Explanatory Example

In order to better understand why the proposed scheme improves the robustness and the classification performance of the condition monitoring system, which represents the main contribution of this paper, an illustrative example by applying the kernel fuzzy C-means (KFCM) and the Gustafson-Kernel (GK) algorithms identified as B and E, respectively, is presented. For this example, the training database has 500 observations from the data set Data A. The first 250 observations correspond to Class 1, and the others 250 observations correspond to Class 2. Figures 8 an 9 show the fuzzy partition matrices obtained in the training stage after training KFCM and GK algorithms, respectively. These results indicate that in general, the membership grade of the observations of each class have close values, and this will create confusion in the classification process.

Table 2 Faults considered for the TE process (See Fig. 6)
Fig. 7
figure 7

Case Study 2: dataset preparation

Table 3 Parameter values in the GK and KFCM algorithms for the case study 2

Now, the KFCM (B) and GK (E) algorithms are trained, and their fuzzy partition matrices (\(B^+,\,E^+\)) and their complements (\(B^{-}\), \(E^{-}\)) are obtained which allows to build the following rule base:

(21)

U1, U2, U3 and U4 are the fuzzy partition matrices as outputs of the rules R1, R2, R3 and R4, respectively.

Table 4 Number of observations used in each class by data set
Table 5 Confusion matrices for the experimental data set (Data A)
Table 6 Confusion matrices for the experimental data set (Data B)
Table 7 Confusion matrices for the experimental data set (Data C)
Table 8 Confusion matrices for the experimental data set (Data D)
Fig. 8
figure 8

Example of the fuzzy partition matrix for KFCM algorithm

Fig. 9
figure 9

Example of the fuzzy partition matrix for GK algorithm

This base of rules allows a better clustering, which corresponds to a better fuzzy partition matrix Ubetter. In this case, Ubetter is obtained for the classes \(C_1\) and \(C_2\) as follows:

for i = 1: observations of the C1 do

$$\begin{aligned} U_C1(1,i) = U1(1,i)\bigcap U2(1,i) \bigcap U3(1,i)\bigcap U4(1,i)\nonumber \\ U_C2(1,i) = U1(1,i)\bigcup U2(1,i)\bigcup U3(1,i)\bigcup U4(1,i) \end{aligned}$$

end for

Fig. 10
figure 10

Example of the result obtained with Ubetter

The aim is to maximize the classification of one class, while minimizing the classification of another class. Figure 10 shows the result which explains the robustness with respect to noise.

Subsequently, the results obtained in the TS are used in the RS, which allows to enhance the classification performance and the robustness with respect noise and external disturbances of the condition monitoring system.

4 Results and Discussion

Next, the results obtained with the proposed condition monitoring scheme in the case studies are presented.

4.1 Results of the Case Study 1

Table 4 shows the number of observations used from the Class 1 (\(NOC_1\)) and the Class 2 (\(NOC_2\)) of each data set. Tables 5, 6, 7 and 8 show the confusion matrices resulting after applying the GK, the KFCM algorithms used individually in their standard version and the methodology proposed in this paper, by using both algorithms and the Pythagorean fuzzy sets (PyFS). The main diagonal shows the number of observations successfully classified (NOSC), and the column TA represents the accuracy in the classification process (NOSC/\(NOC_i\) for \(i=1,2\)). The average values (AVE) of TA are shown in the last row.

In Fig. 11, the classification results for the KFCM, GK and the PyFS algorithms are shown, representing the correct classification percentage obtained by each algorithm.

4.2 Results of the Case Study 2

Table 9 shows the confusion matrices for the TE process, where F1: Fault 1, F2: Fault 2, F6: Fault 6 and F7: Fault 7. The results obtained confirm that the proposal put forward in this paper led to the best performance.

Figure 12 shows the classification results for the KFCM and GK algorithms, as well as those obtained with the proposed approach based on PyFS, for the TE process.

5 Comparison with Other Successful Algorithms

In this section, a comparison taking into account performance and execution time with recently presented fuzzy clustering algorithms with successful results in classification processes is first made. Subsequently, a comparison table is presented with results of recent condition monitoring methods on the TE process based on other types of computational tools.

5.1 GAKFCM Algorithm

GAKFCM algorithm is presented in Ref. [37]. With the aim of improving the initial clustering center, an adaptive genetic algorithm is used firstly, and next, the KFCM algorithm performs the classification. Table 10 shows the values of the parameters used in this paper for this classifier which were obtained from [37].

Fig. 11
figure 11

Classification for experimental data sets (Case Study 1)

5.2 EWFCM and KEWFCM Algorithms

The maximum-entropy-regularized weighted fuzzy c-means (EWFCM) and its kernel version KEWFCM are presented in [38] for extracting the important features of a data set and improving the clustering. With this aim, the dispersion within clusters is minimized, and the entropy of the attribute weights is maximized simultaneously. The kernel version (KEWFCM) by using the Gaussian Kernel was developed for improving the clustering process in data with non-spherical shaped clusters.

Table 11 shows the values utilized for the parameters of the EWFCM and KEWFCM algorithms which were obtained from [38].

Table 9 Confusion matrices for the TE process (F1: 960, F2: 960, F6: 960, F7: 960)
Fig. 12
figure 12

Classification for the TE process (Case Study 2)

5.3 Results of the Comparison

For establishing the comparison, the Tennessee Eastman process (TE) was used. Table 12 shows the obtained results with the GAKFCM, EWFCM and KEWFCM algorithms.

For establishing if there exists significant differences among the obtained results, statistical tests should be applied [39].

5.3.1 Statistical Tests

First, it is determined if there exists at least one classifier whose results has significant differences from the rest by using the Friedman test. Rejection of the null hypothesis of Friedman’s test leads to a pairwise comparison of the algorithms using Wilcoxon’s test in order to determine the best algorithm(s).

Table 10 Parameters of the GAKFCM algorithm
Table 11 Parameters in the EWFCM and KEWFCM algorithms

Friedman Test

Six algorithms (\(k = 6\)) with \(N=10\) data sets were analyzed. The Friedman statistical quantity value obtained was \(F_{F}\) = 340. The critical value of the F distribution for \((k-1)\) and \((k-1)(N-1)\) degrees of freedom F(5,45) and degree of significance \(\alpha =0.05\) is 2.449, which implies that the null-hypothesis is rejected (F(5,45) < \(F_{F}\)). This result indicates that there exist at least one classifier whose average performance is significantly different from the rest.

Wilcoxon Test

Taking into account the result of the Friedman test, the results of the six algorithms were compared by pairs, by using the statistical test of Wilcoxon. Table 13 displays the comparison results. Each algorithm is identified as follows: F: GK, G: KFCM, H: GAKFCM, I: EWFCM, J: KEWFCM and K: PyFS. The positive rank (\(R^{+}\)) and negative rank (\(R^{-}\)) for each performed comparison are shown in the first and the second row, respectively. The statistical values of T and the critical value of T for a degree of significance \(\alpha =0.05\) are shown in the next two rows. Finally, the last row shows the winner algorithm in each comparison. Table 14 shows a summary with the number of wins for each algorithm.

Table 12 Confusion matrices for the TE process
Table 13 Wilcoxon test results
Table 14 Result of the performance comparison of the algorithms
Table 15 Average computational time for the different algorithms used
Table 16 Results of the comparison

5.4 Analysis of the Execution Time

The characteristics of the computer used for performing the experiments are the following: Intel Core i7-6500U 2.5 - 3.1GHz, memory: 8GB DDR3L. Table 15 shows a comparison of the average execution time for each algorithm.

The Tennessee Eastman process has a high time constant, which is a characteristic of chemical processes. In general, the execution times shown in Table 15 are very small compared with the time constant of the TE process, which confirms the feasibility of applying these algorithms in the condition monitoring scheme.

5.5 Comparison with Recent Condition Monitoring Methods

In [40], the support vector machines (SVM), convolutional neural network (CNN), generative adversarial network CNN (GAN-CNN), auxiliary classifier GAN (ACGAN) and auxiliary classifier GAN combined with Wasserstein distance, gradient penalty and Bayesian optimization (HGAN) algorithms are compared by using the TE process. The goal of the paper is to analyze the performance of these algorithms when they are trained with small and imbalanced training data.

To make a satisfactory comparison, the experiments have to be performed under similar conditions. In this sense, data sets with the same characteristics to the experiments developed in [40](100 samples and an imbalance ratio of normal samples to fault samples of 5:1) were used for training the KFCM and GK algorithms used in this paper as example to apply the proposal of condition monitoring scheme.

The obtained results are shown in Table 16.

This comparison shows that the proposed method achieves better results for most analyzed faults. The Wilcoxon test was developed to compare the performances of the HGAN algorithm and the proposal of this paper. The result showed that there are no significant differences between the performances of these algorithms. This experiment demonstrated the capacity of the proposed condition monitoring scheme to obtain a satisfactory performance with small and imbalanced training data.

6 Conclusions

In the present paper, a novel robust condition monitoring approach for industrial plants based on the use of n fuzzy classification algorithms and the PyMGs is proposed.

In the training stage of the proposal, the membership grade and its complement of n classification algorithms are used to built the rule-based decisions with the aim to achieve an enhanced U partition matrix which allows a better location of the center of the different classes. Furthermore, the use of Pythagorean membership degrees permits to obtain a greater classification space which improves the robustness of the system in the presence of external disturbances and noise in the measured process variables. The rule base allows a better clustering of the data because it uses the information of the n fuzzy classification algorithms. Later, in the online recognition stage, the high membership grade of the set of n algorithms is used for the final decision.

It is necessary to highlight the potential of the proposed condition monitoring scheme, since n computational tools can be used.

Some experiments were performed by using GK and KFCM algorithms working in single form and together in the proposed scheme. In the results, it is observed that the methodology proposed yields the best results.

For comparing the classification performance of the proposal with other computational tools, three recently presented fuzzy clustering algorithms (GAKFCM, EWFCM and KEWFCM) with excellent performance in classification processes were first used. In this case, the classification performance and the execution time were used in the comparison. In all cases, the proposal obtain the best results.

Finally, a comparison with other computational tools using small and imbalanced training data was developed. In this comparison, the proposed condition monitoring scheme shows a satisfactory performance.

For future research, an interesting idea would be to evaluate the proposal in the diagnosis of multiple faults.