Keywords

1 Introduction

A main premise in the Industry 4.0 paradigm is to obtain high production levels with low operating expenses to improve the relation benefits-costs [6, 10]. An important cause of the increase in operating expenses and the descending productivity in industrial plants is the occurrence of faults [3, 17].

Many research results on the fault diagnosis topic in industrial systems have been published in the scientific literature in the last two decades under two main approaches: model based, and data based fault diagnosis [14, 15]. However, the advances in the Internet of Things (IoT) and Big Data technologies have currently allowed a major attention and results in the last approach [2, 11].

Several computational tools have been displayed in scientific papers and books to improve the performance of industrial fault diagnosis systems [7, 8]. However, the need to develop new strategies remains open because the results depend on the type of industrial plant analyzed.

The training stage of a data-based supervised fault diagnosis system is decisive for achieving the best online performance. To accomplish better results in training, the different classes that represent the operation of the industrial plant have to be very well identified [16]. However, this is a very complex task due to the uncertainties that characterize the industrial measurements by the effect of external disturbances and noise [18].

To overcome some difficulties of type-1 fuzzy sets to deal with the uncertain that characterize the industrial process due to noise and external disturbances type-2 fuzzy sets are used. In type-1 fuzzy sets, the memberships degree is a crisp number, but in type-2 fuzzy sets, the memberships degree is a type-1 fuzzy number. The goal is that higher membership values should contribute more than memberships that are smaller when the cluster centers are updated [19, 20]. In this paper, a fault diagnosis methodology based on type-2 fuzzy classification algorithms is presented.

The main contribution of this paper is to present a robust condition monitoring scheme versus external disturbances and noise. For this, a scheme based on the use of Type-2 Fuzzy sets is displayed. For misclassification reduction, a kernel variant is implemented of the proposed algorithms to accomplish a better differentiation between classes. The proposal exhibits high performance in the presence of noisy observations

2 Materials and Methods

2.1 Type-2 Fuzzy C-Means Algorithm (T2FCM) and Kernelized T2FCM (KT2FCM)

For updating the cluster centers in T2FCM, the weighted mean of all observations is used [19]. The membership values for the Type 2 membership are obtained as follow:

$$\begin{aligned} a_{ik} = u_{ik} - \frac{1-u_{ik}}{2} \end{aligned}$$
(1)

where \(a_{ik}\) and \(u_{ik}\) are the type-2 and type-1 memberships respectively. The cluster centers are updated according to the traditional FCM but taking into account the new type-2 fuzzy membership . Although T2FCM has proven effective for spherical data, it fails when the data structure of input patterns is non-spherical. A way of increasing the accuracy of the T2FCM is using a kernel function for calculating the distance of data point from the cluster centers, i.e., mapping the data points from the input space to a high dimensional space. This algorithm is used to obtain a better separability among classes improving the classification results. In the KT2FCM algorithm is minimized the following objective function:

$$\begin{aligned} J_{KT2FCM} \,=\, \sum _{i=1}^{l}\sum _{k=1}^{N}a_{ik}^{*m}\left\| \mathbf {\Psi (z_{k})}-\mathbf {\Psi (v_{i})}\right\| ^{2} \end{aligned}$$
(2)

where, \(\left\| \mathbf {\Psi (z_{k})}-\mathbf {\Psi (v_{i})}\right\| ^{2}\) is the square of the distance between \(\mathbf {\Psi (z_{k})}\) and \(\mathbf {\Psi (v_{i})}\). In the feature space, the distance is computed through the kernel in the input space as:

$$\begin{aligned} \left\| \mathbf {\Psi (z_{k})}-\mathbf {\Psi (v_{i})}\right\| ^{2} = & {} \mathbf {K(z_{k},z_{k})}- \mathbf {2K(z_{k},v_{i})}\nonumber \\ {} & {} + \mathbf {K(v_{i},v_{i})} \end{aligned}$$
(3)

In the scientific bibliography, many kernel functions are found, and the most appropriate depends on the applications [13]. Nonetheless, the most used is the Gaussian Kernel Function (GKF).

If the GKF is used, then \(\mathbf {K(z,z) = 1}\) and \(\left\| \mathbf {\Psi (z_{k})}-\mathbf {\Psi (v_{i})}\right\| ^{2} = \mathbf {2\left( 1-K(z_{k},v_{i})\right) }\). So, Eq. (2) can be expressed as:

$$\begin{aligned} J_{KT2FCM} = & {} 2\sum _{i=1}^{l}\sum _{k=1}^{N}a _{ik}^{*m}\left\| 1-\mathbf {K(z_{k},v_{i})}\right\| ^{2} \end{aligned}$$
(4)

where,

$$\begin{aligned} \mathbf {K(z_{k},v_{i})} = e^{-\left\| \textbf{z}_{k}-\textbf{v}_{i}\right\| ^{2}/\delta ^{2}} \end{aligned}$$
(5)

where \(\delta \) is the bandwidth which illustrates the smoothness degree of the GKF. Minimizing Eq. (4), yields:

$$\begin{aligned} a _{ik}^{*} = \frac{1}{\sum _{j=1}^{l}\left( \frac{1-\mathbf {K(z_{k},v_{i})}}{1-\mathbf {K(z_{k},v_{j})}}\right) ^{1/\left( m-1\right) }} \end{aligned}$$
(6)
$$\begin{aligned} \textbf{q}_{i} = \frac{\sum _{k=1}^{N}\left( a_{ik}^{*m}\mathbf {K(z_{k},v_{i})z_{k}}\right) }{\sum _{k=1}^{N}a_{ik}^{*m}\mathbf {K(z_{k},v_{i})}} \end{aligned}$$
(7)

2.2 Interval Type-2 Fuzzy C-Means Algorithm (IT2FCM) and Kernelized IT2FCM (KIT2FCM)

The parameter m is crucial in fuzzy clustering algorithms to determine the partition matrix uncertainty. Nevertheless, it is not an easy task to decide the value of m in advance. IT2FCM regards the fuzzification coefficient as an interval [\(m_{1}\),\(m_{2}\)] and minimizes the objective function as [20]:

$$\begin{aligned} J_{IT2FCM} \,=\, \sum _{i=1}^{l}\sum _{k=1}^{N}u _{ik}^{*m}d_{ik}^{2} \end{aligned}$$
(8)

where the parameter m is substituted by \(m_{1}\) and \(m_{2}\) that represent different fuzzy degrees and provide different objective functions compared with FCM. To minimize the objective function [20]:

$$\begin{aligned} \overline{u_{i}}(k) \,=\, max\left( 1/\sum _{j=1}^{l}(d _{ik}/d _{jk})^{2/(m_{1}-1)}, 1/\sum _{j=1}^{l}(d _{ik}/d _{jk})^{2/(m_{2}-1)} \right) \end{aligned}$$
(9)
$$\begin{aligned} \underline{u_{i}}(k) \,=\, min\left( 1/\sum _{j=1}^{l}(d _{ik}/d _{jk})^{2/(m_{1}-1)}, 1/\sum _{j=1}^{l}(d _{ik}/d _{jk})^{2/(m_{2}-1)} \right) \end{aligned}$$
(10)

where \(d_{ik}^{2} = \left\| z_{k} - q_{i}\right\| \) is the distance between input patterns \(z_{k}\) and cluster centers \(q_{i}\). \(\overline{u_{i}}(k)\) \((\underline{u_{i}}(k))\) is the upper (lower) membership function of \(z_{k}\) to \(q_{i}\).

Distinct from FCM, the output of IT2FCM algorithm is an interval type-2 fuzzy set, that it is not possible to convert to a crisp set directly by a defuzzication operation. To calculate the centroid of a type-2 fuzzy set and reduce the type-2 fuzzy set to the type-1 fuzzy set is executed the type reduction just as the first step of output processing [9]. The interval-valued cluster centers are calculated as:

$$\begin{aligned} \widetilde{\textbf{q}_{i}} = [\widetilde{q}_{i,1}, \widetilde{q}_{i,2}]= \sum _{u_{i1}}\cdot \cdot \cdot \sum _{u_{i1}}\frac{1}{\frac{\sum _{k=1}^{N}u_{ik}^{m^{*}}z_{k}}{\sum _{k=1}^{N}u_{ik}^{m^{*}}}} \end{aligned}$$
(11)

supported on such type-2 memberships. \(m^{*}\) switches from \(m_{1}\) to \(m_{2}\), and \(\widetilde{q}_{i,1}\) and \(\widetilde{q}_{i,2}\) are usually obtained by Karnik-Mendel algorithm [5]. The procedure to obtain the kernel version of the IT2FCM algorithm (KIT2FCM) is similar to the one used in the case of T2FCM algorithm. The distance is calculated through the kernel function using the Gaussian Kernel Function (GKF).

2.3 Proposed Methodology

The proposed classification scheme for Fault Detection and Isolation (FDI) is displayed in Fig. 1. It exhibits an offline training phase and a recognition phase executed online. In the first phase, the fuzzy classifier is trained using a training database builds with historical data of the process. In the online phase, the classifier analyzes each observation collected from the process. The result offers information to the operator about the state of the system in real time. Training is the most important stage, since the center of each of the classes that represent the operation of the process will be determined, either in normal operation or in the presence of faults.

Fig. 1.
figure 1

Classification scheme for detection and isolation of faults.

Offline Training Phase. In this phase, the FDI system is trained with a set of historical data which contain the necessary information of each known operating state or class of the industrial plant (normal operation condition (NOC) and states of fault). The main aim of the training process is to determine the center of the known classes \(\textbf{Q} = {\textbf{q}_{1},\textbf{q}_{2}, \ldots , \textbf{q}_{c}}\) is determined to be used in the on-line recognition stage.

On-Line Recognition Phase. In this phase, it is determined to which class each observation k belongs at each time instant. First, the distance between the observation and the centers of the classes that were determined in the offline stage is computed. Subsequently, the degree of membership of the observation k is obtained for each class. It will be assigned to the class with the highest degree of membership (See Algorithm 1).

figure a

2.4 Case Study: DAMADICS

To verify that, the proposed methodology was used in the DAMADICS test problem. It represents an intelligent electro-pneumatic actuator widely used in industries [1]. The diagram of this actuator is shown in Fig. 2. Table 1 and Fig. 3 (with 300 observations per class) shows the operation modes evaluated in the actuator and the measured variables used. Selected faults occur in different parts of the actuator and were selected in order to test the robustness of the diagnostic system.

Fig. 2.
figure 2

Diagram of benchmark actuator system [1].

Table 1. Operation modes and measured variables in DAMADICS.
Fig. 3.
figure 3

Operation modes.

2.5 Design of Experiments

Table 2 shows the characteristics of the training database used, which is free of outliers, noise, and missing variables. The values of the parameters used for the applied algorithms were: \(\epsilon \) = \(10^{-5}\), m = 2, \(\sigma \) = 50. The parameters were taken from [12].

Table 2. Characteristics of the training database.

K-cross-validation method with K = 5 was chosen for training (800 observations) and validation (200 observations). In the experiments of the online phase 2400 observations were used (400 new observations of each operation mode not used in the training). Each experiment was replicated 100 times to ensure repeatability of results. The average of the 100 results was considered as final result. To evaluate the robustness of the proposal, three experiments were developed:

  1. 1.

    Observations without noise.

  2. 2.

    Observations with 2% of noise level

  3. 3.

    Observations with 5% of noise level.

3 Discussion of Results

4 Online Recognition Stage

The confusion matrix (CM) tool was used to evaluate the performance of the FDI system proposed. The values \(CM_{rs}\) for \( r \ne s\) in the CM show the number of observations of the operation mode r that the classifier algorithm misclassifies in the operation modes.

Table 3 shows the confusion matrix (without noise in the measurements) where the results for the operation states Normal Operation Condition (NOC), Fault 1 (F1), Fault 7 (F7), Fault 12 F12), Fault 15 (F15) and Fault 19 (F19) are presented. In the main diagonal are presented the number of observations well classified. The accuracy of the classification process is obtained as TA=correctly classified observations/total observations. The average (AVE) of TA is displayed in the last row.

Figure 4 show the classification results for the different operation modes (NOC and faults 1, 7, 12, 15, 19) by using the T2FCM, IT2FCM, KT2FCM and KIT2FCM algorithms for DAMADICS process. They show a classification percentage obtained for each data set. Figure 5 displays a global classification percentage obtained for each algorithm (without noise, 2% and 5% of noise level).

Table 3. Confusion matrix for the DAMADICS process (NOC: 400, F1: 400, F7: 400, F12: 400, F15: 400, F19: 400)
Fig. 4.
figure 4

Classification (\(\%\)) for DAMADICS process.

Fig. 5.
figure 5

Global classification (\(\%\)) obtained for each algorithm.

4.1 Statistical Tests

Since several algorithms are used, statistical tests should be applied to compare their performance [4]. The statistical Friedman test can be used in order to establish if the differences among the obtained performances are significant. If significant differences are found, a comparison in pairs should be developed to find the best classifier. In this case, the statistical Wilcoxon test was used.

Friedman Test. Applying the test for \(k = 4\) algorithms and \(N = 10\) datasets, the value obtained for the statistical Friedman \(F_{F}\) = 241. \(F_{F}\) is distributed according to the F distribution with \(k-1=3\) and \((k-1)\times (N-1)=27\) degrees of freedom. From the distribution F table, F(3,27) for \(\alpha =0.05\) is 2.9604, so the null-hypothesis (F(3,27) < \(F_{F}\)) is rejected. This means that there are significant differences among the obtained performances.

Wilcoxon Test. Table 4 exhibits the results of applying the Wilcoxon test (A1: T2FCM, A2: IT2FCM, A3: KT2FCM, A4: KIT2FCM). First row displays the sum of positive ranks \(R^{+}\), and the second rows displays the sum of the negative ranks \(R^{-}\) obtained from the comparison developed. The values of the T statistic and its critical values for a significance level \(\alpha =0.05\) are shown below. Finally, the winning algorithm are shown in each comparison. Table 5 shows that KT2FCM and KIT2FCM obtain the best results.

Table 4. Results of the Wilcoxon test
Table 5. Algorithm comparison summary

5 Conclusions

This paper presented the design of a fault diagnosis system with robust behavior by using type-2 fuzzy classification algorithm. The main contribution of the proposal was the application of the theory of Type-2 Fuzzy Sets to overcome the effect of uncertainties that characterize the industrial process due to noisy observations and external disturbances.

The capacity of the function kernels to discriminate better among the operation modes reducing misclassification was demonstrated in the developed experiments. The proposed FDI scheme was successfully validated using the DAMADICS process benchmark.