Introduction

Over the years, the security and monitoring of the industrial processes are increasingly important steps to ensure maintained product quality and to guarantee proper functioning. The industrial processes are highly complex and automated. Furthermore, the quality concerning monitoring issues are hot topics, especially for large-scale systems. For the increase in availability, reliability and safety, the choice of the monitoring algorithms is important to ensure efficient process monitoring techniques.

On the one hand, a fault is understood as an incorrect function step in the actual dynamic system, which leads to an unacceptable anomaly in the overall system performance (Frank 1990). On the other hand, Fault Detection (FD) techniques are necessary to monitor the continuity of operating the system under normal conditions to ensure safety. Nevertheless, the FD principle will help to limit the process disturbances and keep the system safe and reliable. For this reason, several techniques have been reported (Wang et al. 2018; Isermann 1984; Venkatasubramanian et al. 2003).

The FD literature is large and important as regards detecting any fault that might occur (Joe Qin 2003; Jaffel et al. 2013). For years, several algorithms have been developed. For example, we find the methods classified according to the knowledge of the system in the form of two main classes (De Angelo et al. 2009): qualitative or quantitative.

In this context and in order to model and analyze the relationships between variables, multivariate statistical techniques have been developed for process monitoring such as the Principal Component Analysis (PCA) (Neffati et al. 2019; Said et al. 2018), the Independent Component Analysis (ICA) (Lee et al. 2006; Kano et al. 2003) and the Partial Least Squares (PLS) (Li et al. 2010; Wold 1985). Thus, ICA tries, with a group of independent components, to reflect non-Gaussian information. The PCA method essays to extract linear relations among the considered variables and then represents them with orthogonal Principal Components (Peng et al. 2014). The PLS technique aims to extract process data and quality data together and to model the relationship between them.

However, the PLS as a data-driven method has shown a good performance and has been widely used in modeling, monitoring, and diagnosis in analytical, physical and clinical chemistry as well as industrial processes. The PLS method, which can extract relationships between two sets of variables, inputs/outputs, can build a linear learning model with linear Latent Variables (LVs) (Tang et al. 2017). Unlike the PCA, which captures variations in input data with a descending order of variance, the PLS model finds an optimum pair of latent variables in the input data related to the output ones, such that these transformed variables have the largest covariance.

For process monitoring, several extended PLS methods have been also proposed in the literature. Among the existing work, the statistical process based on the PLS have been frequently studied. MacGregor et al. suggested the monitoring methods with multiblock PLS models and showed the performance of the contribution diagrams used to identify the fault variables using the PLS technique (MacGregor et al. 1994). In Kresta et al. (1991), the authors presented the basic methodology, using the PLS models, to detect faults related to output data for continuous processes. Li et al. (2010), specified the geometric property of the PLS decomposition structure in input data and then compared the different PLS models to monitor the process. Helland et al. (1992), proposed the recursive PLS algorithm in order to update the PLS model with the latest process data. To solve several problems, Zhou et al. (2010), suggested a total PLS model for output-relevant process monitoring.

Actually, most industrial processes are nonlinear. Nevertheless, the application of the PLS is limited by its linear assumption. For nonlinear input data and output data, nonlinear PLS (Rosipal 2010), polynomial PLS (Malthouse et al. 1997), neural PLS (Lee et al. 2006) and Kernel PLS (KPLS) (Rosipal and Trejo 2001; Zhang et al. 2015; Zhang and Hu 2011) have been proposed. The kernel method has been very developed in literature (Wu et al. 2017). Furthermore, the KPLS methods have become one of the simplest, most elegant and faster techniques at the level of the development of the soft measurement model for nonlinear systems relative to other nonlinear approaches. The KPLS technique provides a good monitoring performance by finding those LVs that present a nonlinear correlation with the response variables, besides improving model understanding.

The KPLS can also be used to handle the original input data that are non-linearly transformed into a feature space of arbitrary dimensionality, via nonlinear mapping. Then a linear PLS model will be created in the feature space. The main advantage of the KPLS is that it does not involve any nonlinear optimization utilizing the kernel function or the stabilization problem (Kim et al. 2005). There have been recent papers on the use of the standard KPLS with commonly used indices, such as the Squared Prediction Error (SPE) and the Hovelling index \(T^{2}\) charts (Zhang et al. 2015).

There are still some problems for industrial process monitoring based on the KPLS technique. Since the standard KPLS performs an oblique projection to an input space, it has limitations in distinguishing quality-related and quality-unrelated faults. As a consequence, the number of latent variables selected for the KPLS may be larger than that for the linear PLS (Kim et al. 2005). Yet, the computation time may increase according to the number of samples, selected for the KPLS, for the storage of the symmetric kernel matrix during the identification phase of a KPLS monitoring model.

The main aim of this manuscript is to use the advantages of the KPLS technique by introducing it as a part of a new suggested method for nonlinear systems. In this study, we propose a new Reduced KPLS (RKPLS) in which we consider only the set of observations that approximates the retained important components to produce a reduced size of the kernel matrix.

In this paper, the optimized statistic RKPLS consists in computing the optimized parameters of the KPLS to better improve the detection phase. For this purpose, a metaheuristic technique is chosen to compute the optimal value. The considered metaheuristic is entitled the tabu search algorithm (Marappan and Gopalakrishnan 2018). This method is addressed using two objective functions, firstly the reduced size and the reduced false alarm rate .

We propose, at the first place, a reduced method to overcome the FD problems and facilitate this task. Then the most important point is to follow real and dynamic systems (Seera et al. 2016; Mosallam et al. 2016). The most real industrial processes are dynamic over time, i.e. time varying. Whereas, the static RKPLS and the KPLS are based on a model, time invariant, build from the training data. For complex and dynamic systems, the fixed KPLS and RKPLS methods give, in many cases, false alarms, which can reduce the reliability of these methods. For this reason, we suggest a new online FD method based on the reduced model. In the literature, to update the RKPLS model, several dynamic methods have been proposed (Chen et al. 2017; He et al. 2013). To better monitor a real mode and actual data, a moving window RKPLS (MW-RKPLS) is put forward. The main contribution is to determine the reduced model, which can be updated, if new useful data are available. The suggested MW-RKPLS consists in updating the RKPLS model using a moving window. To conclude, the main contributions of this paper lie in:

  • We handle, firstly, the FD problem by a reduced method which consists in selecting the significant components, with an optimized statistic version.

  • We use then the online MW-RKPLS that aims to update RKPLS method using a moving window, if and only if a new normal sample presents useful and important information about the monitored system.

  • We use only a reduced set of observations rich with information, which improve the FD performances in the online version.

  • The suggested approach is evaluated by using real dataset.

The statistic RKPLS method is tested on the Chemical Stirred Tank Reactor (CSTR) benchmark process and the Tennessee Eastman Process (TEP). Afterwards, the relevance of the suggested online MW-RKPLS method is used to monitor the air quality and the TEP. The FD performances of both developed techniques are illustrated in terms of False Alarm Rate (FAR), Good Detection Rate (GDR) and Computation Time (CT).

The paper is organized as follows. In section “Previous work”, an overview of the PLS and KPLS methods is given. Section “Proposed RKPLS for fault detection” presents the proposed RKPLS method. After that, the FD index SPE is presented in section “Fault detection theory”. Thereafter, in section “KPLS and RKPLS based EWMA-SPE chart”, the KPLS based EWMA-SPE chart and the RKPLS based EWMA-SPE chart are presented. Section “Suggested MW-RKPLS monitoring” presents the suggested MW-RKPLS method. The tabu search metaheuristic method is presented in section “Selection of kernel parameter using tabu search algorithm”. Section “Simulation results” shows the fault detection performances using the CSTR process, the air quality and the TEP. Finally, section “Conclusion” concludes the paper.

Previous work

Standard PLS method

Principle

The PLS method, which extracts a set a vectors called latent components from the original input/output data space, builds linear multivariable regression model. Thus, given the input matrix X\(\in \)\(\mathfrak {R}^{N \times m}\) containing N samples with m process variables and the output matrix Y\(\in \)\(\mathfrak {R}^{N \times J}\) comprising N observations with J quality variables, we get:

$$\begin{aligned} X= \left[ {\begin{array}{c} x_{1} \\ \vdots \\ x_{N} \\ \end{array} } \right] \in \mathfrak {R}^{N \times m}, Y= \left[ {\begin{array}{c} y_{1} \\ \vdots \\ y_{N} \\ \end{array} } \right] \in \mathfrak {R}^{N \times J}, \end{aligned}$$

The objective of the PLS method is to search for a set of vectors called LVs components T=\([ t_{1},t_{2}...t_{l}] \) and U=\([ u_{1},u_{2}\ldots u_{l}] \), which represent as much as possible the variations in the input and output observations (Taouali et al. 2015). The PLS model projects the input matrix and the output matrix to a low-dimensional space with an L number of LVs. The PLS decomposes the X and Y matrices as follows:

$$\begin{aligned} \left\{ \begin{array}{l} X=TP^{T}+E \\ Y=UQ^{T}+F \end{array} \right. \end{aligned}$$
(1)

where \(\hbox {P}=[ p_{1},p_{2}...p_{l}] \) and \(\hbox {Q}=[ q_{1},q_{2}...q_{l}]\) represent the loadings for X and Y, respectively, and matrix E and matrix F are the PLS residuals corresponding to the input matrix X and the output matrix Y, respectively. On the other hand, the number of latent factors is determined by cross-validation, which gives the maximum prediction power based on data excluded from training data (Qin 2012).

Algorithm

The main idea of the PLS algorithm is to extract each pair of corresponding latent variables as a linear combination of the input and output variables (Baffi et al. 1999). In the classical form presented by Wold (1985), the PLS method are calculated from the Nonlinear Iterativ partial least squares (NIPALS) algorithm. In each iteration, we find weight vectors, \(w_{i}\) and \(c_{i}\), which present the weights of the input and output projections, as follows:

$$\begin{aligned} w_{i}= & {} u_{i}^{T}X /(u_{i}^{T}u_{i}), i=1,2..l, \end{aligned}$$
(2)
$$\begin{aligned} c_{i}= & {} t_{i}^{T}Y /(t_{i}^{T}t_{i}), i=1,2..l, \end{aligned}$$
(3)

For each iteration, a matrix of columns yields the following variables: \(t_{i}\), \(u_{i}\), \(p_{i}\) and \(q_{i}\). For an algorithm sequence, the results of the previous iteration are used as inputs for the next iteration, to ensure that the scores are extracted from the input/output matrices.

A typical PLS algorithm is divided in four essential steps as follows:

  1. Step 1:

    Mean-center and scale X and Y;

  2. Step 2:

    Using either the NIPLS algorithm, compute the following quantities: \(t_{i}\), \(u_{i}\), \(p_{i}\) and \(q_{i}\);

  3. Step 3:

    Deflate X and Y by subtracting the computed latent vectors from them;

  4. Step 4:

    Go to step 2 to compute the next latent vector.

The organigram follows presents the essential steps of the PLS model, as depicted in Fig. 1.

Fig. 1
figure 1

Flowchart of NIPALS algorithm

KPLS method

Principle

Generally, the PLS method is limited by its linear assumption. In recent years, kernel methods (Rosipal and Trejo 2001; Willis 2010) have received a lot of attention because of kernel tricks that build a nonlinear latent variable model with an approximately linear computational cost.

Therefore, linear systems can be performed by the PLS model theory. Concerning nonlinear data in a higher-dimensional space, called feature space F, where they can be modeled linearly, they can be developed by the KPLS (Jalali-Heravi and Kyani 2007). The latter is formulated in this feature space to extend the linear PLS to its nonlinear kernel form.

Furthermore, the key idea of the KPLS is to map the input process variable data \(x_{i}\), i=1,...,N into a feature space F via a nonlinear transformation \(\Phi \), as illustrated in Eq. (4):

$$\begin{aligned} \Phi :x_{i}\in \mathfrak {R}^{N}\rightarrow \Phi (x_{i})\in F \end{aligned}$$
(4)

In that case and due to the curse of dimensionality, it is impossible to compute the nonlinear mapping of each unfolded sample from batch processes. To tackle this issue, the Mercer kernel k(., .) can be defined as the inner product of two mapped samples (Wang 2012):

$$\begin{aligned} k(x_{i},x_{j})=<\Phi (x_{i}),\Phi (x_{j})>=\Phi (x_{i})\Phi (x_{j})^{T} \end{aligned}$$
(5)

where \(\Phi (x_{i})\)\(\in \mathfrak {R}^{1 \times S}\), i=1,...,N and S is the dimension of the feature space.

The requirement on the kernel function is that it satisfies the Mercer’s theorem. According to Eq. (5), the Gram matrix K \(\in \mathfrak {R}^{N \times N}\) can be obtained as follows:

$$\begin{aligned} K=\Phi (X)\Phi (X)^{T} \end{aligned}$$
(6)

with \(\Phi (X)=[\varphi (x_{1})^{T},...,\varphi (x_{N})^{T}]\).

There are many kernel functions which are commonly used. The different kernel functions are available as:

Polynomial kernel : \(K(X,Y)=<X,Y>^{p}\),

Sigmoid kernel : \(K(X,Y)=tanh(\beta _{0}<X,Y>+\beta _{1})\),

Radial basis kernel :\(K(X,Y)= \exp (-\frac{\parallel X-Y\parallel ^{2}}{2\sigma })\),

where p, \(\beta _{0}\), \(\beta _{1}\), and \(\sigma \) are determined using the cross-validation technique.

Algorithm

In this section, we will present the necessary steps of the KPLS method. The kernel algorithms for the PLS were given by Lindgren et al. (1993) for the optimization of matrices with a large number of samples.

At the first stage, mean centering in a high-dimensional space must be performed. The centered Gram K matrix can be then computed as indicated in Eq. (7):

$$\begin{aligned} K\leftarrow \left( I_{n}-\dfrac{1}{n}1_{n}1^{T}_{n}\right) K\left( I_{n}-\dfrac{1}{n}1_{n}1^{T}_{n}\right) \end{aligned}$$
(7)

where \(1_{n}\) denotes a vector of ones with length N, and \(I_{n}\) is an N-dimensional identity matrix.

Now, considering a modified version of the PLS algorithm, the score vectors T and U are scaled to a unit norm instead of scaling the weight vectors W and C (Rosipal 2010).

On the other hand, we find the deflation step. It is based on a rank-one reduction in the K and Y matrices using a new extracted score vector T (Kim et al. 2005). The K and Y matrices are deflated as:

$$\begin{aligned}&K \leftarrow (I_{n}-tt^{T})K(I_{n}-tt^{T})\nonumber \\&\quad =K-tt^{T}-Ktt^{T}+tt^{T}Ktt^{T} \end{aligned}$$
(8)
$$\begin{aligned}&Y \leftarrow Y-tt^{T}Y \end{aligned}$$
(9)

where \(I_{n}\) is an N-dimensional identity matrix.

Fig. 2
figure 2

Flowchart of KPLS-based SPE chart

After extracting the desired kernel, the following steps consist in calculating the prediction outputs on the training and testing samples. The corresponding prediction outputs on the training samples can be written as follows:

$$\begin{aligned} \widehat{Y}=KU(T^{T}KU)^{-1}T^{T}Y \end{aligned}$$
(10)

The corresponding prediction outputs on the testing samples can be represented as follows:

$$\begin{aligned} \widehat{Y_{t}}=K_{t}U(T^{T}KU)^{-1}T^{T}Y \end{aligned}$$
(11)

where \(K_{t}\) is the kernel matrix of the test samples.

A typical KPLS algorithm is divided in four essential steps as follows:

  1. Step 1:

    Calculate kernel matrix and then center;

  2. Step 2:

    Set i=1, \(K_{1}=K\), \(Y_{1}=Y\);

  3. Step 3:

    Random initialized \(u_{i}\) equal to any column of \(Y_{i}\);

  4. Step 4:

    \(t_{i}=K^{T}_{i}u_{i}\), \(t_{i}=t_{i} / \parallel t_{i} \parallel \);

  5. Step 5:

    \(c_{i}=Y^{T}_{i}t_{i}\);

  6. Step 6:

    \(u_{i}=Y_{i}c_{i}\), \(c_{i}=c_{i} / \parallel c_{i} \parallel \);

  7. Step 7:

    If \(t_{i}\) converge, go to Step 7, else return to Step 3;

  8. Step 8:

    Deflate K and Y;

  9. Step 9:

    Repeat Steps 3 to 6 to extract more latent variables;

  10. Step 10:

    Obtain the cumulative matrices T and U.

The following organigram presents the essential steps of the KPLS model, as illustrated in Fig. 2.

Proposed RKPLS for fault detection

For kernel methods, the training data used for monitoring and modeling must be stored in a memory. Specifically, the monitoring methods based on the KPLS suffer from computation complexity (Jaffel et al. 2017), because the learning time as well as the amount of computer memory increase rapidly with the number of observations. As a result, there are problems of memory and calculation when the number of observations becomes large, mainly when dynamic processes are being monitored. Despite the fact that the KPLS method solves the problem of non-linearity, it is computationally limited because of the increased n-dimensional kernel matrix with the number of observations (Taouali et al. 2016).

For this reason, a new reduction method entitled the RKPLS is proposed in this section.

RKPLS principle

The important principle of the proposed RKPLS method is to reduce the computation time. In this method, we select a reduced number of observations among the N measurement variables of the information matrix. The parameter number of the resulting RKPLS model is equal to the number L of the latent component. To conclude, we can say that to raise the detection performance, we add in a reduced data set the most loaded samples in terms of information. Such a system is used to generate a reduced KPLS model, which will be used for monitoring.

The suggested RKPLS method consists in approaching each latent component \(\lbrace w_{j}\rbrace _{j=1..P}\) by transformed input data \(\phi (x_{Latent}^{(j)}) \in \phi \lbrace x^{i} \rbrace _{i=1...M} \), which have the highest projection value in the direction of \(w_{j}\) (Taouali et al. 2015).

The projection of vector \(\phi (x_{Latent}^{(j)})\) can be written as:

$$\begin{aligned} \phi (x_{Latent}^{(j)})= \alpha _{j}*k_{j}(x), j=1,2..L \end{aligned}$$
(12)

After that, we project all the vectors of the transformed data \(\phi \lbrace x^{i} \rbrace _{i=1...M}\) on the latent component \(w_{j}\) and we retain \(x_{Latent}^{(j)} \in \lbrace x^{(i)} \rbrace _{i=1...M} \) that satisfies Eq. (13):

$$\begin{aligned} \left\{ \begin{array}{l} \phi (x_{Latent}^{(j)})_{j} = \max \limits _{i=1,..,M} \phi (x^{i})_{j} \\ and \\ \phi (x_{Latent}^{(j)})_{i \ne j} < \varsigma \end{array} \right. \end{aligned}$$
(13)

where \(\varsigma \) is a given threshold.

Once the reduced data set \(\lbrace x_{Latent}^{(j)}\rbrace _{j=1..L}\) is determined, a reduced data matrix can be defined as:

$$\begin{aligned} X_{r}=[x_{Latent}^{(1)} x_{Latent}^{(2)} ... x_{Latent}^{(L)}]^{T} \end{aligned}$$
(14)

Furthermore, we make a reduced Kernel matrix \(K_{r}\) associated to a kernel function k, as indicated in Eq. (15):

$$\begin{aligned} \begin{matrix} K_{r} \end{matrix} = \begin{bmatrix} k(x_{1},x_{1})&\ldots&k(x_{1},x_{L}) \\ \vdots&\ddots&\vdots \\ k(x_{L},x_{1})&\ldots&k(x_{L},x_{L}) \end{bmatrix} \in R^{L \times L} \end{aligned}$$
(15)

RKPLS method

The main algorithmic steps of the suggested RKPLS are shown as follows:

  1. Step 1:

    Acquire an initial standardized block of training data \( \lbrace x_{i}\rbrace _{i=1..N}\) and scale them,

  2. Step 2:

    Construct the kernel matrix K and scale it,

  3. Step 3:

    Project \( \lbrace \phi _{i} \rbrace _{i=1..N}\) on the component latent \(\lbrace w_{i} \rbrace \) and choose \(x_{Latent}^{(i)}\) that satisfies the Eq. (13),

  4. Step 4:

    Construct the reduced kernel matrix \(K_{r} \in R^{L \times L}\), as in Eq. (15),

  5. Step 5:

    Estimate the reduced KPLS model,

  6. Step 6:

    Determine the control limits of the SPE chart presented in the next section.

To sum up, the fault detection flowchart of the RKPLS is shown in Fig. 3.

Fig. 3
figure 3

Flowchart of proposed RKPLS-based SPE chart

Fault detection theory

FD indices

Statistical process monitoring, to build process models, relies on the use of normal process data. The FD stage is the first step in process monitoring. The key idea of the kernel methods is to map the measurement space into the feature space F, so that data in the feature space F can be distributed linearly. However, the two statistics SPE and \(T^{2}\) can be used for fault detection in space F.

In general, traditional PLS-based FD methods use the SPE and \(T^{2}\), which are, respectively, expressed in terms of Euclidian and Mahalanobis distances (Choi et al. 2005; Li et al. 2011).

In this paper, the model is used for FD through the SPE detection indice, which is presented in the next subsection.

Squared prediction error

The SPE index is defined as the norm of the residual vector in the feature space F (Fezai et al. 2018). The SPE index measures variability that breaks the normal process correlation, which often indicates an abnormal situation (Joe Qin 2003). It is possible to detect new events by computing the SPE or the Q statistic of the residuals for a new observation.

The SPE index is calculated as the squared norm of the residual components as follows:

$$\begin{aligned} SPE=\parallel \tilde{X_{r}} \parallel ^{2}= \parallel X_{r}-\widehat{X_{r}} \parallel ^{2}= \parallel (I-\widehat{W}\widehat{W}^{T})X_{r}) \parallel \nonumber \\ \end{aligned}$$
(16)

where \(\widehat{X_{r}}\) is the reduced estimated value for the RKPLS method and \(\widehat{W}\) is the weights matrix. Nevertheless, the normal region defined by the SPE control limit includes residual components developed by Jackson and Mudholkar (1979). Thus, faults with small to moderate magnitudes can easily exceed the SPE control limit. Therefore, the process is considered normal if:

$$\begin{aligned} SPE\leqslant \delta ^{2}_{\alpha } \end{aligned}$$
(17)

where \(\delta ^{2}_{\alpha }\) denotes the upper control limit for the SPE with a significance level \(\alpha \).

The confidence limits \(\delta ^{2}_{\alpha }\) for the SPE with a significance level can be calculated as:

$$\begin{aligned} \delta ^{2}_{\alpha }=g\chi ^{2}_{h,\alpha } \end{aligned}$$
(18)

where the confidence level is \((1-\alpha )\times 100\%\), and g and h are given as:

$$\begin{aligned} g=\dfrac{Variance(SPE)}{2\times mean(SPE)}\hbox { and }h=\dfrac{2\times (mean(SPE))^{2}}{Variance(SPE)}. \end{aligned}$$

KPLS and RKPLS based EWMA-SPE chart

In this section, the classic KPLS and the proposed RKPLS methods will be applied in the KPLS-based EWMA-SPE technique and in the suggested RKPLS-based EWMA-SPE technique, respectively, to improve the detection phase. Generally, the EWMA has been widely used to improve the quality of a process when small process shifts are of interest (Abbas et al. 2014; Lu and Tsai 2015). It was first introduced by Roberts (1959).

The single valued based EWMA statistic Z may be calculated using Eq. (19):

$$\begin{aligned} Z_{i}=\lambda \bar{X}_{i}+(1-\lambda )Z_{i-1}, i=1..N \end{aligned}$$
(19)

where \(\lambda \) is known as the smoothing parameter and is chosen such that \(0 < \lambda \leqslant 1 \), and i is the sample number. Afterwards, \(\bar{X}_{i}\) is the average of the \(i^{th}\) sample, and the quantity \(Z_{i-1}\) represents the past information where its initial value \(Z_{0}\) is equal to the target mean or the average of the preliminary samples.

Suggested MW-RKPLS monitoring

In order that the static KPLS method can be developed and improved, we find several techniques as the MW-KPLS (Liu et al. 2010; Shinzawa et al. 2006). Nevertheless, we can collect just the most useful observations in a reduced data matrix, in order to reduce the FAR. For this reason, we use, as a first step, the RKPLS method which solves the problem of memory and computation time when the number of observations become large. Mainly for dynamic-process monitoring, we can find a significant complication. Hence, the principal objective is to develop the proposed MW-RKPLS method.

MW-RKPLS formulation

The basic idea of the MW-RKPLS technique is based on a fixed window along data in real-time. The moving window technique allows the used algorithm to operate in an online mode in a time-varying environment. The general principle of the moving window is to eliminate the oldest sample and then add a newly available one (Jiang and Yan 2013; Jaffel et al. 2016). The main procedure of the suggested FD method is divided in both phases. The two phases are performed, as follows:

Offline RKPLS model identification: At the first place, we focus on determining an RKPLS model, presenting the most loaded samples in terms of information, which will be used for online monitoring. The reduced reference model is built, which adequately described the normal operating condition.

Let consider a data set X \(\in R^{N' \times m}\) and Y\(\in R^{N' \times J}\), where \(N'\) indicates the size of the moving window. All RKPLS stages, already presented in the previous sections, are executed to determine a reduced-model identification. To update the RKPLS model, we adopt the moving window method. The algorithm of this step is defined in the next section.

Online RKPLS model update by moving window forFD: At the second place, we focus on updating the model and downdating the kernel matrix. The online procedure consists in updating the reduced model if and only if a new normal sample presents useful information about the monitored system. The update strategy consists in adding the next observation to the reduced data set as follows.

The SPE index using the RKPLS, as indicate by Eq. (16), can evaluate the new observation if \(x_{k+1}\) is used to update the RKPLS model. Afterwards, we calculate the projection of the new observation \(\phi _{k+1}\) once it is considered as a healthy observation, on the spaces panned by \(\lbrace \phi (x^{(j)}_{Latent})\rbrace _{j=1,2...L}\). In this case, the projection of the \(j^{th}\) component is defined by Eq. (20) and is noted by \(\hat{\phi }(x_{k+1})\):

$$\begin{aligned} \hat{\phi }(x_{k+1})_{j}=\langle \phi (x^{(j)}_{Latent}), \phi _{k+1} \rangle ; j=1,...,L \end{aligned}$$
(20)

To get a good approximation of the \(\phi _{k+1}\), it is necessary to verify the condition indicated by Eq. (13), and it can be presented as follows:

$$\begin{aligned} \vert \Vert \hat{\phi }(x_{k+1})\Vert - \Vert \phi _{k+1} \Vert \vert < \varsigma \end{aligned}$$
(21)

where \(\varsigma \) is a given threshold.

Thereafter, we have two cases. If the condition is true, the updates of the RKPLS model are not implemented. Otherwise, if Eq. (21) is not satisfied, the reduced data set \(\lbrace \phi (x^{(j)}_{Latent})\rbrace _{j=1,2...L}\) does not adequately approach \(\phi _{k+1}\). Consequently, the update RKPLS model is based on the moving window technique. The main objective is to add \(x_{k+1}\) to the reduced data set, as shown the following equation:

$$\begin{aligned} \lbrace \phi (x^{(j)}_{Latent})\rbrace _{j=1,2...L+1}=\lbrace \lbrace \phi (x^{(j)}_{Latent})\rbrace _{j=1,2...L}, \phi _{k+1} \rbrace \end{aligned}$$
(22)

Then the reduced kernel matrix presents an update at the level of the last row and column, which can be expanded. Thus, the kernel matrix \(K^{r}_{L+1}\) is given by:

$$\begin{aligned} K^{r}_{L+1}= \left[ {\begin{array}{c c} K^{r}_{L} &{} a \\ \\ a^{T} &{} b \\ \end{array} } \right] \in \mathfrak {R}^{(L+1) \times (L+1)}, \end{aligned}$$

where a is a vector such that \(a_{i}=k(x^{i}_{k},x_{k+1})_{i=1,...,L}\) and \(b=k(x_{k+1}, x_{k+1})\) are scalar.

The following step consists to downdate the \(K^{r}_{L+1}\) matrix, from the reduced data set, by excluding the influence of the oldest observation. In this case, it is necessary to remove the first row and the first column of \(K^{r}_{L+1}\). We denote by \(\hat{K}^{r}_{L}\in \mathfrak {R}^{L \times L}\) the resultant matrix where the reduced data set becomes \(\lbrace \phi (x^{(j)}_{Latent})\rbrace _{j=1,2...,L}=\lbrace \lbrace \phi (x^{(j)}_{Latent})\rbrace _{j=2...L}, \phi _{k+1} \rbrace \). This fact makes the suggested MW-RKPLS practical for many complex FD problems.

Flowchart of proposed MW-RKPLS

The main algorithmic steps of the proposed MW-RKPLS are shown as follows:

Offline phase

  1. Step 1:

    Select an initial standardized training data set of input/output and set kernel parameter.

  2. Step 2:

    Construct the kernel matrix K and scale it.

  3. Step 3:

    Estimate initial KPLS model, so determine the L latent variables.

  4. Step 4:

    Project \(\lbrace \phi _{i}\rbrace _{i=1,..,n}\) and choose the \( \lbrace \phi (x^{(j)}_{Latent})\rbrace _{j=1,..,L}\) that satisfy Eq. (13).

  5. Step 5:

    Construct the reduced kernel matrix \( K^{r}_{L} \in \mathfrak {R}^{L \times L}\).

  6. Step 6:

    Determine initial RKPLS model.

  7. Step 7:

    Determine initial control limit of the SPE statistic.

Online phase

  1. Step 8:

    Obtain a new testing observation \(x_{k+1}\) and scale it.

  2. Step 9:

    Evaluate FD index SPE for \(x_{k+1}\). If control limit is not exceeded, the new observation \(x_{k+1}\) is considered normal, so go to step 10 otherwise, turn to step 8.

  3. Step 10:

    According to the condition mentioned by Eq. (21), if it is satisfied, it is not necessary to update the RKPLS model and turn to step 8, otherwise go to the next step.

  4. Step 11:

    Update the RKPLS model, download the kernel matrix \( K^{r}_{L} \in \mathfrak {R}^{L \times L}\) and reject the oldest observation.

  5. Step 12:

    Update the number of LVs and the \(X_{r}\) matrix.

  6. Step 13:

    Update the control limit of the SPE index of the monitoring statistic and the RKPLS model.

  7. Step 14:

    To get a new testing data, return to step 8.

To sum up, the FD flowchart of the MW-RKPLS is illustrated in Fig. 4.

Fig. 4
figure 4

Flowchart of suggested MW-RKPLS

Selection of kernel parameter using tabu search algorithm

Selection principle of kernel parameter

In this paper, the kernel methods are the keys of the used method, which is based on the kernel function and the kernel parameters. In general, for systems diagnostic using the KPLS method, the Gaussian kernel RBF is the most used choice, as a nonlinear function. To have an optimal shape, it is necessary to choose the width of parameter \(\sigma \). This parameter is the key element of the Gaussian kernel RBF and presents a direct influence on the ability of the KPLS method. Thus, \(\sigma \) presented in the kernel function has an important effort on the partitioning outcome in the F feature space. In this section, an optimization approach is presented to select an optimal Gaussian kernel parameter, which is used in the suggested RKPLS method. The large value in \(\sigma \) results to over-fitting, and the small value of \(\sigma \) results in under-fitting. Furthermore, the correct optimal kernel parameter is usually presented as the one that can improve the FD performance. For this reason, the choice of the \(\sigma \) parameter needs to be treated by the given application. The tabu search method is used to optimize the \(\sigma \) parameter to apply for the RKPLS algorithm (Scrich et al. 2004).

Initial solution

In this context, we focus on the use of a meta-heuristic called the tabu search. The general idea is to determine the initial solution using the tabu search algorithm to find an optimal \(\sigma \) value to the RKPLS model. An initialization solution is presented at random. It is recommended to introduce the parameter \(\sigma \) constraints in an interval \( \sigma \in \mathopen {[}2^{-6},2^{6}\mathclose {]}\) to reduce the search space for the RKPLS model. To improve the FD performance, the solution is determined by the nearest unused neighbor values of the parameter. The process repeats until all the neighbors are visited.

Simulation results

The considered performances to evaluate the suggested method are the FAR, the GDR and the CT. The FAR is considered as the ratio between the total time of false alarms and the total time of faultless data (Lahdhiri et al. 2017):

$$\begin{aligned} FAR=\dfrac{Violated \, samples}{Faultless \, data}\% \end{aligned}$$
(23)

The GDR is calculated as the ratio between the total time of the detected faults and the total time when the system is not operating properly:

$$\begin{aligned} GDR=\left( \dfrac{Violated \, samples}{Faulty \, data}+ \dfrac{Not \, violated \, samples}{Faultless \, data}\right) \% \end{aligned}$$
(24)

Static RKPLS method

In this section, we demonstrate the FD performances of the suggested RKPLS method. To evaluate the yield of our proposed FD statistic, simulation on a chemical reactor CSTR and on a TEP is presented. Nevertheless, it is compared with the conventional KPLS method proposed in the literature.

Case study on CSTR benchmark process

As a first try, we start with the non-isothermal CSTR process, used to conduct chemical reactions. The dynamic model of the system is described by the following equations:

$$\begin{aligned} \left\{ \begin{array}{l} \frac{dC_{A}}{dt}=\frac{F}{V}(C_{A0}-C_{A})-k_{0} e^{-E/RT}C_{A}\\ \frac{dT_{A}}{dt}=\frac{F}{V}(T_{A0}-T_{A})+ \frac{(-\Delta H)k_{0}}{\rho C_{p}}e^{-E/RT}C_{A}-\frac{q}{V \rho C_{p}}\\ q=\frac{aF^{b+1}}{F_{c}+(\frac{aF^{b}}{2\rho _{c} C_{pc}})}(T-T_{cin}) \end{array} \right. \end{aligned}$$
(25)

where the variables used to construct the data matrix are presented in Table 1.

Table 1 Measurement variables in CSTR benchmark process
Table 2 Summary of good detection rates, false alarm rates, and computation time for TEP data for case 1: fault in temperature T
Fig. 5
figure 5

Monitoring faults in temperature using KPLS and RKPLS techniques in sample intervals of [250–350]

In this paper, the input matrix X is composed of a cooling water flow rate, a reactant flow rate, temperature, and concentration at the exit of the CSTR. The input vector is expressed by:

$$\begin{aligned} X=[F_{c} \,\, F \,\, C_{A} \,\, T] \end{aligned}$$
(26)

Temperature T and concentration \(C_{A}\) are controlled using proportional integral controllers by manipulating the inlet cooling water flow rate \(F_{c}\) and the feed flow rate F, respectively.

To evaluate the performance of the suggested RKPLS, the number of used observations is equal to 1000. In addition, the training data \(X_{training}\) is computed on 500 samples, and the so is for the testing data \(X_{testing}\). Thereafter, the tabu search method is used to compute the optimal value of \(\sigma \). In this case and for the CSTR system, the optimal \(\sigma \) value is equal to 4.5.

In the simulated CSTR process, two fault scenarios representing two different types of faults are generated.

  • Fault 1 is a step bias of the sensor measuring the temperature T of the reactor, rather than its range of variation. The fault is introduced between instances 250 and 350.

  • Fault 2 is the similar fault of the sensor measuring the concentration \(C_{A}\). The fault is introduced between instances 300 and 400.

Case 1: Fault in temperature T

In the first case study, we introduce a fault in temperature during sample intervals from 250 to 350. The FD results are provided in Table 2.

Figure 5a show the FD results of the KPLS-based SPE and KPLS-based EWMA-SPE techniques. Nevertheless, Fig. 5b presents the FD results of the proposed RKPLS-based SPE and RKPLS-based EWMA-SPE techniques.

Fig. 6
figure 6

Monitoring faults in concentration using KPLS and RKPLS techniques in sample intervals of [300–400]

The suggested RKPLS provides a reduced kernel matrix with 10 observations. From Fig. 5 and Table 2, we notice that the proposed RKPLS-based SPE provides better results compared to the classical KPLS-based SPE technique from the FDR and CT point of view. On the other hand, the results of the proposed RKPLS-based EWMA-SPE demonstrate also that a GDR compared to the suggested RKPLS-based SPE and good results with some false alarm rates compared to the conventional KPLS-based EWMA-SPE technique. Afterwards, the FD results indicate that the suggested RKPLS-based SPE and RKPLS-based EWMA-SPE techniques give a good performance compared to the classical KPLS-based SPE technique.

Table 3 Summary of good detection rates, false alarm rates, and computation time for TEP data for case 2: Fault in concentration \(C_{A}\)

Case 2: Fault in concentration \(C_{A}\)

In this section, we introduce a fault in concentration in sample intervals of [300 to 400]. The FD results are given in Table 3.

Figure 6a depicts the FD results of the KPLS-based SPE and KPLS-based EWMA-SPE techniques. Furthermore, Fig. 6b displays the FD results of the suggested RKPLS-based SPE and RKPLS-based EWMA-SPE techniques.

In this case, we notice that the proposed detection strategy RKPLS based on SPE shows an improvement at the level of FDR compared to KPLS technique. Therefore, we observe through Table 3 that the suggested RKPLS drastically reduces the CT, which is very useful for a real time application. Then Fig. 6 and Table 3 highlight that the developed RKPLS-based EWMA-SPE provides better results compared to the KPLS-based EWMA-SPE. Finally, the FD results given by the CSTR benchmark process, in the two studied cases, show a good performance for the proposed RKPLS compared to the classical KPLS.

Next, to better study the performance of the proposed method, the RKPLS algorithm is illustrated on the TEP.

Case study on TEP

In this part, the effectiveness of the proposed RKPLS for process monitoring is investigated by applying the TEP. The TEP is a complex nonlinear and dynamic process (Downs and Vogel 1993). Actually, the TEP is developed to provide a realistic industrial process for evaluating control and monitoring approaches, including the KPLS and the fisher discriminant analysis. The five major units of this process are a reactor, a stripper, a condenser, a recycle compressor and a vapor-liquid separator, as depicted in Fig. 7. The TEP contains two products of variables G and H from four reactants: A, C, D and E. The reaction scheme is as follows:

$$\begin{aligned} \left\{ \begin{array}{l} A(g)+C(g)+D(g)\rightarrow G(liq);\\ A(g)+C(g)+E(g)\rightarrow H(liq);\\ A(g)+E(g)\rightarrow F(liq);\\ 3D(g)\rightarrow 2F(liq);\\ \end{array} \right. \end{aligned}$$
(27)
Fig. 7
figure 7

Flow diagram of TEP

Table 4 Measurement variables in TEP

For the TEP, modeling, identification and monitoring represent a challenge for the control community and represent as well the subject of several studies (Fezai et al. 2018; Jaffel et al. 2016). Concerning the \(X_{train}\) matrix, we find 22 variables continuously measured among 41 process variables. We find also 19 variables that are compositional measures for the quality matrix \(Y_{train}\). All variables used to construct the data matrix are presented in Table 4. For the testing data set, the fault is introduced at 224 observations. TEP data for training the model were presented in Lahdhiri et al. (2018), where 21 faults types could be introduced, as presented in Table 5.

However, Fig. 8 shows the FD performance of the RKPLS-based SPE and RKPLS-based EWMA-SPE techniques for IDV 1 fault test data. The proposed RKPLS method gives 0.2366 s as a CT. On the other hand, we have 0.6991 s for the KPLS method. The optimal value of \(\sigma \) for the TEP system, from the solution given by the tabu search method, is equal to 10.

Tables 6 and 7 provide the FAR and GDR for some test data sets of faults to validate the suggested method. Then the undetected faults are mentioned in the tables by “–”.

Table 5 List of monitoring variables in TEP
Fig. 8
figure 8

Monitoring TEP IDV 1 fault using KPLS and RKPLS techniques

The suggested RKPLS provides a reduced kernel matrix with 135 observations. The RKPLS-based SPE as well as the RKPLS based EWMA-SPE statistic shows better FD than the KPLS-based SPE, and the KPLS based EWMA-SPE statistics, as depicted in Fig. 8. The KPLS-based EWMA-SPE method shows slight FD improvement compared to the KPLS-based SPE. Afterwards, the RKPLS-based SPE and the RKPLS-based EWMA-SPE demonstrate a better FD performance with the FAR, as represented the Table 6, and a better GDR as given in Table 7.

Table 6 False alarm rates (%) for TEP data
Table 7 Good detection rates (%) for TEP data

MW-RKPLS method

In this section, we present the FD performances of the suggested MW-RKPLS. In order to illustrate that the KPLS and the RKPLS are not appropriate for monitoring non-stationary processes, Tables 6 and 7 show that many defects are undetectable. We note that the number of FARs provided by the KPLS and RKPLS approaches are undesirable, especially for the dynamic and complex systems. This can be explained by the fact that the KPLS and the RKPLS are unable to adequately control non-stationary processes since they are based on the use of a fixed model.

For the dynamique and real systems, the MW-RKPLS method represents several advantages over the static one. In this part, we consider two real, complex and dynamic systems. At the first place, the study is based on the TEP, which is a highly nonlinear and dynamic process. At the second place, the air quality system takes place, which is a real system.

Table 8 Performances of MW-KPLS and MW-RKPLS for online FD

Case study on TEP

The benchmark of the TEP, being real data coming from the sensors, is very used for the FD procedure in research. In the static mode, the model can not correctly track the variation and change in the complex TEP. This problem produces more difficulty for the FD in the process. Many faults are not well detected by the static method. In the rest of this section, we repeat the simulation with the online MW-RKPLS method.

Nevertheless, Fig. 9 depictes the FD performance of the MW-KPLS-based SPE and the MW-RKPLS-based SPE for IDV 1 fault test data. The updated numbers of LVs using the MW-RKPLS and the MW-KPLS are presented in Fig. 10.

Fig. 9
figure 9

Monitoring TEP IDV 1 fault using MW-KPLS and MW-RKPLS techniques

Fig. 10
figure 10

Evolution of number of LVs with MW-KPLS and MW-RKPLS

The detection results, in terms of FAR and GDR, of the two simulated methods (MW-RKPLS and MW-KPLS) using the SPE index in the normal operation condition are provided in Table 8.

According to Table 8, the MW-RKPLS based SPE presents good performances in term of FAR, and all faults are detected compared to both static approaches based on the EWMA-SPE.

We notice that the MW-RKPLS algorithm provides a comparable performance in terms of FAR than the MW-KPLS. The suggested MW-RKPLS method gives 0.9971 s as a CT. On the other hand, we have 1.304 s for the KPLS method. It is highlighted, in this part that the MW-RKPLS has better performances than the MW-KPLS, especially in terms of average CT and computation cost. Then the online method has better performances compared to the static one in terms of FAR and FD procedure.

Case study on air quality

To better control the suggested method, we use an air quality monitoring network AIRLOR, which is operating in Lorraine, France. The AIRLOR monitoring network consists of 20 stations placed in several sites: rural, urban and peri-rural. Six neighbor measurement stations are reserved to the registering of some pollutants, like sulfur dioxide (\(SO_{2}\)), carbon monoxide (CO), ozone (\(O_{3}\)) and nitrogen oxides (NO and \(NO_{2}\)) (Bell et al. 2004; Harkat et al. 2006).

Afterwards, the principal idea is to detect sensor faults of the measure ozone concentration \(O_{3}\) and the nitrogen oxides NO and \(NO_{2}\). The phenomenon of the photochemical pollution presents, in fact, a dynamic nonlinear behavior. For this reason, we use, in this part, the proposed MW-RKPLS method. The observation vector X contains 18 monitored variables, respectively named \(\upsilon _{1}\) to \(\upsilon _{18}\), including an ozone concentration \(O_{3}\), a nitrogen oxide, and a nitrogen dioxide collected from each station, as depicted in Eq. (28).

$$\begin{aligned} x(k)= \begin{bmatrix} \underbrace{ \upsilon _{1}(k) \upsilon _{2}(k)\upsilon _{3}(k)}_{Station 1}&\ldots&\underbrace{\upsilon _{10}(k) \upsilon _{11}(k)\upsilon _{12}(k)}_{Station 4}&\ldots&\underbrace{\upsilon _{16}(k) \upsilon _{17}(k)\upsilon _{18}(k)}_{Station 6} \end{bmatrix}^{T} \end{aligned}$$
(28)
Fig. 11
figure 11

Monitoring faults in the ozone \(O_{3}\) using MW-KPLS and MW-RKPLS techniques in sample intervals of [400–500]

Case 1: Fault in ozone \(O_{3}\)

To illustrate the effectiveness of the suggested MW-RKPLS method for FD, a bias fault is simulated on the variable \(\upsilon _{10}(k)\) between observations 400 and 500. The magnitude of the fault is equal to 30% of the range of its variation.

Next, Fig. 11 shows the evolution of the SPE index using the MW-KPLS and the MW-RKPLS. The updated numbers of LVs using the MW-RKPLS and the MW-KPLS, for case 1, are presented in Fig. 12.

The compared performances of the suggested MW-RKPLS in terms of FAR, GDR and CT are summarized in Table 9.

Fig. 12
figure 12

Evolution of number of LVs with MW-KPLS and MW-RKPLS, for \(O_{3}\)

The MW-RKPLS approach is compared to the MW-KPLS approach. It is indicated that the MW-RKPLS method has better performances than the MW-KPLS method, especially in terms of average CT and FAR, as presented in Table 9 and Fig. 11.

Case 2: Fault in nitrogen oxides \(NO_{2}\)

In this part, to illustrate the effectiveness of the suggested MW-RKPLS method for FD, a bias fault is simulated on the variable \(\upsilon _{15}(k)\) between observations 250 and 350. The magnitude of the fault is equal to 30% of the range of its variation.

The monitoring results of the MW-KPLS and the MW-RKPLS, using the SPE index for the nitrogen oxides \(NO_{2}\) bias fault, are depicted in Fig. 13.

Table 9 Summary of good detection rates, false alarm rates, and computation time for TEP data for case 1: fault in ozone \(O_{3}\) in online mode
Fig. 13
figure 13

Monitoring faults in nitrogen oxides \(NO_{2}\) using MW-KPLS and MW-RKPLS techniques in sample intervals of [250–350]

Fig. 14
figure 14

Evolution of number of LVs with MW-KPLS and MW-RKPLS, for \(NO_{2}\)

From this figure, the injected fault is clearly detected in time. Figure 14 shows the updated numbers of LVs using the MW-RKPLS and the MW-KPLS.

The compared performances of the suggested MW-RKPLS in terms of FAR, GDR and CT are summarized in Table 10.

According to Tables 9 and 10, we observe that the evaluation of the FAR and GDR, for the proposed method, is always the best compared to the MW-KPLS in both different cases. Therefore, we deduce that the suggested MW-RKPLS drastically decreases the CT, which is very useful for a real time application.

Our proposed reduced method is much less expensive in terms of memory and time than the other standard KPLS methods, such as the MW-KPLS and the KPLS, which confirms the efficiency of the suggested MW-RKPLS as well as the statistic one.

Table 10 Summary of good detection rates, false alarm rates, and computation time for TEP data for case 2: fault in nitrogen oxides \(NO_{2}\)

Conclusion

In this research, we have used a new FD method applicable to the process monitoring using KPLS in static and dymanic forms. Furthermore, we have put forward a reduced KPLS method based on the SPE index for nonlinear dynamic process monitoring. However, the process data and product quality data are readily modeled utilizing LVs methods, like PLS.

The idea of this paper is to handle a reduced data-driven method for FD in online version. Our main contribution is to use first the RKPLS which solves the problem of CT and the storage of variables. In fact, we adopt only the observations rich with information. Second, we suggest the MW-RKPLS to update the reduced model and to better monitor a real data.

Nevertheless, the RKPLS-based SPE and RKPLS-based EWMA-SPE FD performances are assessed and compared to those of the classical KPLS-based SPE. On the other hand, we have proposed an improved RKPLS method, called the MW-RKPLS, for nonlinear dynamic process monitoring. Firstly, the FD of the CSTR benchmark process and of the dynamic highly nonlinear system TEP using the suggested RKPLS-based SPE and the RKPLS based EWMA-SPE have been addressed to evaluate the performances of the developed techniques. Then the MW-RKPLS method has been compared to the MW-KPLS method using the nonlinear system TEP and the air quality monitoring network data.

With the RKPLS method, the results have been satisfactory relative to the classical KPLS-based SPE and the KPLS-based EWMA-SPE. More precisely, the results have demonstrated the efficiency of the developed technique in terms of false alarm rate, good detection rate and computation time, compared with the conventional fault detection KPLS. To solve the problem of detection, the dynamic MW-RKPLS method has been tested on highly dynamic systems, and the results have been satisfactory compared to the static method. Most importantly, the MW-RKPLS has had better performances than the MW-KPLS, especially in terms of average CT and FAR.

Finally, the performances and good scaling properties of the suggested method have been proved through several experiments. The suggested MW-RKPLS method may be very helpful to design a real time monitoring strategy for fault reconstruction and isolation.