1 Introduction

Since the introduction of Support Vector Machines (SVM) [19], many learning algorithms have been transferred to a kernel representation [2, 7]. The important benefit lies on the fact that nonlinearities can be allowed, while avoiding to solve a nonlinear optimization problem. The transfer is implicitly accomplished by means of a nonlinear map in a Reproducing Kernel Hilbert Space F k (RKHS) [9].

Kernel methods have been successfully applied to a large class of problems, such as identification of nonlinear system [2, 3, 16, 17], diagnostic system [24], time series prediction [13], face recognition [26], biological data processing for medical diagnosis [23]…. The attractiveness of such algorithms stands from their elegant treatment of data issued from nonlinear processes. However, these techniques suffer from computational complexity as the amount of computer memory and the training time increases rapidly with the number of observations. It is clear that for large datasets (as for example in image processing, computer vision or object recognition), the kernel method with its powerful advantage of dealing with nonlinearities is computationally limited. For large datasets, an eigen decomposition of Gram matrix can simply become too time-consuming to extract the principal components and therefore the system parameter identification becomes a tough task. To overcome this burden, recently a theoretical foundation for online learning algorithm with kernel method in reproducing kernel Hilbert spaces was proposed [4, 5, 14, 15, 18, 22, 25]. Also online kernel algorithm is more useful when the system to be identified is time-varying, because these algorithms can automatically track changes of system model with time-varying and time lagging characteristic.

In this paper, we propose a new method for online identification of a nonlinear system parameters modeled on Reproducing Kernel Hilbert Space (RKHS). This method uses the Reduced Kernel Principal Component Analysis (RKPCA) that selects the observation data to approach the Principal Components Analysis kept by the Kernel Principal Component Analysis method KPCA [2]. The selected observations are used to build an RKHS model with a reduced parameter number. The proposed online identification method updates the list of the retained principal components and then the RKHS model by evaluating the error between the output model and the process one. The proposed technique may be very helpful to design an adaptive control strategy of nonlinear systems.

The paper is organized as follows. In Sect. 2, we remind the Reproducing Kernel Hilbert Space (RKHS). Section 3 is devoted to the modeling in RKHS. The Reduced Kernel Principal Component Analysis RKPCA method is presented in Sect. 4. In Sect. 5, we propose the new online RKPCA method. The proposed algorithm has been tested to identify the Wiener-Hammerstein benchmark [21] and a chemical reactor [8].

2 Reproducing kernel Hilbert space

Let \( E \subset {\mathbb R}^{d} \) be an input space and L 2(E) the Hilbert space of square integrable functions defined on E. Let \( k:E \times E \to \mathbb{R} \) be a continuous positive definite kernel. It is proved [6, 9] that it exists as a sequence of an orthonormal eigen functions (ψ 1, ψ 2, …, ψ l ) in L 2(E) and a sequence of corresponding real positive eigenvalues (σ 1, σ 2, …, σ l ) (where l can be infinite) so that

$$ k(x,t) = \sum\limits_{j = 1}^{l} {\sigma_{j} \psi_{j} (x)\psi_{j} (t)} ;\quad x,t \in E. $$
(1)

Let \( F_{k} \subset L^{2} (E) \) be a Hilbert space associated to the kernel k and defined by:

$$ F_{k} = \left\{ {f \in L^{2} (E)/f = \sum\limits_{i = 1}^{l} {w_{i} \varphi_{i} } {\text{ and }}\sum\limits_{j = 1}^{l} {{\frac{{w_{j}^{2} }}{{\sigma_{j} }}}} < +\infty } \right\} $$
(2)

where \( \varphi_{i} = \sqrt {\sigma_{i} } \psi_{i} \) i = 1, …, l. The scalar product in the space F k is given by:

$$ \left\langle {f,g} \right\rangle_{{F_{k} }} = \left\langle {\sum\limits_{i = 1}^{l} {w_{i} \varphi_{i} } ,\sum\limits_{j = 1}^{l} {z_{j} \varphi_{j} } } \right\rangle_{{F_{k} }} = \sum\limits_{i = 1}^{l} {w_{i} z_{i} } $$
(3)

The kernel k is said to be a reproducing kernel of the Hilbert space F k if and only if the following conditions are satisfied.

$$ \left\{ {\begin{array}{*{20}c} {\forall x \in E,\quad k\left( {x, \cdot } \right) \in F_{k} } \hfill \\ {\forall x \in E \quad {\text{ and }} \quad \forall f \in F_{k} ,\left\langle { \, f( \cdot ),k(x, \cdot )} \right\rangle_{{{F_k} }} = f(x)} \hfill \\ \end{array} } \right. $$
(4)

where k(x,·) means \( k(x,x^{\prime } )\quad \forall x^{\prime } \in E \). F k is called reproducing kernel Hilbert space (RKHS) with kernel k and dimension l. Moreover, for any RKHS, there exists only one positive definite kernel and vice versa [10].

Among the possible reproducing kernels, we mention the Radial Basis function (RBF) defined as:

$$ k(x,t) = \exp \left( { - \left\| {x - t} \right\|^{2} /2\mu^{2} } \right) ;\quad \quad \forall x,t \in E $$
(5)

with μ a fixed parameter.

3 RKHS models

Consider a set of observations \( \{ x^{(i)} ,y^{(i)} \}_{i = 1, \ldots ,M} \) with \( x^{(i)} \in {\mathbb R}^{n} ,y^{(i)} \in \mathbb{R} \) are respectively the system input and output. According to the statistical learning theory (SLT) [19, 20], the identification problem in the RKHS F k can be formulated as a minimization of the regularized empirical risk. Thus, it consists in finding the function \( f^{*} \in F_{k} \) such that

$$ f^{*} = \sum\limits_{j = 1}^{l} {w_{j}^{*} \varphi_{j} } = \mathop {\min }\limits_{{f \in F_{k} }} \frac{1}{M}\sum\limits_{i = 1}^{M} {\left( {y^{(i)} - f\left( {x^{(i)} } \right)} \right)^{2} } + \lambda \left\| f \right\|_{{F_{k} }}^{2} $$
(6)

where M is the measurement number and λ is a regularization parameter chosen in order to ensure a generalization ability to the solution f *. According to the representer theorem [9], the solution f * of the optimization problem (6) is a linear combination of the kernel k applied to the M measurements x (i)i = 1, …, M, as:

$$ f^{*} (x) = \sum\limits_{i = 1}^{M} {a_{i}^{*} k\,x^{(i)} ,x} .$$
(7)

To solve the optimization problem (6), we can use some kernel methods such that Support Vector Machine (SVM) [11], Least Square Support Vector Machine (LSSVM) [7], Regularization Network (RN) [3], Kernel Partial Least Square (KPLS) [12], …. In [2], the Kernel Principal Component Analysis KPCA were proposed. This method reconsiders the regularization idea by finding the solution to the identification problem in some subspace F kpca spanned by the so-called principal component and yields to a RKHS model with M parameters.

In the next section, we present the Reduced KPCA method in which we approximate the retained principal components given by the KPCA with a set of vectors of input observations. This approximation is performed by a set of particular training observations and allows the construction of a RKHS model with much less parameters.

4 RKPCA method

Let a nonlinear system with an input \( u \in \mathbb{R} \) and an output \( y \in \mathbb{R} \) from which we extract a set of observations be \( \{ u^{(i)} ,y^{(i)} \}_{i = 1, \ldots ,M} \). Let F k be an RKHS space with kernel k. To build the input vector x (i) of the RKHS model, we use the NARX (Nonlinear auto regressive with eXogeneous input) structure as:

$$ x^{(i)} = \left\{ {u^{(i)} , \ldots ,u^{{(i - m_{u} )}} , \, y^{(i - 1)} , \ldots , \, y^{{(i - m_{y} )}} } \right\}^{T} ;\quad m_{u} ,m_{y} \in \mathbb{N} $$
(8)

The set of observations becomes \( D = \{ x^{(i)} ,y^{(i)} \}_{i = 1, \ldots ,M} \) where \( x^{(i)} \in {\mathbb R}^{{m_{u} + m_{y} + 1}} \) and \( y^{(i)} \in \mathbb{R} \) and the RKHS model of this system based on (7) can be written as:

$$ \tilde{y}^{(j)} = \sum\limits_{i = 1}^{M} {a_{i} k(x^{(i)} ,x^{(j)} )} $$
(9)

Let the application Φ:

$$ \begin{gathered} \Upphi :E \to {\mathbb R}^{l} \hfill \\ \, x \,\mapsto\, \Upphi (x) = \left( {\begin{array}{*{20}c} {\varphi_{1} (x)} \\ \vdots \\ {\varphi_{l} (x)} \\ \end{array} } \right) \hfill \\ \end{gathered} $$
(10)

where φ i are given in (2).

The Gram matrix K associated with the kernel k is an M-dimensional square matrix so that

$$ K_{i,j} = k(x^{(i)} ,x^{(j)} )\quad \quad {\text{for}}\quad i,j = 1, \ldots ,M $$
(11)

The kernel trick [10] is so that

$$ \left\langle {\Upphi (x),\Upphi (x^{\prime } )} \right\rangle = k(x,x^{\prime } )\quad x,\quad x^{\prime } \in E $$
(12)

We assume that the transformed data \( \{ \Upphi (x^{(i)} )\}_{i = 1, \ldots ,M} \in {\mathbb R}^{l} \) are centered [2]. The empirical covariance matrix of the transformed data is symmetrical and l-dimensional. It is written as following:

$$ C_{\phi } = \frac{1}{M}\sum\limits_{i = 1}^{M} {\Upphi (x^{(i)} )\Upphi (x^{(i)} )^{T} } , \quad C_{\phi } \in {\mathbb R}^{l \times l} $$
(13)

Let l be the number of the eigenvectors \( \{ V_{j} \}_{{j = 1, \ldots ,l^{\prime } }} \) of the C ϕ matrix that corresponds to the nonzeros positive eigenvalues \( \{ \lambda_{j} \}_{{j = 1, \ldots , \, l^{\prime } }} \). It is proved in [2] that the number l is less or equal to M.

Due to the large size l of C ϕ , the calculus of \( \{ V_{j} \}_{{j = 1, \ldots , \, l^{\prime } \, }} \) can be difficult. The KPCA method shows that these \( \{ V_{j} \}_{{j = 1, \ldots , \, l^{\prime } \, }} \) are related to the eigenvectors \( \{ \beta_{j} \}_{{j = 1, \ldots ,l^{\prime } \, }} \) of the gram matrix K according to [1]:

$$ V_{j} = \sum\limits_{i = 1}^{M} {\beta_{j,i} } \Upphi (x^{(i)} ),\quad j = 1, \ldots ,l^{\prime } $$
(14)

where \( (\beta_{j,i} )_{j = 1, \ldots ,p} \) are the components of \( \{ \beta_{j} \}_{{j = 1, \ldots ,l^{\prime } }} \) associated to their nonzero eigenvalues \( \mu_{1} > \cdots > \mu_{{l^{'} }}. \)

The principle of the KPCA method consists in organizing the eigenvectors \( \{ \beta_{j} \}_{{j = 1, \ldots ,l^{\prime } }} \) in the decreasing order of their corresponding eigenvalues \( \{ \mu_{j} \}_{{j = 1, \ldots ,l^{\prime } }} \). The principal components are the p first vectors \( \{ V_{j} \}_{j = 1, \ldots ,p} \) associated with the highest eigenvalues and are often sufficient to describe the structure of the data [1, 2]. The number p satisfies the Inertia Percentage criterion IPC given by:

$$ p^{*} = \arg ({\text{IPC}} \ge 99) $$
(15)

where

$$ {\text{IPC}} = {\frac{{\sum\nolimits_{i = 1}^{p} {\mu_{i} } }}{{\sum\nolimits_{i = 1}^{M} {\mu_{i} } }}} \times 100 $$
(16)

The RKHS model provided by the KPCA method is [1].

$$ \tilde{y}^{{({\text{new}})}} = \sum\limits_{q = 1}^{p} {w_{q} } \sum\limits_{i = 1}^{M} {\beta_{q,i} } k(x^{i} ,x^{{({\text{new}})}} ) $$
(17)

Since the principal components are a linear combination of the transformed input data \( \{ \Upphi (x^{i} )\}_{i = 1, \ldots ,M} \) [3], the Reduced KPCA approaches each vector \( \{ V_{j} \}_{j = 1, \ldots ,p} \) by a transformed input data \( \Upphi (x_{b}^{(j)} ) \in \{ \Upphi (x^{i} )\}_{i = 1, \ldots ,M} \) having a high projection value in the direction of V j [1].

The projection of the \( \Upphi (x^{(i)} ) \) on the V j called \( \tilde{\Upphi }(x^{(i)} )_{j} \in \mathbb{R} \) and can be written as:

$$ \tilde{\Upphi }(x^{(i)} )_{j} = \left\langle {V_{j} ,\Upphi (x^{(i)} )} \right\rangle ,\quad \, j = 1, \ldots ,p $$
(18)

According to (14) and (12), the relation (18) is written:

$$ \tilde{\Upphi }(x^{(i)} )_{j} = \sum\limits_{m = 1}^{M} {\beta_{j,m} } k(x^{(m)} ,x^{(i)} ),\quad \, j = 1, \ldots ,p $$
(19)

To select the vectors \( \{ \Upphi (x_{b}^{(i)} )\} \), we project all the \( \{ \Upphi (x^{(i)} )\}_{i = 1, \ldots ,M} \) vectors on each principal component \( \{ V_{j} \}_{j = 1, \ldots ,p} \) and we retained \( x_{b}^{(j)} \in \{ x^{(i)} \}_{i = 1, \ldots ,M} \) that satisfies

$$ \left\{ {\begin{array}{*{20}c} {\Upphi (x_{b}^{(j)} )_{j} = \mathop {\text{Max}}\limits_{i = 1, \ldots ,M} \tilde{\Upphi }(x^{(i)} )_{j} } \hfill \\ {\text{and}} \hfill \\ {\Upphi (x_{b}^{(j)} )_{i \ne j} < \zeta } \hfill \\ \end{array} } \right. $$
(20)

where ζ is a given threshold.

Once the \( \{ x_{b}^{(j)} \}_{j = 1, \ldots ,p} \) corresponding to the p principal component \( \{ V_{j} \}_{j = 1, \ldots ,p} \) is determined, we transform the vector \( \Upphi (x) \in {\mathbb R}^{l} \) to the \( \hat{\Upphi }(x) \in {\mathbb R}^{p} \) vector that belongs to the space generated by \( \{ \Upphi (x_{b}^{j} )\}_{j = 1, \ldots ,p} \) and the proposed reduced model is

$$ \tilde{y}_{\text{reduced}}^{{({\text{new}})}} = \sum\limits_{j = 1}^{p} {\hat{a}_{j} } \hat{\Upphi }(x^{({\rm new})} )_{j} $$
(21)

where

$$ \hat{\Upphi }(x^{{({\text{new}})}} )_{j} = \left\langle {\Upphi (x_{b}^{(j)} ),\Upphi (x^{{({\text{new}})}} )} \right\rangle \quad \quad {\text{for}}\quad j = 1, \ldots ,p $$
(22)

and according to the kernel trick (12), the model (21) is

$$ \tilde{y}_{\text{reduced}}^{{({\text{new}})}} = \sum\limits_{j = 1}^{p} {\hat{a}_{j} } k_{j} (x^{{({\text{new}})}} ) $$
(23)

where

$$ k_{j} (x) = k(x_{b}^{(j)} ,x)\quad \quad {\text{for}} \quad j = 1, \ldots ,p $$
(24)

The model (23) is less complicate than that provided by the KPCA. The identification problem can be formulated as a minimization of the regularized least square written as:

$$ J_{r} (\hat{a}) = \frac{1}{2}\sum\limits_{i = 1}^{M} {\left( {y^{(i)} - \sum\limits_{j = 1}^{p} {\hat{a}_{j} } k_{j} \left( {x^{(i)} } \right)} \right)^{2} } + {\frac{\rho }{2}}\left\| {\hat{a}} \right\|^{2} $$
(25)

where ρ is a regularization parameter and \( \hat{a} = (\hat{a}_{1} , \ldots ,\hat{a}_{p} )^{T} \) is the parameter estimate vector.

The solution of the problem (25) is

$$ \hat{a}^{*} = \left( {F + \rho I_{p} } \right)^{ - 1} G $$
(26)

With:

$$ \begin{array}{*{20}c} {G = \left( \begin{gathered} \sum\limits_{i = 1}^{M} {k_{1} \left( {x^{(i)} } \right)y^{(i)} } \hfill \\ \vdots \hfill \\ \sum\limits_{i = 1}^{M} {k_{b} \left( {x^{(i)} } \right)y^{(i)} } \hfill \\ \end{gathered} \right) \in {\mathbb R}^{p} \, } \hfill \\ {\text{and}} \hfill \\ {F = \left( \begin{gathered} \sum\limits_{i = 1}^{M} {k_{1} \left( {x^{(i)} } \right)k_{1} \left( {x^{(i)} } \right)} \cdots \sum\limits_{i = 1}^{M} {k_{1} \left( {x^{(i)} } \right)k_{b} \left( {x^{(i)} } \right)} \hfill \\ \vdots \hfill \\ \sum\limits_{i = 1}^{M} {k_{b} \left( {x^{(i)} } \right)k_{1} \left( {x^{(i)} } \right)} \cdots \sum\limits_{i = 1}^{M} {k_{b} \left( {x^{(i)} } \right)k_{b} \left( {x^{(i)} } \right)} \hfill \\ \end{gathered} \right) \in {\mathbb R}^{p \times p} } \hfill \\ \end{array} $$
(27)

And \( I_{p} \in {\mathbb R}^{p \times p} \) is the p identity matrix.

The RKPCA algorithm is summarized by the five following steps:

  1. 1.

    Determine the nonzero eigenvalues \( \{ \mu_{j} \}_{{j = 1, \ldots ,l^{\prime } }} \) and the eigenvectors \( \{ \beta_{j} \}_{{j = 1, \ldots ,l^{\prime } }} \) of Gram matrix K.

  2. 2.

    Order the \( \{ \beta_{j} \}_{{j = 1, \ldots ,l^{\prime } }} \) on the decreasing way with respect to the corresponding eigenvalues.

  3. 3.

    For the p retained principal components, choose the \( \{ (x_{b}^{(j)} )\}_{j = 1, \ldots ,p} \) that satisfy (20).

  4. 4.

    Solving (25) to determine \( \hat{a}^{*} \in {\mathbb R}^{p} \).

  5. 5.

    The reduced RKHS model is given by (23).

5 Online RKPCA method

In this section, we propose an online Reduced Kernel Principal Component Analysis method, which consists on updating the vectors that approximate the principal components. This proposed method is detailed in the following steps.

First step, to determine the optimal value of the parameters μ of the kernel associated to the RKHS model, an offline learning step on an n-observation set \( I = \{ (x^{(1)} ,y^{(1)} ), \ldots ,(x^{(n)} ,y^{(n)} ) \) is carried out till the provided RKHS model approximates correctly the nonlinear system. Then, we apply the RKPCA method to reduce the number of the model parameters and the resulting model is written as:

$$ \tilde{y}(x) = \sum\limits_{j = 1}^{p} {\hat{a}_{j} } k(x_{b}^{(j)} ,x) $$
(28)

Let I n be the set of observations that correspond to the retained principal components. \( I_{n} = \{ x_{b}^{(j)} \}_{j = 1, \ldots ,p} \)

At time instant (n + 1), the RKHS model output is obtained according to (28) as:

$$ \tilde{y}^{(n + 1)} = \sum\limits_{j = 1}^{p} {\hat{a}_{j} } k(x_{b}^{(j)} ,x^{(n + 1)} ) $$
(29)

The error between the estimated output and the real one is

$$ e^{(n + 1)} = \left| {\tilde{y}^{(n + 1)} - y^{(n + 1)} } \right| $$
(30)

If e (n+1) < ɛ 1, where ɛ 1 is a given threshold, the model approaches sufficiently the system behavior. Else an update of the RKHS model is required which can be accomplished either by actualizing the model parameters or by actualizing the retained principal components.

In both cases, we calculate the projection of Φ(x (n+1)) on the space F kpca spanned by \( \{ \Upphi (x_{b}^{(i)} )\}_{j = 1, \ldots ,p} \). This projection is denoted \( \hat{\Upphi }(x^{n + 1} ) \) so that its j ith component is given by:

$$ \hat{\Upphi }\left( {x^{(n + 1)} } \right)_{j} = \left\langle {\Upphi \left( {x_{b}^{(j)} } \right),\Upphi \left( {x^{(n + 1)} } \right)} \right\rangle = k\left( {x_{b}^{(j)} ,x^{(n + 1)} } \right),\quad \, j = 1, \ldots ,p $$
(31)

A good approximation of \( \Upphi (x^{n + 1} ) \) by \( \hat{\Upphi }(x^{n + 1} ) \) requires satisfying the following condition:

$$ \left| { \, \left\| {\hat{\Upphi }\left( {x^{(n + 1)} } \right)} \right\| - \left\| {\Upphi \left( {x^{(n + 1)} } \right)} \right\| \, } \right| < \varepsilon_{2} $$
(32)

The set I n is updated to \( I_{n + 1} = \{ I_{n} , \, x^{(n + 1)} \} \) to determine the parameters \( \{ \hat{a}_{j} \}_{j = 1, \ldots ,p} \) of the RKHS model (28).

If (32) isn’t satisfied, we actualize the set \( \{ x_{b}^{(j)} \}_{j = 1, \ldots ,p} \) using the observation set I n+1 then we built the Gram matrix corresponding to I n+1 given by:

$$ G_{n + 1} = \left( {\begin{array}{*{20}c} {k\left( {x_{b}^{(1)} ,x_{b}^{(1)} } \right)} & \ldots & {k\left( {x_{b}^{(1)} ,x^{(n + 1)} } \right)} \\ \vdots & \ddots & \vdots \\ {k\left( {x^{(n + 1)} ,x_{b}^{(1)} } \right)} & \cdots & {k\left( {x^{(n + 1)} ,x^{(n + 1)} } \right)} \\ \end{array} } \right) $$
(33)

and we compute its eigenvalues.

According to relations (15), (16) and (33), we determine the new p principal component. Then, we used the RKPCA to determine the new set \( \{ x_{b}^{(j)} \}_{{j = 1, \ldots ,p^{\prime } }} \) that approaches the p retained principal components. The RKPCA model is given by:

$$ \tilde{y} = \sum\limits_{j = 1}^{{p^{\prime } }} {\hat{a}_{j} } k\left( {x_{b}^{(j)} ,x} \right) $$
(34)

Finally, we estimate the parameters \( \hat{a}_{j} ;\quad j = 1, \ldots ,p^{\prime } \)

In the following, we summarize the algorithm of the online RKPCA method.

6 Online RKPCA algorithm

6.1 Offline phase

  1. 1.

    According to (15) and (16), we determine the p retained principal components resulting from the processing of an n measurement set. Then, we determine the \( I_{n} = \{ x_{b}^{(j)} \}_{j = 1, \ldots ,p} \) set according to (20). The RKHS model is given by:

    $$ \tilde{y} = \sum\limits_{j = 1}^{p} {\hat{a}_{j} } k(x_{b}^{(j)} ,x) $$

6.2 Online phase

  1. 1.

    At time instant (n + 1), we have a new data (x n+1, y n+1), if e (n+1) < ɛ 1, the model approaches sufficiently the behavior of the system, else we need to update the RKHS model (28) by the projection \( \hat{\Upphi }(x^{n + 1} ) \) given by (31).

  2. 2.

    If (32) is satisfied, we use the set \( I_{n + 1} = \{ I_{n} ,x^{(n + 1)} \} \) to actualize the parameters \( \{ \hat{a}_{j} \}_{j = 1, \ldots ,p} \), else we update the \( \{ x_{b}^{(j)} \}_{j = 1, \ldots ,p} \) set using the I n+1 set and the relations (15), (16) and (33). The new RKHS model is given by (34), and the a j parameters of the model can be determined using the least square method.

7 Simulations

The proposed method has been tested for modeling a Wiener Hammerstein benchmark and a chemical reactor.

7.1 Description of wiener Hammerstein benchmark model

The system to be modelled is sketched by Fig. 1. It consists on an electronic nonlinear system with a Wiener Hammerstein structure that was built by Gerd Vendesteen [21]. This process was adopted as a nonlinear system benchmark in SYSID 2009.

Fig. 1
figure 1

Wiener Hammerstein benchmark

7.2 Results

To build the RKHS model, we use the RBF Kernel (Radial Basis Function)

$$ K\left( {x,x^{\prime } } \right) = \exp \left( { - {\frac{{\left\| {x - x^{\prime } } \right\|^{2} }}{{2\mu^{2} }}}} \right),\quad \mu = 88 $$
(35)

We use a heuristic approach to select the input vector that yields the minimal normalized mean square error between real output and estimated one. This approach is called sequential forward search, in which each input is selected sequentially. The selected vector is:\( x(k) = \left\{ {u(k - 1),u(k - 2),u(k - 4), \ldots ,u(k - 15),y(k - 1)} \right\}^{T} \in {\mathbb R}^{15} \) selected with validation Normalized Mean Square Error NMSE of 0.063.

The chosen thresholds are

$$ \varepsilon_{1} = 0.09,\quad \varepsilon_{ 2} = 0.0 1 $$

We performed the online identification using the online RKPCA algorithm developed in Sect. 5. The total number of observations is 187,000.

The parameter estimate vector is

$$ \hat{a} = 1 0^{ 4} \times \left[ { - 1. 9 5 { } - 0. 0 5 \quad { 1} . 6 2 \, - 0. 0 6 \quad { 0} . 4 3 \quad { 0} . 2 5 \, - 0. 2 4 \quad { 0} . 7 4 \quad { 0} . 0 2 \, - 0. 8 8 \quad { 0} . 2 0 \, { } - 1. 3 \quad { 0} . 9 6 \, - 0. 0 9 \quad { 0} . 5 2 \,{ } - 0. 1 7} \right]^{\text{T}} \in {\mathbb R}^{16} $$

The number of the retained principal components is p  = 16. They form the columns of the following matrix.

$$ x_{b} = \left( {\begin{array}{*{20}c} { 0. 1 5 4 8} & { - 0. 0 1 3 0} & { 0. 5 9 0 2} & { 0. 4 5 8 7} & { - 1. 8 7 7 0} & { - 0. 3 0 5 9} & { - 1. 2 2 7 1} & { - 0. 5 0 8 1} & { - 0. 0 0 5 8} & { 0. 5 5 7 9} & { - 0. 0 0 3 1} & { - 0. 6 0 5 7} & { - 0. 1 5 9 3} & { - 0. 1 1 6 0} & { 0. 4 3 9 1} & { - 0. 0 0 7 6} \\ { 0. 3 0 6 3} & { 0. 0 1 2 4} & { 0. 5 3 9 4} & { 1. 0 1 1 5} & { - 1. 2 4 9 4} & { - 0. 3 6 0 9} & { - 1. 5 6 2 2} & { - 1. 1 6 7 0} & { - 0. 0 0 5 2} & { 0. 6 4 8 2} & { - 0. 0 0 7 6} & { - 1. 2 2 7 1} & { - 0. 1 8 7 8} & { - 0. 0 1 4 1} & { 0. 1 0 2 3} & { - 0. 0 0 2 4} \\ { 0. 4 5 4 9} & { - 0. 6 0 5 7} & { 0. 1 4 6 3} & { 1. 3 8 6 8} & { - 0. 2 0 5 0} & { - 0. 3 9 8 6} & { - 1. 6 2 3 3} & { - 1. 3 6 3 1} & { - 0. 0 0 7 6} & { 0. 4 3 9 1} & { - 0. 0 0 3 8} & { - 1. 5 6 2 2} & { - 0. 0 1 6 5} & 0& { - 0. 1 5 9 3} & { - 0. 0 1 3 0} \\ { 0. 6 8 8 4} & { - 1. 2 2 7 1} & { - 0. 3 1 8 3} & { 1. 2 8 5 5} & { 0. 7 7 2 2} & { - 0. 3 9 2 1} & { - 1. 4 3 7 9} & { - 1. 1 6 2 2} & { - 0. 0 0 2 4} & { 0. 1 0 2 3} & { - 0. 0 0 7 6} & { - 1. 6 2 3 3} & { 0. 2 6 2 7} & { - 0. 0 0 9 6} & { - 0. 1 8 7 8} & { 0. 0 1 2 4} \\ { 1. 0 4 2 4} & { - 1. 5 6 2 2} & { - 0. 4 9 9 6} & { 0. 7 0 1 8} & { 1. 2 9 1 7} & { - 0. 1 6 4 8} & { - 1. 1 5 1 9} & { - 0. 8 7 2 8} & { - 0. 0 1 3 0} & { - 0. 1 5 9 3} & { - 0. 0 0 4 1} & { - 1. 4 3 7 9} & { 0. 5 6 4 8} & { - 0. 0 0 2 4} & { - 0. 0 1 6 5} & { - 0. 6 0 5 7} \\ { 1. 4 3 5 2} & { - 1. 6 2 3 3} & { - 0. 2 4 8 6} & { - 0. 0 5 7 7} & { 1. 2 1 9 6} & { 0. 3 8 6 6} & { - 0. 7 5 3 3} & { - 0. 7 5 4 3} & { 0. 0 1 2 4} & { - 0. 1 8 7 8} & { - 0. 0 0 7 2} & { - 1. 1 5 1 9} & { 0. 8 8 9 3} & { - 0. 0 0 8 2} & { 0. 2 6 2 7} & { - 1. 2 2 7 1} \\ { 1. 6 8 4 1} & { - 1. 4 3 7 9} & { 0. 2 9 4 9} & { - 0. 5 8 0 6} & { 0. 6 9 6 0} & { 1. 0 1 5 9} & { - 0. 2 8 8 1} & { - 0. 8 1 7 5} & { - 0. 6 0 5 7} & { - 0. 0 1 6 5} & { - 0. 0 0 4 8} & { - 0. 7 5 3 3} & { 1. 1 8 3 2} & { - 0. 0 0 3 1} & { 0. 5 6 4 8} & { - 1. 5 6 2 2} \\ { 1. 6 2 0 9} & { - 1. 1 5 1 9} & { 0. 8 5 0 8} & { - 0. 6 2 4 9} & { 0. 0 3 9 8} & { 1. 1 8 2 1} & { 0. 2 0 7 4 { }} & { - 0. 8 6 1 4} & { - 1. 2 2 7 1} & { 0. 2 6 2 7} & { - 0. 0 0 6 5} & { - 0. 2 8 8 1} & { 1. 2 9 8 9 { }} & { - 0. 0 0 7 6} & { 0. 8 8 9 3} & { - 1. 6 2 3 3} \\ { 1. 1 9 3 5} & { - 0. 7 5 3 3} & { 1. 2 0 0 3} & { - 0. 2 6 8 8} & { - 0. 4 1 1 3} & { 0. 5 1 8 1} & { 0. 5 5 7 9} & { - 0. 6 8 8 7} & { - 1. 5 6 2 2} & { 0. 5 6 4 8} & { - 0. 0 0 4 8} & { 0. 2 0 7 4} & { 1. 0 3 4 8} & { - 0. 0 0 3 8} & { 1. 1 8 3 2 { }} & { - 1. 4 3 7 9} \\ { 0. 5 0 2 0} & { - 0. 2 8 8 1} & { 1. 2 6 9 3} & { 0. 1 5 3 5} & { - 0. 4 5 2 2} & { - 0. 7 2 5 1} & { 0. 6 4 8 2} & { - 0. 2 7 8 1} & { - 1. 6 2 3 3} & { 0. 8 8 9 3} & { - 0. 0 0 6 2} & { 0. 5 5 7 9} & { 0. 3 5 6 7} & { - 0. 0 0 7 6} & { 1. 2 9 8 9} & { - 1. 1 5 1 9} \\ { - 0. 2 2 4 5} & { 0. 2 0 7 4} & { 1. 0 9 1 5} & { 0. 2 6 0 9} & { - 0. 0 7 5 2} & { - 1. 7 5 0 4} & { 0. 4 3 9 1} & { 0. 1 9 1 9} & { - 1. 4 3 7 9} & { 1. 1 8 3 2} & { - 0. 0 0 4 8} & { 0. 6 4 8 2} & { - 0. 5 0 8 1} & { - 0. 0 0 4 1} & { 1. 0 3 4 8} & { - 0. 7 5 3 3} \\ { - 0. 7 0 5 6} & { 0. 5 5 7 9} & { 0. 7 6 1 5} & { - 0. 1 3 7 0} & { 0. 5 4 5 6} & { - 1. 8 5 0 3} & { 0. 1 0 2 3} & { 0. 4 7 5 5} & { - 1. 1 5 1 9} & { 1. 2 9 8 9} & { - 0. 0 0 6 5} & { 0. 4 3 9 1} & { - 1. 1 6 7 0} & { - 0. 0 0 7 2} & { 0. 3 5 6 7} & { - 0. 2 8 8 1} \\ { 0. 4 2 4 9} & { 0. 6 1 5 7} & { - 0. 1 3 6 3} & { 1. 4 8 6 8} & { 0. 2 2 5 0} & { - 0. 3 7 8 6} & { 1. 5 2 4 3} & { - 1. 2 7 3 1} & { - 0. 0 0 7 6} & { 0. 4 4 3 1} & { - 0. 0 0 4 8} & { - 1. 5 4 2 1} & { - 0. 0 1 8 7} & { - 0. 0 0 4 5} & { - 0. 1 3 7 3} & { - 0. 0 1 7 2} \\ { - 0. 3 0 5 2} & { 0. 4 3 9 1} & { 0. 2 3 9 3} & { - 1. 6 2 0 6} & { 1. 4 7 9 5} & { 0. 0 9 0 6} & { - 0. 1 8 7 8} & { 0. 1 8 2 7} & { - 0. 2 8 8 1} & { 0. 3 5 6 7} & { - 0. 0 0 5 8} & { - 0. 1 5 9 3} & { - 1. 1 6 2 2} & { - 0. 0 0 6 5} & { - 1. 1 6 7 0} & { 0. 5 5 7 9} \\ { 0. 1 2 5 7} & { - 0. 3 7 4 6} & { 0. 0 9 8 9} & { 0. 3 9 4 5} & { - 0. 2 5 7 2} & { - 0. 2 5 2 0} & { - 0. 5 2 3 6} & { - 0. 1 0 9 2} & { - 0. 1 2 7 4} & { - 0. 2 3 9 7} & { - 0. 0 6 6 6} & { - 0. 4 8 6 5} & { 0. 0 6 7 0} & { - 0. 1 5 6 9} & { - 0. 0 8 4 1} & { - 0. 2 4 1 7} \\ \end{array} } \right) $$

In Fig. 2a–c, we present the system and the model output during the online identification. We picked the performance of our algorithm in the training sample windows [4800 5800], [105 1.1105] and [183200 184200]. We remark that the model output is in concordance with the system output, indeed the Normalized mean Square Error is equal to 0.087. This shows the good performances of the proposed online identification method.

Fig. 2
figure 2

System and model outputs during the online identification

To evaluate the performance of the proposed method, we plot in Fig. 3, the evolution of the NMSE. We notice that the error goes down when the number of observation goes high.

Fig. 3
figure 3

Evolution of NMSE

In the Table 1, we summarized the results of the online identification algorithm in terms of kernel parameter, NMSE, and parameters number.

Table 1 Results of the Wiener Hammerstein benchmark identification with the online RKPCA algorithm

7.3 Chemical reactor modeling

7.3.1 Process description

The process is a Continuous Stirred Tank Reactor CSTR which is a nonlinear system used for the conduct of the chemical reactions [8]. A diagram of the reactor is given in Fig. 4.

Fig. 4
figure 4

Chemical reactor diagram

The physical equations describing the process are

$$ \begin{aligned} {\frac{{{\text{d}}h(t)}}{{{\text{d}}t}}} & = w_{1} (t) + w_{2} (t) - 0,2\sqrt {h(t)} \\ {\frac{{{\text{d}}C_{b} (t)}}{{{\text{d}}t}}} & = C_{b1} - C_{b} (t) {\frac{{w_{1} (t)}}{h(t)}} + (C_{b2} - C_{b} (t)){\frac{{w_{2} (t)}}{h(t)}} - {\frac{{k_{1} .C_{b} (t)}}{{(1 + k_{2} .C_{b} (t))^{2} }}} \\ \end{aligned} $$
(36)

where h(t) is the height of the mixture in the reactor of the feed of reactant 1 w 1 (resp, reactant 2, w 2) with concentration Cb 1 (resp. Cb 2). The feed of product of the reaction is w 0 and its concentration is C b . k 1 and k 2 are consuming reactant rate. The temperature in the reactor is assumed constant and equal to the ambient temperature. We are interested by modeling the subsystem presented in Fig. 5.

Fig. 5
figure 5

Considered subsystem

For the purpose of the simulations, we used the CSTR model of the reactor provided with Simulink of Matlab. The parameter μ = 208 is determine using the cross-validation technique in the offline phase.

The input vector of RKHS model is

$$ x(k) = [w_{1} (k - 1),w_{1} (k - 2),cb(k - 1),cb(k - 2)]^{T} $$
(37)

The number of observations is 300.

In Fig. 6, we represent the online reduced kernel principal component analysis output as well as the system output. We notice concordance between both outputs with a Normalized mean Square Error equal to 0.083%.

Fig. 6
figure 6

System and model outputs during the online identification

In Fig. 7, we draw the evolution of the NMSE, we notice that the online identification algorithm presents readily an error less than 1% since the 20th training sample.

Fig. 7
figure 7

Evolution of NMSE

In the Table 2, we summarized the performance of the online identification algorithm in terms of kernel parameter, NMSE, and parameters number.

Table 2 Results of the chemical reactor identification with the online RKPCA algorithm

8 Conclusion

In this paper, we have proposed an online reduced kernel principal component analysis method for nonlinear system parameter identification. Through several experiments, we showed the accuracy and good scaling properties of the proposed method. This algorithm has been tested for identifying a Wiener Hammerstein benchmark model and a chemical reactor, and the results were satisfying. The proposed technique may be very helpful to design an adaptive control strategy of nonlinear systems.