Online identification of nonlinear system using reduced kernel principal component analysis

Taouali, Okba; Elaissi, Ilyes; Messaoud, Hassani

doi:10.1007/s00521-010-0461-x

Online identification of nonlinear system using reduced kernel principal component analysis

Original Article
Published: 23 October 2010

Volume 21, pages 161–169, (2012)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Neural Computing and Applications Aims and scope Submit manuscript

Online identification of nonlinear system using reduced kernel principal component analysis

Download PDF

Okba Taouali¹,
Ilyes Elaissi¹ &
Hassani Messaoud¹

581 Accesses
28 Citations
Explore all metrics

Abstract

The Principal Component Analysis (PCA) is a powerful technique for extracting structure from possibly high-dimensional data sets. It is readily performed by solving an eigenvalue problem, or by using iterative algorithms that estimate principal components. This paper proposes a new method for online identification of a nonlinear system modelled on Reproducing Kernel Hilbert Space (RKHS). Therefore, the PCA technique is tuned twice, first we exploit the Kernel PCA (KPCA) which is a nonlinear extension of the PCA to RKHS as it transforms the input data by a nonlinear mapping into a high-dimensional feature space to which the PCA is performed. Second, we use the Reduced Kernel Principal Component Analysis (RKPCA) to update the principal components that represent the observations selected by the KPCA method.

Kernel $\ell ^1$-norm principal component analysis for denoising

Article 25 September 2023

Kernel principal component analysis with reduced complexity for nonlinear dynamic process monitoring

Article 18 June 2016

Kernel Principal Component Analysis: Applications, Implementation and Comparison

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Since the introduction of Support Vector Machines (SVM) [19], many learning algorithms have been transferred to a kernel representation [2, 7]. The important benefit lies on the fact that nonlinearities can be allowed, while avoiding to solve a nonlinear optimization problem. The transfer is implicitly accomplished by means of a nonlinear map in a Reproducing Kernel Hilbert Space F _k (RKHS) [9].

Kernel methods have been successfully applied to a large class of problems, such as identification of nonlinear system [2, 3, 16, 17], diagnostic system [24], time series prediction [13], face recognition [26], biological data processing for medical diagnosis [23]…. The attractiveness of such algorithms stands from their elegant treatment of data issued from nonlinear processes. However, these techniques suffer from computational complexity as the amount of computer memory and the training time increases rapidly with the number of observations. It is clear that for large datasets (as for example in image processing, computer vision or object recognition), the kernel method with its powerful advantage of dealing with nonlinearities is computationally limited. For large datasets, an eigen decomposition of Gram matrix can simply become too time-consuming to extract the principal components and therefore the system parameter identification becomes a tough task. To overcome this burden, recently a theoretical foundation for online learning algorithm with kernel method in reproducing kernel Hilbert spaces was proposed [4, 5, 14, 15, 18, 22, 25]. Also online kernel algorithm is more useful when the system to be identified is time-varying, because these algorithms can automatically track changes of system model with time-varying and time lagging characteristic.

In this paper, we propose a new method for online identification of a nonlinear system parameters modeled on Reproducing Kernel Hilbert Space (RKHS). This method uses the Reduced Kernel Principal Component Analysis (RKPCA) that selects the observation data to approach the Principal Components Analysis kept by the Kernel Principal Component Analysis method KPCA [2]. The selected observations are used to build an RKHS model with a reduced parameter number. The proposed online identification method updates the list of the retained principal components and then the RKHS model by evaluating the error between the output model and the process one. The proposed technique may be very helpful to design an adaptive control strategy of nonlinear systems.

The paper is organized as follows. In Sect. 2, we remind the Reproducing Kernel Hilbert Space (RKHS). Section 3 is devoted to the modeling in RKHS. The Reduced Kernel Principal Component Analysis RKPCA method is presented in Sect. 4. In Sect. 5, we propose the new online RKPCA method. The proposed algorithm has been tested to identify the Wiener-Hammerstein benchmark [21] and a chemical reactor [8].

2 Reproducing kernel Hilbert space

Let $ E \subset {\mathbb R}^{d} $ be an input space and L ²(E) the Hilbert space of square integrable functions defined on E. Let $ k:E \times E \to \mathbb{R} $ be a continuous positive definite kernel. It is proved [6, 9] that it exists as a sequence of an orthonormal eigen functions (ψ ₁, ψ ₂, …, ψ _l) in L ²(E) and a sequence of corresponding real positive eigenvalues (σ ₁, σ ₂, …, σ _l) (where l can be infinite) so that

$$ k(x,t) = \sum\limits_{j = 1}^{l} {\sigma_{j} \psi_{j} (x)\psi_{j} (t)} ;\quad x,t \in E. $$

(1)

Let $ F_{k} \subset L^{2} (E) $ be a Hilbert space associated to the kernel k and defined by:

$$ F_{k} = \left\{ {f \in L^{2} (E)/f = \sum\limits_{i = 1}^{l} {w_{i} \varphi_{i} } {\text{ and }}\sum\limits_{j = 1}^{l} {{\frac{{w_{j}^{2} }}{{\sigma_{j} }}}} < +\infty } \right\} $$

(2)

where $ \varphi_{i} = \sqrt {\sigma_{i} } \psi_{i} $ i = 1, …, l. The scalar product in the space F _k is given by:

$$ \left\langle {f,g} \right\rangle_{{F_{k} }} = \left\langle {\sum\limits_{i = 1}^{l} {w_{i} \varphi_{i} } ,\sum\limits_{j = 1}^{l} {z_{j} \varphi_{j} } } \right\rangle_{{F_{k} }} = \sum\limits_{i = 1}^{l} {w_{i} z_{i} } $$

(3)

The kernel k is said to be a reproducing kernel of the Hilbert space F _k if and only if the following conditions are satisfied.

$$ \left\{ {\begin{array}{*{20}c} {\forall x \in E,\quad k\left( {x, \cdot } \right) \in F_{k} } \hfill \\ {\forall x \in E \quad {\text{ and }} \quad \forall f \in F_{k} ,\left\langle { \, f( \cdot ),k(x, \cdot )} \right\rangle_{{{F_k} }} = f(x)} \hfill \\ \end{array} } \right. $$

(4)

where k(x,·) means $ k(x,x^{\prime } )\quad \forall x^{\prime } \in E $. F _k is called reproducing kernel Hilbert space (RKHS) with kernel k and dimension l. Moreover, for any RKHS, there exists only one positive definite kernel and vice versa [10].

Among the possible reproducing kernels, we mention the Radial Basis function (RBF) defined as:

$$ k(x,t) = \exp \left( { - \left\| {x - t} \right\|^{2} /2\mu^{2} } \right) ;\quad \quad \forall x,t \in E $$

(5)

with μ a fixed parameter.

3 RKHS models

Consider a set of observations $ \{ x^{(i)} ,y^{(i)} \}_{i = 1, \ldots ,M} $ with $ x^{(i)} \in {\mathbb R}^{n} ,y^{(i)} \in \mathbb{R} $ are respectively the system input and output. According to the statistical learning theory (SLT) [19, 20], the identification problem in the RKHS F _k can be formulated as a minimization of the regularized empirical risk. Thus, it consists in finding the function $ f^{*} \in F_{k} $ such that

$$ f^{*} = \sum\limits_{j = 1}^{l} {w_{j}^{*} \varphi_{j} } = \mathop {\min }\limits_{{f \in F_{k} }} \frac{1}{M}\sum\limits_{i = 1}^{M} {\left( {y^{(i)} - f\left( {x^{(i)} } \right)} \right)^{2} } + \lambda \left\| f \right\|_{{F_{k} }}^{2} $$

(6)

where M is the measurement number and λ is a regularization parameter chosen in order to ensure a generalization ability to the solution f ^*. According to the representer theorem [9], the solution f ^* of the optimization problem (6) is a linear combination of the kernel k applied to the M measurements x ⁽ⁱ⁾, i = 1, …, M, as:

$$ f^{*} (x) = \sum\limits_{i = 1}^{M} {a_{i}^{*} k\,x^{(i)} ,x} .$$

(7)

To solve the optimization problem (6), we can use some kernel methods such that Support Vector Machine (SVM) [11], Least Square Support Vector Machine (LSSVM) [7], Regularization Network (RN) [3], Kernel Partial Least Square (KPLS) [12], …. In [2], the Kernel Principal Component Analysis KPCA were proposed. This method reconsiders the regularization idea by finding the solution to the identification problem in some subspace F _kpca spanned by the so-called principal component and yields to a RKHS model with M parameters.

In the next section, we present the Reduced KPCA method in which we approximate the retained principal components given by the KPCA with a set of vectors of input observations. This approximation is performed by a set of particular training observations and allows the construction of a RKHS model with much less parameters.

4 RKPCA method

Let a nonlinear system with an input $ u \in \mathbb{R} $ and an output $ y \in \mathbb{R} $ from which we extract a set of observations be $ \{ u^{(i)} ,y^{(i)} \}_{i = 1, \ldots ,M} $. Let F _k be an RKHS space with kernel k. To build the input vector x ⁽ⁱ⁾ of the RKHS model, we use the NARX (Nonlinear auto regressive with eXogeneous input) structure as:

$$ x^{(i)} = \left\{ {u^{(i)} , \ldots ,u^{{(i - m_{u} )}} , \, y^{(i - 1)} , \ldots , \, y^{{(i - m_{y} )}} } \right\}^{T} ;\quad m_{u} ,m_{y} \in \mathbb{N} $$

(8)

The set of observations becomes $ D = \{ x^{(i)} ,y^{(i)} \}_{i = 1, \ldots ,M} $ where $ x^{(i)} \in {\mathbb R}^{{m_{u} + m_{y} + 1}} $ and $ y^{(i)} \in \mathbb{R} $ and the RKHS model of this system based on (7) can be written as:

$$ \tilde{y}^{(j)} = \sum\limits_{i = 1}^{M} {a_{i} k(x^{(i)} ,x^{(j)} )} $$

(9)

Let the application Φ:

$$ \begin{gathered} \Upphi :E \to {\mathbb R}^{l} \hfill \\ \, x \,\mapsto\, \Upphi (x) = \left( {\begin{array}{*{20}c} {\varphi_{1} (x)} \\ \vdots \\ {\varphi_{l} (x)} \\ \end{array} } \right) \hfill \\ \end{gathered} $$

(10)

where φ _i are given in (2).

The Gram matrix K associated with the kernel k is an M-dimensional square matrix so that

$$ K_{i,j} = k(x^{(i)} ,x^{(j)} )\quad \quad {\text{for}}\quad i,j = 1, \ldots ,M $$

(11)

The kernel trick [10] is so that

$$ \left\langle {\Upphi (x),\Upphi (x^{\prime } )} \right\rangle = k(x,x^{\prime } )\quad x,\quad x^{\prime } \in E $$

(12)

We assume that the transformed data $ \{ \Upphi (x^{(i)} )\}_{i = 1, \ldots ,M} \in {\mathbb R}^{l} $ are centered [2]. The empirical covariance matrix of the transformed data is symmetrical and l-dimensional. It is written as following:

$$ C_{\phi } = \frac{1}{M}\sum\limits_{i = 1}^{M} {\Upphi (x^{(i)} )\Upphi (x^{(i)} )^{T} } , \quad C_{\phi } \in {\mathbb R}^{l \times l} $$

(13)

Let l ^′ be the number of the eigenvectors $ \{ V_{j} \}_{{j = 1, \ldots ,l^{\prime } }} $ of the C _ϕ matrix that corresponds to the nonzeros positive eigenvalues $ \{ \lambda_{j} \}_{{j = 1, \ldots , \, l^{\prime } }} $. It is proved in [2] that the number l ^′ is less or equal to M.

Due to the large size l of C _ϕ, the calculus of $ \{ V_{j} \}_{{j = 1, \ldots , \, l^{\prime } \, }} $ can be difficult. The KPCA method shows that these $ \{ V_{j} \}_{{j = 1, \ldots , \, l^{\prime } \, }} $ are related to the eigenvectors $ \{ \beta_{j} \}_{{j = 1, \ldots ,l^{\prime } \, }} $ of the gram matrix K according to [1]:

$$ V_{j} = \sum\limits_{i = 1}^{M} {\beta_{j,i} } \Upphi (x^{(i)} ),\quad j = 1, \ldots ,l^{\prime } $$

(14)

where $ (\beta_{j,i} )_{j = 1, \ldots ,p} $ are the components of $ \{ \beta_{j} \}_{{j = 1, \ldots ,l^{\prime } }} $ associated to their nonzero eigenvalues $ \mu_{1} > \cdots > \mu_{{l^{'} }}. $

The principle of the KPCA method consists in organizing the eigenvectors $ \{ \beta_{j} \}_{{j = 1, \ldots ,l^{\prime } }} $ in the decreasing order of their corresponding eigenvalues $ \{ \mu_{j} \}_{{j = 1, \ldots ,l^{\prime } }} $. The principal components are the p first vectors $ \{ V_{j} \}_{j = 1, \ldots ,p} $ associated with the highest eigenvalues and are often sufficient to describe the structure of the data [1, 2]. The number p satisfies the Inertia Percentage criterion IPC given by:

$$ p^{*} = \arg ({\text{IPC}} \ge 99) $$

(15)

where

$$ {\text{IPC}} = {\frac{{\sum\nolimits_{i = 1}^{p} {\mu_{i} } }}{{\sum\nolimits_{i = 1}^{M} {\mu_{i} } }}} \times 100 $$

(16)

The RKHS model provided by the KPCA method is [1].

$$ \tilde{y}^{{({\text{new}})}} = \sum\limits_{q = 1}^{p} {w_{q} } \sum\limits_{i = 1}^{M} {\beta_{q,i} } k(x^{i} ,x^{{({\text{new}})}} ) $$

(17)

Since the principal components are a linear combination of the transformed input data $ \{ \Upphi (x^{i} )\}_{i = 1, \ldots ,M} $ [3], the Reduced KPCA approaches each vector $ \{ V_{j} \}_{j = 1, \ldots ,p} $ by a transformed input data $ \Upphi (x_{b}^{(j)} ) \in \{ \Upphi (x^{i} )\}_{i = 1, \ldots ,M} $ having a high projection value in the direction of V _j [1].

The projection of the $ \Upphi (x^{(i)} ) $ on the V _j called $ \tilde{\Upphi }(x^{(i)} )_{j} \in \mathbb{R} $ and can be written as:

$$ \tilde{\Upphi }(x^{(i)} )_{j} = \left\langle {V_{j} ,\Upphi (x^{(i)} )} \right\rangle ,\quad \, j = 1, \ldots ,p $$

(18)

According to (14) and (12), the relation (18) is written:

$$ \tilde{\Upphi }(x^{(i)} )_{j} = \sum\limits_{m = 1}^{M} {\beta_{j,m} } k(x^{(m)} ,x^{(i)} ),\quad \, j = 1, \ldots ,p $$

(19)

To select the vectors $ \{ \Upphi (x_{b}^{(i)} )\} $, we project all the $ \{ \Upphi (x^{(i)} )\}_{i = 1, \ldots ,M} $ vectors on each principal component $ \{ V_{j} \}_{j = 1, \ldots ,p} $ and we retained $ x_{b}^{(j)} \in \{ x^{(i)} \}_{i = 1, \ldots ,M} $ that satisfies

$$ \left\{ {\begin{array}{*{20}c} {\Upphi (x_{b}^{(j)} )_{j} = \mathop {\text{Max}}\limits_{i = 1, \ldots ,M} \tilde{\Upphi }(x^{(i)} )_{j} } \hfill \\ {\text{and}} \hfill \\ {\Upphi (x_{b}^{(j)} )_{i \ne j} < \zeta } \hfill \\ \end{array} } \right. $$

(20)

where ζ is a given threshold.

Once the $ \{ x_{b}^{(j)} \}_{j = 1, \ldots ,p} $ corresponding to the p principal component $ \{ V_{j} \}_{j = 1, \ldots ,p} $ is determined, we transform the vector $ \Upphi (x) \in {\mathbb R}^{l} $ to the $ \hat{\Upphi }(x) \in {\mathbb R}^{p} $ vector that belongs to the space generated by $ \{ \Upphi (x_{b}^{j} )\}_{j = 1, \ldots ,p} $ and the proposed reduced model is

$$ \tilde{y}_{\text{reduced}}^{{({\text{new}})}} = \sum\limits_{j = 1}^{p} {\hat{a}_{j} } \hat{\Upphi }(x^{({\rm new})} )_{j} $$

(21)

where

$$ \hat{\Upphi }(x^{{({\text{new}})}} )_{j} = \left\langle {\Upphi (x_{b}^{(j)} ),\Upphi (x^{{({\text{new}})}} )} \right\rangle \quad \quad {\text{for}}\quad j = 1, \ldots ,p $$

(22)

and according to the kernel trick (12), the model (21) is

$$ \tilde{y}_{\text{reduced}}^{{({\text{new}})}} = \sum\limits_{j = 1}^{p} {\hat{a}_{j} } k_{j} (x^{{({\text{new}})}} ) $$

(23)

where

$$ k_{j} (x) = k(x_{b}^{(j)} ,x)\quad \quad {\text{for}} \quad j = 1, \ldots ,p $$

(24)

The model (23) is less complicate than that provided by the KPCA. The identification problem can be formulated as a minimization of the regularized least square written as:

$$ J_{r} (\hat{a}) = \frac{1}{2}\sum\limits_{i = 1}^{M} {\left( {y^{(i)} - \sum\limits_{j = 1}^{p} {\hat{a}_{j} } k_{j} \left( {x^{(i)} } \right)} \right)^{2} } + {\frac{\rho }{2}}\left\| {\hat{a}} \right\|^{2} $$

(25)

where ρ is a regularization parameter and $ \hat{a} = (\hat{a}_{1} , \ldots ,\hat{a}_{p} )^{T} $ is the parameter estimate vector.

The solution of the problem (25) is

$$ \hat{a}^{*} = \left( {F + \rho I_{p} } \right)^{ - 1} G $$

(26)

With:

$$ \begin{array}{*{20}c} {G = \left( \begin{gathered} \sum\limits_{i = 1}^{M} {k_{1} \left( {x^{(i)} } \right)y^{(i)} } \hfill \\ \vdots \hfill \\ \sum\limits_{i = 1}^{M} {k_{b} \left( {x^{(i)} } \right)y^{(i)} } \hfill \\ \end{gathered} \right) \in {\mathbb R}^{p} \, } \hfill \\ {\text{and}} \hfill \\ {F = \left( \begin{gathered} \sum\limits_{i = 1}^{M} {k_{1} \left( {x^{(i)} } \right)k_{1} \left( {x^{(i)} } \right)} \cdots \sum\limits_{i = 1}^{M} {k_{1} \left( {x^{(i)} } \right)k_{b} \left( {x^{(i)} } \right)} \hfill \\ \vdots \hfill \\ \sum\limits_{i = 1}^{M} {k_{b} \left( {x^{(i)} } \right)k_{1} \left( {x^{(i)} } \right)} \cdots \sum\limits_{i = 1}^{M} {k_{b} \left( {x^{(i)} } \right)k_{b} \left( {x^{(i)} } \right)} \hfill \\ \end{gathered} \right) \in {\mathbb R}^{p \times p} } \hfill \\ \end{array} $$

(27)

And $ I_{p} \in {\mathbb R}^{p \times p} $ is the p identity matrix.

The RKPCA algorithm is summarized by the five following steps:

1.
Determine the nonzero eigenvalues $ \{ \mu_{j} \}_{{j = 1, \ldots ,l^{\prime } }} $ and the eigenvectors $ \{ \beta_{j} \}_{{j = 1, \ldots ,l^{\prime } }} $ of Gram matrix K.
2.
Order the $ \{ \beta_{j} \}_{{j = 1, \ldots ,l^{\prime } }} $ on the decreasing way with respect to the corresponding eigenvalues.
3.
For the p retained principal components, choose the $ \{ (x_{b}^{(j)} )\}_{j = 1, \ldots ,p} $ that satisfy (20).
4.
Solving (25) to determine $ \hat{a}^{*} \in {\mathbb R}^{p} $.
5.
The reduced RKHS model is given by (23).

5 Online RKPCA method

In this section, we propose an online Reduced Kernel Principal Component Analysis method, which consists on updating the vectors that approximate the principal components. This proposed method is detailed in the following steps.

First step, to determine the optimal value of the parameters μ of the kernel associated to the RKHS model, an offline learning step on an n-observation set $ I = \{ (x^{(1)} ,y^{(1)} ), \ldots ,(x^{(n)} ,y^{(n)} ) $ is carried out till the provided RKHS model approximates correctly the nonlinear system. Then, we apply the RKPCA method to reduce the number of the model parameters and the resulting model is written as:

$$ \tilde{y}(x) = \sum\limits_{j = 1}^{p} {\hat{a}_{j} } k(x_{b}^{(j)} ,x) $$

(28)

Let I _n be the set of observations that correspond to the retained principal components. $ I_{n} = \{ x_{b}^{(j)} \}_{j = 1, \ldots ,p} $

At time instant (n + 1), the RKHS model output is obtained according to (28) as:

$$ \tilde{y}^{(n + 1)} = \sum\limits_{j = 1}^{p} {\hat{a}_{j} } k(x_{b}^{(j)} ,x^{(n + 1)} ) $$

(29)

The error between the estimated output and the real one is

$$ e^{(n + 1)} = \left| {\tilde{y}^{(n + 1)} - y^{(n + 1)} } \right| $$

(30)

If e ⁽ⁿ⁺¹⁾ < ɛ ₁, where ɛ ₁ is a given threshold, the model approaches sufficiently the system behavior. Else an update of the RKHS model is required which can be accomplished either by actualizing the model parameters or by actualizing the retained principal components.

In both cases, we calculate the projection of Φ(x ⁽ⁿ⁺¹⁾) on the space F _kpca spanned by $ \{ \Upphi (x_{b}^{(i)} )\}_{j = 1, \ldots ,p} $. This projection is denoted $ \hat{\Upphi }(x^{n + 1} ) $ so that its j ⁱth component is given by:

$$ \hat{\Upphi }\left( {x^{(n + 1)} } \right)_{j} = \left\langle {\Upphi \left( {x_{b}^{(j)} } \right),\Upphi \left( {x^{(n + 1)} } \right)} \right\rangle = k\left( {x_{b}^{(j)} ,x^{(n + 1)} } \right),\quad \, j = 1, \ldots ,p $$

(31)

A good approximation of $ \Upphi (x^{n + 1} ) $ by $ \hat{\Upphi }(x^{n + 1} ) $ requires satisfying the following condition:

$$ \left| { \, \left\| {\hat{\Upphi }\left( {x^{(n + 1)} } \right)} \right\| - \left\| {\Upphi \left( {x^{(n + 1)} } \right)} \right\| \, } \right| < \varepsilon_{2} $$

(32)

The set I _n is updated to $ I_{n + 1} = \{ I_{n} , \, x^{(n + 1)} \} $ to determine the parameters $ \{ \hat{a}_{j} \}_{j = 1, \ldots ,p} $ of the RKHS model (28).

If (32) isn’t satisfied, we actualize the set $ \{ x_{b}^{(j)} \}_{j = 1, \ldots ,p} $ using the observation set I _n+1 then we built the Gram matrix corresponding to I _n+1 given by:

$$ G_{n + 1} = \left( {\begin{array}{*{20}c} {k\left( {x_{b}^{(1)} ,x_{b}^{(1)} } \right)} & \ldots & {k\left( {x_{b}^{(1)} ,x^{(n + 1)} } \right)} \\ \vdots & \ddots & \vdots \\ {k\left( {x^{(n + 1)} ,x_{b}^{(1)} } \right)} & \cdots & {k\left( {x^{(n + 1)} ,x^{(n + 1)} } \right)} \\ \end{array} } \right) $$

(33)

and we compute its eigenvalues.

According to relations (15), (16) and (33), we determine the new p ^′ principal component. Then, we used the RKPCA to determine the new set $ \{ x_{b}^{(j)} \}_{{j = 1, \ldots ,p^{\prime } }} $ that approaches the p ^′ retained principal components. The RKPCA model is given by:

$$ \tilde{y} = \sum\limits_{j = 1}^{{p^{\prime } }} {\hat{a}_{j} } k\left( {x_{b}^{(j)} ,x} \right) $$

(34)

Finally, we estimate the parameters $ \hat{a}_{j} ;\quad j = 1, \ldots ,p^{\prime } $

In the following, we summarize the algorithm of the online RKPCA method.

6 Online RKPCA algorithm

6.1 Offline phase

1.
According to (15) and (16), we determine the p retained principal components resulting from the processing of an n measurement set. Then, we determine the $ I_{n} = \{ x_{b}^{(j)} \}_{j = 1, \ldots ,p} $ set according to (20). The RKHS model is given by:
$$ \tilde{y} = \sum\limits_{j = 1}^{p} {\hat{a}_{j} } k(x_{b}^{(j)} ,x) $$

6.2 Online phase

1.
At time instant (n + 1), we have a new data (x ⁿ⁺¹, y ⁿ⁺¹), if e ⁽ⁿ⁺¹⁾ < ɛ ₁, the model approaches sufficiently the behavior of the system, else we need to update the RKHS model (28) by the projection $ \hat{\Upphi }(x^{n + 1} ) $ given by (31).
2.
If (32) is satisfied, we use the set $ I_{n + 1} = \{ I_{n} ,x^{(n + 1)} \} $ to actualize the parameters $ \{ \hat{a}_{j} \}_{j = 1, \ldots ,p} $, else we update the $ \{ x_{b}^{(j)} \}_{j = 1, \ldots ,p} $ set using the I _n+1 set and the relations (15), (16) and (33). The new RKHS model is given by (34), and the a _j parameters of the model can be determined using the least square method.

7 Simulations

The proposed method has been tested for modeling a Wiener Hammerstein benchmark and a chemical reactor.

7.1 Description of wiener Hammerstein benchmark model

The system to be modelled is sketched by Fig. 1. It consists on an electronic nonlinear system with a Wiener Hammerstein structure that was built by Gerd Vendesteen [21]. This process was adopted as a nonlinear system benchmark in SYSID 2009.

7.2 Results

To build the RKHS model, we use the RBF Kernel (Radial Basis Function)

$$ K\left( {x,x^{\prime } } \right) = \exp \left( { - {\frac{{\left\| {x - x^{\prime } } \right\|^{2} }}{{2\mu^{2} }}}} \right),\quad \mu = 88 $$

(35)

We use a heuristic approach to select the input vector that yields the minimal normalized mean square error between real output and estimated one. This approach is called sequential forward search, in which each input is selected sequentially. The selected vector is:$ x(k) = \left\{ {u(k - 1),u(k - 2),u(k - 4), \ldots ,u(k - 15),y(k - 1)} \right\}^{T} \in {\mathbb R}^{15} $ selected with validation Normalized Mean Square Error NMSE of 0.063.

The chosen thresholds are

$$ \varepsilon_{1} = 0.09,\quad \varepsilon_{ 2} = 0.0 1 $$

We performed the online identification using the online RKPCA algorithm developed in Sect. 5. The total number of observations is 187,000.

The parameter estimate vector is

$$ \hat{a} = 1 0^{ 4} \times \left[ { - 1. 9 5 { } - 0. 0 5 \quad { 1} . 6 2 \, - 0. 0 6 \quad { 0} . 4 3 \quad { 0} . 2 5 \, - 0. 2 4 \quad { 0} . 7 4 \quad { 0} . 0 2 \, - 0. 8 8 \quad { 0} . 2 0 \, { } - 1. 3 \quad { 0} . 9 6 \, - 0. 0 9 \quad { 0} . 5 2 \,{ } - 0. 1 7} \right]^{\text{T}} \in {\mathbb R}^{16} $$

The number of the retained principal components is p ^′ = 16. They form the columns of the following matrix.

$$ x_{b} = \left( {\begin{array}{*{20}c} { 0. 1 5 4 8} & { - 0. 0 1 3 0} & { 0. 5 9 0 2} & { 0. 4 5 8 7} & { - 1. 8 7 7 0} & { - 0. 3 0 5 9} & { - 1. 2 2 7 1} & { - 0. 5 0 8 1} & { - 0. 0 0 5 8} & { 0. 5 5 7 9} & { - 0. 0 0 3 1} & { - 0. 6 0 5 7} & { - 0. 1 5 9 3} & { - 0. 1 1 6 0} & { 0. 4 3 9 1} & { - 0. 0 0 7 6} \\ { 0. 3 0 6 3} & { 0. 0 1 2 4} & { 0. 5 3 9 4} & { 1. 0 1 1 5} & { - 1. 2 4 9 4} & { - 0. 3 6 0 9} & { - 1. 5 6 2 2} & { - 1. 1 6 7 0} & { - 0. 0 0 5 2} & { 0. 6 4 8 2} & { - 0. 0 0 7 6} & { - 1. 2 2 7 1} & { - 0. 1 8 7 8} & { - 0. 0 1 4 1} & { 0. 1 0 2 3} & { - 0. 0 0 2 4} \\ { 0. 4 5 4 9} & { - 0. 6 0 5 7} & { 0. 1 4 6 3} & { 1. 3 8 6 8} & { - 0. 2 0 5 0} & { - 0. 3 9 8 6} & { - 1. 6 2 3 3} & { - 1. 3 6 3 1} & { - 0. 0 0 7 6} & { 0. 4 3 9 1} & { - 0. 0 0 3 8} & { - 1. 5 6 2 2} & { - 0. 0 1 6 5} & 0& { - 0. 1 5 9 3} & { - 0. 0 1 3 0} \\ { 0. 6 8 8 4} & { - 1. 2 2 7 1} & { - 0. 3 1 8 3} & { 1. 2 8 5 5} & { 0. 7 7 2 2} & { - 0. 3 9 2 1} & { - 1. 4 3 7 9} & { - 1. 1 6 2 2} & { - 0. 0 0 2 4} & { 0. 1 0 2 3} & { - 0. 0 0 7 6} & { - 1. 6 2 3 3} & { 0. 2 6 2 7} & { - 0. 0 0 9 6} & { - 0. 1 8 7 8} & { 0. 0 1 2 4} \\ { 1. 0 4 2 4} & { - 1. 5 6 2 2} & { - 0. 4 9 9 6} & { 0. 7 0 1 8} & { 1. 2 9 1 7} & { - 0. 1 6 4 8} & { - 1. 1 5 1 9} & { - 0. 8 7 2 8} & { - 0. 0 1 3 0} & { - 0. 1 5 9 3} & { - 0. 0 0 4 1} & { - 1. 4 3 7 9} & { 0. 5 6 4 8} & { - 0. 0 0 2 4} & { - 0. 0 1 6 5} & { - 0. 6 0 5 7} \\ { 1. 4 3 5 2} & { - 1. 6 2 3 3} & { - 0. 2 4 8 6} & { - 0. 0 5 7 7} & { 1. 2 1 9 6} & { 0. 3 8 6 6} & { - 0. 7 5 3 3} & { - 0. 7 5 4 3} & { 0. 0 1 2 4} & { - 0. 1 8 7 8} & { - 0. 0 0 7 2} & { - 1. 1 5 1 9} & { 0. 8 8 9 3} & { - 0. 0 0 8 2} & { 0. 2 6 2 7} & { - 1. 2 2 7 1} \\ { 1. 6 8 4 1} & { - 1. 4 3 7 9} & { 0. 2 9 4 9} & { - 0. 5 8 0 6} & { 0. 6 9 6 0} & { 1. 0 1 5 9} & { - 0. 2 8 8 1} & { - 0. 8 1 7 5} & { - 0. 6 0 5 7} & { - 0. 0 1 6 5} & { - 0. 0 0 4 8} & { - 0. 7 5 3 3} & { 1. 1 8 3 2} & { - 0. 0 0 3 1} & { 0. 5 6 4 8} & { - 1. 5 6 2 2} \\ { 1. 6 2 0 9} & { - 1. 1 5 1 9} & { 0. 8 5 0 8} & { - 0. 6 2 4 9} & { 0. 0 3 9 8} & { 1. 1 8 2 1} & { 0. 2 0 7 4 { }} & { - 0. 8 6 1 4} & { - 1. 2 2 7 1} & { 0. 2 6 2 7} & { - 0. 0 0 6 5} & { - 0. 2 8 8 1} & { 1. 2 9 8 9 { }} & { - 0. 0 0 7 6} & { 0. 8 8 9 3} & { - 1. 6 2 3 3} \\ { 1. 1 9 3 5} & { - 0. 7 5 3 3} & { 1. 2 0 0 3} & { - 0. 2 6 8 8} & { - 0. 4 1 1 3} & { 0. 5 1 8 1} & { 0. 5 5 7 9} & { - 0. 6 8 8 7} & { - 1. 5 6 2 2} & { 0. 5 6 4 8} & { - 0. 0 0 4 8} & { 0. 2 0 7 4} & { 1. 0 3 4 8} & { - 0. 0 0 3 8} & { 1. 1 8 3 2 { }} & { - 1. 4 3 7 9} \\ { 0. 5 0 2 0} & { - 0. 2 8 8 1} & { 1. 2 6 9 3} & { 0. 1 5 3 5} & { - 0. 4 5 2 2} & { - 0. 7 2 5 1} & { 0. 6 4 8 2} & { - 0. 2 7 8 1} & { - 1. 6 2 3 3} & { 0. 8 8 9 3} & { - 0. 0 0 6 2} & { 0. 5 5 7 9} & { 0. 3 5 6 7} & { - 0. 0 0 7 6} & { 1. 2 9 8 9} & { - 1. 1 5 1 9} \\ { - 0. 2 2 4 5} & { 0. 2 0 7 4} & { 1. 0 9 1 5} & { 0. 2 6 0 9} & { - 0. 0 7 5 2} & { - 1. 7 5 0 4} & { 0. 4 3 9 1} & { 0. 1 9 1 9} & { - 1. 4 3 7 9} & { 1. 1 8 3 2} & { - 0. 0 0 4 8} & { 0. 6 4 8 2} & { - 0. 5 0 8 1} & { - 0. 0 0 4 1} & { 1. 0 3 4 8} & { - 0. 7 5 3 3} \\ { - 0. 7 0 5 6} & { 0. 5 5 7 9} & { 0. 7 6 1 5} & { - 0. 1 3 7 0} & { 0. 5 4 5 6} & { - 1. 8 5 0 3} & { 0. 1 0 2 3} & { 0. 4 7 5 5} & { - 1. 1 5 1 9} & { 1. 2 9 8 9} & { - 0. 0 0 6 5} & { 0. 4 3 9 1} & { - 1. 1 6 7 0} & { - 0. 0 0 7 2} & { 0. 3 5 6 7} & { - 0. 2 8 8 1} \\ { 0. 4 2 4 9} & { 0. 6 1 5 7} & { - 0. 1 3 6 3} & { 1. 4 8 6 8} & { 0. 2 2 5 0} & { - 0. 3 7 8 6} & { 1. 5 2 4 3} & { - 1. 2 7 3 1} & { - 0. 0 0 7 6} & { 0. 4 4 3 1} & { - 0. 0 0 4 8} & { - 1. 5 4 2 1} & { - 0. 0 1 8 7} & { - 0. 0 0 4 5} & { - 0. 1 3 7 3} & { - 0. 0 1 7 2} \\ { - 0. 3 0 5 2} & { 0. 4 3 9 1} & { 0. 2 3 9 3} & { - 1. 6 2 0 6} & { 1. 4 7 9 5} & { 0. 0 9 0 6} & { - 0. 1 8 7 8} & { 0. 1 8 2 7} & { - 0. 2 8 8 1} & { 0. 3 5 6 7} & { - 0. 0 0 5 8} & { - 0. 1 5 9 3} & { - 1. 1 6 2 2} & { - 0. 0 0 6 5} & { - 1. 1 6 7 0} & { 0. 5 5 7 9} \\ { 0. 1 2 5 7} & { - 0. 3 7 4 6} & { 0. 0 9 8 9} & { 0. 3 9 4 5} & { - 0. 2 5 7 2} & { - 0. 2 5 2 0} & { - 0. 5 2 3 6} & { - 0. 1 0 9 2} & { - 0. 1 2 7 4} & { - 0. 2 3 9 7} & { - 0. 0 6 6 6} & { - 0. 4 8 6 5} & { 0. 0 6 7 0} & { - 0. 1 5 6 9} & { - 0. 0 8 4 1} & { - 0. 2 4 1 7} \\ \end{array} } \right) $$

In Fig. 2a–c, we present the system and the model output during the online identification. We picked the performance of our algorithm in the training sample windows [4800 5800], [10⁵ 1.110⁵] and [183200 184200]. We remark that the model output is in concordance with the system output, indeed the Normalized mean Square Error is equal to 0.087. This shows the good performances of the proposed online identification method.

To evaluate the performance of the proposed method, we plot in Fig. 3, the evolution of the NMSE. We notice that the error goes down when the number of observation goes high.

In the Table 1, we summarized the results of the online identification algorithm in terms of kernel parameter, NMSE, and parameters number.

Table 1 Results of the Wiener Hammerstein benchmark identification with the online RKPCA algorithm

Full size table

7.3 Chemical reactor modeling

7.3.1 Process description

The process is a Continuous Stirred Tank Reactor CSTR which is a nonlinear system used for the conduct of the chemical reactions [8]. A diagram of the reactor is given in Fig. 4.

The physical equations describing the process are

$$ \begin{aligned} {\frac{{{\text{d}}h(t)}}{{{\text{d}}t}}} & = w_{1} (t) + w_{2} (t) - 0,2\sqrt {h(t)} \\ {\frac{{{\text{d}}C_{b} (t)}}{{{\text{d}}t}}} & = C_{b1} - C_{b} (t) {\frac{{w_{1} (t)}}{h(t)}} + (C_{b2} - C_{b} (t)){\frac{{w_{2} (t)}}{h(t)}} - {\frac{{k_{1} .C_{b} (t)}}{{(1 + k_{2} .C_{b} (t))^{2} }}} \\ \end{aligned} $$

(36)

where h(t) is the height of the mixture in the reactor of the feed of reactant 1 w ₁ (resp, reactant 2, w ₂) with concentration Cb ₁ (resp. Cb ₂). The feed of product of the reaction is w ₀ and its concentration is C _b. k ₁ and k ₂ are consuming reactant rate. The temperature in the reactor is assumed constant and equal to the ambient temperature. We are interested by modeling the subsystem presented in Fig. 5.

For the purpose of the simulations, we used the CSTR model of the reactor provided with Simulink of Matlab. The parameter μ = 208 is determine using the cross-validation technique in the offline phase.

The input vector of RKHS model is

$$ x(k) = [w_{1} (k - 1),w_{1} (k - 2),cb(k - 1),cb(k - 2)]^{T} $$

(37)

The number of observations is 300.

In Fig. 6, we represent the online reduced kernel principal component analysis output as well as the system output. We notice concordance between both outputs with a Normalized mean Square Error equal to 0.083%.

In Fig. 7, we draw the evolution of the NMSE, we notice that the online identification algorithm presents readily an error less than 1% since the 20th training sample.

In the Table 2, we summarized the performance of the online identification algorithm in terms of kernel parameter, NMSE, and parameters number.

Table 2 Results of the chemical reactor identification with the online RKPCA algorithm

Full size table

8 Conclusion

In this paper, we have proposed an online reduced kernel principal component analysis method for nonlinear system parameter identification. Through several experiments, we showed the accuracy and good scaling properties of the proposed method. This algorithm has been tested for identifying a Wiener Hammerstein benchmark model and a chemical reactor, and the results were satisfying. The proposed technique may be very helpful to design an adaptive control strategy of nonlinear systems.

References

Aissi I (2009) Modélisation, Identification et Commande Prédictive des systèmes non linéaires par utilisations des espaces RKHS, thèse de doctorat, ENIT Tunisie
Scholkopf B, Smola A, Muller K-R (1998) Nonlinear component analysis as kernel eigenvalue problem. Neural Comput 10:1299–1319
Google Scholar
Scholkopf B, Smola A (2002) Learning with kernels. The MIT press, Cambridge
Google Scholar
Richard C, Bermudez JCM, Honeine P (2009) Online prediction of time series data with kernels. IEEE Trans Signal Process 57:1058–1067
Google Scholar
Kuzmin D, Warmuth MK (2007) Online Kernel PCA with entropic matrix updates. In: Proceedings of the 24 the international conference on machine learning, Corvallis, OR
Mercer J (1909) Functions of positive and negative type and their connection with the theory of integral equations. Philos Trans R Soc 209:415–446
MATH Google Scholar
Suykens JAK, Van Gestel T, De Brabanter J, De Moor B, Vandewalle J (2002) Least squares support vector machines. World Scientific, Singapore
MATH Google Scholar
Demuth H, Beale M, Hagan M (2007) Neural network toolbox 5, User’s guide, The MathWorks
Wahba G (2000) An introduction to model building with reproducing Kernel Hilbert spaces. Technical report No 1020, Department of Statistics, University of Wisconsin-Madison
Aronszajn N (1950) Theory of reproducing kernels. Trans Am Math Soc 68(3):337–404
MathSciNet MATH Google Scholar
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines. Cambridge University Press, Cambridge
Google Scholar
Rosipal R, Trecho LJ (2001) Kernel partial least squares in reproducing Kernel Hilbert spaces. J Mach Learn Res 2:97–123
Google Scholar
Fernandez R (1999) Predicting time series with a local support vector regression machine. In: ACAI conference
Gunter Simon, Schraudolph NN, Vishwanathan SVN (2007) Fast iterative kernel principal component analysis. J Mach Learn Res 8:1893–1918
MathSciNet Google Scholar
Chin T-J, Schindler K, Suter D (2006) Incremental Kernel SVD for face recognition with image sets. In: Proceedings of the 7th international conference on automatic face and gesture recognition FGR
Lauer F (2008) Machines à Vecteurs de Support et Identification de Systèmes Hybrides, thèse de doctorat de l’Université Henri Poincaré—Nancy 1, octobre 2008, Nancy, France
Taouali O, Aissi I, Villa N, Messaoud H (2009) Identification of nonlinear multivariable processes modelled on reproducing Kernel Hilbert space: application to tennessee process. In: ICONS, 2nd IFAC international conference on intelligent control systems and signal processing. Istanbul, pp 1–6
Vovk V (2008) Leading strategies in competitive on-line prediction. Theor Comput Sci 405:285–296
MathSciNet MATH Google Scholar
Vapnik VN (1995) The nature of statistical learning theory. Springer, New York
MATH Google Scholar
Vapnik VN (1998) Statistical learning theory. Wiley, New York
MATH Google Scholar
Vandersteen G (1997) Identification of linear and nonlinear systems in an errors-in-variables least squares and total least squares framework, PhD thesis, Vrije Universiteit Brussel
Vovk V (2006) On-line regression competitive with reproducing kernel Hilbert spaces, Technical report arXiv:cs.LG/0511058(version 2), arXiv.org e-Print archive, Jan 2006
Veropoulos K, Cristianini N, Campbell C (1999) The application of support vector machines to medical decision support: a case study, ACAI conference
Ghate VN, Dudul SV (2009) Induction machine fault detection using support vector machine based classifier. WSEAS Trans Syst 8:591–603
Google Scholar
Wanga W, Mena C, Lub W (2008) Online prediction model based on support vector machine. Neurocomputing 71:550–558
Google Scholar
Yang MH (2002) Kernel eigenfaces vs. Kernel first faces: face recognition using kernel method, In: IEEE FGR, pp 215–220

Download references

Author information

Authors and Affiliations

Unité de Recherche d’Automatique, Traitement de Signal et Image (ATSI), 5000, Monastir, Tunisia
Okba Taouali, Ilyes Elaissi & Hassani Messaoud

Authors

Okba Taouali
View author publications
You can also search for this author in PubMed Google Scholar
Ilyes Elaissi
View author publications
You can also search for this author in PubMed Google Scholar
Hassani Messaoud
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Okba Taouali.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Taouali, O., Elaissi, I. & Messaoud, H. Online identification of nonlinear system using reduced kernel principal component analysis. Neural Comput & Applic 21, 161–169 (2012). https://doi.org/10.1007/s00521-010-0461-x

Download citation

Received: 09 November 2009
Accepted: 04 October 2010
Published: 23 October 2010
Issue Date: February 2012
DOI: https://doi.org/10.1007/s00521-010-0461-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Online identification of nonlinear system using reduced kernel principal component analysis

Abstract

Similar content being viewed by others

Kernel \(\ell ^1\)-norm principal component analysis for denoising

Kernel principal component analysis with reduced complexity for nonlinear dynamic process monitoring

Kernel Principal Component Analysis: Applications, Implementation and Comparison

1 Introduction

2 Reproducing kernel Hilbert space

3 RKHS models

4 RKPCA method

5 Online RKPCA method